A threat feed is a collection of actionable information about threats that allows for mitigating harmful events. This blog post is concerned with developing an IP based threat feed or blacklist. We will look at how to gather, aggregate, enrich, and extract threat data for consumption.
Gathering the threat data
I have several servers in the US, Europe, and Asia running modified versions of cowrie, a medium interaction honey pot. These honey pots allow ssh access by accepting logins based on a random number of guesses for each attacker. The configuration setting is below:
auth_class = AuthRandom
Once an attacker gains entry to a server, the honeypot records their actions and saves any artifacts they create. A Splunk Universal Forwarder (UF) monitors the honey pot log file and ships it to a Security Information and Event Management (SIEM) platform. This process happens for every honey pot that I have deployed.
Aggregation of the threat data
The objective is to consolidate IP addresses from attackers that have gained unauthorized access to at least one server. Capturing IP addresses that only scan or connect to one of our servers is not a high confidence metric in terms of determining whether they are a malicious actor or compromised machine. The additional requirement that they must also gain entry provides enough confidence that the source address has malicious intent. We can use a Splunk query to achieve the objective and bucket the aggregated log events that have source IP addresses with a login_success event type.

At the time of this writing, the query above generated over 15 thousand unique IP addresses using my honey pot dataset. This generated list can be used in an IP blacklist or IP based egress filter. Another option is to check how much overlap exists with commercial and open-source IP lists you utilize.
Data Enrichment
The IP address list alone has limited value. We can make the data more useful by adding additional context. We can pull in third-party IP reputation information from Virus Total or ABuseIPDB and many other providers. The graphic below contains another Splunk query that enhances the IP address list with the associated country, last-seen-time, and first-seen-time.

Extraction
Splunk allows you to run scheduled searches and output the results to a lookup table. I have a reoccurring search that saves a CSV file with the results below and then pushes the changes to a git repo once a day. There is also an option to save the results as XML, JSON, or PDF.

Thanks for reading.
thanks for the knowledge share.
No problem. Glad to give back.