If you manage servers with OpenSSH access, you have no doubt been subject to the barrage of ssh brute-force attempts that occurs across the internet. Some administrators deal with this by either changing the default port (security by obscurity), utilizing public keys, threshold blocking, or white-listing source IP addresses among other things. AWS has security groups that lets you do this easily. However, if you are among those who have to allow public access from anywhere on the default port, you can at least monitor these attempts for trends or setup a SSH Honeypot and do all the sweet things. I decided to try the former using Splunk.
The first thing I did was figure out my search terms:
index="logger" sourcetype=syslog host=labs.sawbox.net
This gives me all the logs from the ‘logger’ index that were ingested via syslog and only outputs the specified host. This is great, but we only need the ssh login failures. Adding a few more search terms we get:
index="logger" sourcetype=syslog host=labs.sawbox.net sshd CASE(Invalid) user
Now that looks much better. CASE( ) is used when we want an exact match (case-sensitive), which produces all the login failures. We can do things like track total count or in our case aggregate the usernames and look for trends. Splunk didn’t carve out the user or IP field automatically so we will have to do that using field extraction.
We highlight the two fields we need and assign the names ‘User’ and ‘src’. Now when we perform the search again we see it gets populated under interesting fields.
If you look towards the bottom on the graphic you can see that User and src have now been extracted. Lets use these fields in a search:
index="logger" sourcetype=syslog host=labs.sawbox.net sshd CASE(Invalid) user | chart count by User
If we can sort the list by username frequency that would be much more usable:
index="logger" sourcetype=syslog host=labs.sawbox.net sshd CASE(Invalid) user | chart count by User | sort -count limit=10
We limited our output to the top 10 usernames and sorted the list in descending order. Now we can see that the user ‘admin’ over the last 30 days is the most frequently attempted username for this attack. We can take this process one step further and utilize data visualization with dashboards, which gives us:
Great! Now we can visually see which usernames are popular across a rolling 30 day period. Hopefully, you do not have actual users in the top 10. Ok let us shift over to those IP addresses. Since we have our templated search already done, all that is necessary is for us to tailor it to show us the top 10 IP addresses that have been probing our server:
index="logger" sourcetype=syslog host=labs.sawbox.net sshd CASE(Invalid) user | chart count by src | sort -count limit=10
Here we create a chart by source IP address and count the number of times we see that IP address. Each occurrence is a failed attempt to login based on how we did the extraction. You can see that several hosts have tried a few hundred times to log into this particular machine. How about a visual aid for this data:
Perfect! There you have it. Top 10 IP addresses that you can use to bolster your defenses.
Be careful not to solely focus on the top 10 as that’s only the noisy traffic. Keep an eye on the low and slow.
There are many other things we can do with this data to include using a similar process for successful logins, but that’s for another blog post. Thanks for reading.