In a previous blog I promised to show how to use pipes to read log files better. This can be an intimidating process to say the least, lets give an example.
[[email protected] logs]: ls -lh -rw-r--r-- 1 root root 0 Mar 28 04:02 access_log -rw-r--r-- 1 root root 320K Mar 24 22:14 access_log.1 -rw-r--r-- 1 root root 327K Mar 24 03:36 access_log.2 -rw-r--r-- 1 root root 94K Mar 6 21:48 access_log.3 -rw-r--r-- 1 root root 1.7M Feb 28 03:10 access_log.4
Notice I have almost a megabyte of access logs here. This is a sandbox that only I really play on. I’ve got WordPress, Drupal and a few other minor things installed. Nothing special virtually NO traffic. Regardless, when we do:
[[email protected] logs]: cat access_log* | wc -l 16176
There are over 16,000 entries! That is a lot of text to go through, and these are TINY compared to what you can see on a production server which is likely millions of lines. Without a good question we will not get a good answer, so we have to know the data we are looking for and how to narrow down your data pool to only the entries you need, or at the very least an amount you can go through manually. Some times this means that you wouldn’t even know the log file that you were looking for. At this point, you could use grep in the capacity
grep searchterm -ilr /logdir/*
and this should give you some file names where you can process data at. I don’t usually have to do this unless I am unfamiliar with the application. If you’re getting an error with a configuration this is a GREAT way to find out what configuration file contains the parameter you need to set.
Getting back to the task at hand, which is how to interpret our findings lets just say that I was looking for some PHP code that was causing trouble. Because network traffic monitoring is set up, I know that I had an outbound flood to the IP address 188.8.131.52. I can grep for the IP in question in the apache logs to find out what and where is happening. These can be set up in a few different ways, so assuming you’re on CPanel I would run
[[email protected] logs]: grep 184.108.40.206 -ilr /etc/httpd/domlogs/*
and see if anything came up with this IP address in the actual page title. It would likely only be entries for one or two PHP pages, and they would likely be highly suspicious. Lets say that we found a script called suspicious.php mentioned that had this IP in the get string, and it was mentioned a ton of times in one domain’s log. We could find out how many times this was mentioned by running:
[[email protected] logs]: grep 220.127.116.11 -ilr /etc/httpd/domlogs/domain.com | grep suspicious.php | wc -l
and this would tell us the number of times it has been ran. Depending on the attack script this can be thousands of log entries. Lets say we wanted to see if they had been doing anything else on the server, but the suspicious.php access entries were so numerous that we have thousands of lines to search through. Instead of searching through these we would just run:
[[email protected] logs]: grep 18.104.22.168 -ilr /etc/httpd/domlogs/* | grep -v suspicious.php
Where 22.214.171.124 is the IP that accessed suspicious.php. This would eliminate any entries that contained suspicious.php from being shown. We can do this multiple times with different terms as necessecary. By the gradual inclusion or exclusion of terms we can process the logs into usable data. It takes a bit of time but you can track compromises, spammers and other server side problems with this. You can even do it in real time. Lets say you had a 503 error. All you would do is run:
[[email protected] logs]: tail -f /etc/httpd/log/error_log | grep
[[email protected] logs]: tail -f /var/log/exim_mainlog | grep
Where the email is the sending or receiving address. This will show you flow of the email through the server.