The junk that you will find in web access logs

If you have a website, you likely will look at your logs from time to time to see just who or how many people are visiting your site. I’ve certainly looked at a lot of logfiles both for my site and for others and thought I’d pass along some things you will likely see. For starters you are likely to see requests for pages that don’t exist. Even if you’ve never made changes to your site, you may see requests for files like ../../cmd.exe

I was at first amused when I saw this entry in my website logs. .exe files are windows executables and cmd.exe is basically windows command shell. My server was NOT windows, so I knew that not only would this file not be found, the vulnerability they were trying to exploit would fail. Sometimes you’ll see extremely long entries in your logs. For instance I’ve seen one lately, that looks like this SEARCH /\x90\x04H\x04H\x04H\ except it goes on and on for 3 pages worth of scrolling. My suspicion is that someones trying to do a “buffer overflow” attack. The result code was 414 (URL too large) which means that the attempt to overflow the buffer failed.

Another thing that I’ve found in my logs is in the referrer log. This referrer log can be useful in finding how people got to your site. Say that I had just got a link posted to my site on another website, then when people click on the link to visit my site I can see the “referrer” or the site that housed the link they clicked on. Unfortunately it’s possible to craft a request for the page that manufacturers an address. This is called referrer spam. For instance, I’ve found addresses of several porn sites in my referrer logs. I seriously doubt they have a link to my page. I’ve found recently I’m getting referrer spam from a “smokersteeth” website of some sort.

It’s worth mentioning that it’s probably worthwhile to password protect any web directories that might let you view your logs over the web to protect against being used as a way of advertising these sites. They may still show up in your logs as it’s pretty simple to automate a large batch of sites on the chance that someone’s going to see the link and visit. As I’ve discussed before, just because there are no links to something on your website doesn’t mean that it can’t be found. So, best practice is to password protect your log viewing pages.

