Too Much Robot Probing is Making Me Annoyed


Have you ever had a dream where you were being probed by robots? Neither have I, thank goodness. That would be a nightmare. Although I'm not being probed by robots in my nightmares, my websites are being probed by robots in real life. Nearly every website is.

In case you're unaware, search engines gather information for their search results by sending robots through the entire Internet looking for new webpages. When they find new pages, they index them for search. The problem is that robots do so much probing that this can be a substantial percentage of the traffic that a small website receives. If a website runs scripts to count the number of page views, as many of mine do, those scripts would have to be more sophisticated than is practical to make them distinguish page views by real people from page views by robots. The bottom line is that you can't tell exactly how much real traffic you're getting on your small website thanks to all the robot probing you're also getting.

I have some idea of how much robot traffic I'm getting from the dates and times at which my page-view scripts are triggered. When I see twenty page views of twenty different pages in the same minute, that's a good indication that no real person is involved. However, it's more complicated than that, because robots will not read every page on a website all at once. They'll read 5, 10, 20, 30, or maybe 50 seemingly random pages at a time and then leave. My experience has been that as my websites get more trafficked by actual humans, the robots also seem to show up more frequently.

The really annoying thing is that despite all the probing that search engine robots are doing, they still are not directing much search traffic my way. I know I can use my robots.txt file to prevent probing, but I don't, because some traffic is still better than none. So, I have to put up with never really knowing how much of my websites' traffic is human and how much is probing from the Google robot from hell.

