Author
Message
James
Posts:1757
Moderator
Member since: 2006-11-29
:: Quote ::
Subject: Statistics analysis - you've gotta be a bit careful sometimes!
Going on from an earlier comment about keeping track of visitors, I found out something surprising today.

I tend to use several ways of analysing data, as some packages have pretty reports, some have certain reports included in the free version, and some have different reports. Looking for a package that listed the pages with 404 errors, and came up with Deep Log Analyser. I've used this before, but gave up on it becuase the charts aren't pretty enough (or some airhead reason). Anyway, looking at what pages it reckons are my top pages, I noticed Bing have been sending loads of traffic to a .htm page that has never existed, but they seem to have indexed anyway. We are talking several thousand hits over the course of a week. So not insubstantial. I've been showing the site error page to all of these visitors for ages I guess.

The reason I picked this up is because most packagaes have not shown the entire url of the page that's been indexed, but this one did. I'd assume that the fact they got shown the error page SHOULD show up in the logs, and be picked up that way. WHY the 404 error doesn't show up on other programs is a mystery. I can't claim to understandlog files in any depth - just enough to find out basic errors to enable me to resolve site issues. I've put in a permanent redirect for the .htm page, so hopefully the Bing traffic (if it exists?!?) will become visitors properly.
August 08, 2009 07:57PM
DamonHD
Posts:6158
Moderator
Member since: 2006-11-30
:: Quote ::
Subject: Re: Statistics analysis - you've gotta be a bit careful sometimes!
Google Analytics cannot provide data for pages that don't contain its JavaScript because they don't exist...

Which is a point that I was forgetting. B^>

Rgds

Damon
August 08, 2009 08:01PM
James
Posts:1757
Moderator
Member since: 2006-11-29
:: Quote ::
Subject: Re: Statistics analysis - you've gotta be a bit careful sometimes!
Oh yeah - never thought of that! But the error doesn't show on log file analysers either. I think it might be because most show the top few failed requests, and as that is often taken up by favicon requests and the like, a page that doesn't exists might not make the front page and slip through the net. Still trying to work this out as adsense is offline at the moment, and I have half an hour to kill before we watch a film.
August 08, 2009 08:08PM
GegaBit
Posts:3311
Senior member
Member since: 2006-11-30
:: Quote ::
Subject: Re: Statistics analysis - you've gotta be a bit careful sometimes!
Monitoring via JS has its advantages, but I spare my pages and visitors the extra code and delay and use AWStats that picks up all my needed info from the server log file directly and comes with most hosting packages.

James, it also reports the errors with details.
August 08, 2009 08:56PM
Joshua
Posts:2831
Administrator
Member since: 2007-03-16
:: Quote ::
Subject: Re: Statistics analysis - you've gotta be a bit careful sometimes!
You could put the analytics js on a custom 404 page if you want to (Doesn't track robots, but at least most of the humans entering the page).

Quote:
This code sends a virtual pageview of "/404.html?page=[pagename.html?queryparameter]&from=[referrer]" to your account, where [pagename.html?queryparameters] is the missing page name and referrer is the page URL from where the user reached the 404 page.
How do I track error pages so they show up in my reports?


Also with some modification in the links to pages without the ability to place js (eg. PDFs) you can still track them with analytics. Or even outgoing links if you want to.

How do I track files (such as PDF, AVI, or WMV) that are downloaded from my site?



Edited 2 time(s). Last edit at 08/08/2009 11:10PM by Joshua.
August 08, 2009 10:54PM
DamonHD
Posts:6158
Moderator
Member since: 2006-11-30
:: Quote ::
Subject: Re: Statistics analysis - you've gotta be a bit careful sometimes!
Thanks for that J: I was sure there'd be something similar, but I just wanted to highlight the simple difference between JS-driven and log-file-driven stats that I hadn't thought carefully enough about!

Rgds

Damon
August 09, 2009 05:03AM
James
Posts:1757
Moderator
Member since: 2006-11-29
:: Quote ::
Subject: Re: Statistics analysis - you've gotta be a bit careful sometimes!
Glad you can track 404's in Analytics, so thanks for posting that. I still personally prefer to use the server logs for tracking errors though.

GB - I tried AWstats, and as you say they are included with the hosting package. Like Weblog Expert, AWstats reports attempted file access as the root of the directory - I.E. /wordpress/ Click on that link, and you get sent to the root file, and see the page intended. Therefore nothing appears to be wrong. It's only when I saw the more detailed info on DeepLog that I ralised that some of the traffic was going to the non existant page thanks to Bing indexing it, and that it needed a redirect to work.

None of this explains why this doesn't show up in the error logs though. Oh well....
August 09, 2009 09:43AM
Pengi
Posts:3345
Senior member
Member since: 2006-12-17
:: Quote ::
Subject: Re: Statistics analysis - you've gotta be a bit careful sometimes!
Quote:
I noticed Bing have been sending loads of traffic to a .htm page that has never existed

Any idea as to how this could come about? Given the effort that is put into making search engines send traffic to intended relevant pages, it seems strange. Could there be some links to this "page" from somewhere on your site (or someone else's)?
August 09, 2009 10:28AM
James
Posts:1757
Moderator
Member since: 2006-11-29
:: Quote ::
Subject: Re: Statistics analysis - you've gotta be a bit careful sometimes!
Been thinking about this one, and I do think there is one possibility:-

I think what might have happened is idiot-features here made an error and FTP'd the wrong page to the wrong location at some point. Somehow or other M$ managed to find it and index it. I think this might have happened shortly before I moved hosts, and started again with a clean install and checked what files were supposed to go where. It's also possible that I noticed the file in the wrong place and removed it, but not until after M$ had indexed it - hence the traffic. They clearly can't index a non-existant file, so at some point they must have had a file to index, and I think that this is the most likely explanation.

The stats thing doesn't really matter, as now the indexing error has been picked up I can correct it. My only point is that sometimes it's better to use several analysis methods than just rely on one. Might pick up errors a bit easier.



Edited 1 time(s). Last edit at 08/09/2009 11:06AM by James.
August 09, 2009 11:05AM

Sorry, you do not have permission to post/reply in this forum.