To quote the late, great Peter Boyle, "Holy Crap!" But I'm getting ahead of myself.
First, some background: Familygreenberg.com has more than 75 pages on it. And that's not counting the blog, with all of its individual posts and monthly archive files, which brings the total to more than 450. It's hosted on Yahoo! GeoCities Pro ($9/month, 5 e-mail addresses, 100GB of bandwidth, 2GB of storage, various web-based tools, etc.). Maybe not the best deal around today, but certainly more than I need and I've had very few complaints.
GeoCities Site Statistics
One of those few complaints has always been access to web server stats. If I had my own web server, I'd be able to look at the server logs, and see each and every request the server processed. Then, I could use one of a wide variety of available tools to slice & dice the data and generate interesting reports about who was accessing the site, what they were looking at, how long they were staying, etc.
People who sell things on their websites find this kind of information invaluable, since it tells them what's hot and what's not. For me, of course, it's just an exercise in ego. How many people are reading what I write? What are the most popular entries and what are the least? Is it the same three or four people who keep coming back, or am I reaching hundreds of people around the world? Egomania has certainly gone hi-tech in the 21st century...
Yahoo! GeoCities offers a page called Site Statistics that attempts to fulfill this need. The presentation is rather poor, the navigation is difficult, and the data is relatively sparse. Here are some samples:
The data is only available on a page by page basis, so determining how many people are viewing the photo albums, or how many people are reading the blog, or even how many people are viewing the site in total, is basically impossible.
To compensate for these shortcomings, I created a rather complex Excel spreadsheet. The spreadsheet made copious use of the Excel's Web Query function, looping through all the pages on the site, computing the Site Statistics URL for each page, and then pulling all the relevant data into the spreadsheet. Once the data was there, I was able to create some more advanced graphs:
I could also generate lists of referring site domains and keyword searches, which allowed me to post "How People Found Me" entries like this and this. The process was a little kludgy, but at least it worked.
DoS Filters Mark the Downfall of GeoCities Site Statistics
About a month ago, I sat down to run my monthly update on the spreadsheet, and got an error message: Yahoo's Site Statistics page was down. Bummer. So I waited about an hour & tried again. Still down. Major bummer. I was going to have to go to sleep that night without knowing how many (dozens of) people were reading my blog! It was a rough night, but I pulled through. The next day, Site Statistics was behaving normally again, so I ran the spreadsheet's looping algorithm again. Wouldn't you know it? Same error message. What lousy luck!
Too lousy, in fact. I started investigating, and realized eventually that the number of pages on the site (including the blog) was causing the spreadsheet to appear as a malicious program to Yahoo's servers (hundreds of HTTP requests from the same IP address over a short span of time). When that happens, Yahoo denies access to the site from the offending IP address for some period of time. Rebooting my computer lets me try again sooner (since Comcast assigns a new IP address to my cable modem when it gets cycled off/on), but I could never make it all the way through the page list before getting locked out.
So, I was left with two choices. Either capture the data in segments, and separate the segments by some period of time (hours, I'm guessing, if not a full day), or find another way to capture statistics about my site. Since the former was frankly more work than even I was willing to do, I went about looking for another option. Enter, Google Analytics.
Google Analytics - What Happens When You Let the Professionals Handle Things
Setup was simple: I pasted four lines of HTML at the end of each page in my site. In many cases, because of templates available from MS FrontPage and Blogger, I was able to paste the code in one place and refresh a template, which replicated the code to hundreds of pages simultaneously. In any case, the whole thing took me about 15 minutes. After that, I had to wait around 24 hours for the site to start collecting data (the help text says the site refreshes roughly once every 24 hours, but experience suggests it's more often than that).
Then the data started rolling in. Here's an example of the "Executive Summary" homepage:
On one page, I can see total visits and pageviews for the site, as well as the ratio of new vs. returning visitors, a geographic representation of my visitors, and a breakdown of the referring sources.
Here are some more samples:
In the upper-left is a full-sized version of the Geo Map Overlay. Hovering over each orange dot tells me the city & state from which the visit originated. The larger the dot, the more visits from that city
In the upper-right is the keyword conversion chart. Each word or phrase entered into a search engine that resulted in a visit to my site is listed here, along with the number of visits the keyword yielded and the average number of pages for each visit. The "+" next to each term drills down into the sources (i.e., search engines) that were used for that keyword. The red circle with the arrows in it provides a pop-up menu, which allows me to view activity for that keyword over a date/time range, or drill down on one of eighteen different metrics (e.g., region, city, state, network location, browser type). Just about anything I'd like to know about the keywords used to access my site is available on this one, interactive page.
(NOTE: I didn't provide screen shots for all of them, but similar pages exist with lists of page filenames, page titles, domains, locations, and others. Each of them provides the same kind of drill down capability described above. So, for instance, I can replicate the functionality of GeoCities Site Statistics by bringing up the page list, and then drilling down for each page on referrer, keyword, or date/time range, if that's what I wanted to know.)
In the lower-left is the navigation funnel. The site's file structure is shown in the page's middle column. I can click on any page, and on the right, I see a graphical display of all the pages that users came from to visit that page (the top of the funnel), as well as all the pages they visited after that page (the bottom of the funnel). This graph also shows entrance and exit rates for each page, so you can tell if a particular page is the users' jumping off point to the rest of the site, or if it is the end of the road.
If you click on the selected page (the middle of the funnel), you see the Site Overlay page, a sample of which is shown in the lower-right. This view shows a preview of the page itself, and provides a small bar graph next to each link on the page, indicating how many times users clicked on the link. So, from the sample I've included here, I learn that an equal number of people clicked on the oval, "Photo Album" graphic as clicked on the "Photo Album" menu on the page's menu-bar (5 each), but only 3 people clicked on the "Disney, 2006" link under "New Photos." Over time, this will provide me valuable feedback about how people use the site.
Speaking of time, all of the data I've discussed so far can be viewed for a single day or a span of multiple days. Some views can even reflect intra-day (hourly) data. In addition, you can specify two date/time ranges, and Google Analytics will show you side-by-side comparisons of the statistics for the two ranges. So, if I were so inclined, I could compare this week to last week, or this month to the same month last year, and so on.
But Wait, There's (Theoretically) More!
All of that said, there are significant pieces of Google Analytics that I'm not even using. For instance, the tool allows me to tag my web pages (using another two lines of code on each page), and then group my statistics by these tags. So, for instance, if I wanted to tag a series of pages as "technology-related blog entries," I could then see visit rates, pageviews, keyword usage, geo map overlays, etc. for that group of pages as a single unit.
For eCommerce sites, the tool also tracks dollars spent per page, and allows you to identify specific marketing campaigns on the site, which it uses to show dollars spent per marketing campaign. Assuming you already know how much each campaign is costing you, this makes marketing ROI very easy to calculate, on an overall basis and for each campaign individually. And if you use Google's AdWords feature, Google Analytics will tell you how many impressions, clicks-throughs, revenue, etc. were generated by each AdGroup and/or Keyword.
Finally, Google Analytics has a concept called "Goals." I haven't investigated this too thoroughly yet, but the upshot is that you can establish goals for your site (number of vists, amount of money spent, etc.) and the tool will tell you how far along you are on each goal.
A Very Satisfied Customer
As you can imagine, I don't think I'll be visiting the GeoCities Site Statistics page too often in the future. Google Analytics has far surpassed what Yahoo! provides about data on its own servers, which is rather ironic, since one would think that Yahoo! would have the edge (after all, they can see the server logs).
As has always been the case on the web, though, switching costs are near zero, and the best product usually wins out. I'm officially hooked on Google Analytics.