As the title suggests I want to track the following types of content in Google Analytics:
- XML Files – e.g. Sitemap.xml, Rss feeds or subscriptions, Template files, Data files loaded by Flash etc etc
- PHP Files – as in on the server, never outputting JavaScript to the client – pure 100% unadulterated server-side
- Images – directly linked images
- PDFs – again direct links
- Video files
- Flash files
- Etc etc
I do not want to have to send the Urchin javascript snippet to the browser for any of these and I’d like to specify the object hit in a little more detail.
Is that too hard!!
Apparently not.
So – here’s what I did:
- Redirected the specific page/file request to a PHP page in Apache using .htaccess and Mod_Rewrite
- Wrote a PHP page to trigger a hit in Google Analytics and then output the correct data using the required file header information
Here’s how you do it:
Let’s take a simple example to start with by just tracking one file – sitemap.xml – once you’ve got this sorted you can expand it to cover any files dynamically but then you get into more complex regular expressions.
1: Create your PHP page
First we want to capture all the requests for sitemap.xml and forward them through to a PHP page where we can run our tracking code. So create a PHP page and call it something like “trackSitemap.php” and in it just echo something like “this is the sitemap” for now. Hint – to make it easier save it in the same directory as your sitemap.xml file and edit the .htaccess file in that directory.
2: Redirect the request using Apache Mod_Rewrite
Open your .htaccess file and use the following – you will need to edit the paths/addresses to suit:
# INTERCEPT SITEMAP XML FILE
RewriteCond %{HTTP_HOST} ^(.*)mydomain.com$ [L]
RewriteRule ^sitemap.xml$ trackSitemap.php
3: Test the redirect
Try browsing to yourdomain.com/sitemap.xml – if you see your test message (”this is the sitemap”) then the redirect works and you can now carry on – if not then you may need to tweak the paths and addresses in your .htaccess file – for example I track my site’s blog sitemap which is in the /blog directory – so I used the above code in the .htaccess file within the /blog directory and not in the root.
4: Trigger the Google Analytics tracking in your PHP page
Paste the following code in your PHP page:
# Track using Google Analytics
$ga_uid='UA-XXXXXXX-X'; // Enter your unique GA Urchin ID (utmac)
$ga_domain='mydomain.com'; // Enter your domain name/host name (utmhn)
$ga_randNum=rand(1000000000,9999999999);// Creates a random request number (utmn)
$ga_cookie=rand(10000000,99999999);// Creates a random cookie number (cookie)
$ga_rand=rand(1000000000,2147483647); // Creates a random number below 2147483647 (random)
$ga_today=time(); // Current Timestamp
$ga_referrer=$_SERVER['HTTP_REFERER']; // Referrer url
$ga_userVar=''; // Enter any variable data you want to pass to GA or leave blank
$ga_hitPage='/blog/sitemap.xml'; // Enter the page address you want to track
$gaURL='http://www.google-analytics.com/__utm.gif?utmwv=1&utmn='.$ga_randNum.'&utmsr=-&utmsc=-&utmul=-&utmje=0&utmfl=-&utmdt=-&utmhn='.$ga_domain.'&utmr='.$ga_referrer.'&utmp='.$ga_hitPage.'&utmac='.$ga_uid.'&utmcc=__utma%3D'.$ga_cookie.'.'.$ga_rand.'.'.$ga_today.'.'.$ga_today.'.'.$ga_today.'.2%3B%2B__utmb%3D'.$ga_cookie.'%3B%2B__utmc%3D'.$ga_cookie.'%3B%2B__utmz%3D'.$ga_cookie.'.'.$ga_today.'.2.2.utmccn%3D(direct)%7Cutmcsr%3D(direct)%7Cutmcmd%3D(none)%3B%2B__utmv%3D'.$ga_cookie.'.'.$ga_userVar.'%3B';
$handle = @f open($gaURL, "r"); // open the xml file
$fget = @f gets($handle); // get the XML data
@f close($handle); // close the xml file
header('Content-Type: text/xml;'); // set the document content type for the output
$xml = @file_get_contents('sitemap.xml'); // get the actual file to output
echo $xml; // output the data
NOTE: on lines 15, 16 and 17 you need to remove the spaces between f gets, f open and f close – unfortunately WordPress doesn’t seem to let me write those words in full. The @ symbol before each function suppresses any errors – you can remove these if you want to debug the code.
5: Alter the above code to suit
Change the $ga_uid to your Google Analytics Urchin reference and alter the $ga_domain to be your primary domain name/hostname – you can include the www. if you need.
Change the $ga_hitPage to be the page you want to show a hit for – for example if you specify “www.mydomain.com” as your $ga_domain and you want to track www.mydomain.com/blog/sitemap.xml then enter “/blog/sitemap.xml” ias your hitPage. If you want to log any user defined parameters then you can add them to the $ga_userVar variable.
Alter the path and filename of your actual XML file ($xml) relative to the trackSitemap.php file or the absolute file path (line 20 above).
6: Test it!
Try browsing again to your Sitemap.xml file and hopefully you should see the XML file as normal – if so then your script has processed correctly and output the right file. If not then double check through your code above.
Now we all know that it takes a little while for Google Analytics to show you the new hit data so check it the next day to make sure you see the hits on your Sitemap file clocking up.
EXTRA HINT: I also use my own custom tracking on these files so I can see realtime tracking data coming in – if you want to double check your code is working then you can obviously add extra code into the PHP file to trigger any functions you like (database logging, email calls etc etc).
Let me know how you get on using this by commenting below or Tweet me @mjdigital
$xml =