For the checkout process in one of my sites has multiple steps and I use the same backend file to process each of the steps (i.e. checkout.php). In order to track the each of the pages in google analytics, I set a custom page name / event for each of the steps in the checkout process.
i.e.
With the old urchin.js tracker:
urchinTracker("/checkout-S1");
Or with the new ga.js tracker:
pageTracker._trackPageview("/checkout-S1");
Example:
| Page | Url | Page name / event sent to GA |
|---|---|---|
| Verify Items | usablelayout.com/checkout.php | /checkout-S1 |
| Enter mailing address | usablelayout.com/checkout.php | /checkout-S2 |
| Enter payment info | usablelayout.com/checkout.php | /checkout-S3 |
| Thank you / confirmation | usablelayout.com/checkout.php | /checkout-S4 |
I have noticed requests in my server logs and in my page not found (404) error report that the GoogleBot had tried to access /checkout-S1, /checkout-S1, ... on my server. It appears that Google is using the data collected in Google Analytics to find new pages to crawl for the Google Bot.
Has anyone else see this sort of this happen? We all knew Google would (at some point) analyze the data that they are collecting from our sites via Google Analytics and start using that for their main search algorithms but this seems to be the proof that they are doing it.
What are your thoughts? What other data is being used by Google that they collect from our sites using Analytics for their search algorithms.
Comments
Not to worry
Someone just pointed me to this. This link may be helpful: http://blogs.zdnet.com/Google/?p=39 . The most likely explanation is that Googlebot can scan JavaScript to discover new urls. But even then, we'd rather not discover Analytics/Urchin-related urls, so I passed this url on to some folks from the crawl team to check into.
Post new comment