Ryan Cramer

Search Accessibility

Using Google's First Click Free with PHP

Solving subscriber-only search engine indexing

Content in subscription web sites is naturally limited for search engines. This type of content can’t be indexed by Google because you have to log in to view it. How do you retain the benefits of being indexed by Google and still maintain subscriber-only content?

Some have solved this problem by detecting Googlebot, and then opening up the content to it. By doing this, the content gets indexed by Google, but real users still have to log in to view it. That sounds like a good solution, but it violates Google’s webmaster guidelines. Googlebot doesn’t always identify itself; It will spy on your site pretending to be a real visitor. If it finds that you are delivering different content to real visitors than you are delivering to Googlebot, a penalty is likely to be issued.

Google’s Solution

Google recently launched a service called First Click Free. With this service, Google is essentially giving the green light to the solution mentioned above, but only if you also deliver the full page content when the user clicks to your site from Google. That means you must show the content without requiring a login or subscription. It’s called “First Click Free” because only that first click from Google is bound to this policy. If the user clicks further in your site, then you are under no obligation to continue delivering free content … unless the user clicks from Google again.

How to detect Googlebot and Google clicks

Google’s information says that we should do this by checking referrers for clicks. If the referrer is Google, then we allow the user. To detect Googlebot, we need to check the user agent and IP. If the user agent is “Googlebot”, then we lookup the IP to make sure that it belongs to Google.

Below is a function I’ve put together that performs this task with PHP. It should be used as a condition in code that checks whether or not to allow a user to view subscriber-only content. Because some of the checks can consume more CPU time than others, this function does quick checks with stripos() and then confirms them with more time consuming functions: preg_match() and gethostbyaddr().

function isGoogleClick() i’, $_SERVER[‘HTTP_REFERER’]))
            return true;
    if(stripos($_SERVER[‘HTTP_USER_AGENT’], ‘Googlebot’) !== false) 
    return false; 
}

This is just a code sample to use as a starting point and it’s not guaranteed to be perfect. Use it at your own risk. If you find a bug or have an improvement please let me know.

—Ryan Cramer