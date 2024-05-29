A leak at Google has revealed the inner workings of its search engine and how it ranks content. Thousands of pages of internal documents from Google’s Content API Warehouse were leaked on GitHub in March by an automated bot named yoshi-code-bot. Those documents were then shared by a startup co-founder Rand Fishkin earlier this month.

In an interaction with The Verge, Fishkin said a source had shared the 2,500 pages of documents with him hoping to disprove Google’s “lies” about how the search algorithm works.

While Google hasn’t responded to the leak or allegations that it was dishonest about the Search algorithm, Fishkin told the news outlet that the tech giant has not debated its authenticity, and that an employee [of Google] requested [him] to change some terms around how the event was to be characterised.

The leaked documents contain details about the kind of data Google collects and uses, sites Google ranks high for sensitive topics, and how it handles minor websites. The methods described in the documents are different from public statements made by Google executives on these subjects.

For example, while Google officials have said multiple times that Google Chrome data was not used in search ranking, Chrome is mentioned specifically in many sections as a parameter in the leaked documents. Another factor that was not deemed important was the author byline. While Google had previously stated that bylines should just be done for readers and not rankings, it appears that Google does at least keep track of this attribute even though it is not quite clear if it is a metric for ranking.

The secrecy around the Search algorithm has birthed an industry of marketing and SEO experts who help companies navigate rankings on Google. Fishkin, who has been working in SEO for more than a decade now, called out such experts for “uncritically repeating Google’s public statements” and asked them to do better.

The Hindu has not independently verified these leaked documents.