Google Passage Indexing and BERT

June 3, 2020
Google does not index passages separately. They understand the content while scraping the whole page at the same time.

Google's Passage Indexing and BERT

You do not optimize forBERT, if your site is well optimized, you are fine, this is just about Google being able to surface content within poorly optimized pages

Everything Google does to update its search engine algorithm sends ripples across the internet.

Even the smallest adjustments to the underlying technology responsible for producing billions of results a day with lightning-like speed has an outsized impact that felt almost immediately.

At the same time, every now and again Google drops the equivalent of a nuclear bomb into their algorithm – changing EVERYTHING about how the search giant operates, rewriting the rules of search engine optimisation, traffic generation, and so much more.

That's exactly what happened when Google introduced BERT in November 2019.

A revolutionary new technology that uses artificial intelligence to more efficiently process natural language typed into the Google search box (and elsewhere), BERT is maybe the most advanced language recognition technology anywhere on the planet.

Originally trained on the English version of Wikipedia to recognise sentence structure, grammar, and even colloquial speech and more conversational language, today BERT can read, analyse, and understand more than 104 languages around the world – and it's only getting smarter every day.

The impact that BERT has on search engine optimisation is significant, and we are only 12 months in!

Why BERT is Necessary

One challenge that BERT seems to have overcome, with 100% (up from last years 10%) of search queries analysed intelligently and accurately, is the complexity of the English language. The world of linguistics is incredibly complex; words with multiple meanings can be difficult for a machine to understand when used together with others in a sentence. Ambiguity is something that 'machines' have trouble getting to grips with, so when a neural network such as BERT seems to have mastered it – that is extraordinarily impressive.

  • Lexical ambiguity: Ambiguity at the sentence level. Like "bass" has many meanings.
  • Polysemy and homonymy: vocabulary level. "Get" can be meaning acquire, or understand. "Rose" can be a flower or "rise up".
  • Coreference resolution: Analyses pronouns like "they," "he," "it," "them," "she" to identify what, or who, is being talked about. 

Source: How BERT processes natural language understanding

How Google decides your page is relevant

  1. Analyse passages to understand "Similarity and relatedness". For example, "A ____ is a vehicle.". since both cars and buses are vehicles. A car is similar to a bus since they are both vehicles, but a car is related to concepts of "road" and "driving."
  2. Assign categories and topics to your page. Ranking by relevant score (we will never know how the scoring system is like)
  3. Understand the queries and respond to relevant topic pages
  4. Show passage directly using "feature snippet" if the passage can answer the questions.

Passage Indexing CONFIRMED as a Ranking Factor

Even more interesting, though, is the way that Google has leveraged BERT to better index different passages on websites and pages all over the web.

When Passage Indexing first hit the search engine seem (late 2019 and early 2020) it also sent shockwaves throughout the search engine community.

Google now used its potent algorithmic and analysis technology in conjunction with the BERT to rank specific passages from parts and pages of entire websites – giving even more detailed and relevant answers than was ever possible before.

Now search engine users weren't just provided with links to pages that may contain the information they were after.

No, with Passage Indexing they were provided direct links to the exact passages (highlighted, too) that pertained to their specific search query.

The popularity of Passage Indexing was apparent immediately.

Google search engine optimisation engineers that are notoriously tight-lipped about what influences search engine rankings were more than happy to tell the world that this was a huge piece of the puzzle.

If you're looking to rank highly in Google these days you absolutely must be focused on optimising with Passage Indexing in mind, creating the most highly relevant content possible – content written for people and not search engine spiders.

Passage Indexing and BERT - A Sea Change in the World of SEO

At the end of the day, Google has always been about one thing above all else, and that's providing the most highly relevant answers to whatever is punched into their search box.

In the "wild west" days of Google, it was possible to rank highly (the top spot, even) for truly competitive results by keyword stuffing your content far beyond anything recognisable to the human eye.

Today, though – in large part thanks to the major advances made possible by BERT and Passage Indexing – nothing could be further from the truth.

Keyword stuffing is going to destroy your rankings, going so far as to shuffle you right to the back of the Google sandbox if you abuse this approach.

No, instead you're going to need to carefully create content that is highly relevant to the search strings you want to rank highly for.

Tight content explicitly written with humans in mind (free of fluff and filler) will dominate over the next few years thanks to BERT and Passage Indexing.

You'll also need to focus on building credibility and authority to rank highly, but that's more of an off-page search engine optimisation approach than anything else. Zero in on creating smart, relevant, and concise content for your readers and you should be able to shoot up the ranks in no time thanks to BERT and Passage Indexing.


Why we care? The very basics such as relevant keywords, metadata, title tags etc. Have not changed (if you are doing it properly). What has changed, or should change, is the way that pages are written. BERT cannot be optimised for, but if you are writing clearly and concisely with no fluff (again, this should be the norm anyway), then there is no reason why you cannot be the Ernie to Google's BERT.

When it comes to passage 'indexing', answering a question in a single paragraph, rather than spreading it out over several passages, is going to help you a tremendous amount here. A summary, for instance, at the end of the page (or even at the start, followed by a more in-depth explanation, depending on your site style) is going to capture Google's attention better than a more protracted answer.


More on Google passage indexing as of 21st Nov 2020. Martin Splitt from Google was interviewed about passage indexing and he reiterated that (1) you do not optimize for them, if your site is well optimized, you are fine, this is just about Google being able to surface content within poorly optimized pages and (2) The featured snippet is pulled out as like an "instant answer" to a question while passage indexing is not about answering a specific question like that. Google will show feature snippet first if "Google has enough pages containing concise anwsers".

Dean Long | Expert in Growth MarketingDean Long

Dean Long is a Sydney-based creative marketing and communication professional with expertise in paid search, paid social, affiliate and e-commerce. He's also a distinct MBA Graduate from Western Sydney University.

