Skip to main content

Googlebot

![[googlebot-workflow.png]]

Overview

  1. Find URLs: Search for URLs from websites orXML sitemaps.
  2. Add to Crawl Queue: Queue task to be executed.
  3. HTTP Request: The crawler makes an HTTP request to get the headers and acts according to the returned status code
    • 200 - it crawls and parses the HTML.
    • 30X - it follows the redirects.
    • 40X - it will note the error and not load the HTML
    • 50X - it may come back later to check if the status code has changed.
  4. Render Queue: Render queue costs more resources, so your site might not be rendered.
  5. Ready to be indexed: If all criteria are met, page will be eligible to be indexed and shown in search results.

Readings

Reference