You may have heard the phrase crawl budget and wondered what it means for your website. If you have a relatively small site, it’s not something that you should be overly concerned about. Having said that, understanding what it is can help you further optimise your pages and prepare your business for better online success.
Crawl budget is all about how programmes such as Googlebot prioritise which pages to check and how much resource is allocated to each task. New pages such as blog posts and general website content are usually crawled on the same day they are published and it takes just a few seconds at most.
For large sites that have thousands of pages or regular content, crawl budget becomes increasingly important and various factors determine whether things are properly indexed or not.
Googlebot: What is it?
Whenever you publish content to your website, Google and other search engines need to assess and rank it. At any one time, Google can ‘fetch’ pages and check them using a web crawler programme called Googlebot. When you understand that there are billions of pages out there, you can understand the need for automation and ways to make this process as productive and efficient as possible.
What Factors Affect Crawl Budget?
Googlebot does only one thing and that is crawl pages on the web in order to rank them. It must do this without interfering with the experience of users coming onto site at the same time. This is generally solved by having a crawl rate limit which sets how much fetching the bot can do from a certain site. This rate is determined by a couple of factors:
- Crawl health: If your site is slow for any reason, crawling is reduced. It increases once page loading improves which means aspects such as server speed are important.
- Setting limits: While Googlebot may often set its own limit based on crawl health, a website owner can do the same by limiting whether they accept crawlers or not for entire sites and for individual pages.
How you index pages and make them available for crawling is important but other factors are as well. For instance, the more popular your site becomes, the more likely Googlebot is to visit it. Therefore, speed of page download and visitors need to be in harmony. Pages that are no longer visited as regularly tend to be of less interest to crawlers and can fall down the rankings. And if you suddenly decide to move your website to a new URL it can trigger greater crawling action because everything has to be re-indexed.
Why You Should Be Interested in Crawl Budget?
While crawl budget can seem something that is more important to larger, busier sites, your business will want to maximise the way that it is indexed and make sure that each page is ranking as high as possible. For instance, research has shown that sites which have many low value pages including ones that duplicate content or ones which are a substandard quality can get indexed less. Other factors which can make a difference are:
- If your site has been hacked.
- Soft error page problems where a recognised 404 error page doesn’t come up.
- Having what is called faceted navigation which helps people search your site by filters such as date and price range.
- Including session identifiers which produce duplicate content.
- Dynamically created links that produce what is called infinite space, something which is difficult for bots to crawl.
Optimising a site for crawl bots is something that many businesses miss out on, generally because they are more concerned with the content they are working on and traditional keywords and other SEO content. If you want to maximise crawl time on your site, whether it’s a busy one or not, paying attention to these areas is important.