The Last Hurdle

We are a digital marketing agency offering full digital marketing services including website design and management, social media marketing, content writing, brand and logo design as well as traditional marketing services.

Robots.txt – The Bouncer at Your Website’s Door

Robots.txt - The Bouncer at Your Website’s Door - this image shows a bouncer at a night club door with an arm outstretched blocking entry - a concept for robot.txt file

Robots.txt - The Bouncer at Your Website’s Door

Meet Your Website’s Doorman

If your sitemap is the guest list for your website, then robots.txt is the bouncer at the door. It’s a tiny but mighty text file that decides which guests (search engine crawlers) are allowed in, and which ones should be politely turned away. No clipboard, no velvet rope — just a line or two of code keeping things in order.

What Is Robots.txt (In Plain English)?

The robots.txt file lives in the root of your website (that’s the top-level folder — e.g. https://yourdomain.com/robots.txt). It contains a set of rules written in plain text, giving instructions to web crawlers (also known as bots or spiders).

Two key terms:

  • Crawling = bots scanning your website’s pages.
  • Indexing = adding those pages to search results.

Robots.txt controls crawling, not indexing — a crucial distinction. If another site links to a page you’ve blocked, it might still end up in Google, even if it’s not crawled.

Why Robots.txt Matters for SEO

Robots.txt may be small, but it plays a big role:

  • Guides search engines to your good stuff – helps crawlers focus on your important pages.
  • Keeps junk out of search results – stops thank-you pages, test pages or duplicate content from wasting crawl budget.
  • Protects server resources – large sites can avoid being overwhelmed by too many bots rummaging around.
  • Connects with your sitemap – robots.txt can reference your sitemap, making sure crawlers know exactly where to look.

💡 Want to know more about sitemaps? Read our guide: https://www.thelasthurdle.co.uk/sitemaps-the-secret-roadmap-your-website-cant-do-without/.

Common Robots.txt Mistakes We See in Audits

We run a lot of audits at The Last Hurdle, and robots.txt errors are one of the usual suspects:

  1. Blocking the whole site
    A classic: Disallow: / means nothing gets crawled. Great during development, disastrous if left live.
  2. Forgetting to remove development rules
    We often find “temporary” disallows that became permanent by accident.
  3. Blocking CSS and JavaScript
    Search engines need to see your site the way users do. If your robots.txt blocks essential resources, your site might look broken to crawlers.
  4. Not referencing the sitemap
    This is a missed trick — adding a sitemap line makes your robots.txt file far more useful.
Robots.txt - The Bouncer at Your Website’s Door - this image shows a bright yellow background with a man stood sideway peering intently at his laptop a concept for robot.txt going awry

How to View and Edit Your Robots.txt in WordPress

  • Check it: Just type https://yourdomain.com/robots.txt into your browser. If one exists, you’ll see it straight away.
  • Create or edit it: WordPress doesn’t always generate one automatically. Plugins like Yoast SEO or Rank Math let you edit robots.txt safely from your dashboard.
  • Manual option: Advanced users can add a robots.txt file directly to the root folder via FTP or their hosting control panel.

Best Practice Tips for a Healthy Robots.txt

  • Allow important sections (homepage, blog posts, products).
  • Disallow admin areas (/wp-admin/), cart/checkout duplicates, or staging subdomains.
  • Always include your sitemap reference, e.g.:

Sitemap: https://yourdomain.com/sitemap.xml

  • Keep it simple. Over-engineered rules often backfire.
  • Test changes using Google Search Console’s robots.txt Tester.

A Safe Starting Point: Sample Robots.txt for WordPress

If you’re not sure where to start, here’s a simple, safe “starter pack” robots.txt file for most WordPress sites:

User-agent: *

Disallow: /wp-admin/

Allow: /wp-admin/admin-ajax.php

Sitemap: https://yourdomain.com/sitemap.xml

Think of it as the vanilla ice cream of robots.txt files: not flashy, but it gets the job done without upsetting anyone.

How to Test Robots.txt (Before You Break Something)

Always test your file before and after making changes. Two handy options:

  • Google Search Console – the built-in robots.txt Tester shows what crawlers can and can’t see.
  • TechnicalSEO.com’s robots.txt tester – a free, easy-to-use tool for quick checks.

Myth Busting: Robots.txt Edition

Let’s clear up a few common misunderstandings:

  • Myth: Robots.txt hides pages from Google.
    Truth: It only controls crawling. If someone links to your “hidden” page, it may still appear in results.
  • Myth: Every site needs a complex robots.txt.
    Truth: For most sites, a simple file with a sitemap reference and a few disallow rules is all you need.
  • Myth: Robots.txt improves rankings.
    Truth: It won’t boost you up the charts directly, but it ensures crawlers focus on the right content, which helps your SEO efforts pay off.

Help With Robots.Txt

Robots.txt may not look like much, just a few lines of text, but it’s one of the unsung heroes of website health. Think of it as your digital doorman: quietly working in the background to keep things organised, efficient, and welcoming for the right visitors.

Not sure if your robots.txt is working for you or against you? Let The Last Hurdle take a look. We’ll check your setup, make sure search engines are crawling the right pages, and give your site a clean bill of health.

We don’t just fix broken robots.txt files, we help you build a long-term SEO strategy that keeps your site running smoothly. Contact us today on 01604 654545 or email hello@thelasthurdle.co.uk and let’s get your website back on the guest list where it belongs.

 

 

Robots.txt – The Bouncer at Your Website’s Door

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top