Robots.txt wordpress allow and disallow

robots.txt wordpress allow

Robots.txt wordpress allow and disallow. This article cumulates my personal experience and the latest information I have recently found on the internet. The robots.txt file itself and the instructions contained within it – can affect the SEO of your web site. In my opinion, creating a robots.txt file and placing it on your website is an integral part of the long On Page SEO process.

What does Google want from you?

  • Google fetches everything
  • Google renders your pages completely.
  • Google like CSS and JavaScript files

Summary: Google wants to know your entire website. This helps you to better understand what an internet site is. This helps in determining the value of a website.
Tip: If you do not know what to block for Google’s crawler – do not block anything.

robots.txt wordpress

Robots.txt wordpress allow

From articles of various companies (companies dealing with SEO optimization; not bloggers who write about SEO) I learned that the entire robots.txt file should look like this:

User-Agent: *

This means that you allow indexing of the entire page. Google robot will crawl everything.

Robots.txt wordpress disallow

Of course you can block the Google robot. Then the robots.txt file should contain the line (eg):

Disallow: /search/

Again – with professional articles it follows that in many cases it makes no sense to use different locks in a robots.txt file. What’s more – it can hurt your website. Google needs all information to render the page correctly.

Summary: It follows that there is no point in blocking anything. Google has evolved (and thanks to it can and wants to know more).

Tip: However, I think that indexing the author, the authors, the archives obscures the image in the search engine. So I use in my robots.txt:

 User-Agent: *

Disallow: /author/
Disallow: /archives/

So I allow Google robot to crawl everything except author pages and archive. It is however small “but ..”

“Noindex, follow” Is better than Robots.txt wordpress disallow

The robots meta tag is a detailed solution on a selected page that allows you to control how it’s indexed and displayed to users in search results. Put it in the <head> section of this page, like this:

An example of meta tags

<!DOCTYPE html> <html><head> <meta name=”robots” content=”noindex” /> (…) </head> <body>(…)</body> </html>

Using “robots meta tag” is more difficult (and certainly takes more time) than writing a short robots.txt. Properly matching meta tags and inserting into the page header is difficult and risky. You can make it easier – install another plugin for WordPress but … how many plugins can you have? It takes memory. For example, I do not like many plugins in the admin panel (it disturbs me).

Summary: Using the “robots meta tag” causes Google to read your web page correctly. Google will see all the links (even those you have no idea of) – and thanks to that, it’s going to improve the value of your website.

Tip: Think of it this way: you have 10 articles in the beginning. In each author “John Doe”. In each article a different keyword. What does Google see? 10 x the keyword “John Doe” and other single keywords. Do you optimize the page for the keywords “John Doe”? I doubt it. That is why at the beginning I encourage you to simpler way, ie robots.txt.


Tip 2: You will develop a website, you will have plenty of keywords. You can return to the “Noindex, follow” topic.

All of this ie robots.txt wordpress allow and disallow  is only a temporary solution 

  • There are websites that scan the content of web domains, do not follow robots.txt commands, robots tag meta tags, nofollow attributes, and then place links to such content on your site.
  • These are the back door to your web site and Google will easily get in.

Final summary “robots.txt wordpress allow and disallow

It follows from this that we went back to the first robots.txt example:

User-Agent: *

But for a moment I will apply this robots.txt on my website:

User-Agent: *

Disallow: /author/
Disallow: /archives/

Sitemap: http://rithven.com/sitemap_index.xml

Yes I know. I should often add my XML sitemap to Google Search Console. But do you know what? I have too little time. And noticed that without my manual XML sitemap add – it nicely enlarges itself. Is it just by this extra instruction in robots.txt?

Rithven

Zapisz

Zapisz

Zapisz

Zapisz

Written By
More from Rithven

Gris Creative Coming Soon Template

Gris Creative Coming Soon Template. Exceptionally I present to you not WordPress...
Read More