+3 votes
168 views
How to create a robots file txt

in SEO by (551k points)
reopened | 168 views

1 Answer

+4 votes
Best answer

The search engines when they arrive at a Web site start looking for the robots.txt file to proceed with its reading. It will depend on its content so that the search engine spider continues within the site or goes to another.

The robots.txt file contains a list of pages that are allowed to be indexed to the search engine, which in turn restricts access selectively to certain search engines.

image


It is an ASCII file that must be located at the root of the site. The commands it can contain are:

Agent user : It is used to specify which robot will follow the orders that are presented.

Do not allow : It is used to identify which pages are going to be excluded during the analysis process by the search engine. It is important that each page to be excluded must be separate lines and must also begin with the / symbol. With this symbology it is being specified; to "all pages of the website".

It is extremely important that the robots.txt file has no empty lines.

Following are some examples;
  • When you want to exclude all pages; the Agent user is: Do not allow: /.
  • If the goal is not to exclude any page, then the file robotos.txt should not exist in the website, meaning that all the pages of the site will be visited equally.
  • When a specific robot is excluded, it will be:
  • Agent user: Robot name Do not allow: / Agent user: * Do not allow:
  • When a page is excluded; Agent user: * Do not allow: /directory/path/page.html
  • When all the pages of a directory of the Web site with their corresponding subfolders are excluded, it will be; Agent user: * Do not allow: / directory /
With this file you can prevent search engines from accessing certain pages or directories, you can also block access to files or certain utilities.

Another feature is to prevent the indexing of duplicate content found on the site, so as not to be penalized.

Other considerations to keep in mind is that some robots can ignore the instructions presented in this file, and it must also be clear that the file is public because anyone who writes www.example.com/robots.txt can have access to it.

Now the question may be; How to generate the robots.txt file?

In fact it is quite simple since it is a text document with the name "robots.txt" and then upload it to the root of the domain of the page, it is there where the search engines will go to find it for reading.
A basic robots.txt file can be:

User-agent: *
Disallow: / private /

Instructions are generated to deny access to a directory that will be "private" for all search engines.
The command determines that it is addressed to all robots (User-agent: *), specifying that the directory is unauthorized (Disallow: / private /).

The parameter that is used for the Google search engine is; User-agent: Googlebot

As mentioned above, its use in SEO is used to restrict robots' access to duplicate content.

by (3.5m points)
edited

Related questions

+5 votes
1 answer
asked Jun 23, 2019 in SEO by backtothefuture (551k points) | 260 views
+3 votes
1 answer
asked Jul 30, 2020 in Windows 10 by backtothefuture (551k points) | 756 views
+3 votes
1 answer
asked Sep 16, 2019 in Windows 10 by backtothefuture (551k points) | 332 views
+4 votes
1 answer
asked Dec 3, 2019 in Windows 10 by backtothefuture (551k points) | 312 views
+3 votes
1 answer
Sponsored articles cost $40 per post. You can contact us via Feedback
10,632 questions
10,764 answers
510 comments
3 users