Support Article

Web application does not specify crawling rules in page

SA-34614

Summary



With the current application configuration, web crawlers can access all the pages for indexing.
Web application does not specify crawling rules in page markup which makes the search engines to cache index the production URLs.


Error Messages



Not applicable


Steps to Reproduce



Not applicable

Root Cause



The application should make use of robots META tag to aid in preventing spiders and robots from crawling the application and/or indexing URLs into search results, which may also be cached.

Resolution



Perform the following local-change:

User can add the below tag in UserWorkForm HTML fragment which will embed this tag in all the HTML pages of Pega application.

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

UserWorkForm change will not apply for Login page and hence user should have to customize the login page Web-Login HTML page to add this meta tag.

Note that Web-Login rules ruleset version is available for UnAuthenticated ruleset which needs to specified in Browser requestor type of the current system name.

Published March 7, 2017 - Updated March 21, 2017

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.