Hosting & Search Knowledge Base

404 pages not being cleared from SearchG2 collection

How do I troubleshoot 404 pages not being cleared from SearchG2 collection
Symptoms: 

After multiple crawls, 404 pages are still in the SearchG2 collection

Troubleshooting Steps:
  1. Check if the page does return from search queries and/or in the search collection
  2. Using fiddler or other NET tool check if the page returns a 404 or a 302.
  3. If the result is 404, check number of times the site has crawled and open a ticket if applicable.
  4. If 302 read below

.NET default behavior for custom errors is a 302, not a 404.  As a result G2 will keep the file in the index.  It must have a 404 response code in order to remove.  The changes are below

First make sure the 404 error handling is using ResponseRewrite:

<customErrors mode="On" redirectMode="ResponseRewrite">
   <error statusCode="404" redirect="404.aspx" />
</customErrors>

Then make sure the page itself is setting the 404 error:

<script runat="server" language="c#">
protected void Page_Load(object sender, EventArgs e)
{
   Response.StatusCode = 404;
}
</script>
Labels (2)
Version history
Revision #:
1 of 1
Last update:
‎04-03-2019 08:42 AM
Updated by:
 
Contributors
Looking for more?
Ask in Discussions
Developers

Peer-to-peer support  and answers on developing CMS templates, modifying privacy scripts or building integrations.

Digital Experience Management

Find answers and ask questions on content management, personalization and targeting.

Digital Quality Management

Find answers and ask questions on WCAG and SEO quality management.

Digital Governance

Find answers and ask questions on consent and monitoring solutions.