DavidGreenberg
Crownpeak Employee
Crownpeak Employee

404 pages not being cleared from SearchG2 collection

How do I troubleshoot 404 pages not being cleared from SearchG2 collection
Symptoms: 

After multiple crawls, 404 pages are still in the SearchG2 collection

Troubleshooting Steps:
  1. Check if the page does return from search queries and/or in the search collection
  2. Using fiddler or other NET tool check if the page returns a 404 or a 302.
  3. If the result is 404, check number of times the site has crawled and open a ticket if applicable.
  4. If 302 read below

.NET default behavior for custom errors is a 302, not a 404.  As a result G2 will keep the file in the index.  It must have a 404 response code in order to remove.  The changes are below

First make sure the 404 error handling is using ResponseRewrite:

<customErrors mode="On" redirectMode="ResponseRewrite">
   <error statusCode="404" redirect="404.aspx" />
</customErrors>

Then make sure the page itself is setting the 404 error:

<script runat="server" language="c#">
protected void Page_Load(object sender, EventArgs e)
{
   Response.StatusCode = 404;
}
</script>
Labels (1)
Labels

Have an idea

Have an idea to improve DXM?

Let us know !

Submit an idea

Can't find what you are looking for?

Find Answers

Search our DXM Forum to find answers to questions asked by other DXM users.

Ask a Question

No luck? Ask a question. Our Product and Support teams are monitoring the Forum and typically respond within 48 hours.

Ask a Question