
Crownpeak Employee
404 pages not being cleared from SearchG2 collection
How do I troubleshoot 404 pages not being cleared from SearchG2 collection
Symptoms:
Symptoms:
After multiple crawls, 404 pages are still in the SearchG2 collection
Troubleshooting Steps:- Check if the page does return from search queries and/or in the search collection
- Using fiddler or other NET tool check if the page returns a 404 or a 302.
- If the result is 404, check number of times the site has crawled and open a ticket if applicable.
- If 302 read below
.NET default behavior for custom errors is a 302, not a 404. As a result G2 will keep the file in the index. It must have a 404 response code in order to remove. The changes are below
First make sure the 404 error handling is using ResponseRewrite:
<customErrors mode="On" redirectMode="ResponseRewrite"> <error statusCode="404" redirect="404.aspx" /> </customErrors>
Then make sure the page itself is setting the 404 error:
<script runat="server" language="c#"> protected void Page_Load(object sender, EventArgs e) { Response.StatusCode = 404; } </script>