How does Crownpeak DQM check for broken links?
Crownpeak DQM uses the HTTP response codes returned from the server to determine whether a link should be classed as broken.
When Crownpeak DQM scans your site, it records every single link it comes across. Each of these links is tested to check if any are broken.
Broken link codes
400 - 499
|
Yes
|
This indicates an error with the page and includes common statuses like 404
|
401
|
No
|
'Unauthorized' - authentication is needed to reach the page and has not been provided.
|
403
|
No
|
'Forbidden' - The server has understood the request but is refusing to fulfill it.
|
407
|
No
|
'Proxy Authentication Required' - this is similar to a 401 but indicates that Crownpeak DQM should authenticate itself with the proxy.
|
500 - 599
|
No
|
The server could not service the request made by Crownpeak DQM . This can happen sporadically when the server gets busy.
|
300 - 399
|
Link check is repeated on redirected URL
|
The server is redirecting Crownpeak DQM to another page. Crownpeak DQM goes to the new URL and link checks that URL also. To prevent redirect loops, if the same URL is seen more than 7 time the link is marked as broken.
|
200 - 299
|
No
|
A successful response code. If this is part of the site that is being spidered we check to see if the page in question is an error page.
|
These default responses can be modified if required. Contact the Product Support Team, if you'd like to discuss this.
Other reasons why links may be marked as broken
As well as the HTTP server codes above, a link will also be deemed broken if an error occurs at any part of the page's download process. Reasons for such errors include:
- Host part of the URL is not valid
- Issue with downloading the page via SSL
- The attempt to connect or read times out. Connects time out at 1 minute, reads time out at 2 minutes
- Illegal characters in the URL
- Illegal characters in the URL that Crownpeak DQM is redirected to via the page in question
- The response code indicates a redirect, but gives no URL to redirect to
The scanner is redirected more than seven times
Excluding links from the link checking process
Crownpeak DQM can ignore parts of or whole URLs that do not need to be checked (e.g. login pages). Please inform the Product Support team if you would like to add some broken link exclusions to your Crownpeak DQM configuration.