Prioritize crawls based on 40x vs 50x vs 20x / 30x #101

brendanheywood · 2019-12-17T00:47:25Z

If a url has been already crawled and it was a 20x or 30x then we know it was good, and we are simply checking for regressions.

If it was a 40x then it is either a regression we want detected which might possibly be fixed on the other end, but more likely it is just the link itself which is broken and needs to be fixed.

50x on the other hand are temporary and a clear signal that we should try again and get a different response.

So suggesting that the previous return code should be a soft weighting factor in the queue prioritization so that 50x are done earlier to try and clear them.

brendanheywood added the enhancement label Jan 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prioritize crawls based on 40x vs 50x vs 20x / 30x #101

Prioritize crawls based on 40x vs 50x vs 20x / 30x #101

brendanheywood commented Dec 17, 2019

Prioritize crawls based on 40x vs 50x vs 20x / 30x #101

Prioritize crawls based on 40x vs 50x vs 20x / 30x #101

Comments

brendanheywood commented Dec 17, 2019