Skip to content

Latest commit

 

History

History
61 lines (40 loc) · 2.48 KB

README.md

File metadata and controls

61 lines (40 loc) · 2.48 KB

Go Reference Go Report Card CI Audit

Weaver

Weaver logo

weaver is a command-line tool for checking links on websites.

Old stories would tell how Weavers would kill each other over aesthetic disagreements, such as whether it was prettier to destroy an army of a thousand men or to leave it be, or whether a particular dandelion should or should not be plucked. For a Weaver, to think was to think aesthetically. To act—to Weave—was to bring about more pleasing patterns. They did not eat physical food: they seemed to subsist on the appreciation of beauty.
—China Miéville, “Perdido Street Station”

Here's how to install it:

go install github.com/bitfield/weaver/cmd/weaver@latest

To run it:

weaver https://example.com
Links: 2 (2 OK, 0 errors, 0 warnings) [1s]

Verbose mode

To see more information about what's going on, use the -v flag:

weaver -v https://example.com
[OKAY] https://example.com (200 OK) (referrer: START)
[OKAY] https://www.iana.org/domains/example (200 OK) (referrer: https://example.com)

Links: 2 (2 OK, 0 errors, 0 warnings) [800ms]

How it works

The program checks the status of the specified URL. If the server responds with an HTML page, the program will parse this page for links, and check each new link for its status.

If the link points to the same domain as the original URL, it is also parsed for further links, and so on recursively until all links on the site have been visited.

Any broken links will be reported, together with the referring page:

[DEAD] https://example.com/bogus (404 Not Found) (referrer: https://example.com/)

Rate limiting

The program attempts to continuously adapt its request rate to suit the server. On receiving a 429 Too Many Requests response, it will reduce the current request rate. After a while with no further 429 responses, it will steadily increase the rate until it trips the rate limit once again.

Even without receiving any 429 responses, the program limits itself to a maximum of 5 requests per second, to be respectful of server resources.