Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Hugo Build & Deploy to correctly generate robots.txt #1800

Closed
stumbo opened this issue Aug 8, 2024 · 4 comments
Closed

Fix Hugo Build & Deploy to correctly generate robots.txt #1800

stumbo opened this issue Aug 8, 2024 · 4 comments
Assignees
Labels
bug Something isn't working (as per documentation)

Comments

@stumbo
Copy link
Member

stumbo commented Aug 8, 2024

Describe the bug
robots.txt is supposed to be automatically generated when production is set to true. This would then overwrite the default robots.txt file -- which disallows indexing. This is not working.

Running locally the robots.txt file is correctly generated. This isn't happening in the github action. Figure out where the workflow is messed up.

@stumbo stumbo added the bug Something isn't working (as per documentation) label Aug 8, 2024
@stumbo stumbo self-assigned this Aug 8, 2024
@stumbo stumbo added this to website Aug 8, 2024
@stumbo
Copy link
Member Author

stumbo commented Sep 18, 2024

Draft PR 239 resolves this issue. Need some final clean up and documentation before converting from draft.

@masinter
Copy link
Member

masinter commented Oct 5, 2024

A fix for this was approved and merged. Is there a way of checking whether it was effective? Do the web crawlers now crawl all our pages correctly?

@stumbo
Copy link
Member Author

stumbo commented Oct 5, 2024

There are several online tools that claim to scan websites and identify issues. 

From seomator.com

URL Info Crawlable Indexable
/ Google crawl rule: Allow:
Bing crawl rule: Allow:
Google index rule: none
Bing index rule: none
Google
Bing
Google
Bing

Google Search Console - access limited to Interlisp.org site maintainers - notes the new robots.txt file was
successfully fetched on 10/3/2024.

https://interlisp.org/robots.txt 10/3/24, 2:07 AM check_circle_outline Fetched 49 bytes  

I requested a recrawl, which will show any issues with the robots.txt file.  But, given tools such as seomator seem
to think the robots.txt file is good and allows both Google and Bing to crawl I don't expect any issues.

@stumbo
Copy link
Member Author

stumbo commented Oct 24, 2024

Google Page Indexing shows no pages being blocked by robots.txt.

@stumbo stumbo closed this as completed Oct 24, 2024
@github-project-automation github-project-automation bot moved this to Done in website Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working (as per documentation)
Projects
Status: Done
Development

No branches or pull requests

2 participants