-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create redirects of docs.plone.org
to 6.docs.plone.org
#1496
Comments
I've chewed a while on this today. @stevepiercy The 'crawling errors report' in that article is promotion for the paid SearchEngineKeywordPerformance plugin, and it only works for Bing! and Yahoo search, not for Google. For some searches on Plone terms I already get 6.docs.plone.org results on Google. Within maybe a few weeks most links will already be reindexed and old suggestions dropped. I thought about adding a custom 404 page where we track 404's using instructions here https://matomo.org/faq/how-to/faq_60. But that doesn't work that way because we use a tag container. There are instructions for that situation here where you can check for the page title: https://matomo.org/blog/2019/07/how-to-analyse-404-pages/ But that's not happening here, we redirect any docs.plone/(.*) to 6.docs.plone.org, so on the destination site the page view is for the homepage, which is not a 404. So to detect 'old' redirects on the target site we need to change the current redirect so that it actually generates a 404 into the new docs site that we can track. Then a 404 report would help. We could add/install Google Search Console access on the site to get more information from there. But I checked search console for another site I manage, and I don't think it will help. We would need to first register docs.plone.org in google search console, but that url is redirecting to ... 6.docs.plone.org. If we somehow fix that (disable redirect, install token for an accoun to please search console, enable it again), then you will probably just get a big report of all pages from the old docs site that where once there that google can't find anymore. That doesn't give us any extra information. The only way I've come up with so far to do this efficiently is to look up the most 10-20 used search terms people use in google search console, Check where google would have sent them to (while the wrong indexed data is still there) and add manual redirects for that to our webserver config. But for how many weeks and wrong searches will we do that effort? The lowest hanging fruit at the moment is probably adding an extra message on the homepage to ask people if they hit the homepage from a search result to kindly ask them to repeat the search terms in the local search toolbar on the site to find the new content. That is by the way something else we should validate in Matomo if we register internal site search.... Just checked, we didn't because we used the default search parameters (Searchabletext). I updated the config parameter now to be 'q', so we can inspect what people are searching for using the site search. |
It's easy to redirect docs.plone.org/whatever not to 6.docs.plone.org but to 6.docs.plone.org/whatever so that would generate 404's. Would that be helpful? Any URL after 6.docs.plone.org that does not exist will show the custom error page at that non-existing URL Can see if I can do that tonight, but just had an 8hr bus journey and need to get some rest, and supplies for tonight as |
@polyester Yes, changing the redirect to generate the 404 where and then picking up the 404's in Matomo to see the original paths to the documents is doable. But please don't activate it yet, let's first gather some more feedback. The question if it is desirable to show 404's just to get that data in Matomo. And for how many weeks/searches we are organisation this effort before Google's index has been updated and the broken inbound links from Google search results are gone anyway. If we would activate it then it would be nice to have a note on the 404 page (instead of the hompage) to suggest using the internal site search to find what they're looking for. And/or also suggest searching on 5.docs.plone.org . But we might get as much results from keeping an eye on search console search keywords for the next weeks and adding the most popular destination page redirecrdts for those pages as I suggested. Allthough this procedure is more fuzzy then seeing the actual 404 for page links. |
Let's not do this.
Let's not do this.
I think it would be helpful to know what we don't know. Read on for why.
With 301 redirects, crawlers will "self-heal" faster and to the correct destination, than without them, if at all. I think that is useful. I am interested in checking Matomo periodically for a few months. Once the 301-to-404s drop off, I can do a reverse URL search on them to identify the sites that are still driving traffic to the old URL. If those external sites are under our control, then I'll fix them. For the dregs, such as personal bookmarks or third-party websites, I probably don't care enough but I will reserve judgment until I have the data. The current 404 has a search field, just like the default page, so I am not terribly concerned about the inconvenience of the end user having to repeat their search on our site. To summarize the actions I would like to see:
How does that sound? |
My five cents. At the moment, Google, and its friends, still return many pages of the old documentation, like this one: https://docs.plone.org/4/en/manage/installing/requirements.html which now redirects to a 404 at https://6.docs.plone.org/4/en/manage/installing/requirements.html Isn't it better to redirect to something like: https://4.docs.plone.org/manage/installing/requirements.html and point out at the top of these pages that the 4/5 documentation is obsolete? |
@mamico thanks for the report. I agree that we need redirects for Plone 3, 4, and 5 docs. For all of them, the version number moved from the path to the subdomain. For 3 and 4, we should also drop the Thus for 3 and 4:
...should redirect to:
And for v5:
...should redirect to:
@polyester and @fredvd do you agree? <6 documentation is still used by roughly 20% of visitors, based on less than a day of data collected. That's too large to call them obsolete. But I can update the 404 page message to indicate that the visitor may go to previous versions of the docs. See #1504. We do have warnings on both 3 and 5, but not 4, about the latest versions of docs. Also the warning on 3 points to 5 as the latest, and should instead point to 6. @polyester would you be able to add a warning to 4 docs, and fix the warning on 3? |
|
@mamico Thanks for your feedback. To update this ticket, from the discussion here we have implemented a 'tracking strategy' so that we can see which Google/Search traffic hits which pages to they trigger a 404 that can be analysed in Matomo so that we can add manual 'deeplink' paths to our redirects for the most popuplar destinations. We didn't know before, because everything was redirected to the 6.docs.plone.org homepage. copied from a discussion from thursday 25th I have updated the tag container for the new docs website in Matomo according to https://matomo.org/blog/2019/07/how-to-analyse-404-pages/ and published a version 1.1 You have to play a bit with triggers. There is a second trigger now that detects if the page title is "Page not found" The custom 404 page not found html trigger fires the 404 custom html tag. (This custom html tag is NOT the tag manager snippet by the way, then you might get some nice recursive behavior ) . ) And the important detail, there should always be one: you should also add the 404 trigger to the normal analytics tag, but then as an exclusion condition, so that both tags don't run at the same time. It takes some time for Matomo to process things, so we might start to see results tomorrow. All this fancy tag manager stuff is necessary so that you don't have to change the documentation itself to have the snippet hardcoded only on the 404 page in the sphinx/myst templates. (thats the suggestion in https://matomo.org/faq/how-to/faq_60/ ) |
@stevepiercy I'm still considering if we shouldn't register the docs.plone.org site for Search console so that we can log in there occasionally to see what search keywords people use before they are referred to our documentation sites. We'll have to place a small identifier snippet on docs.plone.org and serve instead of redirecting and then we have 'proven' we own the domain so it is added to Google search console. Google is by far the mostly used search engine, in May alone 1800 searches came from Google when The first runner up is Bing with 57. The search engines unfortunately don't pass the keywords along, so for google, google search console is the only way to get insight there. It has no GDPR implications, you get acces to statistics they gather themselves from google.com usage. |
@fredvd Let's try to figure out whether using Google Search Console would help. Advantages
Disadvantages
I think that is reason enough to enable it. For which sites would you set up properties? I'll send you my Google Account privately, if you want to grant me access. |
Search engines have indexed
docs.plone.org
heavily. We need to set up 301 redirects on the server.I don't have access to do this. @polyester @fredvd have access. We should also discuss how to discover 301 redirects. Matomo should be able to do it, according to this blog article, but I see no such report. It might be for the paid version only. We might need to parse server logs.
List of redirects (add more as they are discovered)
Reference: post by @1letter on Community Forum:
https://community.plone.org/t/plone-6-documentation-update-2023-05-12-plone-6-documentation-released/17451/2
The text was updated successfully, but these errors were encountered: