You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A common characteristic that's emerging on government sites is an HTML page with numerous direct links to content-urls for example: http://www.nrel.gov/gis/data_solar.html
In an ideal world, these pages should automatically generate collections & attribute metadata to that collection based on HTML content (page title as collection title, meta tags scrutinized & added, etc).
I'm not totally sure how to pull this off, it may be as simple as looking for more than 10 direct links to content urls. Some of this thinking should be driven by analyzing already-crawled content.
The text was updated successfully, but these errors were encountered:
A common characteristic that's emerging on government sites is an HTML page with numerous direct links to content-urls for example: http://www.nrel.gov/gis/data_solar.html
In an ideal world, these pages should automatically generate collections & attribute metadata to that collection based on HTML content (page title as collection title, meta tags scrutinized & added, etc).
I'm not totally sure how to pull this off, it may be as simple as looking for more than 10 direct links to content urls. Some of this thinking should be driven by analyzing already-crawled content.
The text was updated successfully, but these errors were encountered: