You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
const getCrawler = (crawlId) => {
const config = new Configuration()
config.set('defaultRequestQueueId', crawlId)
config.set('defaultKeyValueStoreId', crawlId)
config.set('defaultDatasetId', crawlId)
}
return new PlaywrightCrawler({
requestHandler: router,
}, config)
const crawler = getCrawler("testcrawlid")
console.log(crawler.config.get('defaultRequestQueueId')) // here I get 'testcrawlid'
// added some requests using crawler.addRequests()
await crawler.run()
When the crawl finishes, I see that there's no folder for testcrawlid in the storage/request_queues and I see default folder.
To add more context, I actually want to be able to add the same requests again the only difference will be the crawlId.
Now I might prepend crawlId to unique key when adding requests but I also use enqueueLinks everywhere and I dont have the ability to construct the url in such cases. Inside the context of a crawlId, I want to make sure I am not crawling duplicate URLs but if the crawlId is different then the url can be same.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am creating a method.
When the crawl finishes, I see that there's no folder for testcrawlid in the storage/request_queues and I see default folder.
To add more context, I actually want to be able to add the same requests again the only difference will be the crawlId.
Now I might prepend crawlId to unique key when adding requests but I also use enqueueLinks everywhere and I dont have the ability to construct the url in such cases. Inside the context of a crawlId, I want to make sure I am not crawling duplicate URLs but if the crawlId is different then the url can be same.
Beta Was this translation helpful? Give feedback.
All reactions