-
Hi friends, Lets say I am scraping 3 sites, a, b and c . await crawler.run([
`https://www.a.com`,
`https://www.b.com,
`https://www.c.com`
]);
} Based on how the data is stored, it seems as site C starts to crawl when site B is completed and site B starts when site A is completed. Example:
Should I use parallelism if I want sites A, B and C crawled at the same time? Example:
Perhaps I should make separate crawler for each site await crawlerA.run([
`https://www.a.com`,
]);
await crawlerB.run([
`https://www.b.com`,
]);
await crawlerC.run([
`https://www.c.com`,
]);
} What would be my best approach? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
The requests should be processed in the same order as you enqueue them (unless you use
This code would run sequentially since you await each run call. If you use |
Beta Was this translation helpful? Give feedback.
Yes, this is the same as using
Promise.all
afaik (and I would strongly prefer that over starting promises and awaiting later).Because they all use the same default request queue, you need to create them explicitly and you cant use a default queue. A queue without a name will be always the same regardless of how m many instances you create.