-
Hello I'm using crawlee to crawl web pages. The scene in question is I try to crawl same url multi times, I have tried add uniqueKey like
but it doesn't work. Then I try use requestQueue
with maxPageCrawl: 5 config, I got output
what's the problem about my code? and how can I implement this scene? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 16 replies
-
This is a wrong take, you are creating a named queue (with a random name), and those are not removed automatically on start. You just generate data that you will need to clean up manually. Using the default queue is fine here, your problem is deduplication on the request level, so the |
Beta Was this translation helpful? Give feedback.
uniqueKey
works fine, your code looks like you are reusing a single UUID generated upfront, you need to generate a new one for each request - so this needs to happen inside thetransformRequestFunction
. Also depends on what you are after, if it's about respecting URL fragments (the thing after#
), there is an option for that too, that would be better than adding random strings to unique key.This is a wrong take, you are creating a named queue (with a random name), and those are not removed automatically on start. You just generate data that you will need to clean up manually. Using the default queue is fine here, your problem is dedupl…