Replies: 1 comment
-
Resuing 1 session for the flow should work if the token is in the cookies. If it is just a normal header, it will not automatically attach it. If for some reason the SessionPool thing doesn't work, you can simply pass headers to the request when enqueueing it. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi folks,
I'm trying to utilize Crawlee to scrap data from a website with XSRF token protection. The data I need is hidden behind sign in wall.
I set up session pool to use just 1 session, thinking that it'll reuse same session across all queries but looks like this is not happening.
I have a chain of handlers like:
This is where, by the look of it, Crawlee just destroys playwright context and gets stuck with sign in page. Context contains cookies with xsrf token which then should be used by playwright to get access to page with data. I tried also to work it around by making direct calls to website's API like following:
But it results in 401. If I don't skip navigation in the browser I see that authenticated 'sign in' tab gets closed and then new one 'pageWithData' opens and gets redirected to sign in page. If I skip navigation then tab is destroyed shortly after signing in and results in lost context.
I tried to reproduce the call in Postman and it works just fine with xsrf token I get after sign in. Could you suggest how to properly share Playwright context/cookies between handlers in such case?
Beta Was this translation helpful? Give feedback.
All reactions