-
Notifications
You must be signed in to change notification settings - Fork 739
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat:
browserPerProxy
browser launch option (#2418)
Fixes the performance issues with the new proxy handling in browser crawlers reported by @AndreyBykov 's team. Reduces the proxy antiblocking performance, though. Consider the following snippet: ```typescript const proxyConfiguration = new ProxyConfiguration({ newUrlFunction: async () => { return `http://session-${Math.random().toString().slice(2,6)}:[email protected]:8000`; } }) const crawler = new PuppeteerCrawler({ proxyConfiguration, requestHandler: async ({ response, proxyInfo }) => { console.log((await response?.json()).ip); }, headless: false, // browser per proxy = `false` by default }); await crawler.run([ 'https://api.ipify.org/?format=json&q=qnom', 'https://api.ipify.org/?format=json&q=bugt', 'https://api.ipify.org/?format=json&q=qfju', 'https://api.ipify.org/?format=json&q=utbb', 'https://api.ipify.org/?format=json&q=ekqu', ]); ``` ``` INFO System info {"apifyVersion":"3.1.16","apifyClientVersion":"2.9.3","crawleeVersion":"3.9.0","osType":"Linux","nodeVersion":"v20.2.0"} INFO PuppeteerCrawler: Starting the crawler. 139.28.120.90 139.28.120.90 139.28.120.90 139.28.120.90 139.28.120.90 INFO PuppeteerCrawler: All requests from the queue have been processed, the crawler will shut down. INFO PuppeteerCrawler: Final request statistics: {"requestsFinished":5,"requestsFailed":0,"retryHistogram":[5],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":1189,"requestsFinishedPerMinute":86,"requestsFailedPerMinute":0,"requestTotalDurationMillis":5946,"requestsTotal":5,"crawlerRuntimeMillis":3489} INFO PuppeteerCrawler: Finished! Total 5 requests: 5 succeeded, 0 failed. {"terminal":true} real 0m6,358s user 0m6,097s sys 0m0,929s ``` ------- With `browserPerProxy` enabled, the same code snippet runs twice as slow... but correct. ```diff const proxyConfiguration = new ProxyConfiguration({ newUrlFunction: async () => { return `http://session-${Math.random().toString().slice(2,6)}:[email protected]:8000`; } }) const crawler = new PuppeteerCrawler({ proxyConfiguration, requestHandler: async ({ response, proxyInfo }) => { console.log((await response?.json()).ip); }, headless: false, + launchContext: { + browserPerProxy: true, + } }); await crawler.run([ 'https://api.ipify.org/?format=json&q=qnom', 'https://api.ipify.org/?format=json&q=bugt', 'https://api.ipify.org/?format=json&q=qfju', 'https://api.ipify.org/?format=json&q=utbb', 'https://api.ipify.org/?format=json&q=ekqu', ]); ``` ``` INFO PuppeteerCrawler: Starting the crawler. 119.13.197.92 43.228.238.111 107.175.80.114 104.165.1.67 192.3.93.50 INFO PuppeteerCrawler: All requests from the queue have been processed, the crawler will shut down. INFO PuppeteerCrawler: Final request statistics: {"requestsFinished":5,"requestsFailed":0,"retryHistogram":[5],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":2263,"requestsFinishedPerMinute":34,"requestsFailedPerMinute":0,"requestTotalDurationMillis":11317,"requestsTotal":5,"crawlerRuntimeMillis":8765} INFO PuppeteerCrawler: Finished! Total 5 requests: 5 succeeded, 0 failed. {"terminal":true} real 0m11,610s user 0m12,990s sys 0m3,295s ``` --------- Co-authored-by: Martin Adámek <[email protected]>
- Loading branch information
Showing
5 changed files
with
69 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters