We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
如图所示,comment正常爬取时,返回的都是DEBUG: Scraped from <200 URL>{content}。但是图中出现了成片DEBUG: Crawled (200) (referer: None)的形式。出现此种问题后,comment.py往往会飞快地结束。(可能是直接跳过了无法爬取的微博)。
我对comment.py做了改动,改动是将tweet_id加入了对应评论的数据集中。(见附件) 同时,我将setting.py中的并行数从16改为8,将随机请求时间上限从1改为5
The text was updated successfully, but these errors were encountered:
No branches or pull requests
如图所示,comment正常爬取时,返回的都是DEBUG: Scraped from <200 URL>{content}。但是图中出现了成片DEBUG: Crawled (200) (referer: None)的形式。出现此种问题后,comment.py往往会飞快地结束。(可能是直接跳过了无法爬取的微博)。
我对comment.py做了改动,改动是将tweet_id加入了对应评论的数据集中。(见附件)
[comment.py.zip](https://github.com/user-attachments/files/17088611/comment.py.zip)同时,我将setting.py中的并行数从16改为8,将随机请求时间上限从1改为5
The text was updated successfully, but these errors were encountered: