Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Scrape Google Scholar and automatically download relevant research papers. #962

Closed
3 tasks done
deepashri30 opened this issue May 14, 2024 · 4 comments
Closed
3 tasks done
Assignees
Labels
gssoc GSSoC 2024

Comments

@deepashri30
Copy link

Describe the feature

Feat: Scrape Google Scholar and automatically download relevant research papers.

  • Saves researchers time by eliminating manual searches and downloads.
  • Increases efficiency by allowing users to target specific research areas with defined keywords.
  • Creates a centralized collection of research papers for easy access and reference.

How it Works:

User Input: Users enter keywords or phrases related to their research topic.
Search Automation: The scraper automatically queries Google Scholar using the provided keywords.
Intelligent Filtering (Optional): The system can be configured to filter results based on additional criteria (e.g., publication date, author, citation count).
Download Management: The scraper retrieves and downloads the full text of relevant research papers (when available and legal and illegal both). Downloaded papers are organized in a user-defined location.
File Management (Optional): The system can be configured to automatically rename and categorize downloaded papers for better organization.

Add ScreenShots

image

Record

  • I agree to follow this project's Code of Conduct
  • I'm a GSSoC'24 contributor
  • I want to work on this issue
Copy link

Hi there! Thanks for opening this issue. We appreciate your contribution to this open-source project. We aim to respond or assign your issue as soon as possible.

@DivyaVijay1234
Copy link

Hello, I would like to work on this issue. Please assign it to me

@nikhil25803
Copy link
Member

nikhil25803 commented May 17, 2024

Go ahead @deepashri30

Note

  • Please create a separate module for this, as in the folder and project structure (if it is already created, just add your features as functions in the same module).
  • Do not use the `selenium web driver as it is incompatible with all devices and cloud platforms.
  • Before making any changes, please check whether the module you want to add exists. If yes, then you can add your functionality as a method only make a separate module and class for it.

All the best 👨‍💻

@nikhil25803
Copy link
Member

hey @deepashri30, do not add the functionality to download the papers. Just scrape the information and server them as JSON response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gssoc GSSoC 2024
Projects
None yet
Development

No branches or pull requests

3 participants