Added a new category : Other Collection It contains mostly catalogues. It is not yet checked. Report issues if found. Feel free to fork and submit corrections.
A website by Government of India contains many Rarebooks, Manuscripts and eBooks, etc.
This script is created to collect those books.
Rarebooks are more than 1 TB. Manuscripts are more than 130 GB. eBooks size is not known.
This script is created for practising python. Please, don't abuse the website. Use this script to download only needed items.
-
Create a new virtual environment, source it or add to shebang in the main script.
-
Install requests and bs4
pip install requests pip install bs4
-
cd to the directory where script is located.
-
Run as
a. this
python ./v1_DownloadAllBooksFromIndianCultureGovIn_Release_270320221.py
OR,
b. make the script executable and then run directly. You must have your environment added to shebang.
chmod +x ./v1_DownloadAllBooksFromIndianCultureGovIn_Release_270320221.py ./v1_DownloadAllBooksFromIndianCultureGovIn_Release_270320221.py
For automating download of all three categories of PDF, i.e. rare books, manuscripts, eBooks ---
Replace
download_this_category = input('do you want to download this category of PDF? yes(y), No(n)\n')
with
download_this_category = 'y'
And for automating download of all PDF related to each category ---
Replace
download_this_book = input('Do you want to download this book. Yes(y), No(n)?\n')
with
download_this_book = 'y'