Skip to content

A Python tool for downloading Issuu documents as PDFs and a JavaScript snippet for extracting publication URLs from Issuu user pages. For educational use only.

License

Notifications You must be signed in to change notification settings

AhmedOsamaMath/issuu-downloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Issuu Downloader

This repository contains two scripts for downloading and extracting content from publicly accessible Issuu pages. These scripts are intended for educational purposes only. Please ensure you comply with Issuu's terms of service when using these tools.

Files

  1. downloader.py – A Python script to download an Issuu document as a PDF by fetching individual pages and converting them into a single PDF.
  2. url-scraper.js – A JavaScript snippet to be executed in the browser console to extract all project URLs from an Issuu user's page.

1. downloader.py

The Python script downloads pages from an Issuu document (provided via a URL) as images and converts them into a single PDF file. The tool supports both individual URLs and batch processing from a file.

Dependencies

  • Python 3.x
  • requests library
  • img2pdf library
  • re and os (built-in modules)

You can install the necessary dependencies by running:

pip install requests img2pdf

How to Use

  1. Direct URL input:

    • When prompted, input an Issuu document URL in the format:
      https://issuu.com/username/docs/document_id
      
  2. Batch processing:

    • If you have a file containing multiple Issuu URLs (one per line), you can input the file path when prompted.
  3. The downloaded PDF will be saved in the current directory with the name based on the document ID.

Example:

Enter the Issuu PDF URL (or press Enter to input a file path): https://issuu.com/someuser/docs/document_id

The output file will be saved as document_id.pdf.

Script Flow:

  • The script first fetches the document's JSON data.
  • It then extracts the image URLs of the document pages.
  • The images are downloaded and combined into a single PDF.
  • Temporary images are cleaned up after conversion.

2. url-scraper.js

This JavaScript snippet can be used to extract all publication URLs from an Issuu user's page. It needs to be run in the browser's console on an Issuu user profile page.

How to Use

  1. Open the developer console (press F12 or Ctrl + Shift + I in most browsers).
  2. Navigate to the Issuu user's page (e.g., https://issuu.com/jotunpaintsarabia).
  3. Paste the contents of url-scraper.js into the console and press Enter.
  4. The console will output the URLs of all publications found on the page.

Script Example:

/* JavaScript Code for Console */

// Select all publication links
let links = document.querySelectorAll('div[data-testid="publication-card"] a');

// Loop through each link and log the href attribute
links.forEach((link, index) => {
    console.log(`Link ${index + 1}: ${link.href}`);
});

Notes

  • Educational Use Only: These tools are intended for educational purposes. Always respect the terms of service of Issuu and other websites.
  • Responsibility: The repository author is not responsible for any misuse of these tools.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A Python tool for downloading Issuu documents as PDFs and a JavaScript snippet for extracting publication URLs from Issuu user pages. For educational use only.

Topics

Resources

License

Stars

Watchers

Forks