Skip to content

Developer Documentation

عبدالمنان edited this page Dec 10, 2024 · 1 revision

Installation

Step 1: Clone Repository

git clone https://github.com/msmannan00/Orion-Crawler.git
cd Orion-Crawler

Step 2: Build and Start the Docker

docker-compose up --build

This command will build and start the following services:

API Service (api): The main webcrawler service that runs according to the predefined settings.
MongoDB (mongo): Database for storing crawled data.
Redis (redis_server): In-memory data store for caching and task queuing.
Tor Containers (tor-extend-*): Multiple Tor instances to route crawler traffic through different Tor exit nodes.

Step 3: Build and Start the Services

You can run the webcrawler in two ways:

Direct Execution:

  • Copy app/libs/nltk_data folder to appdata in windows or home directory in linux.

2. Install Dependencies

Ensure you have Docker and Docker Compose installed on your machine. Once installed, the dependencies will be handled via Docker Compose.

3. Build and Start the Search

Use Docker Compose to build and run the search:

create .env file and paste the content accordingly

# GENERAL KEYS
S_FERNET_KEY='^@ve!A#(UpMVtTRDx)&ZLXfsjqmIYHCP'
S_APP_BLOCK_KEY='vZ^BbKMzxra!ESkGfVcjLNP$sTe@RJI%Cd#yng*XD&A(UhutFq'
REDIS_PASSWORD='B8WTLk5QW69YF9VE1sV3iimCnWpYqSSjwt1ub2PPi0WMRBMpVojYCXA'

# MONGO KEYS
MONGO_ROOT_USERNAME='admin'
MONGO_ROOT_PASSWORD='rT2hzvlYnCG6nXbCrpw0f0AgssekarUw1dYEaaoZds0qfuu0VwkJi6W'

# TOR KEYS
TOR_PASSWORD='TK2JyQEU9T2K4B7eVhmx1aE7yfWZKZqqfuaI7Bb3t3RnId4N6ZTrcZl'

# CELERY KEYS
CELERY_WORKER_COUNT=30

# FLOWER KEYS
FLOWER_USERNAME='admin'
FLOWER_PASSWORD='qdISx1JoJto2z1lgtkXJw5myqwf5Q2BsnlgPOkQUcUg1RtS1nELQyNQ'

# SERVER KEYS
S_SERVER=http://192.168.3.20:8080
./run.sh build

to simply start the search run

./run.sh

Usage Pointer

  1. if you want to view highly optimize thread level logs you can go to
http://dozzle.localhost:5555/