BigGIS Spark

Docker container for Apache Spark

Prerequisites

Docker Compose >= 1.9.0

Deployment

On local setup:

docker-compose up -d

On Rancher:

Note: HDFS stack should be deployed and running before Spark stack is deployed.

NFS server and Rancher NFS service need to be configured in the cluster. The NFS volume spark-home need to be created via the Rancher WebUI, which is needed for Apache Zepelin.
Add host label spark-master=true to any of your hosts.
Create new Spark stack spark via Rancher WebUI and deploy docker-compose.rancher.yml.

Submit Spark Job

Build Spark sample job.

cd job/spark-example
mvn clean package

Using Spark Client Image

The image biggis/spark-client:2.1.0 can be used submit Spark jobs to the Spark cluster. Edit the environment variables and volumes in the docker-compose.client.yml according to your setup and specify what spark job (jar and class) to submit. The jar file is mapped as a local volume.

Then run the docker-compose.client.yml file as following.

docker-compose -f docker-compose.client.yml run --rm spark-client

Using Spark REST API

You can also upload the job jar to HDFS and deploy the Spark job via curl.

Example: WordCount hamlet.txt

Upload hamlet.txt from biggis-hdfs repository:

curl -u hdfs:password \
     -F 'file=@data/hamlet.txt' \
     -X POST http://localhost:3000/api/v1/upload/files?hdfspath=/demo/hamlet.txt

Upload packaged spark-example-1.0-SNAPSHOT.jar:

curl -u hdfs:password -F 'job=@job/spark-example/target/spark-example-1.0-SNAPSHOT.jar' http://localhost:3000/api/v1/upload/jobs?hdfspath=/jobs/spark-example

Deploy Spark job:

curl -X POST http://localhost:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data @wordcount-job.json

Ports

Spark WebUI is running on port 8080

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
client/sbin		client/sbin
files		files
job/spark-example		job/spark-example
.gitignore		.gitignore
.travis.yml		.travis.yml
Dockerfile		Dockerfile
Dockerfile.client		Dockerfile.client
LICENSE		LICENSE
README.md		README.md
docker-compose.client.yml		docker-compose.client.yml
docker-compose.rancher.yml		docker-compose.rancher.yml
docker-compose.yml		docker-compose.yml
job.json		job.json
wordcount-job.json		wordcount-job.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BigGIS Spark

Prerequisites

Deployment

Submit Spark Job

Using Spark Client Image

Using Spark REST API

Ports

About

Releases

Packages

Languages

License

biggis-project/biggis-spark

Folders and files

Latest commit

History

Repository files navigation

BigGIS Spark

Prerequisites

Deployment

Submit Spark Job

Using Spark Client Image

Using Spark REST API

Ports

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages