This is a simple Apache Spark application to be used in conjunction with Apache Kafka and the ELK stack. It is designed to analyze network traffic generated by the SS7 Attack Simulator machine learning.
The application reads preprocessed data from the Kafka topic ss7-preprocessed generated by SS7 ML Preprocessing and outputs results to the Elasticsearch index ss7-ml-results.
This project is part of a master thesis currently being done at NTNU Gjøvik, Norway.
It is a requirement that both Elasticsearch and Apache Kafka is running on localhost. The application expects the following command line arguments:
- URL of the Spark master. In the form spark://host:port.
- Username for the Elasticsearch cluster.
- Password for the Elasticsearch cluster.
- File path to the training data.
The application can be started using spark-submit:
bin/spark-submit \
--class Main \
--master spark://host:port \
[application jar] \
[Master URL] \
[Elasticsearch username] \
[Elasticsearch password] \
[Path for training data]
The project uses sbt as its build tool. To compile the project use the assembly plugin:
sbt assembly
This creates a fat jar that resolves all dependencies required by Spark.