Simple Apache Spark Streaming application used to preprocess SS7 network traffic sent on Kafka. This application is designed to extract relevant features from traffic generated by the SS7 Attack Simulator.
The application reads data from the Kafka topic ss7-raw-input on localhost and outputs to the Elasticsearch index ss7-ml-preprocessed and the Kafka topic ss7-preprocessed.
It is a requirement that both Elasticsearch and Apache Kafka is running on localhost. The application expects the following command line arguments:
- URL of the Spark master. In the form spark://host:port.
- Username for the Elasticsearch cluster.
- Password for the Elasticsearch cluster.
The application can be started using spark-submit:
bin/spark-submit \
--class Main \
--master spark://host:port \
[application jar] \
[Master URL] \
[Elasticsearch username] \
[Elasticsearch password]
The project uses sbt as its build tool. To compile the project use the assembly plugin:
sbt assembly
This creates a fat jar that resolves all dependencies required by Spark.