(Entomomachia) Error-classification-ISD

The aim of this project is to implement a complex distributed architecture with microservices to build an interface that could be used by a user to share some code and get predictions of errors and mutations based on the code of common repositories from defect4j-dissection.

The prediction is done by a neural network (multilayer perceptrons) in the Spark framework, the model is trained with the data from defect4j, the resultant predictions are a vector of feature for the code and 2 labels(one label for errors, 1 label for mutations). Code with the same label is deemed similar by the framework and will be returned to end users as possible Errors or Mutations.

The application exposes a web interface(for normal users) and some client API(REST)

This project was built for a course at university (engineering of distributed systems).

Technologies used in this project:

Elasticsearch as a repository for useful and labelled code and errors.

Spark as the main classification framework and backend behind the prediction

Spring Framework as the main component for the web server and the backend for communication between end users and prediction framework

Kafka as the messaging and event system that makes the whole project event driven.

Docker and Docker-compose to simply create microservices, orchestrate the whole project and make It Scalable.

Redis for the transaction database.

IMPLEMENTED

Classification framework for errors and mutations based on code passed on input

receiving and sending streaming data from kafka, to kafka(to revise after Client and REST API backend is done)

Querying elasticsearch and create a remote facade as an interface to the queries(other queries could be added, especially for ignoring users of other organizations).

REST APIs

Communication framework with kafka, client side, switched with a request-responce pattern to decouple the client request and the whole backbone, the request are now done with transactions registered in redis, when the whole computation of possible errors and mutation has finished, the result are written to redis.

Sharing codes and errors via REST API to improve the model

Propagation of errors through the transaction status(to enhance for further problems with data)

PARTIALLY IMPLEMENTED

Access control for users with different roles (access to certain code, access to all code for admins, implementation in elastic-facade done , more controls over access and authentication and authorization need to be implemented)

Aspects logging

Tests for the elastic-facade-server and minor parts of the code in other services

Better logging for the elastic-facade server and client, along with more detailed logging for the spring REST interface(and for the web interface as well)

TODOs

WEB interface(It was in production but the backbone was more important)

Tracing

logging analysis by passing logs in ElasticSearch and visualizing them in Kibana (controlling logs file and passing new logs everytime the logfile gets written)

Better error handling

DEPENDENCIES

java openjdk (11 for building and running Spring applications)

docker and docker-compose

git

OPTIONAL DEPENDENCIES FOR TESTING

This dependencies are not required to run the application in docker but are useful when developing and testing the services

java openjdk 8 for testing scala applications, only java 11 is required to build the Spring image

scala and/or sbt

postman or curl to test the REST calls

BUILDING AND RUNNING THE PROJECT

To build and run the whole project

git clone https://github.com/josura/Error-classification-ISD.git
cd backend/RESTinterface
chmod a+x mvnw
./mvnw spring-boot:build-image
cd ../..
docker-compose up

The last bash command(to bring up the whole application with docker) is not stable and most of the times some component fails due to heavy dependencies to other components, so a different approach would be bringing up every service in different windows/tabs in the following order

zookeeper and kafka
elastic-search
redis
general-classifier and improve-labeller-model
elastic-facade-server
elastic-facade-client
rest-interface

USAGE

On a web browser or with http requests, go to http://localhost:8888 and experiment with:

GET requests like http://localhost:8888/id where id is a transaction number

POST requests like http://localhost:8888 to get a transaction ID used to see if the prediction of errors and mutation has finished, with a request body with the following json structure:

{
    "user": "GIORGIO",
    "group": "SELF",
    "code": "if(testing = false){String goodbye = \"\";}"
}

POST requests like http://localhost:8888/share to share some code with errors and the corrected code to enhance the model that predicts errors and mutations, with a request body with the following json structure:

{
    "user": "GIORGIO",
    "group": "ALL",
    "code": "if(testing = false){String goodbye = \"\";}",
    "solution": "if(testing == false){String goodbye = \"\";}"
}

The code sended should not contains any comment and should have good spacing between words and operators.

EXAMPLE

An example is the following:

The user will send a POST request with the following request body:

{
    "user": "GIORGIO",
    "group": "SELF",
    "code": "if(testing = false){String goodbye = \"\";}"
}

The user will receive a response with the following body:

{
    "id":"411",
	"status":"PENDING",
    "resultError":"",
    "resultMutation":""
}

The user will query periodically until status is FINISHED or ERROR, the response body when the transaction has finished is the following:

{
    "id":"411",
	"status":"FINISHED",
    "resultError":"... \"code\" : \"....\", \"solution\": ....  ",
    "resultMutation":"... \"code\" : \"....\", \"solution\": ....  "
}

The resultError is a list of errors associated with solutions, It could be used to take only the solution and give them to the user in the web interface. The resultMutation is also a list of errors associated with solutions, the usage of this result is similar to the one with resultError, but the field taken could be code instead of solution, to predict the mutation.

Name		Name	Last commit message	Last commit date
Latest commit History 212 Commits
.github/workflows		.github/workflows
backend		backend
debugUtil		debugUtil
docs		docs
frontend/web		frontend/web
logs		logs
mining		mining
.gitignore		.gitignore
Entomachia_logo.png		Entomachia_logo.png
README.md		README.md
docker-compose-dev.yml		docker-compose-dev.yml
docker-compose-new.yml		docker-compose-new.yml
docker-compose-snap.yml		docker-compose-snap.yml
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

(Entomomachia) Error-classification-ISD

IMPLEMENTED

PARTIALLY IMPLEMENTED

TODOs

DEPENDENCIES

OPTIONAL DEPENDENCIES FOR TESTING

BUILDING AND RUNNING THE PROJECT

USAGE

EXAMPLE

About

Releases

Packages

Languages

josura/Error-classification-ISD

Folders and files

Latest commit

History

Repository files navigation

(Entomomachia) Error-classification-ISD

IMPLEMENTED

PARTIALLY IMPLEMENTED

TODOs

DEPENDENCIES

OPTIONAL DEPENDENCIES FOR TESTING

BUILDING AND RUNNING THE PROJECT

USAGE

EXAMPLE

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages