diff --git a/INSTALL b/INSTALL new file mode 100644 index 0000000..b92151b --- /dev/null +++ b/INSTALL @@ -0,0 +1,36 @@ +Usage Instructions: +------------------- +For MethodName task as an example, + +- Clone `SIVAND` from "https://github.com/mdrafiqulrabin/SIVAND". Here, we need DD.py, helper.py, and MyDD.py files. +- In `helper.py`, update `` (path to a file that contains all selected inputs) and `` (select token or char type delta for DD). + - Then, have to modify "load_model_M()" to load a target model (i.e., code2seq) from ``, and "prediction_with_M()" to get the predicted name, score, and loss value with `` for an input ``. + - Also, need to check whether `` is parsable into "is_parsable()" and load method by language (i.e. Java) from "load_method()". +- Finally, run `MyDD.py` that will simplify programs one by one and save all simplified traces in the `dd_data/` folder. + + +Usage Example: +-------------- +Here is an example of simplification using code2seq model for MethodName task. + +path = <..>/java-large/test/pnikosis__materialish-progress/library/src/main/java/com/pnikosis/materialishprogress/ProgressWheel_setRimColor.java +method_name = setRimColor +method_body = public void setRimColor(int rimColor) { this.rimColor = rimColor; setupPaints(); if (!isSpinning) { invalidate(); } } +predict, score, loss = setRimColor, 0.9996458292007446, 0.0015064467443153262 + +Trace of simplified code(s): +{"time": "2021-02-13 03:39:54.916934", "score": "0.9996", "loss": "0.0015", "code": "public void setRimColor(int rimColor) { this.rimColor = rimColor; setupPaints(); if (!isSpinning) { invalidate(); } }", "n_tokens": 44, "n_pass": [1, 1, 1]} +{"time": "2021-02-13 03:39:56.577097", "score": "0.9999", "loss": "0.0006", "code": "public void setRimColor(int rimColor) { this.rimColor = rimColor; { invalidate(); } }", "n_tokens": 33, "n_pass": [10, 2, 2]} +{"time": "2021-02-13 03:39:58.246853", "score": "0.9999", "loss": "0.0006", "code": "void setRimColor(int rimColor) { this.rimColor = rimColor; { invalidate(); } }", "n_tokens": 31, "n_pass": [41, 3, 3]} +{"time": "2021-02-13 03:39:59.928924", "score": "0.8938", "loss": "0.6254", "code": "void setRimColor() { this.rimColor = rimColor; { invalidate(); } }", "n_tokens": 28, "n_pass": [44, 4, 4]} +{"time": "2021-02-13 03:40:01.244292", "score": "0.8657", "loss": "0.7068", "code": "void setRimColor() {rimColor = rimColor; { invalidate(); } }", "n_tokens": 25, "n_pass": [46, 5, 5]} +{"time": "2021-02-13 03:40:02.544957", "score": "0.767", "loss": "1.726", "code": "void setRimColor() { rimColor; { invalidate(); } }", "n_tokens": 22, "n_pass": [47, 6, 6]} +{"time": "2021-02-13 03:40:07.090771", "score": "0.767", "loss": "1.726", "code": "void setRimColor() {rimColor; { invalidate(); } }", "n_tokens": 21, "n_pass": [73, 8, 7]} +{"time": "2021-02-13 03:40:09.673635", "score": "0.767", "loss": "1.726", "code": "void setRimColor() {rimColor;{ invalidate(); } }", "n_tokens": 19, "n_pass": [76, 10, 8]} +{"time": "2021-02-13 03:40:11.640979", "score": "0.767", "loss": "1.726", "code": "void setRimColor(){rimColor;{ invalidate(); } }", "n_tokens": 18, "n_pass": [87, 11, 9]} +{"time": "2021-02-13 03:40:16.230386", "score": "0.767", "loss": "1.726", "code": "void setRimColor(){rimColor;{invalidate(); } }", "n_tokens": 17, "n_pass": [112, 13, 10]} +{"time": "2021-02-13 03:40:17.521532", "score": "0.767", "loss": "1.726", "code": "void setRimColor(){rimColor;{invalidate();} }", "n_tokens": 16, "n_pass": [117, 14, 11]} +{"time": "2021-02-13 03:40:18.838683", "score": "0.767", "loss": "1.726", "code": "void setRimColor(){rimColor;{invalidate();}}", "n_tokens": 15, "n_pass": [119, 15, 12]} + +Minimal simplified code: +void setRimColor(){rimColor;{invalidate();}} diff --git a/README.md b/README.md index 21a8519..4e00206 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ ## SIVAND: Prediction-Preserving Program Simplification -This repository contains the code of prediction-preserving simplification and the simplified data using DD module for our paper '[Understanding Neural Code Intelligence Through Program Simplification](https://github.com/mdrafiqulrabin/SIVAND/)' accepted at ESEC/FSE'21. +This repository contains the code of prediction-preserving simplification, and the simplified data using DD module for our paper 'Understanding Neural Code Intelligence Through Program Simplification' accepted at ESEC/FSE'21. --- @@ -27,11 +27,17 @@ This repository contains the code of prediction-preserving simplification and th |Workflow in SIVAND| :-------------------------: -[Delta Debugging (DD)](https://www.st.cs.uni-saarland.de/dd/) was implemented with Python 2. We have modified the core modules ([DD.py](https://www.st.cs.uni-saarland.de/dd/DD.py), [MyDD.py](https://www.st.cs.uni-saarland.de/dd/MyDD.py)) to run in [Python 3](https://github.com/mdrafiqulrabin/dd-py3), and then adopted the DD modules for prediction-preserving program simplification using different models. The approach, SIVAND, is model-agnostic and can be applied to any model, but by loading a model and making a prediction with the model for a task. +[Delta Debugging (DD)](https://www.st.cs.uni-saarland.de/dd/) was implemented with Python 2. We have modified the core modules ([DD.py](https://www.st.cs.uni-saarland.de/dd/DD.py), [MyDD.py](https://www.st.cs.uni-saarland.de/dd/MyDD.py)) to run in [Python 3 (we use Python 3.7.3)](https://github.com/mdrafiqulrabin/dd-py3), and then adopted the DD modules for prediction-preserving program simplification using different models. The approach, SIVAND, is model-agnostic and can be applied to any model by loading a model and making a prediction with the model for a task. -**How to Start**: To apply SIVAND (for MethodName task as an example), first update `` (path to a file that contains all selected inputs) and `` (select token or char type delta for DD) in `helper.py`. Then, write your code into `load_model_M()` to load a target model from ``, and into `prediction_with_M()` to get the predicted name, score, and loss value with `` for an input ``. Also, load method by language (i.e. Java) from `load_method()` and check whether `` is parsable into `is_parsable()`. Finally, run `MyDD.py` that will simplify programs one by one and save all simplified traces in the `dd_data/` folder. +**How to Start**: +To apply SIVAND (for MethodName task as an example), first update `` (path to a file that contains all selected inputs) and `` (select token or char type delta for DD) in `helper.py`. +Then, modify `load_model_M()` to load a target model (i.e., code2vec/code2seq) from ``, and `prediction_with_M()` to get the predicted name, score, and loss value with `` for an input ``. +Also, check whether `` is parsable into `is_parsable()`, and load method by language (i.e. Java) from `load_method()`. +Finally, run `MyDD.py` that will simplify programs one by one and save all simplified traces in the `dd_data/` folder. -**Check More**: For more detailed code, check `models/dd-code2vec/` and `models/dd-code2seq/` folders to see how SIVAND works with code2vec and code2seq models for MethodName task on Java program. Similarly, for VarMisuse task (RNN & Transformer models, Python program), check the `models/dd-great/` folder for our modified code. +**More Details**: +Check `models/dd-code2vec/` and `models/dd-code2seq/` folders to see how SIVAND works with code2vec and code2seq models for MethodName task on Java program. +Similarly, for VarMisuse task (RNN & Transformer models, Python program), check the `models/dd-great/` folder for our modified code. --- diff --git a/REQUIREMENTS b/REQUIREMENTS new file mode 100644 index 0000000..728a6d4 --- /dev/null +++ b/REQUIREMENTS @@ -0,0 +1,16 @@ +# python3 --version +# Python 3.7.3 + +# packages for MN (code2vec/code2seq) +tensorflow == 1.15.0 +numpy + +# packages for VM (RNN/Transformer) +tensorflow == 2.2.0 +pyaml + +# packages for SIVAND +pathlib +datetime +javalang +pandas diff --git a/STATUS b/STATUS new file mode 100644 index 0000000..5a667ca --- /dev/null +++ b/STATUS @@ -0,0 +1,7 @@ +Artifacts Available: +We apply for this badge because we provide a link to the publicly accessible GitHub repository +where our artifact is permanently stored and available. + +Artifacts Evaluated: +We apply for this badge because we document the artifact and share our code to reuse/replicate by other researchers, +and we also include the detailed simplified results as appropriate evidence of verification and validation. diff --git a/paper.pdf b/paper.pdf new file mode 100644 index 0000000..0cb214b Binary files /dev/null and b/paper.pdf differ