Eye in the Sky: Monitoring Financial Fraud using Satellite Data

The contributions of this research are two-fold. First, our research contributes to the financial fraud detection by introducing satellite data to achieve a better performance. The performance improvement confirms the existence of relationship between satellite data and firm financial performance, which suggests that satellite data can be used as signals for firm financial analysis. Second, this study proposes a framework to integrate multiple sources of features. A method of fusing features and classifiers is designed to reduce the declarative and procedural bias for a given classification task. Experiments show that this design leads to further performance improvement for fraud detection.

This repository stores code from data fusion to model prediction, but the code capability is not strong, for reference only. All code is compiled using Python 3.8.

Data Fusion
Prediction
Feature selection

Data Fusion

Environment: Pytorch 1.x

Data_fusion. py contains three data fusion methods, namely Concat, CompactBilinarPooling, and TensorFusion. After generating three different fusion data, they are compared on CM4.

Prediction

Environment：Tensorflow 2.x

Baseline

The baseline shows the model from CM1 to CM4. The evaluation criteria predicted by LR, SVM, XGBoost, and CNN will be presented separately.

CM5

cm5. py shows the stacking model CM5 that only uses three types of data fusion.

CM6

cm6. py shows the final stacking model. Replacing different fused datasets in cm6.py will also draw different feature importance. stacking roc. py displays the ROC curves of the CM4, CM5, and CM6 models, demonstrating the superiority of the CM6 model, while cost_roc. py displays the ROC curves of the CM6 model after adding error costs.

The above three prediction files can be directly run together through main.py. And the cm6_demo can display the prediction process and results of the cm6 model, running completely in about 20 minutes.

Feature selection

The top30_features.py shows the importance ranking of three types of feature data, with the higher the ranking, the greater the impact of this feature on fraud prediction.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
code		code
data		data
README.md		README.md
cm6_demo.ipynb		cm6_demo.ipynb
example.png		example.png
main.py		main.py
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Eye in the Sky: Monitoring Financial Fraud using Satellite Data

Data Fusion

Prediction

Baseline

CM5

CM6

Feature selection

About

Releases 1

Packages

Languages

huangzy2333/Finance-Fraud

Folders and files

Latest commit

History

Repository files navigation

Eye in the Sky: Monitoring Financial Fraud using Satellite Data

Data Fusion

Prediction

Baseline

CM5

CM6

Feature selection

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages