CPU: Intel 8- Core i7-11800
GPU: RTX 3060
keras
pandas
sklearn
seaborn
statsmodels
tensorflow
pymongo[srv]
This project obtains AAPL historical stock data from April 2017 to April 2021 to predict the closing price of the trading day in May 2021. The purpose of this research is to examine and analyze the variations that occur when the multi-step output LSTM is trained to predict AAPL closing prices using only closing prices and using that with adding auxiliary properties. Specifically, auxiliary properties contain Nasdaq and Dow Jones stock index, Google Trends and USD/CNY.
- Stock APPL Data Acquisition by API
- Stock Apple Data Acquisition downloaded from the Internet
- Nasdaq share index by API using the same datetime as APPL stock
- Dow Jones share index by API using the same datetime as APPL stock
- Google Search APPLE trend 2017/4/1-2021/6/1 weekly from Internet
- USD/CNY 2017/4/1-2021/6/1 weekly from Internet
-
Store and retrieve data.
-
Upload to MongoDB and test the API.
Task 3.1: Clean the data from missing values and outliers, if any
3.1.1 Using Z score to check if there are outliers in AAPL stock dataset and cap them
3.1.2 Using Z score to check if there are outliers in Google Trends and USD/CNY and cap them
3.1.3 Count the number of missing values in each column of the dataset.
Task 3.2: Data integration
Task 3.3: Data Visualization
3.3.1 AAPL Boxplots monthly
3.3.2 Explore dependency on day of the week and month via carpet plot/heatmap
3.3.3 Autocorrelation of AAPL Close
Task 3.4: Data Normalization and Dimensionality reduction
Task 4.1: Trend, seasonality and random noise
Task 4.2: Proximity Measures
Task 4.3: Data Relationships - spearman correlations and scatter plots
Task 4.4: Hypothesis Testing
Task 4.4.1 AAPL Close price rise/fall vs AAPL Volume rise/fall --> Dependent
Task 4.4.2 AAPL Close price rise/fall vs with/without pandemic --> Independent
5.1 Using LSTM with 2 layers to train the dataset.
A: using only time series of stock prices
B: using the time series of stock prices and the auxiliary data
5.2 Prediction one-month trading days of AAPL closing prices
5.3 Evaluation Metrics Implementation
5.3.1 STATISTICAL PERFORMANCE: MSE MAE R-squared
5.3.2 JOINT PLOTS
5.3.3 RESIDUAL DISTRIBUTION PLOTS
store the auxiliary datasets: Nasdaq, Dow Jones indicates, Google Trends, USD/CNY
store the AAPL stock dataset
which is used to save the plots running in the code(without plot.show())