Skip to content

Latest commit

 

History

History
110 lines (83 loc) · 5.08 KB

machine-learning-class-guide.md

File metadata and controls

110 lines (83 loc) · 5.08 KB

Machine Learning Class Guide

To start customizing and overriding the default settings for you machine learning you want to import it. In this guide we will creating a class to use Random Forrest Regression (RFR) for our machine learning.

To start we want to import the parent class.

from ml_backtest.interfaces import MachineLearningInterface

Building the class

Next step is using these variables in the methods. First up is defining the constructor so you have the data avaible to you to perform youe feature engineering on.

from ml_backtest.interfaces import MachineLearningInterface
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
import pandas as pd
import talib
from typing import Optional, List


class RandomForestRegressorTrainer(MachineLearningInterface):

    def __init__(self, data: pd.DataFrame, rows: Optional[int] = None,
                 columns: Optional[List[str]] = None):
        super().__init__(data,  rows, columns)
        # Additional initialization specific to RandomForestRegressorTrainer can go here.
        # For example, setting up model-specific parameters or preprocessing steps.
        # self.model_specific_param = some_value

        # If there's additional setup required for the RandomForest model,
        # that doesn't fit the pattern provided by MachineLearningInterface,
        # it can be performed here.

After we defined our constructor we can get to the fun part! Creating extra features so we can get the best possible predictions! After we create extra features we can then just call the train method and the prediction method.

from ml_backtest.interfaces import MachineLearningInterface
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
import pandas as pd
import talib
from typing import Optional, List


class RandomForestRegressorTrainer(MachineLearningInterface):

    def __init__(self, data: pd.DataFrame, rows: Optional[int] = None,
                 columns: Optional[List[str]] = None):
        super().__init__(data,  rows, columns)
        # Additional initialization specific to RandomForestRegressorTrainer can go here.
        # For example, setting up model-specific parameters or preprocessing steps.
        # self.model_specific_param = some_value

        # If there's additional setup required for the RandomForest model,
        # that doesn't fit the pattern provided by MachineLearningInterface,
        # it can be performed here.

    def feature_engineer(self):
        # here is where you can add addition columns of features you want to be used in training
        # just make sure you edit the 'self.data' with the features you want as that is the dataframe being
        # used in training
        self.data['SMA'] = self.data['close'].rolling(window=10).mean()
        self.data['EMA'] = talib.EMA(self.data['close'], timeperiod=10)
        self.data['RSI'] = talib.RSI(self.data['close'], timeperiod=14)
        self.data['MACD'], self.data['MACD_signal'], self.data['MACD_hist'] = talib.MACD(self.data['close'],
                                                                                         fastperiod=12, slowperiod=26,
                                                                                         signalperiod=9)
        self.data['SMA_Diff'] = self.data['SMA'].diff()
        self.data['EMA_Diff'] = self.data['EMA'].diff()

    def train(self, x_train, y_train, x_test, y_test):
        # here is where you define the model you want for training
        self.model = RandomForestRegressor(n_estimators=100, random_state=42)
        self.model.fit(x_train, y_train)

    def predict(self, x_train, y_train, x_test, y_test):
        # here is where the predictions will appear
        # you can get different values for the predictions
        # like calculating the residuals
        self.predictions = self.model.predict(x_test)

        mse = mean_squared_error(y_test, self.predictions)
        print(f"Mean Squared Error: {mse}")

That is it! The goal of this child class is to take the abstraction of tedous stock market machine learning away from the user. Do not have to worry about how to get the data, just need to define what you want.

Changing Target

Right now the target can only be adatable via a number. If you do not want the defualt magnitude value for the target (which is highest high of every trade) you can define your own target by implementing the following parent class.

from ml_backtest.interfaces import TargetInterface

After that just build up a simple method for your target class.

class BasicTarget(TargetInterface):
    def target_engineer(self):
        min_value = self.trades['target'].min()
        self.trades['target'] = self.trades['target'].apply(lambda x: min(x, 7))
        self.trades['target'] = self.trades['target'].apply(lambda x: max(x, min_value))

For every trade there is a target column that you can easily manipualte. That is it! All this code can be found and reviewed inside the package models folder. It is under the name of rfr.py.