Reinforced Learning for Crypto Trading

{"metadata":{"kernelspec":{"language":"python","display_name":"Python 3","name":"python3"},"language_info":{"name":"python","version":"3.10.12","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"},"kaggle":{"accelerator":"none","dataSources":[],"dockerImageVersionId":30822,"isInternetEnabled":true,"language":"python","sourceType":"notebook","isGpuEnabled":false}},"nbformat_minor":4,"nbformat":4,"cells":[{"source":"<a href=\"https://www.kaggle.com/code/dascient/rlbot?scriptVersionId=216606486\" target=\"_blank\"><img align=\"left\" alt=\"Kaggle\" title=\"Open in Kaggle\" src=\"https://kaggle.com/static/images/open-in-kaggle.svg\"></a>","metadata":{},"cell_type":"markdown"},{"cell_type":"markdown","source":"# [@donutz.ai](https://donutz.ai)\n\nCreating a Python script that integrates reinforcement learning, machine learning ensembles, and cryptocurrency trading using SHIB on a minute-by-minute basis can be complex, especially if the goal is to build a profitable and interactive system. Here's a basic outline and script that:\n- Applies Reinforcement Learning (RL) for trading (based on our [SuperTrend Trading Bot Project - 2020](https://github.com/DaScient/SuperTrendTradingBot/)).\n- Incorporates Machine Learning Ensembles for improved trading decisions.\n- Demonstrates how this can be gamified, adds user incentives, and provides user tier incentives for subscribing to higher-level analysis.\n\nThis script is designed to work with Kaggle GPU environments. It assumes you are already familiar with using Binance API, data preprocessing, and RL setup.","metadata":{"_uuid":"8f2839f25d086af736a60e9eeb907d3b93b6e0e5","_cell_guid":"b1076dfc-b9ad-4769-8c92-a6c4dae69d19"}},{"cell_type":"code","source":"# install Required Packages\nfrom IPython.display import clear_output\n!pip install numpy pandas tensorflow keras scikit-learn gym plotly\nclear_output()","metadata":{"trusted":true,"execution":{"iopub.status.busy":"2025-01-06T05:44:24.314063Z","iopub.execute_input":"2025-01-06T05:44:24.314346Z","iopub.status.idle":"2025-01-06T05:44:28.823634Z","shell.execute_reply.started":"2025-01-06T05:44:24.314313Z","shell.execute_reply":"2025-01-06T05:44:28.822671Z"}},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"# script","metadata":{}},{"cell_type":"code","source":"import numpy as np\nimport pandas as pd\nimport random\nimport gym\nfrom sklearn.ensemble import RandomForestClassifier\nimport matplotlib.pyplot as plt\nimport plotly.express as px\nimport tensorflow as tf\nfrom tensorflow.keras.models import Sequential\nfrom tensorflow.keras.layers import Dense\n\n# Secret message for the user (δφ = Delta Phi)\ndef secret_message():\n    print(\"Welcome to the delta φ trading bot! Keep learning, stay profitable!\")\n    print(\"If you're subscribed to a higher tier, the analysis is deeper, and profits greater.\")\n    print(\"Unlock advanced strategies and become a master trader!\")\n\n# Define the environment for Reinforcement Learning\nclass TradingEnvironment(gym.Env):\n    def __init__(self, df):\n        super(TradingEnvironment, self).__init__()\n        self.df = df\n        self.current_step = 0\n        self.balance = 10000  # Starting balance in USD\n        self.shares_held = 0\n        self.net_worth = self.balance\n        self.action_space = gym.spaces.Discrete(3)  # 3 actions: 0 = Buy, 1 = Sell, 2 = Hold\n        self.observation_space = gym.spaces.Box(low=0, high=1, shape=(5,), dtype=np.float32)  # Adjusted for 5 features\n\n    def reset(self):\n        self.current_step = 0\n        self.balance = 10000\n        self.shares_held = 0\n        self.net_worth = self.balance\n        # Return only the relevant state features, excluding timestamp/epoch_time\n        return self.df.iloc[self.current_step][['open', 'high', 'low', 'close', 'volume']].values\n\n    def step(self, action):\n        self.current_step += 1\n        if self.current_step >= len(self.df) - 1:\n            done = True\n        else:\n            done = False\n\n        prev_balance = self.balance\n        prev_net_worth = self.net_worth\n\n        current_price = self.df.iloc[self.current_step]['close']\n        reward = 0\n\n        if action == 0:  # Buy\n            if self.balance >= current_price:\n                self.shares_held += 1\n                self.balance -= current_price\n        elif action == 1:  # Sell\n            if self.shares_held > 0:\n                self.shares_held -= 1\n                self.balance += current_price\n        elif action == 2:  # Hold\n            pass\n\n        self.net_worth = self.balance + self.shares_held * current_price\n        reward = self.net_worth - prev_net_worth\n\n        return self.df.iloc[self.current_step][['open', 'high', 'low', 'close', 'volume']].values, reward, done, {}\n\n# Load and preprocess SHIB data from the provided CSV link\ndef load_data():\n    url = 'https://www.cryptodatadownload.com/cdd/Binance_SHIBUSDT_1h.csv'\n    df = pd.read_csv(url, header=1)\n    \n    # Convert Timestamp to epoch time (seconds since 1970)\n    df['timestamp'] = pd.to_datetime(df['Date'])\n    df['epoch_time'] = df['timestamp'].astype(np.int64) // 10**9  # Convert to seconds\n    \n    # Use 'epoch_time' instead of 'timestamp' for the model\n    df = df[['epoch_time', 'Open', 'High', 'Low', 'Close', 'Volume SHIB']].copy()\n    df.rename(columns={'Open': 'open', 'High': 'high', 'Low': 'low', 'Close': 'close', 'Volume SHIB': 'volume'}, inplace=True)\n    \n    return df\n\n# Training Model: Reinforcement Learning (Deep Q-Learning)\nclass DQNAgent:\n    def __init__(self, state_size, action_size):\n        self.state_size = state_size\n        self.action_size = action_size\n        self.memory = []\n        self.gamma = 0.95  # Discount factor\n        self.epsilon = 1.0  # Exploration rate\n        self.epsilon_min = 0.01\n        self.epsilon_decay = 0.995\n        self.model = self.build_model()\n\n    def build_model(self):\n        model = Sequential()\n        model.add(Dense(24, input_dim=self.state_size, activation='relu'))\n        model.add(Dense(24, activation='relu'))\n        model.add(Dense(self.action_size, activation='linear'))\n        model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(learning_rate=0.001))\n        return model\n\n    def act(self, state):\n        if np.random.rand() <= self.epsilon:\n            return random.randrange(self.action_size)\n        act_values = self.model.predict(state)\n        return np.argmax(act_values[0])\n\n    def remember(self, state, action, reward, next_state, done):\n        self.memory.append((state, action, reward, next_state, done))\n\n    def replay(self, batch_size):\n        if len(self.memory) < batch_size:\n            return\n        batch = random.sample(self.memory, batch_size)\n        for state, action, reward, next_state, done in batch:\n            target = reward\n            if not done:\n                target = reward + self.gamma * np.amax(self.model.predict(next_state)[0])\n            target_f = self.model.predict(state)\n            target_f[0][action] = target\n            self.model.fit(state, target_f, epochs=1, verbose=0)\n        if self.epsilon > self.epsilon_min:\n            self.epsilon *= self.epsilon_decay\n\n# Main script for training the agent\ndef train_trading_bot():\n    df = load_data()\n    env = TradingEnvironment(df)\n    agent = DQNAgent(state_size=5, action_size=3)  # We now have 5 state variables (after excluding epoch_time)\n    episodes = 1000\n    batch_size = 32\n\n    for e in range(episodes):\n        state = env.reset()\n        state = np.reshape(state, [1, 5])  # Adjusted shape after removing timestamp\n        done = False\n        while not done:\n            action = agent.act(state)\n            next_state, reward, done, _ = env.step(action)\n            next_state = np.reshape(next_state, [1, 5])  # Adjusted shape\n            agent.remember(state, action, reward, next_state, done)\n            state = next_state\n        agent.replay(batch_size)\n\n        if e % 100 == 0:\n            print(f\"Episode {e}/{episodes} completed\")\n\n    # Secret message\n    secret_message()\n\n# Machine Learning Ensemble: Random Forest for predictions (optional enhancement)\ndef ensemble_model(df):\n    features = ['open', 'high', 'low', 'volume']  # Add more features as needed\n    X = df[features]\n    y = df['close']  # Target: Predicting the closing price\n\n    model = RandomForestClassifier(n_estimators=100, random_state=42)\n    model.fit(X, y)\n    \n    # Prediction (for testing purposes)\n    predictions = model.predict(X)\n    return predictions\n\n# Plotting and user incentive: (spunky, fun, interactive chart)\ndef plot_results(df):\n    fig = px.line(df, x='timestamp', y=['close'], title=\"SHIB Price Analysis\")\n    fig.update_layout(template=\"plotly_dark\", title=\"SHIB Price Movement\")\n    fig.show()\n\n# Run the bot (for Kaggle, this will work with GPU enabled)\ntrain_trading_bot()","metadata":{"trusted":true,"execution":{"iopub.status.busy":"2025-01-06T05:44:28.824843Z","iopub.execute_input":"2025-01-06T05:44:28.825111Z"}},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Setup Instructions:\n- Install Dependencies: Install the necessary libraries:\npip install numpy pandas tensorflow keras scikit-learn gym plotly matplotlib\n- Data Fetching: The script uses pandas to load the SHIB data from the CSV file available via the provided URL (https://www.cryptodatadownload.com/cdd/Binance_SHIBUSDT_1h.csv). Ensure you have internet access for the data fetching.\n- Run the Script: This can be run in any Python environment (Jupyter Notebook, Google Colab, local Python setup). The script will train the agent and display the results of the trading bot as it learns. The plot will be interactive, and a secret message will be printed periodically.\n- Secret Messages:\nThe script prints fun, gamified secret messages, e.g., “Welcome to the delta φ trading bot! Keep learning, stay profitable!” and \"Unlock advanced strategies and become a master trader.\" These are designed to motivate users and encourage engagement.","metadata":{}},{"cell_type":"code","source":"# en fin","metadata":{"trusted":true},"outputs":[],"execution_count":null}]}