Skip to content

Commit

Permalink
Merge pull request #69 from stratosphereips/cleanup-2-players
Browse files Browse the repository at this point in the history
Cleanup 2 players
  • Loading branch information
ondrej-lukas authored Oct 24, 2024
2 parents a34a50d + c9548b9 commit e8f8d17
Show file tree
Hide file tree
Showing 60 changed files with 711 additions and 1,780 deletions.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -148,4 +148,7 @@ replay_buffer.csv
agents/*/saved_models/*
mlruns/*
agents/*/mlruns/*
agents/mlruns*
agents/mlruns*

agents/*/*/mlruns/
agents/*/*/logs
67 changes: 43 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,35 +22,54 @@ There are 4 important methods to be used for interaction with the environment:
4. `terminate_connection()`: Should be used ONCE at the end of the interaction to properly disconnect the agent from the game server.

Examples of agents extending the BaseAgent can be found in:
- [RandomAgent](./agents/random/random_agent.py)
- [InteractiveAgent](./agents/interactive_tui/interactive_tui.py)
- [Q-learningAgent](./agents/q_learning/q_agent.py) (Documentation [here](./docs/q-learning.md))
- [RandomAgent](./agents/attackers/random/random_agent.py)
- [InteractiveAgent](./agents/attackers/interactive_tui/interactive_tui.py)
- [Q-learningAgent](./agents/attackers/q_learning/q_agent.py) (Documentation [here](./docs/q-learning.md))

## Agents' compatibility with the environment

| Agent | NetSecGame branch | Tag|
| ----- |-----| ---- |
|[BaseAgent](./agents/base_agent.py) | [main](https://github.com/stratosphereips/NetSecGame/tree/main) | `HEAD`|
|[RandomAgent](./agents/random/random_agent.py) | [main](https://github.com/stratosphereips/NetSecGame/tree/main) | `HEAD`|
|[InteractiveAgent](./agents/interactive_tui/interactive_tui.py) | [main](https://github.com/stratosphereips/NetSecGame/tree/main) | `HEAD`|
|[Q-learning](./agents/q_learning/q_agent.py) | [main](https://github.com/stratosphereips/NetSecGame/tree/main) | `HEAD`|
|[LLM](./agents/llm/llm_agent.py)| [main](https://github.com/stratosphereips/NetSecGame/tree/main) | [realease_out_of_the_cage](https://github.com/stratosphereips/NetSecGame/tree/release_out_of_cage)|
|[LLM_QA](./agents/llm_qa/llm_agent_qa.py)| [main](https://github.com/stratosphereips/NetSecGame/tree/main) | [realease_out_of_the_cage](https://github.com/stratosphereips/NetSecGame/tree/release_out_of_cage)|
|[GNN_REINFORCE](./agents/llm_qa/llm_agent_qa.py)| [main](https://github.com/stratosphereips/NetSecGame/tree/main) | [realease_out_of_the_cage](https://github.com/stratosphereips/NetSecGame/tree/release_out_of_cage)|
## Agent's types
There are three types of roles an agent can play in NetSecEnv:
1. Attacker
2. Defender
3. Benign

Agents of each type are stored in the corresponding directory within this repository:
```
├── agents
├── attakcers
├── concepts_q_learning
├── double_q_learning
├── gnn_reinforce
├── interactive_tui
├── ...
├── defenders
├── random
├── probabilistic
├── benign
├── benign_random
```
### Agent utils
Utility functions in [agent_utils.py](./agents/agent_utils.py) can be used by any agent to evaluate a `GameState`, generate set of valid `Actions` in a `GameState` etc.
Additionally, there are several files with utils functions that can be used by any agents:
- `[agent_utils.py](./agents/agent_utils.py) Formatting GameState and generation of valid actions
- [graph_agent_utils.py](./agents/graph_agent_utils.py): GameState -> graph conversion
- [llm_utils.py](./agents/llm_utils.py): utility functions for LLM-based agents

## About us
This code was developed at the [Stratosphere Laboratory at the Czech Technical University in Prague](https://www.stratosphereips.org/).
## Agents' compatibility with the environment

## How to visualize the results of mlflow
### Locally
| Agent | NetSecGame branch | Tag| Status |
| ----- |-----| ---- | ---- |
|[BaseAgent](./agents/base_agent.py) | [main](https://github.com/stratosphereips/NetSecGame/tree/main) | `HEAD`||
|[Random Attacker](./agents/random/random_agent.py) | [main](https://github.com/stratosphereips/NetSecGame/tree/main) | `HEAD`||
|[InteractiveAgent](./agents/interactive_tui/interactive_tui.py) | [main](https://github.com/stratosphereips/NetSecGame/tree/main) | `HEAD`||
|[Q-learning](./agents/q_learning/q_agent.py) | [main](https://github.com/stratosphereips/NetSecGame/tree/main) | `HEAD`||
|[LLM](./agents/llm/llm_agent.py)| [main](https://github.com/stratosphereips/NetSecGame/tree/main) | [realease_out_of_the_cage](https://github.com/stratosphereips/NetSecGame/tree/release_out_of_cage)||
|[LLM_QA](./agents/llm_qa/llm_agent_qa.py)| [main](https://github.com/stratosphereips/NetSecGame/tree/main) | [realease_out_of_the_cage](https://github.com/stratosphereips/NetSecGame/tree/release_out_of_cage)||
|[GNN_REINFORCE](./agents/llm_qa/llm_agent_qa.py)| [main](https://github.com/stratosphereips/NetSecGame/tree/main) | [realease_out_of_the_cage](https://github.com/stratosphereips/NetSecGame/tree/release_out_of_cage)||
|[Random Defender](./agents/defenders/random/random_agent.py)| [main](https://github.com/stratosphereips/NetSecGame/tree/main) | | 👷🏼‍♀️ |
|[Probabilistic Defender](./agents/defenders/probabilistic/probabilistic_agent.py)| [main](https://github.com/stratosphereips/NetSecGame/tree/main) | | 👷🏼‍♀️ |

1. export MLFLOW_TRACKING_URI=sqlite:///mlruns.db
2. Then run the agent code
### Agent utils
Utility functions in [agent_utils.py](./agents/agent_utils.py) can be used by any agent to evaluate a `GameState`, generate set of valid `Actions` in a `GameState` etc.

From the folder that you run the python
```bash
mlflow ui --port 8080 --backend-store-uri sqlite:///mlruns.db
```
## About us
This code was developed at the [Stratosphere Laboratory at the Czech Technical University in Prague](https://www.stratosphereips.org/) as part of the AIDojo Project.
14 changes: 11 additions & 3 deletions agents/agent_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@
"""
import random
import sys
import os
sys.path.append(os.path.dirname(os.path.dirname( os.path.abspath(__file__) )))
from os import path
sys.path.append(path.dirname(path.dirname(path.dirname( path.abspath(__file__) ))))
#with the path fixed, we can import now
from env.game_components import Action, ActionType, GameState, Observation, Data, IP, Network
from env.game_components import Action, ActionType, GameState, Observation, IP, Network
import ipaddress

def generate_valid_actions_concepts(state: GameState)->list:
Expand Down Expand Up @@ -75,6 +75,14 @@ def generate_valid_actions(state: GameState)->list:
for trg_host in state.controlled_hosts:
if trg_host != src_host:
valid_actions.add(Action(ActionType.ExfiltrateData, params={"target_host": trg_host, "source_host": src_host, "data": data}))

# BlockIP
for src_host in state.controlled_hosts:
for target_host in state.controlled_hosts:
for blocked_ip in state.known_hosts:
valid_actions.add(Action(ActionType.BlockIP, {"target_host":target_host, "source_host":src_host, "blocked_host":blocked_ip}))


return list(valid_actions)

def state_as_ordered_string(state:GameState)->str:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@
# Arti
# Sebastian Garcia. [email protected]
import sys
import os
from os import path, makedirs
import numpy as np
import random
import pickle
import argparse
import logging
# This is used so the agent can see the environment and game component
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__) ) ) )))
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__) )))
sys.path.append(path.dirname(path.dirname(path.dirname(path.dirname(path.dirname(path.abspath(__file__) ) ) ))))
sys.path.append(path.dirname(path.dirname(path.dirname(path.abspath(__file__) ))))

# This is used so the agent can see the environment and game component
# with the path fixed, we can import now
Expand Down Expand Up @@ -174,7 +174,7 @@ def play_game(self, observation_ip, episode_num, testing=False):
parser.add_argument("--epsilon_max_episodes", help="Max episodes for epsilon to reach maximum decay", default=5000, type=int)
parser.add_argument("--gamma", help="Sets gamma discount for Q-learing during training.", default=0.9, type=float)
parser.add_argument("--alpha", help="Sets alpha for learning rate during training.", default=0.1, type=float)
parser.add_argument("--logdir", help="Folder to store logs", default=os.path.join(os.path.dirname(os.path.abspath(__file__)), "logs"))
parser.add_argument("--logdir", help="Folder to store logs", default=path.join(path.dirname(path.abspath(__file__)), "logs"))
parser.add_argument("--previous_model", help="Load the previous model. If training, it will start from here. If testing, will use to test.", type=str)
parser.add_argument("--testing", help="Test the agent. No train.", default=False, type=bool)
parser.add_argument("--experiment_id", help="Id of the experiment to record into Mlflow.", default='', type=str)
Expand All @@ -184,9 +184,9 @@ def play_game(self, observation_ip, episode_num, testing=False):
parser.add_argument("--early_stop_threshold", help="Threshold for win rate for testing. If the value goes over this threshold, the training is stopped. Defaults to 95 (mean 95% perc)", required=False, default=95, type=float)
args = parser.parse_args()

if not os.path.exists(args.logdir):
os.makedirs(args.logdir)
logging.basicConfig(filename=os.path.join(args.logdir, "q_agent.log"), filemode='w', format='%(asctime)s %(name)s %(levelname)s %(message)s', datefmt='%H:%M:%S',level=logging.ERROR)
if not path.exists(args.logdir):
makedirs(args.logdir)
logging.basicConfig(filename=path.join(args.logdir, "q_agent.log"), filemode='w', format='%(asctime)s %(name)s %(levelname)s %(message)s', datefmt='%H:%M:%S',level=logging.ERROR)

# Create agent
agent = QAgent(args.host, args.port, alpha=args.alpha, gamma=args.gamma, epsilon_start=args.epsilon_start, epsilon_end=args.epsilon_end, epsilon_max_episodes=args.epsilon_max_episodes)
Expand All @@ -195,7 +195,7 @@ def play_game(self, observation_ip, episode_num, testing=False):
actions_logger = logging.getLogger('QAgentActions')
actions_logger.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
actions_handler = logging.FileHandler(os.path.join(args.logdir, "q_agent_actions.log"), mode="w")
actions_handler = logging.FileHandler(path.join(args.logdir, "q_agent_actions.log"), mode="w")
actions_handler.setLevel(logging.INFO)
actions_handler.setFormatter(formatter)
actions_logger.addHandler(actions_handler)
Expand All @@ -217,13 +217,13 @@ def play_game(self, observation_ip, episode_num, testing=False):

if not args.testing:
# Mlflow experiment name
experiment_name = f"Training and Eval of Q-learning Agent"
experiment_name = "Training and Eval of Conceptual Q-learning Agent"
mlflow.set_experiment(experiment_name)
elif args.testing:
# Evaluate the agent performance

# Mlflow experiment name
experiment_name = f"Testing of Q-learning Agent"
experiment_name = "Testing of ConceptualQ-learning Agent"
mlflow.set_experiment(experiment_name)


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import sys
from os import path
# This is used so the agent can see the environment and game components
sys.path.append(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) )))
sys.path.append(path.dirname(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) ))))
# import env.game_components as components

import numpy as np
Expand All @@ -14,7 +14,7 @@
import logging
from torch.utils.tensorboard import SummaryWriter
import time
from env.network_security_game import NetworkSecurityEnvironment
from env.worlds.network_security_game import NetworkSecurityEnvironment
from env.game_components import Action, Observation, GameState

class DoubleQAgent:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,15 @@
from random import choice, choices
from tensorflow_gnn.models.gcn import gcn_conv


# TODO ADAPT TO USE BASE AGENT

# This is used so the agent can see the environment and game components
from os import path
sys.path.append(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) )))
sys.path.append(path.dirname(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) ))))

#with the path fixed, we can import now
from env.network_security_game import NetworkSecurityEnvironment
from env.worlds.network_security_game import NetworkSecurityEnvironment

os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
Expand Down
File renamed without changes.
File renamed without changes.
10 changes: 10 additions & 0 deletions agents/attackers/interactive_tui/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Interactive TUI agent

This is the main agent that should be used to play by humans. It can be played in several modes.

1. Human, without autocompletion of fields nor assistance.
2. Human, with autocompletion of fields, but without assistance.
3. Human, with autocompletion of fields and LLM assitance.

# Display
For reasons of the LLM prompt. The **known hosts** field is _not_ filled with the known hosts sent by the initial observation in the env. This is to avoid some errors in the LLM beliving that the hosts were also controlled.
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,14 @@
from tenacity import retry, stop_after_attempt

sys.path.append(
path.dirname(path.dirname(path.dirname(path.dirname(path.abspath(__file__)))))
path.dirname(path.dirname(path.dirname(path.dirname(path.dirname(path.abspath(__file__))))))
)
from env.game_components import (
ActionType,
Observation,
)

sys.path.append(path.dirname(path.dirname(path.abspath(__file__))))
sys.path.append(path.dirname(path.dirname(path.dirname(path.abspath(__file__)))))

from llm_utils import (
create_action_from_response,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@

# This is used so the agent can see the environment and game components
sys.path.append(
path.dirname(path.dirname(path.dirname(path.dirname(path.abspath(__file__)))))
path.dirname(path.dirname(path.dirname(path.dirname(path.dirname(path.abspath(__file__))))))
)
from env.game_components import Network, IP
from env.game_components import ActionType, Action, GameState, Observation
Expand Down Expand Up @@ -352,7 +352,7 @@ def handle_inputs(self, event: Input.Changed) -> None:
"""
Handles the manual inputs that are types by the user.
"""
log = self.query_one("RichLog")
# log = self.query_one("RichLog")
# log.write(f"Input received: {event.value}")
if event._sender.id == "src_host":
if event.validation_result.is_valid:
Expand Down Expand Up @@ -709,4 +709,4 @@ def _clear_state(self) -> None:
args.memory_len,
args.max_repetitions,
)
app.run()
app.run()
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@
import sys
from os import path

sys.path.append(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) )))
sys.path.append(path.dirname(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) ))))


from env.network_security_game import NetworkSecurityEnvironment
from env.worlds.network_security_game import NetworkSecurityEnvironment
from env.game_components import ActionType, Action, IP, Data, Network, Service

import openai
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
"""
import sys
from os import path
sys.path.append(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) )))
sys.path.append(path.dirname(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) ))))


from env.network_security_game import NetworkSecurityEnvironment
from env.worlds.network_security_game import NetworkSecurityEnvironment
from env.game_components import ActionType, Action, IP, Data, Network, Service

import openai
Expand Down
4 changes: 2 additions & 2 deletions agents/llm/llm_agent.py → agents/attackers/llm/llm_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
"""
import sys
from os import path
sys.path.append(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) )))
sys.path.append(path.dirname(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) ))))


from env.network_security_game import NetworkSecurityEnvironment
from env.worlds.network_security_game import NetworkSecurityEnvironment
from env.game_components import ActionType, Action, IP, Data, Network, Service

import openai
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@

# This is used so the agent can see the environment and game components
from os import path
sys.path.append(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) )))
sys.path.append(path.dirname(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) ))))

from env.network_security_game import NetworkSecurityEnvironment
from env.worlds.network_security_game import NetworkSecurityEnvironment
from env.game_components import Action, ActionType

import numpy as np
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@

# This is used so the agent can see the environment and game components
from os import path
sys.path.append(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) )))
sys.path.append(path.dirname(path.dirname(path.dirname(path.dirname( path.dirname( path.abspath(__file__) ) ) ))))

from env.network_security_game import NetworkSecurityEnvironment
from env.worlds.network_security_game import NetworkSecurityEnvironment
from env.game_components import Action, ActionType
from sentence_transformers import SentenceTransformer

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
import argparse
from math import inf
import pickle
from colorama import Fore, Style, init
from colorama import Fore, init

#q_values = {}
#states = {}
Expand Down
Loading

0 comments on commit e8f8d17

Please sign in to comment.