Skip to content

Commit

Permalink
add basic api sources
Browse files Browse the repository at this point in the history
  • Loading branch information
superstes committed Nov 2, 2024
1 parent fdf172e commit e21f1f0
Show file tree
Hide file tree
Showing 20 changed files with 337 additions and 15 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ By flagging clients originating from these sources you can achieve a nice securi

The databases created from the gathered data will be and stay open-source!

If you (*just*) want to keep track of abusers internally - you could also host your dedicated instance of [this app](https://github.com/O-X-L/risk-db/blob/latest/src).

<a href="https://github.com/O-X-L/risk-db/blob/latest/visualization">
<img src="https://raw.githubusercontent.com/O-X-L/risk-db/refs/heads/latest/visualization/world_map_example.webp" alt="World Map Example" width="800"/>
<img src="https://raw.githubusercontent.com/O-X-L/risk-db/refs/heads/latest/visualization/asn_chart_example.webp" alt="ASN Chart Example" width="800"/>
Expand Down Expand Up @@ -48,6 +50,7 @@ You may also want to check out these projects: (*not open/free data*)
* [CrowdSec](https://www.crowdsec.net/)
* [AbuseIP-DB](https://www.abuseipdb.com/)
* [IPInfo Privacy-DB](https://ipinfo.io/products/proxy-vpn-detection-api)
* [nitefood/asn CLI-Tools](https://github.com/nitefood/asn)

----

Expand Down
16 changes: 9 additions & 7 deletions reporting/Graylog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ We can create a Graylog Alert Notification to Report Abusers to this Risk-Databa

You can find an example on how to split HAProxy logs into different fields here: [gist.github.com](https://gist.github.com/superstes/a2f6c5d855857e1f10dcb51255fe08c6#haproxy-split) (*via Pipeline Rules*)

Hint: You can use [Lookup Tables](https://graylog.org/post/how-to-use-graylog-lookup-tables/) to query if an IP-Address is in your custom safe-ip-list and flag it for further filtering. (*exclude them from being reported*)

## API Service

As Graylog has no option to add advanced filters for the data sent by the notifications, we will have to add a minimal service to do so.
Expand Down Expand Up @@ -36,8 +38,8 @@ As Graylog has no option to add advanced filters for the data sent by the notifi
app = Flask(__name__)


@app.route('/report-abuse/haproxy', methods=['POST'])
def report_abuse_haproxy():
@app.route('/report-abuse', methods=['POST'])
def report_abuse():
unique_list = []

for log in request.json['backlog']:
Expand Down Expand Up @@ -141,9 +143,9 @@ As Graylog has no option to add advanced filters for the data sent by the notifi

`https://<SERVER>/alerts/notifications`

* **Title**: `Report Abuse - HAProxy`
* **Title**: `Report Abuse`
* **Notification Type**: `HTTP Notification`
* **URL**: `http://127.0.0.1:8000/report-abuse/haproxy`
* **URL**: `http://127.0.0.1:8000/report-abuse`


### Create an Alert-Event
Expand All @@ -152,13 +154,13 @@ As Graylog has no option to add advanced filters for the data sent by the notifi

**Event Details**:

* **Title**: `HAProxy Abuse`
* **Title**: `Abuse`
* **Priority**: `Low`

**Condition**:

* **Condition Type**: `Filter & Aggregation`
* **Streams**: Select your HAProxy Access-Log stream
* **Streams**: Select your App's Access-Log stream
* **Search Query**: Filter Logs to only include blocks of your security filters. Also exclude your `safe-ips` and so on
* **Search within the last**: 1 minute
* **Execute search every**: 1 minute
Expand All @@ -168,6 +170,6 @@ As Graylog has no option to add advanced filters for the data sent by the notifi

**Notifications**:

* **Choose Notification**: `Report Abuse - HAProxy`
* **Choose Notification**: `Report Abuse`
* **Grace Period**: Disable
* **Message Backlog**: 500 (duplicates will be filtered by the API-service)
10 changes: 8 additions & 2 deletions src/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Risk-DB Generator
# Risk-DB Sources

This Python3 scripts are used to generate the Risk-Databases from the reports we received.
These Python3 scripts are used for building and managing the Risk-DB.

You can also run your own dedicated instances of these services.

We want to be transparent. All code that is not security-related will be Open-Source.

Expand All @@ -9,3 +11,7 @@ We want to be transparent. All code that is not security-related will be Open-So
Contributions like [reporting issues](https://github.com/O-X-L/risk-db/issues/new), [engaging in discussions](https://github.com/O-X-L/risk-db/discussions) or [PRs](https://github.com/O-X-L/risk-db/pulls) are welcome!

Feel free to share your opinion about possible optimizations/extensions.

## Docker

Dockerized services will be added later on.
76 changes: 76 additions & 0 deletions src/api/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Risk-DB API

This Python3 script is used to act as Risk-Databases API.

We want to be transparent. All code that is not security-related will be Open-Source.

## Contribute

Contributions like [reporting issues](https://github.com/O-X-L/risk-db/issues/new), [engaging in discussions](https://github.com/O-X-L/risk-db/discussions) or [PRs](https://github.com/O-X-L/risk-db/pulls) are welcome!

Feel free to share your opinion about possible optimizations/extensions.

----

## Serviceuser

To allow the API to be run as non-root - you need to add a user:

```bash
useradd -U --shell /usr/sbin/nologin --home-dir /var/local/lib/risk-db --create-home risk-db
```

----

## VirtualEnv

You need to create a Python3 virtualenv to run this app:

```bash
sudo apt install python3-virtualenv
python3 -m virtualenv /var/local/lib/risk-db/venv
source /var/local/lib/risk-db/venv/bin/activate
pip install flask waitress maxminddb
```

----

## Service

You can run it as systemd service:

```
# file: /etc/systemd/system/risk-db.service
[Unit]
Description=Service to run OXL Risk-DB API Service
Documentation=https://github.com/O-X-L/oxl-riskdb
[Service]
Type=simple
Environment=PYTHONUNBUFFERED=1
WorkingDirectory=/var/local/lib/risk-db
ExecStart=/bin/bash -c 'source /var/local/lib/risk-db/venv/bin/activate && \
python3 /var/local/lib/risk-db/main.py'
User=risk-db
Group=risk-db
Restart=on-failure
RestartSec=10s
StandardOutput=journal
StandardError=journal
SyslogIdentifier=oxl-riskdb
[Install]
WantedBy=multi-user.target
```

Enable & Start:

```
systemctl enable risk-db.service
systemctl start risk-db.service
```



224 changes: 224 additions & 0 deletions src/api/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
#!/usr/bin/env python3

from ipaddress import IPv4Address, IPv6Address, AddressValueError, IPv4Interface, IPv6Interface
from re import sub as regex_replace
from threading import Lock
from json import dumps as json_dumps
from json import loads as json_loads
from time import time
from socket import gethostname
from pathlib import Path
from datetime import datetime

from flask import Flask, request, Response, json, redirect
from waitress import serve
import maxminddb

app = Flask('risk-db')
BASE_DIR = Path('/var/local/lib/risk-db')
RISKY_DB_FILE = {
4: BASE_DIR / 'risk_ip4_med.mmdb',
6: BASE_DIR / 'risk_ip6_med.mmdb',
}
ASN_JSON_FILE = BASE_DIR / 'risk_asn_med.json'
NET_JSON_FILES = {
4: BASE_DIR / 'risk_net4_med.json',
6: BASE_DIR / 'risk_net6_med.json',
}

RISK_CATEGORIES = ['bot', 'attack', 'crawler', 'rate', 'hosting', 'vpn', 'proxy', 'probe']
RISK_REPORT_DIR = BASE_DIR / 'reports'
TOKENS = []
NET_SIZE = {4: '24', 6: '64'}
report_lock = Lock()


def _valid_ipv4(ip: str) -> bool:
try:
IPv4Address(ip)
return True

except AddressValueError:
return False


def _valid_public_ip(ip: str) -> bool:
ip = str(ip)
try:
ip = IPv4Address(ip)
return ip.is_global and \
not ip.is_loopback and \
not ip.is_reserved and \
not ip.is_multicast and \
not ip.is_link_local

except AddressValueError:
try:
ip = IPv6Address(ip)
return ip.is_global and \
not ip.is_loopback and \
not ip.is_reserved and \
not ip.is_multicast and \
not ip.is_link_local

except AddressValueError:
return False


def _valid_asn(_asn: str) -> bool:
return _asn.isdigit() and 0 <= int(_asn) <= 4_294_967_294


def _safe_comment(cmt: str) -> str:
return regex_replace(r'[^\sa-zA-Z0-9_=+.-]', '', cmt)[:50]


def _response_json(code: int, data: dict) -> Response:
return app.response_class(
response=json.dumps(data, indent=2),
status=code,
mimetype='application/json'
)


def _get_ipv(ip: str) -> int:
if _valid_ipv4(ip):
return 4

return 6


def _get_src_ip() -> str:
if _valid_public_ip(request.remote_addr):
return request.remote_addr

if 'X-Real-IP' in request.headers:
return request.headers['X-Real-IP'].replace('::ffff:', '')

if 'X-Forwarded-For' in request.headers:
return request.headers['X-Forwarded-For'].replace('::ffff:', '')

return request.remote_addr


# curl -XPOST https://risk.oxl.app/api/report --data '{"ip": "1.1.1.1", "cat": "bot"}' -H 'Content-Type: application/json'
@app.route('/api/report', methods=['POST'])
def report() -> Response:
if 'Content-Type' not in request.headers or request.headers['Content-Type'] != 'application/json':
return _response_json(code=400, data={'msg': 'Expected JSON'})

data = request.get_json()

if 'ip' in data and data['ip'].startswith('::ffff:'):
data['ip'] = data['ip'].replace('::ffff:', '')

if 'ip' not in data or not _valid_public_ip(data['ip']):
return _response_json(code=400, data={'msg': 'Invalid IP provided'})

if 'cat' not in data or data['cat'].lower() not in RISK_CATEGORIES:
return _response_json(
code=400,
data={'msg': f'Invalid Category provided - must be one of: {RISK_CATEGORIES}'},
)

r = {
'ip': data['ip'], 'cat': data['cat'].lower(), 'time': int(time()),
'v': 4 if _valid_ipv4(data['ip']) else 6, 'cmt': None, 'token': None, 'by': _get_src_ip,
}

if 'cmt' in data:
r['cmt'] = _safe_comment(data['cmt'])

if 'Token' in request.headers and request.headers['Token'] in TOKENS:
r['token'] = request.headers['Token']

out_file = RISK_REPORT_DIR / f'{datetime.now().strftime("%Y-%m-%d")}_{gethostname()}.txt'
with report_lock:
with open(out_file, 'a+', encoding='utf-8') as f:
f.write(json_dumps(r) + '\n')

return _response_json(code=200, data={'msg': 'Reported'})


@app.route('/api/ip/<ip>', methods=['GET'])
def check(ip) -> Response:
if ip.startswith('::ffff:'):
ip = ip.replace('::ffff:', '')

if not _valid_public_ip(ip):
return _response_json(code=400, data={'msg': 'Invalid IP provided'})

try:
with maxminddb.open_database(RISKY_DB_FILE[_get_ipv(ip)]) as m:
r = m.get(ip)
if r is None:
return _response_json(code=404, data={'msg': 'Provided IP not reported'})

return _response_json(code=200, data=r)

except FileNotFoundError:
return _response_json(code=404, data={'msg': 'Temporary lookup failure'})


@app.route('/api/net/<ip>', methods=['GET'])
def check_net(ip) -> Response:
if ip.startswith('::ffff:'):
ip = ip.replace('::ffff:', '')

if ip.find('/') != -1:
ip = ip.split('/', 1)[0]

if not _valid_public_ip(ip):
return _response_json(code=400, data={'msg': 'Invalid IP provided'})

ipv = _get_ipv(ip)

if ipv == 4:
net = IPv4Interface(f"{ip}/{NET_SIZE[ipv]}").network.network_address.compressed

else:
net = IPv6Interface(f"{ip}/{NET_SIZE[ipv]}").network.network_address.compressed

net = f"{net}/{NET_SIZE[ipv]}"

try:
return _response_json(code=200, data={**NET_DATA[ipv][net], 'network': net})

except KeyError:
return _response_json(code=404, data={'msg': 'Provided network not reported'})


@app.route('/api/asn/<nr>', methods=['GET'])
def check_asn(nr) -> Response:
if not _valid_asn(nr):
return _response_json(code=400, data={'msg': 'Invalid ASN provided'})

try:
return _response_json(code=200, data=ASN_DATA[str(nr)])

except KeyError:
return _response_json(code=404, data={'msg': 'Provided ASN not reported'})


@app.route('/')
def catch_base():
return redirect(f"/api/ip/{_get_src_ip()}", code=302)


@app.route('/<path:path>')
def catch_all(path):
del path
return redirect(f"/api/ip/{_get_src_ip()}", code=302)


if __name__ == '__main__':
with open(ASN_JSON_FILE, 'r', encoding='utf-8') as f:
ASN_DATA = json_loads(f.read())

NET_DATA = {}

for _ipv, file in NET_JSON_FILES.items():
with open(file, 'r', encoding='utf-8') as f:
NET_DATA[_ipv] = json_loads(f.read())

serve(app, host='127.0.0.1', port=8000)
Loading

0 comments on commit e21f1f0

Please sign in to comment.