Skip to content

Commit

Permalink
[finish] backend documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
bassamadnan committed Aug 17, 2024
1 parent c56cc7b commit f9c27a3
Show file tree
Hide file tree
Showing 7 changed files with 240 additions and 6 deletions.
Binary file added docs/assets/AOSSIE-logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/backend-architecture.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
108 changes: 108 additions & 0 deletions docs/backend/database.md
Original file line number Diff line number Diff line change
@@ -1 +1,109 @@
# Database

## Overview

PictoPy uses several SQLite databases to manage various aspects of the application. This document provides an overview of each database, its structure, and its primary operations.

!!! note "Database Engine"
All databases in PictoPy use SQLite, a lightweight, serverless database engine.

## Album Database

### File Location

The database path is defined in the configuration file as `ALBUM_DATABASE_PATH`.

### Table Structure

| Column Name | Data Type | Constraints | Description |
|-------------|-----------|-------------|-------------|
| album_name | TEXT | PRIMARY KEY | Unique name of the album |
| image_ids | TEXT | | JSON-encoded list of image IDs |
| description | TEXT | | Album description |
| date_created | TIMESTAMP | DEFAULT CURRENT_TIMESTAMP | Creation date of the album |

### Functionality

The `albums.py` file contains functions for managing photo albums. It allows for creating and deleting albums, adding and removing photos from albums, retrieving album photos, editing album descriptions, and getting all albums.

!!! tip "JSON Encoding"
The `image_ids` field uses JSON encoding to store lists in a TEXT field.

## Faces Database

### File Location

The database path is defined in the configuration file as `FACES_DATABASE_PATH`.

### Table Structure

| Column Name | Data Type | Constraints | Description |
|-------------|-----------|-------------|-------------|
| id | INTEGER | PRIMARY KEY AUTOINCREMENT | Unique identifier for each face entry |
| image_id | INTEGER | FOREIGN KEY | References image_id_mapping(id) |
| embeddings | TEXT | | JSON-encoded face embeddings |

### Functionality

The `faces.py` file manages face embeddings for images. It provides functionality for inserting and retrieving face embeddings, getting all face embeddings, deleting face embeddings for an image, and cleaning up orphaned face embeddings.

!!! warning "Referential Integrity"
The `image_id` column maintains referential integrity with the Images database.

## Images Database

### File Location

The database path is defined in the configuration file as `IMAGES_DATABASE_PATH`.

### Table Structures

#### 1. image_id_mapping

| Column Name | Data Type | Constraints | Description |
|-------------|-----------|-------------|-------------|
| id | INTEGER | PRIMARY KEY AUTOINCREMENT | Unique identifier for each image |
| path | TEXT | UNIQUE | Absolute path to the image file |

#### 2. images

| Column Name | Data Type | Constraints | Description |
|-------------|-----------|-------------|-------------|
| id | INTEGER | PRIMARY KEY, FOREIGN KEY | References image_id_mapping(id) |
| class_ids | TEXT | | JSON-encoded class IDs |
| metadata | TEXT | | JSON-encoded metadata |

### Functionality

The `images.py` file manages image information, including paths, object classes, and metadata. It provides functions for inserting and deleting images, retrieving image paths and IDs, getting object classes for an image, and checking if an image is in the database.

!!! info "Path Handling"
The system uses absolute paths for image files to ensure consistency across different operations.

## YOLO Mappings Database

### File Location

The database path is defined in the configuration file as `MAPPINGS_DATABASE_PATH`.

### Table Structure

| Column Name | Data Type | Constraints | Description |
|-------------|-----------|-------------|-------------|
| class_id | INTEGER | PRIMARY KEY | YOLO class identifier |
| name | TEXT | NOT NULL | Human-readable class name |

### Functionality

The `yolo_mapping.py` file is responsible for creating and populating the mappings table with YOLO class names. This database stores mappings between YOLO class IDs and their corresponding names.

## Database Interactions

The databases in PictoPy interact with each other in the following ways:

1. The Albums database uses image IDs from the Images database to manage photos within albums.
2. The Faces database references image IDs from the Images database to associate face embeddings with specific images.
3. The Images database uses class IDs that correspond to the YOLO Mappings database for object recognition.

!!! example "Cross-Database Operation"
When adding a photo to an album, the system first checks if the image exists in the Images database, then adds its ID to the album in the Albums database.
94 changes: 94 additions & 0 deletions docs/backend/image-processing.md
Original file line number Diff line number Diff line change
@@ -1 +1,95 @@
# Image Processing


We use `asyncio` for processing multiple images at the same time in the background without blocking the frontend, this can be found in
`app/routes/images.py`.

PictoPy uses different models for achieving its tagging capabilities.
The discussed models below are default models, you can change them by going to `app/models` directory and change the paths in the configuration files.

## Object Detection with YOLOv8

We use YOLOv8 to spot objects in your photos. Here's what it does:

YOLOv8 takes your image and runs it through its model. It figures out what objects are in the image and where they are.
The result is a list of objects, their locations, and how confident the model is about each detection. If a `person` class is predicted we pass it on
to the face detection model which we discuss in the next section.


???+ tip "Fun Fact"
YOLO stands for "You Only Look Once". We use the model provided by [Ultralytics](https://github.com/ultralytics/ultralytics) by default.

## Face Detection and Recognition

For faces, we do a bit more:

We start with a special version of YOLOv8 that's really good at finding faces. Once we find a face, we zoom in on it
(by cropping it to `160x160` - the shape FaceNet expects) and pass it to our FaceNet model.
FaceNet then creates a unique 'embedding' for each face, the representation of of the face in a form of numbers.


???+ tip "Fun Fact"
We use another YOLOv8 model for this as well by default. This was pretrained on top of the one provided by Ultralytics and is called
[yolov8-face](https://github.com/akanametov/yolo-face)

???+ note "What's an embedding?"
An embedding is a bunch of numbers that represent the face. Similar faces will have similar numbers. FaceNet creates a 512 embedding array
if an image has

## Face Clustering

Now, here's where it gets interesting:

We use something called DBSCAN to group similar faces together. This process happens automatically as you add new photos to the system, we perform reclustering
after every 5 photos are added (this can be changed in the code) but apart from that, the photos are assigned a cluster based on the embedding distance
of the faces in the photo with the mean of each of the clusters.

## How It All Fits Together

When you add a new photo, we first look for objects and faces. If we find faces, we generate embeddings for them. These embeddings then get added to our face clusters.
All this information gets stored in our database so we can find it later.


## Under the Hood

We're using ONNX runtime to run our AI models quickly. Everything's stored in SQLite databases, making it easy to manage.
The system updates clusters as you add or remove photos, so it keeps getting smarter over time.

## PictoPy Model Parameters

Here are some key parameters for the main models used in PictoPy's image processing pipeline.

### YOLOv8 Object Detection

| Parameter | Value | Description |
|-----------|-------|-------------|
| `conf_thres` | 0.7 | Confidence threshold for object detection |
| `iou_thres` | 0.5 | IoU (Intersection over Union) threshold for NMS |
| Input Shape | Varies | Determined dynamically from the model |
| Output | Multiple | Includes bounding boxes, scores, and class IDs |

### Face Detection (YOLOv8 variant)

| Parameter | Value | Description |
|-----------|-------|-------------|
| `conf_thres` | 0.2 | Confidence threshold for face detection |
| `iou_thres` | 0.3 | IoU threshold for NMS in face detection |
| Model Path | `DEFAULT_FACE_DETECTION_MODEL` | Path to the face detection model file |

### FaceNet (Face Recognition)

| Parameter | Value | Description |
|-----------|-------|-------------|
| Model Path | `DEFAULT_FACENET_MODEL` | Path to the FaceNet model file |
| Input Shape | (1, 3, 160, 160) | Expected input shape for face images |
| Output | 512-dimensional vector | Face embedding dimension |

### Face Clustering (DBSCAN)

| Parameter | Value | Description |
|-----------|-------|-------------|
| `eps` | 0.3 | Maximum distance between two samples for them to be considered as in the same neighborhood |
| `min_samples` | 2 | Number of samples in a neighborhood for a point to be considered as a core point |
| `metric` | "cosine" | Distance metric used for clustering |

Note: Some of these values are default parameters and can be adjusted when initializing the models or during runtime, depending on the specific use case or performance requirements.
7 changes: 6 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,13 @@
PictoPy is a modern desktop app designed to transform the handling of digital photos. It facilitates efficient gallery management with a robust focus on privacy, offering smart tagging capabilities for photos based on objects, faces, or scenes.

<br>
This project was announced by AOSSIE, an umbrella organization and was to be implemented from scratch. It provides features such as object detection and face similarity,
<div style="text-align: center;">
<img src="assets/AOSSIE-logo.png" alt="AOSSIE Logo" style="display:flex; margin:0 auto; justify-content: center;">
</div>

This project was announced by [AOSSIE](https://aossie.org/), an umbrella organization and was to be implemented from scratch. It provides features such as object detection and face similarity,
offering smart tagging capabilities for photos based on objects, faces.

<div style="display:flex; margin:0 auto; justify-content: center;">
<div style="width:33%">
<h2>Overview</h2>
Expand Down
19 changes: 19 additions & 0 deletions docs/overview/architecture.md
Original file line number Diff line number Diff line change
@@ -1 +1,20 @@
# Architecture
## Frontend

## Backend
<div style="text-align: center;">
<img src="../../assets/backend-architecture.jpeg" alt="Backend Architecture" style="width: 80%; max-width: 600px; height: auto; display: block; margin: 0 auto;">
</div>

<br>
For the backend, we rely on several techstack, our database is served on sqlite while we using parallel processing capabilities of asyncio due to its compatibility
with FastAPI. Our models are from various sources, we use YOLO models for object and face detection while we use FaceNet for generating the embeddings
of the faces detected. All these models are run on ONNX runtime to avoid heavy dependancies, keeping the application light weight.


We use DBSCAN algorithm to perform clustering for face embeddings generated. All of our database is in SQL (sqlite) and our API calls rely
on queries from the backend.

!!! note "Note"
We discuss all of the features and configuration of our application in further sections of the documentation. They can be used for both developers
as well as users who want to use the app. A postman collection has also been added which can be found in our API section.
18 changes: 13 additions & 5 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,11 +78,19 @@ nav:
- Gallery View: frontend/gallery-view.md

extra:
analytics:
provider: google
property: G-N3Q505TMQ6
social:
- icon: fontawesome/brands/twitter
link: https://x.com/aossie_org
- icon: fontawesome/solid/envelope
link: mailto:[email protected]
name: Contact by Email
- icon: fontawesome/brands/gitlab
link: https://gitlab.com/aossie
name: AOSSIE on GitLab
- icon: fontawesome/brands/github
link: https://github.com/AOSSIE-Org/PictoPy/
name: PictoPy on GitHub
- icon: fontawesome/brands/discord
link: https://discord.com/invite/6mFZ2S846n
name: Join AOSSIE on Discord
- icon: fontawesome/brands/twitter
link: https://x.com/aossie_org
name: Follow AOSSIE on Twitter

0 comments on commit f9c27a3

Please sign in to comment.