-
Notifications
You must be signed in to change notification settings - Fork 19
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
c56cc7b
commit f9c27a3
Showing
7 changed files
with
240 additions
and
6 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,109 @@ | ||
# Database | ||
|
||
## Overview | ||
|
||
PictoPy uses several SQLite databases to manage various aspects of the application. This document provides an overview of each database, its structure, and its primary operations. | ||
|
||
!!! note "Database Engine" | ||
All databases in PictoPy use SQLite, a lightweight, serverless database engine. | ||
|
||
## Album Database | ||
|
||
### File Location | ||
|
||
The database path is defined in the configuration file as `ALBUM_DATABASE_PATH`. | ||
|
||
### Table Structure | ||
|
||
| Column Name | Data Type | Constraints | Description | | ||
|-------------|-----------|-------------|-------------| | ||
| album_name | TEXT | PRIMARY KEY | Unique name of the album | | ||
| image_ids | TEXT | | JSON-encoded list of image IDs | | ||
| description | TEXT | | Album description | | ||
| date_created | TIMESTAMP | DEFAULT CURRENT_TIMESTAMP | Creation date of the album | | ||
|
||
### Functionality | ||
|
||
The `albums.py` file contains functions for managing photo albums. It allows for creating and deleting albums, adding and removing photos from albums, retrieving album photos, editing album descriptions, and getting all albums. | ||
|
||
!!! tip "JSON Encoding" | ||
The `image_ids` field uses JSON encoding to store lists in a TEXT field. | ||
|
||
## Faces Database | ||
|
||
### File Location | ||
|
||
The database path is defined in the configuration file as `FACES_DATABASE_PATH`. | ||
|
||
### Table Structure | ||
|
||
| Column Name | Data Type | Constraints | Description | | ||
|-------------|-----------|-------------|-------------| | ||
| id | INTEGER | PRIMARY KEY AUTOINCREMENT | Unique identifier for each face entry | | ||
| image_id | INTEGER | FOREIGN KEY | References image_id_mapping(id) | | ||
| embeddings | TEXT | | JSON-encoded face embeddings | | ||
|
||
### Functionality | ||
|
||
The `faces.py` file manages face embeddings for images. It provides functionality for inserting and retrieving face embeddings, getting all face embeddings, deleting face embeddings for an image, and cleaning up orphaned face embeddings. | ||
|
||
!!! warning "Referential Integrity" | ||
The `image_id` column maintains referential integrity with the Images database. | ||
|
||
## Images Database | ||
|
||
### File Location | ||
|
||
The database path is defined in the configuration file as `IMAGES_DATABASE_PATH`. | ||
|
||
### Table Structures | ||
|
||
#### 1. image_id_mapping | ||
|
||
| Column Name | Data Type | Constraints | Description | | ||
|-------------|-----------|-------------|-------------| | ||
| id | INTEGER | PRIMARY KEY AUTOINCREMENT | Unique identifier for each image | | ||
| path | TEXT | UNIQUE | Absolute path to the image file | | ||
|
||
#### 2. images | ||
|
||
| Column Name | Data Type | Constraints | Description | | ||
|-------------|-----------|-------------|-------------| | ||
| id | INTEGER | PRIMARY KEY, FOREIGN KEY | References image_id_mapping(id) | | ||
| class_ids | TEXT | | JSON-encoded class IDs | | ||
| metadata | TEXT | | JSON-encoded metadata | | ||
|
||
### Functionality | ||
|
||
The `images.py` file manages image information, including paths, object classes, and metadata. It provides functions for inserting and deleting images, retrieving image paths and IDs, getting object classes for an image, and checking if an image is in the database. | ||
|
||
!!! info "Path Handling" | ||
The system uses absolute paths for image files to ensure consistency across different operations. | ||
|
||
## YOLO Mappings Database | ||
|
||
### File Location | ||
|
||
The database path is defined in the configuration file as `MAPPINGS_DATABASE_PATH`. | ||
|
||
### Table Structure | ||
|
||
| Column Name | Data Type | Constraints | Description | | ||
|-------------|-----------|-------------|-------------| | ||
| class_id | INTEGER | PRIMARY KEY | YOLO class identifier | | ||
| name | TEXT | NOT NULL | Human-readable class name | | ||
|
||
### Functionality | ||
|
||
The `yolo_mapping.py` file is responsible for creating and populating the mappings table with YOLO class names. This database stores mappings between YOLO class IDs and their corresponding names. | ||
|
||
## Database Interactions | ||
|
||
The databases in PictoPy interact with each other in the following ways: | ||
|
||
1. The Albums database uses image IDs from the Images database to manage photos within albums. | ||
2. The Faces database references image IDs from the Images database to associate face embeddings with specific images. | ||
3. The Images database uses class IDs that correspond to the YOLO Mappings database for object recognition. | ||
|
||
!!! example "Cross-Database Operation" | ||
When adding a photo to an album, the system first checks if the image exists in the Images database, then adds its ID to the album in the Albums database. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,95 @@ | ||
# Image Processing | ||
|
||
|
||
We use `asyncio` for processing multiple images at the same time in the background without blocking the frontend, this can be found in | ||
`app/routes/images.py`. | ||
|
||
PictoPy uses different models for achieving its tagging capabilities. | ||
The discussed models below are default models, you can change them by going to `app/models` directory and change the paths in the configuration files. | ||
|
||
## Object Detection with YOLOv8 | ||
|
||
We use YOLOv8 to spot objects in your photos. Here's what it does: | ||
|
||
YOLOv8 takes your image and runs it through its model. It figures out what objects are in the image and where they are. | ||
The result is a list of objects, their locations, and how confident the model is about each detection. If a `person` class is predicted we pass it on | ||
to the face detection model which we discuss in the next section. | ||
|
||
|
||
???+ tip "Fun Fact" | ||
YOLO stands for "You Only Look Once". We use the model provided by [Ultralytics](https://github.com/ultralytics/ultralytics) by default. | ||
|
||
## Face Detection and Recognition | ||
|
||
For faces, we do a bit more: | ||
|
||
We start with a special version of YOLOv8 that's really good at finding faces. Once we find a face, we zoom in on it | ||
(by cropping it to `160x160` - the shape FaceNet expects) and pass it to our FaceNet model. | ||
FaceNet then creates a unique 'embedding' for each face, the representation of of the face in a form of numbers. | ||
|
||
|
||
???+ tip "Fun Fact" | ||
We use another YOLOv8 model for this as well by default. This was pretrained on top of the one provided by Ultralytics and is called | ||
[yolov8-face](https://github.com/akanametov/yolo-face) | ||
|
||
???+ note "What's an embedding?" | ||
An embedding is a bunch of numbers that represent the face. Similar faces will have similar numbers. FaceNet creates a 512 embedding array | ||
if an image has | ||
|
||
## Face Clustering | ||
|
||
Now, here's where it gets interesting: | ||
|
||
We use something called DBSCAN to group similar faces together. This process happens automatically as you add new photos to the system, we perform reclustering | ||
after every 5 photos are added (this can be changed in the code) but apart from that, the photos are assigned a cluster based on the embedding distance | ||
of the faces in the photo with the mean of each of the clusters. | ||
|
||
## How It All Fits Together | ||
|
||
When you add a new photo, we first look for objects and faces. If we find faces, we generate embeddings for them. These embeddings then get added to our face clusters. | ||
All this information gets stored in our database so we can find it later. | ||
|
||
|
||
## Under the Hood | ||
|
||
We're using ONNX runtime to run our AI models quickly. Everything's stored in SQLite databases, making it easy to manage. | ||
The system updates clusters as you add or remove photos, so it keeps getting smarter over time. | ||
|
||
## PictoPy Model Parameters | ||
|
||
Here are some key parameters for the main models used in PictoPy's image processing pipeline. | ||
|
||
### YOLOv8 Object Detection | ||
|
||
| Parameter | Value | Description | | ||
|-----------|-------|-------------| | ||
| `conf_thres` | 0.7 | Confidence threshold for object detection | | ||
| `iou_thres` | 0.5 | IoU (Intersection over Union) threshold for NMS | | ||
| Input Shape | Varies | Determined dynamically from the model | | ||
| Output | Multiple | Includes bounding boxes, scores, and class IDs | | ||
|
||
### Face Detection (YOLOv8 variant) | ||
|
||
| Parameter | Value | Description | | ||
|-----------|-------|-------------| | ||
| `conf_thres` | 0.2 | Confidence threshold for face detection | | ||
| `iou_thres` | 0.3 | IoU threshold for NMS in face detection | | ||
| Model Path | `DEFAULT_FACE_DETECTION_MODEL` | Path to the face detection model file | | ||
|
||
### FaceNet (Face Recognition) | ||
|
||
| Parameter | Value | Description | | ||
|-----------|-------|-------------| | ||
| Model Path | `DEFAULT_FACENET_MODEL` | Path to the FaceNet model file | | ||
| Input Shape | (1, 3, 160, 160) | Expected input shape for face images | | ||
| Output | 512-dimensional vector | Face embedding dimension | | ||
|
||
### Face Clustering (DBSCAN) | ||
|
||
| Parameter | Value | Description | | ||
|-----------|-------|-------------| | ||
| `eps` | 0.3 | Maximum distance between two samples for them to be considered as in the same neighborhood | | ||
| `min_samples` | 2 | Number of samples in a neighborhood for a point to be considered as a core point | | ||
| `metric` | "cosine" | Distance metric used for clustering | | ||
|
||
Note: Some of these values are default parameters and can be adjusted when initializing the models or during runtime, depending on the specific use case or performance requirements. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,20 @@ | ||
# Architecture | ||
## Frontend | ||
|
||
## Backend | ||
<div style="text-align: center;"> | ||
<img src="../../assets/backend-architecture.jpeg" alt="Backend Architecture" style="width: 80%; max-width: 600px; height: auto; display: block; margin: 0 auto;"> | ||
</div> | ||
|
||
<br> | ||
For the backend, we rely on several techstack, our database is served on sqlite while we using parallel processing capabilities of asyncio due to its compatibility | ||
with FastAPI. Our models are from various sources, we use YOLO models for object and face detection while we use FaceNet for generating the embeddings | ||
of the faces detected. All these models are run on ONNX runtime to avoid heavy dependancies, keeping the application light weight. | ||
|
||
|
||
We use DBSCAN algorithm to perform clustering for face embeddings generated. All of our database is in SQL (sqlite) and our API calls rely | ||
on queries from the backend. | ||
|
||
!!! note "Note" | ||
We discuss all of the features and configuration of our application in further sections of the documentation. They can be used for both developers | ||
as well as users who want to use the app. A postman collection has also been added which can be found in our API section. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -78,11 +78,19 @@ nav: | |
- Gallery View: frontend/gallery-view.md | ||
|
||
extra: | ||
analytics: | ||
provider: google | ||
property: G-N3Q505TMQ6 | ||
social: | ||
- icon: fontawesome/brands/twitter | ||
link: https://x.com/aossie_org | ||
- icon: fontawesome/solid/envelope | ||
link: mailto:[email protected] | ||
name: Contact by Email | ||
- icon: fontawesome/brands/gitlab | ||
link: https://gitlab.com/aossie | ||
name: AOSSIE on GitLab | ||
- icon: fontawesome/brands/github | ||
link: https://github.com/AOSSIE-Org/PictoPy/ | ||
name: PictoPy on GitHub | ||
- icon: fontawesome/brands/discord | ||
link: https://discord.com/invite/6mFZ2S846n | ||
name: Join AOSSIE on Discord | ||
- icon: fontawesome/brands/twitter | ||
link: https://x.com/aossie_org | ||
name: Follow AOSSIE on Twitter |