PyValentin - Advanced Matchmaking System

Overview

PyValentin is a sophisticated matchmaking system that uses multi-dimensional distance calculations, compatibility filtering, and grade-based matching to create optimal pairs from survey responses. The system processes raw survey data through several stages, applying mathematical models to quantify compatibility.

Features

Multi-dimensional compatibility analysis
Grade-based matching with configurable weights
Customizable gender/preference filtering
Quality vs. quantity optimization
Grade difference consideration
Multiple matching algorithms (Greedy and Hungarian)
Interactive GUI with progress tracking
Drag-and-drop file support
Comprehensive results output
Automatic dependency management

Setup and Installation

Prerequisites

Python 3.8+
Required packages (automatically installed):
```
tkinterdnd2
numpy
scipy
```

Installation

Download the latest .ZIP release
Run python core/update_dependencies.py to check and install dependencies
Configure your settings files:
- Config.json (response mappings)
- Filter.json (preference rules)
- defaults.json (default file paths)

Usage

Launch the application:
```
python main.py
```
Select required files:
- Survey responses (CSV)
- Configuration file (JSON)
- Filter rules (JSON)
- Grade data (CSV)
Adjust sliders:
- Quality-quantity balance
- Grade weight importance
Click "Process Files"
Check the genR folder for results

The Mathematics Behind PyValentin

1. Data Normalization

Before processing, all categorical survey responses are converted to numerical values using the Config.json mapping. This creates a consistent numerical space for calculations.

2. Distance Calculation

For each pair of users (i,j), we calculate a multi-dimensional Euclidean distance:

distance(i,j) = √(Σ(xi,k - xj,k)²)
where k represents each survey question

This produces a distance matrix D where D[i,j] represents how different two users are across all responses.

3. Similarity Transformation

The distance matrix is transformed into a similarity matrix using:

similarity(i,j) = 1 / (1 + distance(i,j))

This creates a normalized similarity score where:

1.0 = perfect match
0.0 = complete mismatch

4. Preference Filtering

The system applies a boolean matrix F where:

F[i,j] = 1 if preferences match
F[i,j] = 0 if preferences conflict

The final compatibility score becomes:

compatibility(i,j) = similarity(i,j) × F[i,j]

5. Optimal Pairing Algorithm

The system uses a modified stable marriage algorithm with the following steps:

Sort users by number of potential matches
For each user i:
- Find top N matches based on quality weight
- Select best available match j
- Remove both i and j from available pool

Quality weight affects match selection:

High (>0.5): Selects from top 25% of matches
Low (<0.5): Considers up to 75% of matches

6. Second Pass Optimization

For remaining unmatched users:

Create a graph of mutual matches
Find maximal matching using a greedy algorithm
Optimize for global satisfaction using local improvements

7. Grade-Based Optimization

The system incorporates grade differences into the matching process:

final_score = (1 - grade_weight) * compatibility_score + grade_weight * (1 - grade_penalty)

where:
grade_penalty = {
    0: 0.0,    # Same grade
    1: 0.3,    # One grade difference
    2: 0.7,    # Two grades difference
    3: 0.9     # Three+ grades difference
}

The grade_weight slider (0.0-1.0) determines the importance of grade matching:

0.0: Ignore grades entirely
0.7: Recommended balance (default)
1.0: Prioritize grade matching above all else

File Structure

PyValentin/
├── main.py           # Main application
├── FixCSV.py        # Data preprocessing
├── Ski.py           # Core algorithms
├── PyValentin.py    # Improved UI
├── genR/            # Generated results
├── ASF Specific/    # Configuration files
└── README.md        # Documentation

Configuration

Config.json

Defines mappings from survey responses to numerical values:

{
     "Response Text": "Numerical Value",

}

Filter.json

Defines preference matching rules:

{
     "filterables": {
          "1": "Male",
          "2": "Female",

     },
     "filters": {
          "5": ["1", "2", "3"], 
          "4": ["1", "2"],      

     }
}

Input Format

Required CSV columns:

Timestamp
Email
Gender (a)
Attracted to (b)
Question responses...

Example:

Timestamp,Email,Gender,Attracted To,Q1,Q2,...
2024-01-01,[email protected],Male,Female,Response1,Response2,...

Output Files

modified_csv.csv: Normalized survey data
processed_distances.csv: Distance matrix
similarity_list.csv: Similarity scores
filtered_similarity_list.csv: Filtered matches
optimal_pairs_greed.csv: Greedy algorithm pairs
optimal_pairs_gluttony.csv: Hungarian algorithm pairs
optimal_pairs_with_info_greed.csv: Detailed greedy matches
optimal_pairs_with_info_gluttony.csv: Detailed Hungarian matches
unpaired_entries_greed.csv: Unmatched users (greedy)
unpaired_entries_gluttony.csv: Unmatched users (Hungarian)

Customization Guide

Modifying Match Criteria

Update Config.json with new response mappings
Modify Filter.JSON matching rules
Adjust weights in Ski.py calculate_distances() function

Adding New Questions

Add response mappings to Config.json
Update CSV processing in FixCSV.py if needed
Modify distance calculation in Ski.py

Custom Filtering Rules

Modify Filter.JSON:

{
    "filterables": {
        "value": "label"
    },
    "filters": {
        "seeker_value": ["acceptable_values"]
    }
}

Technical Details

Distance Calculation

Uses normalized Euclidean distance
Configurable weights per question
Range: 0 (identical) to 1 (maximum difference)

Matching Algorithm

Converts responses to numerical values
Calculates distance matrix
Generates similarity scores
Applies filtering rules
Handles edge cases

Performance

O(n²) complexity for n participants
Memory usage: ~100MB for 1000 participants
Processing time: ~1-2 seconds per 100 participants

Troubleshooting

Common Issues

Missing dependencies
- Run pip install tkinterdnd2 numpy
- Check Python version (3.8+ required)
File format errors
- Verify CSV column order
- Check JSON syntax
- Ensure UTF-8 encoding
No matches found
- Verify filter rules
- Check response mappings
- Confirm gender/attraction values

Debug Mode

Add debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)

Contact & Support

Contact me for info on this project at: Nagusame CS on Github

License

This project is licensed under the GNU General Public License v3.0 - see below for details:

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
PyValentin-main V-5.82.2		PyValentin-main V-5.82.2
__pycache__		__pycache__
core		core
docs		docs
testing		testing
ui		ui
utils		utils
Config.json		Config.json
Filter.json		Filter.json
GNU GENERAL PUBLIC LICENSE.md		GNU GENERAL PUBLIC LICENSE.md
PyValentin.png		PyValentin.png
README.md		README.md
defaults.json		defaults.json
index.html		index.html
main.py		main.py
pygameMain.py		pygameMain.py
requirements.txt		requirements.txt
update_dependencies.py		update_dependencies.py

NagusameCS/PyValentin

Folders and files

Latest commit

History

Repository files navigation