The Anki Duplicates Finder
Report Bug
·
Request Feature
Table of Contents
I built to identify duplicate notes in Anki. The built in Anki tool only works if the field is identical. This tool will calculate a similarity score between each note-note pair and rank the most similar notes to identify the non-identical duplicates
There are two functions.
- Find Duplicates
- This will tag all of the likely duplicate pairs (for review) sorted by most similar based on the algorithm
- Merge
- You can use this tool to merge duplicate fields or tags.
I have only tested this on Windows running python 3.9, but it should work on other operating systems as well
- Python 3.9 +
- Python libraries from requirements.txt
- Clone or download and unzip the repo
git clone https://github.com/Mike7154/DupCatch.git
- Install dependencies (must have python installed and mapped)
pip install -r requirements.txt
or
py -m pip install -r requirements.txt
- You can verify the installation by running:
cd path/to/DupCatch
py dupcatch.py
or
cd path/to/DupCatch
python dupatch.py
See more detailed instructions at https://1drv.ms/w/s!Ar3STOvKhP6Ymts6IuxRUntBSumx5A?e=h9scEH
-
Copy the desired Anki package file (*.apkg, *.colpkg) to 'Dupcatch/anki_collection'
-
Modify the settings.yml (copycv settings_template.yml file if settings.yml doesn't yet exist)
- If you are doing a 'Duplicates' run, at least modify the 'Duplicates' section in settings.yml
- If you are doing a 'Merge' run, at least modify the 'Merge' section in settings.yml
- Run the Script
cd path/to/DupCatch
python -m pip install -r requirements.txt
python dupcatch.py -r
- For a Merge run use
python dupcatch.py -m
- The tool will output a new *.apkg file into DupCatch/anki_collection which will include only notes that were modified
- Review the results in Anki (I recommend using the Special Fields Addon to choose whether you want tags or a full import)
- Tag the notes to merge fields, tags, or mark as 'not duplicate' or 'covered_by'
- only the merge tags are directional, and you only tag the receiving note.
- Feature 1
- Feature 2
- Feature 3
- Nested Feature
See the open issues for a full list of proposed features (and known issues).
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
GNU General Public License. See LICENSE.txt
for more information.
You can copy, modify, and distribute this software as you please.
If you want to use this tool in proprietary software, contact me and I will send a private license agreement.
Your Name - @twitter_handle - [email protected]
Project Link: https://github.com/Mike7154/DupCatch