Try it online: https://bibtex-clean-tool.streamlit.app/
Demo Video: https://www.bilibili.com/video/BV1XcRvYhE27
Inspired by the cool repos bibtex-tidy and arxiv-latex-cleaner, this repository automates the process of cleaning, organizing, and verifying .bib
bibliography files.
It consists of two primary components:
- BibTeX Cleaner & Formatter (
cleaner.py
) - Reorders.bib
entries based on citation order in the.tex
file, removes duplicates, and appends unused references at the end. This was originally adapted from the repo SFRL/clean_bibtex. - Double-Checker (
checker.py
) - Searches IEEE Xplore for accurate BibTeX citations and updates.bib
entries while preserving the original keys. This is necessary because sources like Google Scholar often have incorrect years, incomplete metadata, or lack authoritative information.
- 🔄 Reorder
.bib
entries according to citation order in the.tex
file. - 🏷️ Add reference comments (
% reference 01
,% reference 02
, etc.) to track ordering. - ❌ Remove duplicate citations, ensuring a concise bibliography.
- 📌 Remove/Preserve unused entries, appending them at the end for later use.
- 🔤 Ensure proper acronym formatting by wrapping specified terms in
\text{}
within the title field.
- 🌐 Automatically search IEEE Xplore, ensuring accurate metadata.
- 🔑 Maintain original BibTeX keys, replacing only outdated information.
- ⏳ Process a configurable number of entries (default: 60) with a progress bar.
Ensure you have Python installed, then install dependencies:
pip install -r requirements.txt
You also need to install the appropriate WebDriver (e.g., ChromeDriver for Google Chrome) if using automated web scraping.
streamlit run app.py
python cleaner.py [bib_file] [tex_file] [--keep] [--wrap-text]
bib_file
(optional, default: ref.bib): Path to your.bib
file.tex_file
(optional, default: main.tex): Path to your.tex
file.--keep
: Preserve unused entries in the cleaned bibliography.--wrap-text
: Wrap the first word in the title field with \text{} for proper LaTeX formatting.
python checker.py [bib_file] [--num <number_of_entries>] [--remove_unselected]
bib_file
(optional, default: cleaned_ref.bib): Path to your.bib
file.--num
(optional, default: 60): Number of bibliography entries to check and update.--remove_unselected
: Remove entries that were not selected during the checking process.
Developed and tested on Apple M1, macOS Sonoma 14.7.3.
Feel free to open issues or pull requests if you encounter any problems or have suggestions for improvement. If you find this project useful, consider giving it a ⭐ on GitHub!
Actually, after the repo was built, I found that there are also many other repositories, like bib-cleaner and bib-world, that can also remove duplicates and unused entries. I particularly like the use case of bib-cleaner.
However, our tool provides more features, such as automatic ordering of .bib
entries according to citation sequence, IEEE Xplore verification. Hope it helps you to some extent!!! 😊