UVVisML

Predict optical properties of molecules with machine learning.

Colab Examples

A Google Colab notebook is available here with examples of using the various types of models and predictions. Alternatively, you may use the command line instructions below.

Command Line Setup

Install Anaconda or Miniconda if you have not yet done so.
git clone [email protected]:learningmatter-mit/uvvisml.git
cd uvvisml
conda env create -f environment.yml
cd uvvisml
bash get_model_files.sh (This downloads trained model files from Zenodo.)
conda activate uvvisml
pip install chemprop

Making Predictions

Test file

To make predictions, specify a --test_file with the dyes or dye-solvent pairs for which you wish to predict properties. This should be a CSV with one dye (for vacuum TD-DFT predictions) or dye-solvent pair (for experimental predictions) per line. For example, the test file for vacuum TD-DFT predictions could be:

smiles
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1
CCN(CC)c1ccc2cc(-c3nc4ccccc4n3C)c(=O)oc2c1
C[SiH](C)c1cccc2ccccc12

The test file for experimental predictions could be:

smiles,solvent
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,C1CCCCC1
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,CCOC(C)=O
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,CC#N
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,CCO
CCN(CC)c1ccc2c(C(F)(F)F)cc(=O)oc2c1,OCC(O)CO
CCN(CC)c1ccc2cc(-c3nc4ccccc4n3C)c(=O)oc2c1,CC#N
C[SiH](C)c1cccc2ccccc12,C1CCCCC1

Property

Experimental peak wavelength of maximum absorption: --property absorption_peak_nm_expt
Vertical excitation energy with maximum oscillator strength in vacuum TD-DFT: --property vertical_excitation_eV_tddft

Method

Single-fidelity (experiment or TD-DFT): --method chemprop
Multi-fidelity (experiment only): --method chemprop_tddft

Train dataset

Experiment: --train_dataset combined (default) or --train_dataset deep4chem
TD-DFT: --train_dataset all_wb97xd3

Cluster

Cluster that the script will be run on. Includes options for Supercloud and Engaging clusters at MIT. Default of None runs the script on the local machine.

Uncertainty in Predictions

Output the ensemble variance (a measure of epistemic uncertainty) in predictions using --uncertainty_method ensemble_variance.

Examples

python uvvisml/predict.py --test_file uvvisml/data/splits/lambda_max_abs/deep4chem/group_by_smiles/smiles_target_test.csv --property absorption_peak_nm_expt --method chemprop --preds_file test_preds.csv

python uvvisml/predict.py --test_file uvvisml/data/splits/lambda_max_abs/deep4chem/group_by_smiles/smiles_target_test.csv --property vertical_excitation_eV_tddft --method chemprop --preds_file test_preds.csv

python uvvisml/predict.py --test_file uvvisml/data/splits/lambda_max_abs/deep4chem/group_by_smiles/smiles_target_test.csv --property absorption_peak_nm_expt --method chemprop --preds_file test_preds.csv --train_dataset deep4chem

python uvvisml/predict.py --test_file uvvisml/data/splits/lambda_max_abs/deep4chem/group_by_smiles/smiles_target_test.csv --property absorption_peak_nm_expt --method chemprop_tddft --preds_file test_preds.csv --log_level info

Data

Please see the Data README for details on the sources and processing of the data used in this repository.

Citation

If you use this code, please cite the following manuscript:

@article{greenman2022multi,
  title={Multi-fidelity prediction of molecular optical peaks with deep learning},
  author={Greenman, Kevin P. and Green, William H. and G{\'{o}}mez-Bombarelli, Rafael},
  journal={Chemical Science},
  year={2022},
  volume={13},
  issue={4},
  pages={1152-1162},
  publisher={The Royal Society of Chemistry},
  doi={10.1039/D1SC05677H},
  url={http://dx.doi.org/10.1039/D1SC05677H}
}

The code for reproducing the results and figures from the above paper is available on Zenodo.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
uvvisml		uvvisml
.gitignore		.gitignore
CITATION.bib		CITATION.bib
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
uvvisml_demo.ipynb		uvvisml_demo.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UVVisML

Colab Examples

Command Line Setup

Making Predictions

Test file

Property

Method

Train dataset

Cluster

Uncertainty in Predictions

Examples

Data

Citation

About

Releases 2

Packages

Languages

License

learningmatter-mit/uvvisml

Folders and files

Latest commit

History

Repository files navigation

UVVisML

Colab Examples

Command Line Setup

Making Predictions

Test file

Property

Method

Train dataset

Cluster

Uncertainty in Predictions

Examples

Data

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages