v0.7.1 speedup, for dev only
Pre-release
Pre-release
For dev only
Speed up by:
- default n_iters = 101 now
- removing unnecessary and duplicated calculations
- in place calculations whenever possible
- no_grad when no need
- better shape initialization and tensor slicing to have fewer reshape calls
- no large memory copy between CPU and GPU inside the optimization loop
Not yet put to the code base:
- remove calculate_metrics (21-sec to 17-sec speed up for the following test case), need to retain the inside percentage
- sample coords to mimic stochastic gradient descent (the random indices generation might take time, might reduce accuracy and then need more n_iters)
- separate molecules so as to remove the div by num_atoms averaging calculation (this might also allow better memory usage)
(.venv) D:\GIT\DiffFitStar\demo\5wvi>python D:\GIT\DiffFit\src\DiffAtomComp.py --target_vol D:\GIT\DiffFitStar\demo\5wvi\6693_seg\emd_6693_region_4208.mrc --target_surface_threshold 0.3 --structures_dir D:\GIT\DiffFitStar\demo\5wvi\AF_input --out_dir D:\GIT\DiffFitStar\demo\5wvi\DF_torch_profile --out_dir_exist_ok True --N_shifts 10 --device cuda --n_iters 101
Full Changelog: v0.7.0...v0.7.1