Skip to content

Machine-learned, GPU-accelerated particle flow reconstruction for CMS

Notifications You must be signed in to change notification settings

kyrxanthos/particleflow

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Notes on modernizing CMS particle flow with machine learning. Internal documentation and results can be found at https://twiki.cern.ch/twiki/bin/view/CMS/MLParticleFlow.

Quickstart on Caltech iBanks:

#get the code
git clone https://github.com/jpata/particleflow.git
cd particleflow

#run a small prepared training
./test/train.sh

...wait...

#look at the output
ls data/PFNet*/epoch*/

Overview

  • set up datasets and ntuples for detailed PF analysis
  • reproduce existing PFCandidates with machine learning
    • end-to-end training of elements to MLPF-candidates using GNN-s
  • reconstruct genparticles directly from detector elements a la HGCAL, neutrino experiments etc
    • set up datasets for regression genparticles from elements
    • Develop a baseline ML-PF model that is able to regress pions and neutral hadrons (currently GravNet-512 with 2D radius-graph)
    • develop improved loss function for event-to-event comparison: EMD, GAN
    • Improve ML-PF model physics performance
    • Improve ML-PF model computational performance
    • Create CMSSW EDProducer for ML-PF particles
    • GPU-evaluation of MLPFProducer in CMSSW
    • Implement a simple tensorflow based ML-PF training for evalutation in CMSSW
  • GPU code for existing PF algorithms
    • test CLUE for element to block clustering
    • port CLUE to PFBlockAlgo in CMSSW
    • parallelize PFAlgo calls on blocks
    • GPU-implementation of PFAlgo
    • GPU-implementation of PFBlockAlgo distances

Presentations

In case the above links do not load, the presentations are also mirrored on the following CERNBox link: https://cernbox.cern.ch/index.php/s/GkIRJU1YZuai4ix

Other relevant issues, repos, PR-s:

Setting up the code

CMSSW recipe from setup.sh:

source /cvmfs/cms.cern.ch/cmsset_default.sh
export SCRAM_ARCH=slc7_amd64_gcc820

scramv1 project CMSSW CMSSW_11_1_0_pre5
cd CMSSW_11_1_0_pre5/src
eval `scramv1 runtime -sh`
git cms-init

git remote add -f jpata https://github.com/jpata/cmssw
git fetch -a jpata

git cms-addpkg RecoParticleFlow/PFProducer
git cms-addpkg Validation/RecoParticleFlow
git cms-addpkg SimGeneral/CaloAnalysis/
git cms-addpkg SimGeneral/MixingModule/

git checkout -b jpata_pfntuplizer --track jpata/jpata_pfntuplizer

#just to get an exact version of the code
git checkout 0fdcc0e8b6d848473170f0dc904468fa8a953aa8

#download the MLPF weight file
mkdir -p RecoParticleFlow/PFProducer/data/mlpf/
wget http://login-1.hep.caltech.edu/~jpata/particleflow/2020-05/models/mlpf_2020_05_19.pb -O RecoParticleFlow/PFProducer/data/mlpf/mlpf_2020_05_19.pb

scram b

#Run a small test of ML-PF
cmsRun RecoParticleFlow/PFProducer/test/mlpf_producer.py
edmDumpEventContent test.root | grep -i mlpf

#Run ML-PF within the reco framework up to ak4PFJets / ak4MLPFJets
cmsDriver.py step3 --runUnscheduled --conditions auto:phase1_2021_realistic \
  -s RAW2DIGI,L1Reco,RECO,RECOSIM,EI,PAT \
  --datatier MINIAODSIM --nThreads 1 -n 10 --era Run3 \
  --eventcontent MINIAODSIM --geometry=DB.Extended \
  --filein /store/relval/CMSSW_11_0_0_patch1/RelValQCD_FlatPt_15_3000HS_14/GEN-SIM-DIGI-RAW/PU_110X_mcRun3_2021_realistic_v6-v1/20000/087F3A84-A56F-784B-BE13-395D75616CC5.root \
  --customise RecoParticleFlow/PFProducer/mlpfproducer_customize.customize_step3 \
  --fileout file:step3_inMINIAODSIM.root

Datasets

  • May 2020
    • TTbar with PU for PhaseI, privately generated, 20k events
      • flat ROOT: /storage/group/gpu/bigdata/particleflow/TTbar_14TeV_TuneCUETP8M1_cfi/pfntuple_*.root
      • pickled graph data: /storage/group/gpu/bigdata/particleflow/TTbar_14TeV_TuneCUETP8M1_cfi/raw/*.pkl
      • processed pytorch: /storage/user/jpata/particleflow/data/TTbar_14TeV_TuneCUETP8M1_cfi/processed/*.pt
      • processed TFRecord: /storage/group/gpu/bigdata/particleflow/TTbar_14TeV_TuneCUETP8M1_cfi/tfr2/cand/*.tfrecords

Creating the datasets

cd test
mkdir TTbar_14TeV_TuneCUETP8M1_cfi
python prepare_args.py > args.txt
condor_submit genjob.jdl

Contents of the flat ROOT output ntuple

The ROOT ntuple contains all PFElements, PFCandidates and GenParticles, along with the links. The following code creates the networkx graph data and a normalized data table:

#process a single file from ROOT to pickle, saving each event into a separate file
python test/postprocessing2.py --input data/TTbar_14TeV_TuneCUETP8M1_cfi/pfntuple_1.root --events-per-file 1 --save-full-graph --save-normalized-table

#produce the pytorch processed dataset, merging 5 pickle files into one pytorch file
python test/graph_data.py --dataset data/TTbar_14TeV_TuneCUETP8M1_cfi --num-files-merge 5

Contents of the numpy ntuple

For more details, see data.ipynb.

Model training

Main training code in train_end2end.py. See the example script in train.sh on how to run the training.

Model validation

Notebook: test_end2end

Acknowledgements

Part of this work was conducted at iBanks, the AI GPU cluster at Caltech. We acknowledge NVIDIA, SuperMicro and the Kavli Foundation for their support of iBanks.

About

Machine-learned, GPU-accelerated particle flow reconstruction for CMS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.6%
  • Other 0.4%