Code repository for the paper "Tracking People by Predicting 3D Appearance, Location & Pose".
Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik.
This code repository provides a code implementation for our paper PHALP, with installation, preparing datasets, and evaluating on datasets, and a demo code to run on any youtube videos.
Abstract : In this paper, we present an approach for tracking people in monocular videos, by predicting their future 3D representations. To achieve this, we first lift people to 3D from a single frame in a robust way. This lifting includes information about the 3D pose of the person, his or her location in the 3D space, and the 3D appearance. As we track a person, we collect 3D observations over time in a tracklet representation. Given the 3D nature of our observations, we build temporal models for each one of the previous attributes. We use these models to predict the future state of the tracklet, including 3D location, 3D appearance, and 3D pose. For a future frame, we compute the similarity between the predicted state of a tracklet and the single frame observations in a probabilistic manner. Association is solved with simple Hungarian matching, and the matches are used to update the respective tracklets. We evaluate our approach on various benchmarks and report state-of-the-art results.
We recommend creating a clean conda environment and install all dependencies. You can do this as follows:
conda env create -f scripts/_environment.yml
After the installation is complete you can activate the conda environment by running:
conda activate PHALP
Install PyOpenGL, TrackEval, PyTube and Neural Mesh Renderer from their respective repositories:
./scripts/setup.sh
Additionally, install Detectron2 from the official repository.
Please download this folder and extract inside the main repository.
or you can run the following command.
curl -L -o '_DATA.zip' 'https://drive.google.com/uc?id=1jEUahdb0WU5FOTllTEfFZrU4yQhshlQL&confirm=t'; unzip _DATA.zip
Besides these files, you also need to download the neutral SMPL model. Please go to the website for the corresponding project and register to get access to the downloads section. Create a folder _DATA/models/smpl/
and place the model there. Otherwise, you can also run:
python3 utils/convert_smpl.py
Once the posetrack dataset is downloaded at "_DATA/posetrack/posetrack_data/", run the following command to run our tracker on all validation videos.
python demo.py --track_dataset posetrack
To evaluate the tracking performance on ID switches, MOTA, and IDF1 and HOTA metrics, please run the following command.
python3 evaluate_PHALP.py out/Videos_results/results/ PHALP posetrack
Please run the following command to run our method on a youtube video. This will download the youtube video from a given ID, and extract frames, run Detectron2, run HMAR and finally run our tracker and renders the video.
python3 demo.py --track_dataset demo
Results (Project site)
We evaluated our method on PoseTrack, MuPoTs and AVA datasets. Our results show significant improvements over the state-of-the-art methods on person tracking. For more results please visit our website.
Parts of the code are taken or adapted from the following repos:
Jathushan Rajasegaran - [email protected] or [email protected]
To ask questions or report issues, please open an issue on the issues tracker.
Discussions, suggestions and questions are welcome!
If you find this code useful for your research or the use data generated by our method, please consider citing the following paper:
@article{rajasegaran2021tracking,
title={Tracking People by Predicting 3D Appearance, Location \& Pose},
author={Rajasegaran, Jathushan and Pavlakos, Georgios and Kanazawa, Angjoo and Malik, Jitendra},
journal={arXiv preprint arXiv:2112.04477},
year={2021}
}