Skip to content

Latest commit

 

History

History
512 lines (478 loc) · 25.2 KB

index.md

File metadata and controls

512 lines (478 loc) · 25.2 KB

Zuoyue Li - 李 作越

I am a Ph.D. student in the Computer Vision and Geometry (CVG) group at ETH Zurich, supervised by Prof. Marc Pollefeys. My research interests focus on 3D vision and 3D generative models, and I collaborate closely with Prof. Martin R. Oswald and Prof. Zhaopeng Cui. My doctoral research was mainly funded by the Swiss Data Science Center (SDSC) fellowships. 

I am currently a research intern at Google Zurich with a topic on generative AI and digital humans. I was a research engineer intern at Meta Zurich with a topic on 3D object detection and scene understanding, and was an overseas researcher in the Computer Vision Group at the Institute of Industrial Science (IIS), The University of Tokyo (東京大学), supervised by Prof. Yoichi Sato, funded by Japan Society for the Promotion of Science (JSPS) fellowships. 

I obtained my M.Sc. degree in Computer Science with distinction at ETH Zurich. I completed my B.Eng. degree in Electronic and Information Engineering as an outstanding graduate at Zhejiang University (浙江大学). 

Email  /  GitHub  /  Google Scholar  /  LinkedIn

profile photo

Research

<script type="text/javascript"> function sat2scene_start() { document.getElementById("sat2scene_video").play(); } function sat2scene_stop() { document.getElementById("sat2scene_video").pause(); } </script>

3D Urban Scene Generation from Satellite Images with Diffusion
Zuoyue Li, Zhenqiang Li, Zhaopeng Cui, Marc Pollefeys, Martin R. Oswald.
CVPR 2024 (Highlight)  /  Paper  /  Project Page  /  Code

Generalize diffusion models to 3D sparse space and perform urban scene generation on a given or predicted geometry, followed by neural rendering techniques to render arbitrary views with excellence in both single-frame quality and inter-frame consistency.

<script type="text/javascript"> function compnvs_start() { document.getElementById("compnvs_video").play(); } function compnvs_stop() { document.getElementById("compnvs_video").pause(); } </script>

CompNVS: Novel View Synthesis with Scene Completion
Zuoyue Li, Tianxing Fan, Zhenqiang Li, Zhaopeng Cui, Yoichi Sato, Marc Pollefeys, Martin R. Oswald.
ECCV 2022  /  Paper  /  Code

Synthesize novel views from RGB-D images with largely incomplete scene coverage. Perform generation on a sparse grid-based neural representation to complete unobserved scene parts. Extrapolate the missing area and render consistent photorealistic image sequences.

<script type="text/javascript"> function acmmm_start() { document.getElementById("acmmm_video").play(); } function acmmm_stop() { document.getElementById("acmmm_video").pause(); } </script>

Factorized and Controllable Neural Re-rendering of Outdoor Scene for Photo Extrapolation
Boming Zhao, Bangbang Yang, Zhenyang Li, Zuoyue Li, Guofeng Zhang, Jiashu Zhao, Dawei Yin, Zhaopeng Cui, Hujun Bao.
ACM Multimedia 2022 (Oral)  /  Paper  /  Project page

Expand tourist photos from a narrow field of view to a wider one while maintaining a similar visual style. Propose factorized neural re-rendering model to produce photorealistic novel views from cluttered outdoor Internet photo collections, which enables applications such as controllable scene re-rendering, photo extrapolation, and 3D photo generation.

<script type="text/javascript"> function sat2vid_start() { document.getElementById("sat2vid_video").play(); } function sat2vid_stop() { document.getElementById("sat2vid_video").pause(); } </script>

Sat2Vid: Street-view Panoramic Video Synthesis from a Single Satellite Image
Zuoyue Li, Zhenqiang Li, Zhaopeng Cui, Rongjun Qin, Marc Pollefeys, Martin R. Oswald.
ICCV 2021  /  Paper

Synthesize both temporally and geometrically consistent street-view panoramic video from a single satellite image and camera trajectory. Explicitly create a 3D point cloud representation of the scene and maintain dense 3D-2D correspondences across frames that reflect the geometric scene configuration inferred from the satellite view. Generation adopts GAN-based methods in the 3D sparse space.

profile photo

NVS-MonoDepth: Improving Monocular Depth Prediction with Novel View Synthesis
Zuria Bauer, Zuoyue Li, Sergio Orts Escolano, Miguel Cazorla, Marc Pollefeys, Martin R. Oswald.
3DV 2021  /  Paper

Application of novel view synthesis to improve monocular depth estimation, with a wrapping scheme using the estimated depth to an additional viewpoint. The same depth network is applied to the synthesized view and provides another supervision.

profile photo

Spatio-Temporal Perturbations for Video Attribution
Zhenqiang Li, Weimin Wang, Zuoyue Li, Yifei Huang, Yoichi Sato.
TCSVT 2021  /  Paper

Take extra attention to the evaluation metrics for video attribution methods. Specifically, a new reliability measurement method is proposed, by which the reliable and objective metrics are screened. The effectiveness of the proposed attribution method is extensively investigated by both subjective and objective evaluation, and comparison with multiple significant baseline attribution methods.

profile photo

Towards Visually Explaining Video Understanding Networks with Perturbation
Zhenqiang Li, Weimin Wang, Zuoyue Li, Yifei Huang, Yoichi Sato.
WACV 2021  /  Paper

Aim to provide an easy-to-use visual explanation method for video understanding networks with diversified structures. Propose a generic perturbation-based visual explanation method, enhanced by a novel spatiotemporal smoothness constraint. The method enables the comparison of explanation results between different video classification networks and avoids generating pathological adversarial explanations for video inputs.

profile photo

Geometry-Aware Satellite-to-Ground Image Synthesis for Urban Areas
Xiaohu Lu*, Zuoyue Li*, Zhaopeng Cui, Martin R. Oswald, Marc Pollefeys, Rongjun Qin.
*Equal contribution. CVPR 2020  /  Paper  /  Code

Generate panoramic street-view images that are geometrically consistent with a given satellite image via a GAN-based network with the proposed geo-transformation layer that retains the physical satellite-to-ground relation. The synthesized images retain well-articulated and authentic geometric shapes, as well as the texture richness of the street view in various scenarios.

profile photo

Topological Map Extraction from Overhead Images
Zuoyue Li, Jan Dirk Wegner, Aurelien Lucchi.
ICCV 2019  /  Paper  /  Code

Circumvent the conventional pixel-wise segmentation of aerial images and predict objects in a vector representation directly. Directly extracts the topological map of a city from overhead images as collections of building footprints and road networks.

Awards

Doctoral Consortium Participant, CVPR 2024
Outstanding Reviewer, CVPR 2023
Outstanding Reviewer, ECCV 2022
National Scholarship for Outstanding Students Abroad, 2022
Japan Society for the Promotion of Science (JSPS) Fellowships for Research in Japan, 2020
Swiss Data Science Center (SDSC) Fellowship, 2019
Graduate with Distinction (M.Sc.) at ETH Zürich, 2018
Second Runner-up (student teams) at Helvetic Coding Contest, 2017
Outstanding Graduates at Zhejiang University, 2015

Academic Service

Conference Reviewer: CVPR, ICCV, ECCV, NeurIPS, WACV.
Journal Reviewer: TPAMI, TGRS.

Teaching

Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich Spring 2024
Teaching Assistant, 263-5902-00L Computer Vision, ETH Zürich Autumn 2023
Teaching Assistant, 263-5904-00L Deep Learning for Computer Vision: Seminal Work, ETH Zürich Spring 2023
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich Spring 2023
Teaching Assistant, 252-0847-00L Computer Science, ETH Zürich Autumn 2022
Teaching Assistant, 263-5904-00L Deep Learning for Computer Vision: Seminal Work, ETH Zürich Spring 2022
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich Spring 2022
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich Spring 2021
Teaching Assistant, 263-5902-00L Computer Vision, ETH Zürich Autumn 2020
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich Spring 2020
Teaching Assistant, 263-5904-00L Deep Learning for Computer Vision: Seminal Work, ETH Zürich Spring 2020
Teaching Assistant, 263-5902-00L Computer Vision, ETH Zürich Autumn 2019
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich Spring 2019
Teaching Assistant, 263-5904-00L Deep Learning for Computer Vision: Seminal Work, ETH Zürich Spring 2019

Contact

Zuoyue Li
CAB G 85.2
Universitätstrasse 6
8092 Zürich
Switzerland

Last update: 28 Apr 2024

<script type="text/javascript"> function setOpacity(elmId, targetOpacity, stepSize, stepTimeMs) { var elm = document.getElementById(elmId); var currentOpacity = parseFloat(elm.style.opacity); var numSteps = Math.ceil(Math.abs(targetOpacity - currentOpacity) / stepSize); stepSize = Math.abs(stepSize); if (targetOpacity < currentOpacity) { stepSize = -stepSize; } var i = 0; var k = window.setInterval(function() { if (i < (numSteps - 1)) { i++; elm.style.opacity = currentOpacity + i * stepSize; } else { elm.style.opacity = targetOpacity; clearInterval(k); } }, stepTimeMs); }; </script>