Mixed-reality headsets offer new ways to perceive our environment. They employ visible spectrum cameras to capture and display the environment on screens in front of the user's eyes. However, these cameras lead to limitations. Firstly, they capture only a partial view of the environment. They are positioned to capture whatever is in front of the user, thus creating blind spots during complete immersion and failing to detect events outside the restricted field of view. Secondly, they capture only visible light fields, ignoring other fields like acoustics and radio that are also present in the environment. Finally, these power-hungry cameras rapidly deplete the mixed-reality headset's battery. We introduce PixelGen to rethink embedded cameras for mixed-reality headsets. PixelGen proposes to decouple cameras from the mixed-reality headset and balance resolution and fidelity to minimize the power consumption. It employs low-resolution, monochrome image sensors and environmental sensors to capture the surroundings around the headset. This approach reduces the system's communication bandwidth and power consumption. A transformer-based language and image model process this information to overcome resolution trade-offs, thus generating a higher-resolution representation of the environment.
View what PixelGen
can do! Here are a stream of images generated from PixelGen-
http://bit.ly/generated_video
-
ImmerCom'24 - PixelGen: Rethinking Embedded Cameras for Mixed-Reality
The 2nd ACM Workshop on Mobile Immersive Computing, Networking, and Systems (ImmerCom) was held in conjunction with ACM MobiCom 2024
@inproceedings{10.1145/3636534.3696216, author = {Li, Kunjun and Gulati, Manoj and Shah, Dhairya and Waskito, Steven and Chakrabarty, Shantanu and Varshney, Ambuj}, title = {PixelGen: Rethinking Embedded Cameras for Mixed-Reality}, year = {2024}, isbn = {9798400704895}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3636534.3696216}, doi = {10.1145/3636534.3696216}, abstract = {Mixed-reality headsets offer new ways to perceive our environment. They employ visible spectrum cameras to capture and display the environment on screens in front of the user's eyes. However, these cameras lead to limitations. Firstly, they capture only a partial view of the environment. They are positioned to capture whatever is in front of the user, thus creating blind spots during complete immersion and failing to detect events outside the restricted field of view. Secondly, they capture only visible light fields, ignoring other fields like acoustics and radio that are also present in the environment. Finally, these power-hungry cameras rapidly deplete the mixed-reality headset's battery. We introduce PixelGen to rethink embedded cameras for mixed-reality headsets. PixelGen proposes to decouple cameras from the mixed-reality headset and balance resolution and fidelity to minimize the power consumption. It employs low-resolution, monochrome image sensors and environmental sensors to capture the surroundings around the headset. This approach reduces the system's communication bandwidth and power consumption. A transformer-based language and image model process this information to overcome resolution trade-offs, thus generating a higher-resolution representation of the environment. We present initial experiments that show PixelGen's viability.}, booktitle = {Proceedings of the 30th Annual International Conference on Mobile Computing and Networking}, pages = {2128–2135}, numpages = {8}, keywords = {embedded systems, networking, large language models}, location = {Washington D.C., DC, USA}, series = {ACM MobiCom '24} }
-
IPSN'24 Demo - PixelGen: Rethinking Embedded Camera Systems for Mixed-Reality
PixelGen wins Best Demonstration Runner-Up award at IPSN'24.
@INPROCEEDINGS{10577362, author={Li, Kunjun and Gulati, Manoj and Shah, Dhairya and Waskito, Steven and Chakrabarty, Shantanu and Varshney, Ambuj}, booktitle={2024 23rd ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)}, title={Demo Abstract: PixelGen: Rethinking Embedded Camera Systems for Mixed-Reality}, year={2024}, volume={}, number={}, pages={271-272}, keywords={Headphones;Visualization;Power demand;Magnetic sensors;Mixed reality;Virtual reality;Sensor phenomena and characterization;Embedded Camera Systems;Mixed Reality;Multimodal AI;Low-power Systems}, doi={10.1109/IPSN61024.2024.00036} }
-
arXiv - PixelGen: Rethinking Embedded Camera Systems
@misc{li2024pixelgenrethinkingembeddedcamera, title={PixelGen: Rethinking Embedded Camera Systems}, author={Kunjun Li and Manoj Gulati and Steven Waskito and Dhairya Shah and Shantanu Chakrabarty and Ambuj Varshney}, year={2024}, eprint={2402.03390}, archivePrefix={arXiv}, primaryClass={eess.IV}, url={https://arxiv.org/abs/2402.03390}, }
This repository contains everything you need to get started with PixelGen, including scripts, hardware designs, and model compression tools.
fast_stable_diffusion_AUTOMATIC1111.ipynb
: This script runs the diffusion model on collected sensor data to generate high-resolution images.ControlNet
: This repo gives a quick guide to use controlnet with stable diffusion (either by canny edge detector or oneformer segmentation)
capture.py
: This script captures a low-res monochrome image using the PixelGen platform.fusion.py
: This script captures an image and simultaneously collect sensor data.
AmbiqSDK
: Contains the driver code required to interface with the various sensors and microcontroller on the PixelGen platform.board
: Includes the schematics of the hardware components, detailing how the sensors, microcontroller, and transceivers are interconnected. It also provides the Gerber files needed for manufacturing the custom PixelGen board.
This repo contains pruning.ipynb which can compress stable diffusion 3 transformer with one-shot pruning (SparseGPT) and Tiny-SD which is a compressed stable diffusion v1.5 model with knowledge distillation
. Note that the work is still in progress.
- Clone the repository:
git clone https://github.com/weiserlab/PixelGen.git cd PixelGen
This shows a high level overview of PixelGen.
We welcome contributions from the community. Please fork the repository and create a pull request with your enhancements or bug fixes.
PixelGen is developed and maintained by the following researchers from the WEISER group from the School of Computing, National University of Singapore (NUS).
Feel free to contact us at [email protected]
or [email protected]
for any questions or suggestions.
This work was supported primarily through a grant from the NUS-NCS center, a startup grant, a MoE Tier 1 Grant, and an unrestricted gift from Google through their Research Scholar Program. All of these grants were administered through the National University of Singapore.