Merge branch 'shubh/regional_nerf_tests' of https://github.com/Stanfo…

…rd-NavLab/nerfstudio into shubh/regional_nerf_tests
nerfstudio-project · Dec 30, 2023 · 1ede346 · 1ede346
2 parents fe634e7 + a8e6f8f
commit 1ede346
Show file tree

Hide file tree

Showing 108 changed files with 5,740 additions and 1,047 deletions.
diff --git a/.github/workflows/doc.yml b/.github/workflows/doc.yml
@@ -0,0 +1,34 @@
+name: Docs
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+  workflow_dispatch:
+
+permissions:
+  contents: write
+jobs:
+  docs:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+      - uses: actions/setup-python@v3
+        with:
+          python-version: '3.9'
+      - name: Install dependencies
+        run: |
+          pip install .[docs]
+      - name: Sphinx build
+        # fail on warnings
+        run: |
+          sphinx-build docs _build -W --keep-going
+      - name: Deploy
+        uses: peaceiris/actions-gh-pages@v3
+        with:
+          publish_branch: gh-pages
+          github_token: ${{ secrets.GITHUB_TOKEN }}
+          publish_dir: _build/
+          force_orphan: true
+          cname: docs.nerf.studio
+        if: github.event_name != 'pull_request'
diff --git a/.readthedocs.yaml b/.readthedocs.yaml
diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -33,7 +33,7 @@
   "editor.formatOnSave": true,
   "python.envFile": "${workspaceFolder}/.env",
   "python.formatting.provider": "none",
-  "python.formatting.blackArgs": ["--line-length=120"],
+  "black-formatter.args": ["--line-length=120"],
   "python.linting.pylintEnabled": false,
   "python.linting.flake8Enabled": false,
   "python.linting.enabled": true,

diff --git a/Dockerfile b/Dockerfile
@@ -135,9 +135,10 @@ RUN git clone --branch v0.4.0 --recursive https://github.com/colmap/pycolmap.git
     python3.10 -m pip install . && \
     cd ..
 
-# Install hloc master (last release (1.3) is too old) as alternative feature detector and matcher option for nerfstudio.
+# Install hloc 1.4 as alternative feature detector and matcher option for nerfstudio.
 RUN git clone --branch master --recursive https://github.com/cvg/Hierarchical-Localization.git && \
     cd Hierarchical-Localization && \
+    git checkout v1.4 && \
     python3.10 -m pip install -e . && \
     cd ..
 
@@ -148,8 +149,10 @@ RUN git clone --branch v1.0 --recursive https://github.com/cvg/pyceres.git && \
     cd ..
 
 # Install pixel perfect sfm.
-RUN git clone --branch v1.0 --recursive https://github.com/cvg/pixel-perfect-sfm.git && \
+RUN git clone --recursive https://github.com/cvg/pixel-perfect-sfm.git && \
     cd pixel-perfect-sfm && \
+    git reset --hard 40f7c1339328b2a0c7cf71f76623fb848e0c0357 && \
+    git clean -df && \
     python3.10 -m pip install -e . && \
     cd ..
 
@@ -168,6 +171,8 @@ RUN cd nerfstudio && \
 # Change working directory
 WORKDIR /workspace
 
-# Install nerfstudio cli auto completion and enter shell if no command was provided.
-CMD ns-install-cli --mode install && /bin/bash
+# Install nerfstudio cli auto completion
+RUN ns-install-cli --mode install
 
+# Bash as default entrypoint.
+CMD /bin/bash -l
diff --git a/LICENSE b/LICENSE
@@ -186,7 +186,7 @@
       same "printed page" as the copyright notice for easier
       identification within third-party archives.
 
-   Copyright [yyyy] [name of copyright owner]
+   Copyright 2023 The Nerfstudio Team
 
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.

diff --git a/docs/Makefile b/docs/Makefile
@@ -3,7 +3,7 @@
 
 # You can set these variables from the command line, and also
 # from the environment for the first two.
-SPHINXOPTS    ?=
+SPHINXOPTS    ?= -W --keep-going  # build fail on warning
 SPHINXBUILD   ?= sphinx-build
 SOURCEDIR     = .
 BUILDDIR      = _build

diff --git a/docs/developer_guides/config.md b/docs/developer_guides/config.md
@@ -89,7 +89,7 @@ Often times, you just want to play with the parameters of an existing model with
   ns-train --help
   ```
 
-- List out all exist configurable parameters for `{METHOD_NAME}`
+- List out all existing configurable parameters for `{METHOD_NAME}`
 
   ```bash
   ns-train {METHOD_NAME} --help

diff --git a/docs/developer_guides/debugging_tools/benchmarking.md b/docs/developer_guides/debugging_tools/benchmarking.md
@@ -16,7 +16,7 @@ Simply replace the arguments in brackets with the correct arguments.
 
 - `-m {METHOD_NAME}`: Name of the method you want to benchmark (e.g. `nerfacto`, `mipnerf`).
 - `-s`: Launch a single job per GPU.
-- `-v {VIS}`: Use another visualization than wandb, which is the default. Only other option is tensorboard.
+- `-v {VIS}`: Use another visualization than wandb, which is the default. Other options are comet & tensorboard.
 - `{GPU_LIST}`: (optional) Specify the list of gpus you want to use on your machine space separated. for instance, if you want to use GPU's 0-3, you will need to pass in `0 1 2 3`. If left empty, the script will automatically find available GPU's and distribute training jobs on the available GPUs.
 
 :::{admonition} Tip

diff --git a/docs/developer_guides/new_methods.md b/docs/developer_guides/new_methods.md
@@ -132,7 +132,7 @@ finally run the following to register the dataparser.
 pip install -e .
 ```
 
-Similarly to the method develomement, you can also use environment variables to register dataparsers.
+Similarly to the method development, you can also use environment variables to register dataparsers.
 Use the `NERFSTUDIO_DATAPARSER_CONFIGS` environment variable:
 
 ```

diff --git a/docs/developer_guides/pipelines/datamanagers.md b/docs/developer_guides/pipelines/datamanagers.md
@@ -62,13 +62,11 @@ class VanillaDataManagerConfig(InstantiateConfig):
     """number of rays per batch to use per eval iteration"""
     eval_num_images_to_sample_from: int = -1
     """number of images to sample during eval iteration"""
-    eval_image_indices: Optional[Tuple[int, ...]] = (0,)
-    """specifies the image indices to use during eval; if None, uses all"""
     camera_optimizer: CameraOptimizerConfig = CameraOptimizerConfig()
     """specifies the camera pose optimizer used during training"""
 ```
 
-Let's take a quick look at how the `run_train` method is implemented. Here we sample images, then pixels, and then return the RayBundle and RayGT information.
+Let's take a quick look at how the `next_train` method is implemented. Here we sample images, then pixels, and then return the RayBundle and RayGT information.
 
 ```python
 def next_train(self, step: int) -> Tuple[RayBundle, Dict]:

diff --git a/docs/developer_guides/pipelines/models.md b/docs/developer_guides/pipelines/models.md
@@ -55,7 +55,7 @@ class Model:
         """Process a RayBundle object and return RayOutputs describing quanties for each ray."""
 
     def get_metrics_dict(self, outputs, batch):
-        """Returns metrics dictionary which will be plotted with wandb or tensorboard."""
+        """Returns metrics dictionary which will be plotted with comet, wandb or tensorboard."""
 
     def get_loss_dict(self, outputs, batch, metrics_dict=None):
         """Returns a dictionary of losses to be summed which will be your loss."""

diff --git a/docs/developer_guides/viewer/custom_gui.md b/docs/developer_guides/viewer/custom_gui.md
@@ -59,7 +59,7 @@ class MyModel(Model):
 
 
 **Writing to the element**
-You can write to a viewer element in Python, which provides a convenient way to track values in your code without the need for wandb/tensorboard or relying on `print` statements.
+You can write to a viewer element in Python, which provides a convenient way to track values in your code without the need for comet/wandb/tensorboard or relying on `print` statements.
 
 ```python
 self.custom_value.value = x

diff --git a/docs/developer_guides/viewer/index.md b/docs/developer_guides/viewer/index.md
@@ -16,7 +16,7 @@ local_viewer
 
 We thank the authors and contributors to the following repos, which we've started, used, and modified for our use-cases.
 
-- [Viser](https://github.com/brentyi/viser/tree/main/viser) - made by [Brent Yi](https://github.com/brentyi)
+- [Viser](https://github.com/brentyi/viser/) - made by [Brent Yi](https://github.com/brentyi)
 - [meshcat-python](https://github.com/rdeits/meshcat-python) - made by [Robin Deits](https://github.com/rdeits)
 - [meshcat](https://github.com/rdeits/meshcat) - made by [Robin Deits](https://github.com/rdeits)
 - [ThreeJS](https://threejs.org/)

diff --git a/docs/developer_guides/viewer/local_viewer.md b/docs/developer_guides/viewer/local_viewer.md
@@ -1,6 +1,6 @@
 # Local Server
 
-If you are unable to connect to `https://viewer.nerf.studio`, want to use Safari, or want develop the viewer codebase, you can launch your own local viewer.
+If you are unable to connect to `https://viewer.nerf.studio`, want to use Safari, or want to develop the viewer codebase, you can launch your own local viewer.
 
 ## Installing Dependencies
 

diff --git a/docs/developer_guides/viewer/viewer_control.md b/docs/developer_guides/viewer/viewer_control.md
@@ -62,8 +62,8 @@ class MyModel(nn.Module):  # Must inherit from nn.Module
         self.viewer_button = ViewerButton(name="Dummy Button",cb_hook=button_cb)
 ```
 
-## Double-click Callbacks
-We forward *double* clicks inside the viewer to the ViewerControl object, which you can use to interact with the scene. To do this, register a callback using `register_click_cb()`. The click is defined to be a ray that starts at the camera origin and passes through the click point on the screen, in world coordinates. 
+## Scene Click Callbacks
+We forward *single* clicks inside the viewer to the ViewerControl object, which you can use to interact with the scene. To do this, register a callback using `register_click_cb()`. The click is defined to be a ray that starts at the camera origin and passes through the click point on the screen, in world coordinates. 
 
 ```python
 from nerfstudio.viewer.server.viewer_elements import ViewerControl,ViewerClick
@@ -77,6 +77,16 @@ class MyModel(nn.Module):  # must inherit from nn.Module
         self.viewer_control.register_click_cb(click_cb)
 ```
 
+You can also use `unregister_click_cb()` to remove callbacks that are no longer needed. A good example is a "Click on Scene" button, that when pressed, would register a callback that would wait for the next click, and then unregister itself.
+```python
+    ...
+    def button_cb(button: ViewerButton):
+        def click_cb(click: ViewerClick):
+            print(f"Click at {click.origin} in direction {click.direction}")
+            self.viewer_control.unregister_click_cb(click_cb)
+        self.viewer_control.register_click_cb(click_cb)
+```
+
 ### Thread safety
 Just like `ViewerElement` callbacks, click callbacks are asynchronous to training and can potentially interrupt a call to `get_outputs()`.
 
diff --git a/docs/index.md b/docs/index.md
@@ -135,7 +135,7 @@ This documentation is organized into 3 parts:
 
 ### Included Methods
 
-- [**Nerfacto**](nerfology/methods/nerfacto.md): Recommended method, integrates mutiple methods into one.
+- [**Nerfacto**](nerfology/methods/nerfacto.md): Recommended method, integrates multiple methods into one.
 - [Instant-NGP](nerfology/methods/instant_ngp.md): Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
 - [NeRF](nerfology/methods/nerf.md): OG Neural Radiance Fields
 - [Mip-NeRF](nerfology/methods/mipnerf.md): A Multiscale Representation for Anti-Aliasing Neural Radiance Fields
@@ -151,6 +151,7 @@ This documentation is organized into 3 parts:
 - [Nerfbusters](nerfology/methods/nerfbusters.md): Removing Ghostly Artifacts from Casually Captured NeRFs
 - [NeRFPlayer](nerfology/methods/nerfplayer.md): 4D Radiance Fields by Streaming Feature Channels
 - [Tetra-NeRF](nerfology/methods/tetranerf.md): Representing Neural Radiance Fields Using Tetrahedra
+- [Instruct-GS2GS](nerfology/methods/igs2gs.md): Editing 3DGS Scenes with Instructions
 
 **Eager to contribute a method?** We'd love to see you use nerfstudio in implementing new (or even existing) methods! Please view our {ref}`guide<own_method_docs>` for more details about how to add to this list!
 

diff --git a/docs/make.bat b/docs/make.bat
@@ -9,6 +9,7 @@ if "%SPHINXBUILD%" == "" (
 )
 set SOURCEDIR=.
 set BUILDDIR=_build
+set SPHINXOPTS="-W --keep-going"
 
 %SPHINXBUILD% >NUL 2>NUL
 if errorlevel 9009 (

diff --git a/docs/nerfology/methods/igs2gs.md b/docs/nerfology/methods/igs2gs.md
@@ -0,0 +1,103 @@
+# Instruct-GS2GS
+
+<h4>Editing Gaussian Splatting Scenes with Instructions</h4>
+
+```{button-link} https://instruct-gs2gs.github.io/
+:color: primary
+:outline:
+Paper Website
+```
+
+```{button-link} https://github.com/cvachha/instruct-gs2gs
+:color: primary
+:outline:
+Code
+```
+
+<video id="teaser" muted autoplay playsinline loop controls width="100%">
+    <source id="mp4" src="https://instruct-gs2gs.github.io/data/videos/face.mp4" type="video/mp4">
+</video>
+
+**Instruct-GS2GS enables instruction-based editing of 3D Gaussian Splatting scenes via a 2D diffusion model**
+
+## Installation
+
+First install nerfstudio dependencies. Then run:
+
+```bash
+pip install git+https://github.com/cvachha/instruct-gs2gs
+cd instruct-gs2gs
+pip install --upgrade pip setuptools
+pip install -e .
+```
+
+## Running Instruct-GS2GS
+
+Details for running Instruct-GS2GS (built with Nerfstudio!) can be found [here](https://github.com/cvachha/instruct-gs2gs). Once installed, run:
+
+```bash
+ns-train igs2gs --help
+```
+
+| Method       | Description                  | Memory |
+| ------------ | ---------------------------- | ------ |
+| `igs2gs`       | Full model, used in paper    | ~15GB  |
+
+Datasets need to be processed with COLMAP for Gaussian Splatting support.
+
+Once you have trained your GS scene for 20k iterations, the checkpoints will be saved to the `outputs` directory. Copy the path to the `nerfstudio_models` folder. (Note: We noticed that training for 20k iterations rather than 30k seemed to run more reliably)
+
+To start training for editing the GS, run the following command:
+
+```bash
+ns-train igs2gs --data {PROCESSED_DATA_DIR} --load-dir {outputs/.../nerfstudio_models} --pipeline.prompt {"prompt"} --pipeline.guidance-scale 12.5 --pipeline.image-guidance-scale 1.5
+```
+
+The `{PROCESSED_DATA_DIR}` must be the same path as used in training the original GS. Using the CLI commands, you can choose the prompt and the guidance scales used for InstructPix2Pix.
+
+## Method
+
+### Overview
+
+Instruct-GS2GS is a method for editing 3D Gaussian Splatting (3DGS) scenes with text instructions in a method based on [Instruct-NeRF2NeRF](https://instruct-nerf2nerf.github.io/). Given a 3DGS scene of a scene and the collection of images used to reconstruct it, this method uses an image-conditioned diffusion model ([InstructPix2Pix](https://www.timothybrooks.com/instruct-pix2pix)) to iteratively edit the input images while optimizing the underlying scene, resulting in an optimized 3D scene that respects the edit instruction. The paper demonstrates that our proposed method is able to edit large-scale, real-world scenes, and is able to accomplish  realistic and targeted edits.
+
+
+## Pipeline
+
+<video id="pipeline" muted autoplay playsinline loop controls width="100%">
+    <source id="mp4" src="https://instruct-gs2gs.github.io/data/videos/pipeline.mp4" type="video/mp4">
+</video>
+
+This section will walk through each component of the Instruct-GS2GS method.
+
+### How it Works
+
+Instruct-GS2GS gradually updates a reconstructed Gaussian Splatting scene by iteratively updating the dataset images while training the 3DGS:
+
+1. Images are rendered from the scene at all training viewpoints.
+2. They get edited by InstructPix2Pix given a global text instruction.
+3. The training dataset images are replaced with the edited images.
+4. The 3DGS continues training as usual for 2.5k iterations.
+
+### Editing Images with InstructPix2Pix
+
+To update a dataset image from a given viewpoint, Instruct-GS2GS takes the original, unedited training image as image conditioning and uses the global text instruction as text conditioning. This process mixes the information of the diffusion model, which attempts to edit the image, the current 3D structure of the 3DGS, and view-consistent information from the unedited, ground-truth images. By combining this set of information, the edit is respected while maintaining 3D consistency.
+
+The code snippet for how an image is edited in the pipeline can be found [here](https://github.com/cvachha/instruct-gs2gs/blob/main/igs2gs/ip2p.py).
+
+### Iterative Dataset Update and Implementation
+
+The method takes in a dataset of camera poses and training images, a trained 3DGS scene, and a user-specified text-prompt instruction, e.g. “make him a marble statue”. Instruct-GS2GS constructs the edited GS scene guided by the text-prompt by applying a 2D text and image conditioned diffusion model, in this case Instruct-Pix2Pix, to all training images over the course of training. It performs these edits using an iterative udpate scheme in which all training dataset images are updated using a diffusion model individually, for sequential iterations spanning the size of the training images, every 2.5k training iterations. This process allows the GS to have a holistic edit and maintain 3D consistency.
+
+The process is similar to Instruct-NeRF2NeRF where for a given training camera view, it sets the original training image as the conditioning image, the noisy image input as the GS rendered from the camera combined with some randomly selected noise, and receives an edited image respecting the text conditioning. With this method, it is able to propagate the edited changes to the GS scene. The method is able to maintain grounded edits by conditioning Instruct-Pix2Pix on the original unedited training image.
+
+This method uses Nerfstudio’s gsplat library for our underlying gaussian splatting model. We adapt similar parameters for the diffusion model from Instruct-NeRF2NeRF. Among these are the values that define the amount of noise (and therefore the amount signal retained from the original images). We vary the classifier-free guidance scales per edit and scene, using a range of values. We edit the entire dataset and then train the scene for 2.5k iterations. For GS training, we use L1 and LPIPS losses. We train our method for a maximum of 27.5k iterations (starting with a GS scene trained for 20k iterations). However, in practice we stop training once the edit has converged. In many cases, the optimal training length is a subjective decision — a user may prefer more subtle or more extreme edits that are best found at different stages of training.
+
+
+## Results
+
+For results, view the [project page](https://instruct-gs2gs.github.io/)!
+
+<video id="teaser" muted autoplay playsinline loop controls width="100%">
+    <source id="mp4" src="https://instruct-gs2gs.github.io/data/videos/campanile_all.mp4" type="video/mp4">
+</video>
diff --git a/docs/nerfology/methods/index.md b/docs/nerfology/methods/index.md
@@ -38,6 +38,7 @@ The following methods are supported in nerfstudio:
     Tetra-NeRF<tetranerf.md>
     TensoRF<tensorf.md>
     Generfacto<generfacto.md>
+    Instruct-GS2GS<igs2gs.md>
 ```
 
 (own_method_docs)=
@@ -50,7 +51,7 @@ We also welcome additions to the list of methods above. To do this, simply creat
 
 1. Add a markdown file describing the model to the `docs/nerfology/methods` folder
 2. Update the above list of implement methods in this file.
-3. Add the method to the {ref}`this<third_party_methods>` list in `docs/index.md`.
+3. Add the method to {ref}`this<third_party_methods>` list in `docs/index.md`.
 4. Add a new `ExternalMethod` entry to the `nerfstudio/configs/external_methods.py` file.
 
 For the method description, please refer to the [Instruct-NeRF2NeRF](in2n) page as an example of the layout. Please try to include the following information: