pytorch · vmoens · Feb 25, 2025 · Feb 25, 2025 · Feb 25, 2025
diff --git a/docs/source/reference/collectors.rst b/docs/source/reference/collectors.rst
@@ -155,7 +155,7 @@ will work:
     >>> collector = SyncDataCollector(env, policy, frames_per_batch=N, total_frames=-1)
     >>> for data in collector:
     ...     memory.extend(data)
-    >>> # MultiSyncDataCollector + regular env: behaves like a ParallelEnv iif cat_results="stack"
+    >>> # MultiSyncDataCollector + regular env: behaves like a ParallelEnv if cat_results="stack"
     >>> memory = ReplayBuffer(
     ...     storage=LazyTensorStorage(N, ndim=2),
     ...     sampler=SliceSampler(num_slices=4, trajectory_key=("collector", "traj_ids"))

diff --git a/docs/source/reference/data.rst b/docs/source/reference/data.rst
@@ -930,7 +930,7 @@ It is important that your environment specs match the input and output that it s
 :class:`~torchrl.envs.ParallelEnv` will create buffers from these specs to communicate with the spawn processes.
 Check the :func:`torchrl.envs.utils.check_env_specs` method for a sanity check.
 
-If needed, specs can be automatially generated from data using the :func:`~torchrl.envs.utils.make_composite_from_td`
+If needed, specs can be automatically generated from data using the :func:`~torchrl.envs.utils.make_composite_from_td`
 function.
 
 Specs fall in two main categories, numerical and categorical.
@@ -1073,7 +1073,7 @@ Then, a second storage keeps track of the actions and results associated with th
     >>> next_data = forest.data_map[index]
 
 The ``next_data`` entry can have any shape, but it will usually match the shape of ``index`` (since at each index
-corresponds one action). Once ``next_data`` is obtrained, it can be put together with ``data`` to form a set of nodes,
+corresponds one action). Once ``next_data`` is obtained, it can be put together with ``data`` to form a set of nodes,
 and the tree can be expanded for each of these. The following figure shows how this is done.
 
 .. figure:: /_static/img/collector-copy.png

diff --git a/docs/source/reference/envs.rst b/docs/source/reference/envs.rst
@@ -301,7 +301,7 @@ The ``"_reset"`` key has two distinct functionalities:
    Designing an environment that behaves according to ``"_reset"`` inputs is the
    developer's responsibility, as TorchRL has no control over the inner logic
    of :meth:`~.EnvBase._reset`. Nevertheless, the following point should be
-   kept in mind when desiging that method.
+   kept in mind when designing that method.
 
 2. After a call to :meth:`~.EnvBase._reset`, the output will be masked with the
    ``"_reset"`` entries and the output of the previous :meth:`~.EnvBase.step`
@@ -329,7 +329,7 @@ designing reset functionalities:
   whether the ``"_reset"`` at the root level corresponds to an ``all()``, ``any()``
   or custom call to the nested ``"done"`` entries cannot be known in advance,
   and it is explicitly assumed that the ``"_reset"`` at the root was placed
-  there to superseed the nested values (for an example, have a look at
+  there to supersede the nested values (for an example, have a look at
   :class:`~.PettingZooWrapper` implementation where each group has one or more
   ``"done"`` entries associated which is aggregated at the root level with a
   ``any`` or ``all`` logic depending on the task).
@@ -1126,7 +1126,7 @@ to always know what the latest available actions are. You can do this like so:
         >>> )
 
 .. note::
-  In case you are using a parallel environment it is important to add the transform to the parallel enviornment itself
+  In case you are using a parallel environment it is important to add the transform to the parallel environment itself
   and not to its sub-environments.
 
 

diff --git a/docs/source/reference/trainers.rst b/docs/source/reference/trainers.rst
@@ -9,7 +9,7 @@ The trainer package provides utilities to write re-usable training scripts. The
 trainer that implements a nested loop, where the outer loop runs the data collection steps and the inner
 loop the optimization steps. We believe this fits multiple RL training schemes, such as
 on-policy, off-policy, model-based and model-free solutions, offline RL and others.
-More particular cases, such as meta-RL algorithms may have training schemes that differ substentially.
+More particular cases, such as meta-RL algorithms may have training schemes that differ substantially.
 
 The ``trainer.train()`` method can be sketched as follows:
 

diff --git a/examples/distributed/replay_buffers/distributed_replay_buffer.py b/examples/distributed/replay_buffers/distributed_replay_buffer.py
@@ -3,7 +3,7 @@
 ===========================
 
 This example illustrates how a skeleton reinforcement learning algorithm can be implemented in a distributed fashion with communication between nodes/workers handled using `torch.rpc`.
-It focusses on how to set up a replay buffer worker that accepts remote operation requests efficiently, and so omits any learning component such as parameter updates that may be required for a complete distributed reinforcement learning algorithm implementation.
+It focuses on how to set up a replay buffer worker that accepts remote operation requests efficiently, and so omits any learning component such as parameter updates that may be required for a complete distributed reinforcement learning algorithm implementation.
 In this model, >= 1 data collectors workers are responsible for collecting experiences in an environment, the replay buffer worker receives all of these experiences and exposes them to a trainer that is responsible for making parameter updates to any required models.
 """
 
@@ -150,7 +150,7 @@ def _create_and_launch_data_collectors(self) -> None:
 
 class ReplayBufferNode(RemoteTensorDictReplayBuffer):
     """Experience replay buffer node that is capable of accepting remote connections. Being a `RemoteTensorDictReplayBuffer`
-    means all of its public methods are remotely invokable using `torch.rpc`.
+    means all of its public methods are remotely invocable using `torch.rpc`.
     Using a LazyMemmapStorage is highly advised in distributed settings with shared storage due to the lower serialization
     cost of MemoryMappedTensors as well as the ability to specify file storage locations which can improve ability to recover from node failures.
 

diff --git a/knowledge_base/MUJOCO_INSTALLATION.md b/knowledge_base/MUJOCO_INSTALLATION.md
@@ -189,7 +189,7 @@ issues when running `import mujoco_py` and some troubleshooting for each of them
     /path/to/conda/envs/mj_envs/lib/python3.8/site-packages/glfw/__init__.py:912: GLFWError: (65537) b'The GLFW library is not initialized'
     ```
 
-    _Solution_: This can usually be sovled by setting EGL as your mujoco_gl backend: `MUJOCO_GL=egl python myscript.py`
+    _Solution_: This can usually be solved by setting EGL as your mujoco_gl backend: `MUJOCO_GL=egl python myscript.py`
 
 
 
@@ -208,7 +208,7 @@ RuntimeError: Failed to initialize OpenGL
 > Mujoco's EGL code indexes devices globally while CUDA_VISIBLE_DEVICES 
   (when used with job schedulers like slurm) returns the local device ids. 
   This can be worked around by setting the `GPUS` environment variable to the 
-  global device id. For slurm, it can be obtained using `SLURM_STEP_GPUS` enviroment variable.
+  global device id. For slurm, it can be obtained using `SLURM_STEP_GPUS` environment variable.
 
 8. Rendered images are completely black.
 

diff --git a/knowledge_base/PRO-TIPS.md b/knowledge_base/PRO-TIPS.md
@@ -76,7 +76,7 @@ Errors to look for that may be related to this misconception are the following:
   than the number of environments you're working with (twice as much for instance). This
   is also and especially true for environments that are rendered (even if they are rendered on GPU). 
 - The speed of training depends upon several factors and there is not a one-fits-all
-  solution to every problem. The common bottlnecks are:
+  solution to every problem. The common bottlenecks are:
   - **data collection**: the simulator speed may affect performance, as can the data
     transformation that follows. Speeding up environment interactions is usually
     done via vectorization (if the simulators enables it, e.g. Brax and other Jax-based
@@ -93,7 +93,7 @@ Errors to look for that may be related to this misconception are the following:
     a computational bottleneck as these are usually coded using plain for loops.
     If profiling indicates that this operation is taking a considerable amount
     of time, consider using our fully vectorized solutions instead.
-  - **Loss compuation**: The loss computation and the optimization
+  - **Loss computation**: The loss computation and the optimization
     steps are frequently responsible of a significant share of the compute time.
     Some techniques can speed things up. For instance, if multiple target networks
     are being used, using vectorized maps and functional programming (through 

diff --git a/knowledge_base/VERSIONING_ISSUES.md b/knowledge_base/VERSIONING_ISSUES.md
@@ -7,7 +7,7 @@ ImportError: /usr/local/lib/python3.7/dist-packages/torchrl/_torchrl.so: undefin
 ```
 
 ### How to reproduce
-1. Create an Colab Notebook (at 24/11/2022 Colab enviroment has Python 3.7 and Pytorch 1.12 installed by default).
+1. Create an Colab Notebook (at 24/11/2022 Colab environment has Python 3.7 and Pytorch 1.12 installed by default).
 2. ``` !pip install torchrl ```
 3. ``` import torchrl ```
 
@@ -20,4 +20,4 @@ before the ```!pip install torchrl``` command. This will install the latest pyto
 ### Workarounds
 There are two workarounds to this issue
 1. Install/upgrade to the latest pytorch release before installing torchrl.
-2. If you need to use a previous pytorch relase: Install functorch version related to your torch distribution: e.g. ``` pip install functorch==0.2.0 ``` and install library from source ``` pip install git+https://github.com/pytorch/rl@<lib_version_here> ```.
+2. If you need to use a previous pytorch release: Install functorch version related to your torch distribution: e.g. ``` pip install functorch==0.2.0 ``` and install library from source ``` pip install git+https://github.com/pytorch/rl@<lib_version_here> ```.
diff --git a/sota-implementations/decision_transformer/lamb.py b/sota-implementations/decision_transformer/lamb.py
@@ -85,7 +85,7 @@ def step(self, closure=None):
                 grad = p.grad
                 if grad.is_sparse:
                     raise RuntimeError(
-                        "Lamb does not support sparse gradients, consider SparseAdam instad."
+                        "Lamb does not support sparse gradients, consider SparseAdam instead."
                     )
                 global_grad_norm.add_(grad.pow(2).sum())
 

diff --git a/sota-implementations/redq/utils.py b/sota-implementations/redq/utils.py
@@ -108,7 +108,7 @@
 def correct_for_frame_skip(cfg: "DictConfig") -> "DictConfig":  # noqa: F821
     """Correct the arguments for the input frame_skip, by dividing all the arguments that reflect a count of frames by the frame_skip.
 
-    This is aimed at avoiding unknowingly over-sampling from the environment, i.e. targetting a total number of frames
+    This is aimed at avoiding unknowingly over-sampling from the environment, i.e. targeting a total number of frames
     of 1M but actually collecting frame_skip * 1M frames.
 
     Args:
@@ -578,7 +578,7 @@ def transformed_env_constructor(
         stats (dict, optional): a dictionary containing the :obj:`loc` and :obj:`scale` for the `ObservationNorm` transform
         norm_obs_only (bool, optional): If `True` and `VecNorm` is used, the reward won't be normalized online.
             Default is `False`.
-        use_env_creator (bool, optional): wheter the `EnvCreator` class should be used. By using `EnvCreator`,
+        use_env_creator (bool, optional): whether the `EnvCreator` class should be used. By using `EnvCreator`,
             one can make sure that running statistics will be put in shared memory and accessible for all workers
             when using a `VecNorm` transform. Default is `True`.
         custom_env_maker (callable, optional): if your env maker is not part
@@ -644,7 +644,7 @@ def make_transformed_env(**kwargs) -> TransformedEnv:
         elif custom_env_maker is None and custom_env is not None:
             env = custom_env
         else:
-            raise RuntimeError("cannot provive both custom_env and custom_env_maker")
+            raise RuntimeError("cannot provide both custom_env and custom_env_maker")
 
         if cfg.env.noops and custom_env is None:
             # this is a bit hacky: if custom_env is not None, it is probably a ParallelEnv
@@ -699,7 +699,7 @@ def initialize_observation_norm_transforms(
 
     Args:
         proof_environment (EnvBase instance, optional): if provided, this env will
-            be used ot execute the rollouts. If not, it will be created using
+            be used to execute the rollouts. If not, it will be created using
             the cfg object.
         num_iter (int): Number of iterations used for initializing the :obj:`ObservationNorms`
         key (str, optional): if provided, the stats of this key will be gathered.

diff --git a/test/_utils_internal.py b/test/_utils_internal.py
@@ -592,7 +592,7 @@ class LSTMNet(nn.Module):
     TensorDict of size [batch x time_steps].
 
     If a 2D tensor is provided as input, it is assumed that it is a batch of data
-    with only one time step. This means that we explicitely assume that users will
+    with only one time step. This means that we explicitly assume that users will
     unsqueeze inputs of a single batch with multiple time steps.
 
     Args:

diff --git a/test/mocking_classes.py b/test/mocking_classes.py
@@ -970,15 +970,15 @@ def _get_in_obs(self, tensordict):
 
 
 class DummyModelBasedEnvBase(ModelBasedEnvBase):
-    """Dummy environnement for Model Based RL sota-implementations.
+    """Dummy environment for Model Based RL sota-implementations.
 
-    This class is meant to be used to test the model based environnement.
+    This class is meant to be used to test the model based environment.
 
     Args:
-        world_model (WorldModel): the world model to use for the environnement.
-        device (str or torch.device, optional): the device to use for the environnement.
-        dtype (torch.dtype, optional): the dtype to use for the environnement.
-        batch_size (sequence of int, optional): the batch size to use for the environnement.
+        world_model (WorldModel): the world model to use for the environment.
+        device (str or torch.device, optional): the device to use for the environment.
+        dtype (torch.dtype, optional): the dtype to use for the environment.
+        batch_size (sequence of int, optional): the batch size to use for the environment.
     """
 
     def __init__(

diff --git a/test/test_collector.py b/test/test_collector.py
@@ -862,7 +862,7 @@ def test_env_that_errors(self, ctype):
         "collector_cls", ["MultiSyncDataCollector", "MultiaSyncDataCollector"]
     )
     def test_env_that_waits(self, to, collector_cls):
-        # Tests that the collector fails if the MAX_IDLE_COUNT<waiting time, but succeeeds otherwise
+        # Tests that the collector fails if the MAX_IDLE_COUNT<waiting time, but succeeds otherwise
         # We run this in a subprocess to control the env variable.
         script = f"""import os
 

diff --git a/test/test_env.py b/test/test_env.py
@@ -1305,7 +1305,7 @@ def test_parallel_env(
             td_reset = TensorDict(source=rand_reset(env_parallel), batch_size=[N])
             env_parallel.reset(tensordict=td_reset)
 
-            # check that interruption occured because of max_steps or done
+            # check that interruption occurred because of max_steps or done
             td = env_parallel.rollout(policy=None, max_steps=T)
             assert (
                 td.shape == torch.Size([N, T]) or td.get(("next", "done")).sum(1).any()

diff --git a/test/test_loggers.py b/test/test_loggers.py
@@ -187,14 +187,14 @@ def test_log_video(self, steps, video_format, tmpdir):
         sleep(0.01)  # wait until events are registered
 
         # check that the logged videos are the same as the initial video
-        extention = (
+        extension = (
             ".pt"
             if video_format == "pt"
             else ".memmap"
             if video_format == "memmap"
             else ".mp4"
         )
-        video_file_name = "foo_" + ("0" if not steps else str(steps[0])) + extention
+        video_file_name = "foo_" + ("0" if not steps else str(steps[0])) + extension
         path = os.path.join(tmpdir, exp_name, "videos", video_file_name)
         if video_format == "pt":
             logged_video = torch.load(path)

diff --git a/test/test_rb.py b/test/test_rb.py
@@ -2871,7 +2871,7 @@ def test_slice_sampler_prioritized(self, ndim, strict_length, circ, at_capacity)
             assert (samples["traj"] != 0).all(), samples["traj"].unique()
         else:
             assert (samples["traj"] == 0).any()
-            # Check that all samples of the first traj contain all elements (since it's too short to fullfill 10 elts)
+            # Check that all samples of the first traj contain all elements (since it's too short to fulfill 10 elts)
             sc = samples[samples["traj"] == 0]["step_count"]
             assert (sc == 1).sum() == (sc == 2).sum()
             assert (sc == 1).sum() == (sc == 4).sum()
@@ -3393,7 +3393,7 @@ def test_rb(
         error_catcher = (
             pytest.raises(
                 ValueError,
-                match="Samplers with drop_last=True must work with a predictible batch-size",
+                match="Samplers with drop_last=True must work with a predictable batch-size",
             )
             if batch_size is None
             and issubclass(sampler_type, SamplerWithoutReplacement)

diff --git a/test/test_storage_map.py b/test/test_storage_map.py
@@ -274,15 +274,15 @@ def __call__(self):
         gen_id = IDGen()
         gen_hash = lambda: hash(torch.rand(1).item())
 
-        def dummy_node_stack(obervations):
+        def dummy_node_stack(observations):
             return TensorDict.lazy_stack(
                 [
                     Tree(
                         node_data=TensorDict({"obs": torch.tensor(obs)}),
                         hash=gen_hash(),
                         node_id=gen_id(),
                     )
-                    for obs in obervations
+                    for obs in observations
                 ]
             )
 

diff --git a/test/test_transforms.py b/test/test_transforms.py
@@ -535,7 +535,7 @@ def test_transform_env(self, device):
                 low=torch.randn(2),
                 high=None,
             )
-        with pytest.raises(ValueError, match="`low` must be stricly lower than `high`"):
+        with pytest.raises(ValueError, match="`low` must be strictly lower than `high`"):
             ClipTransform(
                 in_keys=["observation", "reward"],
                 in_keys_inv=["observation_orig"],

diff --git a/torchrl/_utils.py b/torchrl/_utils.py
@@ -375,7 +375,7 @@ def get_func_name(cls, fn):
         elif fn_str[0].startswith("<function "):
             first = fn_str[0][len("<function ") :]
         else:
-            raise RuntimeError(f"Unkown func representation {fn}")
+            raise RuntimeError(f"Unknown func representation {fn}")
         last = fn_str[1:]
         if last:
             first = [first]

diff --git a/torchrl/collectors/collectors.py b/torchrl/collectors/collectors.py
@@ -663,7 +663,7 @@ def __init__(
         self.closed = False
 
         if not reset_when_done:
-            raise ValueError("reset_when_done is deprectated.")
+            raise ValueError("reset_when_done is deprecated.")
         self.reset_when_done = reset_when_done
         self.n_env = self.env.batch_size.numel()
 
@@ -2541,7 +2541,7 @@ class MultiaSyncDataCollector(_MultiDataCollector):
 
     Environment types can be identical or different.
 
-    The collection keeps on occuring on all processes even between the time
+    The collection keeps on occurring on all processes even between the time
     the batch of rollouts is collected and the next call to the iterator.
     This class can be safely used with offline RL sota-implementations.
 
@@ -3130,7 +3130,7 @@ def _main_async_collector(
                         "the shared device (aka storing_device) is set to CPU."
                     )
                     if collected_tensordict.device is not None:
-                        # placehoder in case we need different behaviors
+                        # placeholder in case we need different behaviors
                         if collected_tensordict.device.type in ("cpu",):
                             collected_tensordict.share_memory_()
                         elif collected_tensordict.device.type in ("mps",):

diff --git a/torchrl/collectors/distributed/generic.py b/torchrl/collectors/distributed/generic.py
@@ -463,7 +463,7 @@ def __init__(
         self.max_weight_update_interval = max_weight_update_interval
         if self.update_after_each_batch and self.max_weight_update_interval > -1:
             raise RuntimeError(
-                "Got conflicting udpate instructions: `update_after_each_batch` "
+                "Got conflicting update instructions: `update_after_each_batch` "
                 "`max_weight_update_interval` are incompatible."
             )
         self.launcher = launcher

diff --git a/torchrl/collectors/distributed/rpc.py b/torchrl/collectors/distributed/rpc.py
@@ -320,7 +320,7 @@ def __init__(
         self.max_weight_update_interval = max_weight_update_interval
         if self.update_after_each_batch and self.max_weight_update_interval > -1:
             raise RuntimeError(
-                "Got conflicting udpate instructions: `update_after_each_batch` "
+                "Got conflicting update instructions: `update_after_each_batch` "
                 "`max_weight_update_interval` are incompatible."
             )
         self.launcher = launcher