Skip to content

Commit

Permalink
Merge branch 'main' into mgs28-char-rnn-update
Browse files Browse the repository at this point in the history
  • Loading branch information
mgs28 authored Sep 4, 2024
2 parents 8508a8b + 748e52b commit fb72a18
Show file tree
Hide file tree
Showing 46 changed files with 795 additions and 66 deletions.
3 changes: 2 additions & 1 deletion .ci/docker/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,9 @@ IMAGE_NAME="$1"
shift

export UBUNTU_VERSION="20.04"
export CUDA_VERSION="12.4.1"

export BASE_IMAGE="ubuntu:${UBUNTU_VERSION}"
export BASE_IMAGE="nvidia/cuda:${CUDA_VERSION}-devel-ubuntu${UBUNTU_VERSION}"
echo "Building ${IMAGE_NAME} Docker image"

docker build \
Expand Down
2 changes: 1 addition & 1 deletion .ci/docker/common/common_utils.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,5 @@ conda_run() {
}

pip_install() {
as_ci_user conda run -n py_$ANACONDA_PYTHON_VERSION pip install --progress-bar off $*
as_ci_user conda run -n py_$ANACONDA_PYTHON_VERSION pip3 install --progress-bar off $*
}
6 changes: 3 additions & 3 deletions .ci/docker/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ pytorch-lightning
torchx
torchrl==0.5.0
tensordict==0.5.0
ax-platform>==0.4.0
nbformat>==5.9.2
ax-platform>=0.4.0
nbformat>=5.9.2
datasets
transformers
torchmultimodal-nightly # needs to be updated to stable as soon as it's avaialable
Expand Down Expand Up @@ -68,4 +68,4 @@ pygame==2.1.2
pycocotools
semilearn==0.3.2
torchao==0.0.3
segment_anything==1.0
segment_anything==1.0
3 changes: 3 additions & 0 deletions .jenkins/metadata.json
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@
"intermediate_source/model_parallel_tutorial.py": {
"needs": "linux.16xlarge.nvidia.gpu"
},
"recipes_source/torch_export_aoti_python.py": {
"needs": "linux.g5.4xlarge.nvidia.gpu"
},
"advanced_source/pendulum.py": {
"needs": "linux.g5.4xlarge.nvidia.gpu",
"_comment": "need to be here for the compiling_optimizer_lr_scheduler.py to run."
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ We use sphinx-gallery's [notebook styled examples](https://sphinx-gallery.github

Here is how you can create a new tutorial (for a detailed description, see [CONTRIBUTING.md](./CONTRIBUTING.md)):

NOTE: Before submitting a new tutorial, read [PyTorch Tutorial Submission Policy](./tutorial_submission_policy.md).

1. Create a Python file. If you want it executed while inserted into documentation, save the file with the suffix `tutorial` so that the file name is `your_tutorial.py`.
2. Put it in one of the `beginner_source`, `intermediate_source`, `advanced_source` directory based on the level of difficulty. If it is a recipe, add it to `recipes_source`. For tutorials demonstrating unstable prototype features, add to the `prototype_source`.
3. For Tutorials (except if it is a prototype feature), include it in the `toctree` directive and create a `customcarditem` in [index.rst](./index.rst).
Expand All @@ -31,7 +33,7 @@ If you are starting off with a Jupyter notebook, you can use [this script](https

## Building locally

The tutorial build is very large and requires a GPU. If your machine does not have a GPU device, you can preview your HTML build without actually downloading the data and running the tutorial code:
The tutorial build is very large and requires a GPU. If your machine does not have a GPU device, you can preview your HTML build without actually downloading the data and running the tutorial code:

1. Install required dependencies by running: `pip install -r requirements.txt`.

Expand All @@ -40,8 +42,6 @@ The tutorial build is very large and requires a GPU. If your machine does not ha
- If you have a GPU-powered laptop, you can build using `make docs`. This will download the data, execute the tutorials and build the documentation to `docs/` directory. This might take about 60-120 min for systems with GPUs. If you do not have a GPU installed on your system, then see next step.
- You can skip the computationally intensive graph generation by running `make html-noplot` to build basic html documentation to `_build/html`. This way, you can quickly preview your tutorial.

> If you get **ModuleNotFoundError: No module named 'pytorch_sphinx_theme' make: *** [html-noplot] Error 2** from /tutorials/src/pytorch-sphinx-theme or /venv/src/pytorch-sphinx-theme (while using virtualenv), run `python setup.py install`.
## Building a single tutorial

You can build a single tutorial by using the `GALLERY_PATTERN` environment variable. For example to run only `neural_style_transfer_tutorial.py`, run:
Expand All @@ -59,8 +59,8 @@ The `GALLERY_PATTERN` variable respects regular expressions.


## About contributing to PyTorch Documentation and Tutorials
* You can find information about contributing to PyTorch documentation in the
PyTorch Repo [README.md](https://github.com/pytorch/pytorch/blob/master/README.md) file.
* You can find information about contributing to PyTorch documentation in the
PyTorch Repo [README.md](https://github.com/pytorch/pytorch/blob/master/README.md) file.
* Additional information can be found in [PyTorch CONTRIBUTING.md](https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md).


Expand Down
21 changes: 21 additions & 0 deletions _static/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -91,3 +91,24 @@
transition: none;
transform-origin: none;
}

.pytorch-left-menu-search input[type=text] {
background-image: none;
}

.gsc-control-cse {
padding-left: 0px !important;
padding-bottom: 0px !important;
}

.gsc-search-button .gsc-search-button-v2:focus {
border: transparent !important;
outline: none;
box-shadow: none;
}
.gsc-search-button-v2:active {
border: none !important;
}
.gsc-search-button-v2 {
border: none !important;
}
17 changes: 17 additions & 0 deletions _templates/layout.html
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,23 @@
</script>
{%- endblock %}

{% block sidebartitle %}
{% if theme_display_version %}
{%- set nav_version = version %}
{% if READTHEDOCS and current_version %}
{%- set nav_version = current_version %}
{% endif %}
{% if nav_version %}
<div class="version">
{{ nav_version }}
</div>
{% endif %}
{% endif %}
<div class="searchbox">
<script async src="https://cse.google.com/cse.js?cx=e65585f8c3ea1440e"></script>
<div class="gcse-search"></div>
</div>
{% endblock %}

{% block footer %}
{{ super() }}
Expand Down
2 changes: 2 additions & 0 deletions advanced_source/cpp_custom_ops.rst
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,8 @@ To add ``torch.compile`` support for an operator, we must add a FakeTensor kerne
known as a "meta kernel" or "abstract impl"). FakeTensors are Tensors that have
metadata (such as shape, dtype, device) but no data: the FakeTensor kernel for an
operator specifies how to compute the metadata of output tensors given the metadata of input tensors.
The FakeTensor kernel should return dummy Tensors of your choice with
the correct Tensor metadata (shape/strides/``dtype``/device).

We recommend that this be done from Python via the `torch.library.register_fake` API,
though it is possible to do this from C++ as well (see
Expand Down
3 changes: 2 additions & 1 deletion advanced_source/dynamic_quantization_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,8 @@ def tokenize(self, path):
model.load_state_dict(
torch.load(
model_data_filepath + 'word_language_model_quantize.pth',
map_location=torch.device('cpu')
map_location=torch.device('cpu'),
weights_only=True
)
)

Expand Down
15 changes: 10 additions & 5 deletions advanced_source/python_custom_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ def display(img):
######################################################################
# ``crop`` is not handled effectively out-of-the-box by
# ``torch.compile``: ``torch.compile`` induces a
# `"graph break" <https://pytorch.org/docs/stable/torch.compiler_faq.html#graph-breaks>`_
# `"graph break" <https://pytorch.org/docs/stable/torch.compiler_faq.html#graph-breaks>`_
# on functions it is unable to handle and graph breaks are bad for performance.
# The following code demonstrates this by raising an error
# (``torch.compile`` with ``fullgraph=True`` raises an error if a
Expand All @@ -85,9 +85,9 @@ def f(img):
#
# 1. wrap the function into a PyTorch custom operator.
# 2. add a "``FakeTensor`` kernel" (aka "meta kernel") to the operator.
# Given the metadata (e.g. shapes)
# of the input Tensors, this function says how to compute the metadata
# of the output Tensor(s).
# Given some ``FakeTensors`` inputs (dummy Tensors that don't have storage),
# this function should return dummy Tensors of your choice with the correct
# Tensor metadata (shape/strides/``dtype``/device).


from typing import Sequence
Expand Down Expand Up @@ -130,6 +130,11 @@ def f(img):
# ``autograd.Function`` with PyTorch operator registration APIs can lead to (and
# has led to) silent incorrectness when composed with ``torch.compile``.
#
# If you don't need training support, there is no need to use
# ``torch.library.register_autograd``.
# If you end up training with a ``custom_op`` that doesn't have an autograd
# registration, we'll raise an error message.
#
# The gradient formula for ``crop`` is essentially ``PIL.paste`` (we'll leave the
# derivation as an exercise to the reader). Let's first wrap ``paste`` into a
# custom operator:
Expand Down Expand Up @@ -203,7 +208,7 @@ def setup_context(ctx, inputs, output):
######################################################################
# Mutable Python Custom operators
# -------------------------------
# You can also wrap a Python function that mutates its inputs into a custom
# You can also wrap a Python function that mutates its inputs into a custom
# operator.
# Functions that mutate inputs are common because that is how many low-level
# kernels are written; for example, a kernel that computes ``sin`` may take in
Expand Down
2 changes: 1 addition & 1 deletion advanced_source/static_quantization_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ We next define several helper functions to help with model evaluation. These mos
def load_model(model_file):
model = MobileNetV2()
state_dict = torch.load(model_file)
state_dict = torch.load(model_file, weights_only=True)
model.load_state_dict(state_dict)
model.to('cpu')
return model
Expand Down
2 changes: 1 addition & 1 deletion beginner_source/basics/quickstart_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ def test(dataloader, model, loss_fn):
# the state dictionary into it.

model = NeuralNetwork().to(device)
model.load_state_dict(torch.load("model.pth"))
model.load_state_dict(torch.load("model.pth", weights_only=True))

#############################################################
# This model can now be used to make predictions.
Expand Down
16 changes: 13 additions & 3 deletions beginner_source/basics/saveloadrun_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,14 @@
##########################
# To load model weights, you need to create an instance of the same model first, and then load the parameters
# using ``load_state_dict()`` method.
#
# In the code below, we set ``weights_only=True`` to limit the
# functions executed during unpickling to only those necessary for
# loading weights. Using ``weights_only=True`` is considered
# a best practice when loading weights.

model = models.vgg16() # we do not specify ``weights``, i.e. create untrained model
model.load_state_dict(torch.load('model_weights.pth'))
model.load_state_dict(torch.load('model_weights.pth', weights_only=True))
model.eval()

###########################
Expand All @@ -50,9 +55,14 @@
torch.save(model, 'model.pth')

########################
# We can then load the model like this:
# We can then load the model as demonstrated below.
#
# As described in `Saving and loading torch.nn.Modules <pytorch.org/docs/main/notes/serialization.html#saving-and-loading-torch-nn-modules>`__,
# saving ``state_dict``s is considered the best practice. However,
# below we use ``weights_only=False`` because this involves loading the
# model, which is a legacy use case for ``torch.save``.

model = torch.load('model.pth')
model = torch.load('model.pth', weights_only=False),

########################
# .. note:: This approach uses Python `pickle <https://docs.python.org/3/library/pickle.html>`_ module when serializing the model, thus it relies on the actual class definition to be available when loading the model.
Expand Down
2 changes: 1 addition & 1 deletion beginner_source/blitz/cifar10_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,7 @@ def forward(self, x):
# wasn't necessary here, we only did it to illustrate how to do so):

net = Net()
net.load_state_dict(torch.load(PATH))
net.load_state_dict(torch.load(PATH, weights_only=True))

########################################################################
# Okay, now let us see what the neural network thinks these examples above are:
Expand Down
3 changes: 1 addition & 2 deletions beginner_source/chatbot_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,7 @@
# Preparations
# ------------
#
# To start, Download the data ZIP file
# `here <https://zissou.infosci.cornell.edu/convokit/datasets/movie-corpus/movie-corpus.zip>`__
# To get started, `download <https://zissou.infosci.cornell.edu/convokit/datasets/movie-corpus/movie-corpus.zip>`__ the Movie-Dialogs Corpus zip file.

# and put in a ``data/`` directory under the current directory.
#
Expand Down
4 changes: 4 additions & 0 deletions beginner_source/deeplabv3_on_android.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ Image Segmentation DeepLabV3 on Android

**Reviewed by**: `Jeremiah Chung <https://github.com/jeremiahschung>`_

.. warning::
PyTorch Mobile is no longer actively supported. Please check out `ExecuTorch <https://pytorch.org/executorch-overview>`_, PyTorch’s all-new on-device inference library. You can also review our `end-to-end workflows <https://github.com/pytorch/executorch/tree/main/examples/portable#readme>`_ and review the `source code for DeepLabV3 <https://github.com/pytorch/executorch/tree/main/examples/models/deeplab_v3>`_.


Introduction
------------

Expand Down
2 changes: 1 addition & 1 deletion beginner_source/fgsm_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ def forward(self, x):
model = Net().to(device)

# Load the pretrained model
model.load_state_dict(torch.load(pretrained_model, map_location=device))
model.load_state_dict(torch.load(pretrained_model, map_location=device, weights_only=True))

# Set the model in evaluation mode. In this case this is for the Dropout layers
model.eval()
Expand Down
16 changes: 8 additions & 8 deletions beginner_source/saving_loading_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@
# .. code:: python
#
# model = TheModelClass(*args, **kwargs)
# model.load_state_dict(torch.load(PATH))
# model.load_state_dict(torch.load(PATH), weights_only=True)
# model.eval()
#
# .. note::
Expand Down Expand Up @@ -206,7 +206,7 @@
# .. code:: python
#
# # Model class must be defined somewhere
# model = torch.load(PATH)
# model = torch.load(PATH, weights_only=False)
# model.eval()
#
# This save/load process uses the most intuitive syntax and involves the
Expand Down Expand Up @@ -290,7 +290,7 @@
# model = TheModelClass(*args, **kwargs)
# optimizer = TheOptimizerClass(*args, **kwargs)
#
# checkpoint = torch.load(PATH)
# checkpoint = torch.load(PATH, weights_only=True)
# model.load_state_dict(checkpoint['model_state_dict'])
# optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
# epoch = checkpoint['epoch']
Expand Down Expand Up @@ -354,7 +354,7 @@
# optimizerA = TheOptimizerAClass(*args, **kwargs)
# optimizerB = TheOptimizerBClass(*args, **kwargs)
#
# checkpoint = torch.load(PATH)
# checkpoint = torch.load(PATH, weights_only=True)
# modelA.load_state_dict(checkpoint['modelA_state_dict'])
# modelB.load_state_dict(checkpoint['modelB_state_dict'])
# optimizerA.load_state_dict(checkpoint['optimizerA_state_dict'])
Expand Down Expand Up @@ -407,7 +407,7 @@
# .. code:: python
#
# modelB = TheModelBClass(*args, **kwargs)
# modelB.load_state_dict(torch.load(PATH), strict=False)
# modelB.load_state_dict(torch.load(PATH), strict=False, weights_only=True)
#
# Partially loading a model or loading a partial model are common
# scenarios when transfer learning or training a new complex model.
Expand Down Expand Up @@ -446,7 +446,7 @@
#
# device = torch.device('cpu')
# model = TheModelClass(*args, **kwargs)
# model.load_state_dict(torch.load(PATH, map_location=device))
# model.load_state_dict(torch.load(PATH, map_location=device, weights_only=True))
#
# When loading a model on a CPU that was trained with a GPU, pass
# ``torch.device('cpu')`` to the ``map_location`` argument in the
Expand All @@ -469,7 +469,7 @@
#
# device = torch.device("cuda")
# model = TheModelClass(*args, **kwargs)
# model.load_state_dict(torch.load(PATH))
# model.load_state_dict(torch.load(PATH, weights_only=True))
# model.to(device)
# # Make sure to call input = input.to(device) on any input tensors that you feed to the model
#
Expand Down Expand Up @@ -497,7 +497,7 @@
#
# device = torch.device("cuda")
# model = TheModelClass(*args, **kwargs)
# model.load_state_dict(torch.load(PATH, map_location="cuda:0")) # Choose whatever GPU device number you want
# model.load_state_dict(torch.load(PATH, weights_only=True, map_location="cuda:0")) # Choose whatever GPU device number you want
# model.to(device)
# # Make sure to call input = input.to(device) on any input tensors that you feed to the model
#
Expand Down
2 changes: 1 addition & 1 deletion beginner_source/transfer_learning_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
print(f'Best val Acc: {best_acc:4f}')

# load best model weights
model.load_state_dict(torch.load(best_model_params_path))
model.load_state_dict(torch.load(best_model_params_path, weights_only=True))
return model


Expand Down
3 changes: 2 additions & 1 deletion en-wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
ACL
ADI
AOT
AOTInductor
APIs
ATen
AVX
Expand Down Expand Up @@ -624,4 +625,4 @@ warmstarting
warmup
webp
wsi
wsis
wsis
4 changes: 1 addition & 3 deletions intermediate_source/TP_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,6 @@ To see how to utilize DeviceMesh to set up multi-dimensional parallelisms, pleas

.. code-block:: python
# run this via torchrun: torchrun --standalone --nproc_per_node=8 ./tp_tutorial.py
from torch.distributed.device_mesh import init_device_mesh
tp_mesh = init_device_mesh("cuda", (8,))
Expand Down Expand Up @@ -360,4 +358,4 @@ Conclusion
This tutorial demonstrates how to train a large Transformer-like model across hundreds to thousands of GPUs using Tensor Parallel in combination with Fully Sharded Data Parallel.
It explains how to apply Tensor Parallel to different parts of the model, with **no code changes** to the model itself. Tensor Parallel is a efficient model parallelism technique for large scale training.

To see the complete end to end code example explained in this tutorial, please refer to the `Tensor Parallel examples <https://github.com/pytorch/examples/blob/main/distributed/tensor_parallelism/fsdp_tp_example.py>`__ in the pytorch/examples repository.
To see the complete end-to-end code example explained in this tutorial, please refer to the `Tensor Parallel examples <https://github.com/pytorch/examples/blob/main/distributed/tensor_parallelism/fsdp_tp_example.py>`__ in the pytorch/examples repository.
Loading

0 comments on commit fb72a18

Please sign in to comment.