diff --git a/.github/workflows/check_markdown_links.yaml b/.github/workflows/check_markdown_links.yaml
new file mode 100644
index 0000000..dd33901
--- /dev/null
+++ b/.github/workflows/check_markdown_links.yaml
@@ -0,0 +1,14 @@
+name: Check Markdown links
+on:
+  pull_request:
+    branches:
+      - main
+
+jobs:
+  markdown-link-check:
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/checkout@master
+    - uses: gaurav-nelson/github-action-markdown-link-check@v1
+      with:
+        use-quiet-mode: 'yes'
diff --git a/README.md b/README.md
index efd1adc..e3e6143 100644
--- a/README.md
+++ b/README.md
@@ -21,7 +21,7 @@ This repository contains a variety of Determined examples that are not actively
 | [cifar10\_pytorch\_inference](computer_vision/cifar10_pytorch_inference)     | CIFAR-10                     | PyTorch                                  |
 | [cifar10\_tf\_keras](computer_vision/cifar10_tf_keras)                       | CIFAR-10                     | TensorFlow (tf.keras)                    |
 | [fasterrcnn\_coco\_pytorch](computer_vision/fasterrcnn_coco_pytorch)         | Penn-Fudan Dataset           | PyTorch                                  |
-| [mmdetection\_pytorch](computer_vision/mmdetection_pytorch)                  | COCO                         | PyTorch                                  |
+| [mmdetection](model_hub/mmdetection)                  | COCO                         | PyTorch                                  |
 | [detr\_coco\_pytorch](computer_vision/detr_coco_pytorch)                     | COCO                         | PyTorch                                  |
 | [deformabledetr\_coco\_pytorch](computer_vision/deformabledetr_coco_pytorch) | COCO                         | PyTorch                                  |
 | [iris\_tf\_keras](computer_vision/iris_tf_keras)                             | Iris Dataset                 | TensorFlow (tf.keras)                    |
diff --git a/blog/python_sdk_demo/README.md b/blog/python_sdk_demo/README.md
index 1248095..1d3aa7a 100644
--- a/blog/python_sdk_demo/README.md
+++ b/blog/python_sdk_demo/README.md
@@ -42,4 +42,4 @@ python determined_sdk_demo.py
 - [Wesley Turner](https://github.com/wes-turner)
 - [Kevin Musgrave](https://github.com/KevinMusgrave)
 
-The code in the `medmnist_model` directory is based on the [`determined_medmnist_e2e`](https://github.com/ighodgao/determined_medmnist_e2e) repo by [Isha Ghodgaonkar](ighodgao).
\ No newline at end of file
+The code in the `medmnist_model` directory is based on the [`determined_medmnist_e2e`](https://github.com/ighodgao/determined_medmnist_e2e) repo by [Isha Ghodgaonkar](https://github.com/ighodgao).
\ No newline at end of file
diff --git a/computer_vision/byol_pytorch/README.md b/computer_vision/byol_pytorch/README.md
index f90f502..740a233 100644
--- a/computer_vision/byol_pytorch/README.md
+++ b/computer_vision/byol_pytorch/README.md
@@ -3,7 +3,7 @@
 This example shows how to perform self-supervised image classifier training with BYOL using
 Determined's PyTorch API.  This example is based on the [byol-pytorch](https://github.com/lucidrains/byol-pytorch/tree/master/byol_pytorch) package.
 
-Original BYOL paper: https://arxiv.org/abs/2006.0
+Original BYOL paper: https://arxiv.org/abs/2006.07733
 
 Code and configuration details also sourced from the following BYOL implementations:
   - (JAX, paper authors) https://github.com/deepmind/deepmind-research/tree/master/byol
diff --git a/computer_vision/cifar10_pytorch/README.md b/computer_vision/cifar10_pytorch/README.md
index a27df7f..68c464b 100644
--- a/computer_vision/cifar10_pytorch/README.md
+++ b/computer_vision/cifar10_pytorch/README.md
@@ -2,7 +2,7 @@
 
 This example shows how to build a simple CNN on the CIFAR-10 dataset using
 Determined's PyTorch API. This example is adapted from this [Keras CNN
-example](https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py).
+example](https://github.com/keras-team/keras/blob/keras-2/examples/cifar10_cnn.py).
 
 ## Files
 * **model_def.py**: The core code for the model. This includes building and compiling the model.
diff --git a/computer_vision/cifar10_pytorch/model_def.py b/computer_vision/cifar10_pytorch/model_def.py
index 253fc4f..3e7193f 100644
--- a/computer_vision/cifar10_pytorch/model_def.py
+++ b/computer_vision/cifar10_pytorch/model_def.py
@@ -1,6 +1,6 @@
 """
 CNN on Cifar10 from Keras example:
-https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py
+https://github.com/keras-team/keras/blob/keras-2/examples/cifar10_cnn.py
 """
 import os
 import tempfile
diff --git a/computer_vision/cifar10_tf_keras/README.md b/computer_vision/cifar10_tf_keras/README.md
index 2b02dc0..8902b33 100644
--- a/computer_vision/cifar10_tf_keras/README.md
+++ b/computer_vision/cifar10_tf_keras/README.md
@@ -2,7 +2,7 @@
 
 This example shows how to build a simple CNN on the CIFAR-10 dataset using
 Determined's tf.keras API. This example is adapted from this [Keras CNN
- example](https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py).
+ example](https://github.com/keras-team/keras/blob/keras-2/examples/cifar10_cnn.py).
 
 ## Files
 * **model_def.py**: Organizes the model and data-loaders into the Determined TFKerasTrial API.
diff --git a/computer_vision/cifar10_tf_keras/cifar_model.py b/computer_vision/cifar10_tf_keras/cifar_model.py
index a74f2ea..fd9c5eb 100644
--- a/computer_vision/cifar10_tf_keras/cifar_model.py
+++ b/computer_vision/cifar10_tf_keras/cifar_model.py
@@ -1,6 +1,6 @@
 """
 Original CIFAR-10 CNN Keras model code from:
-https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py.
+https://github.com/keras-team/keras/blob/keras-2/examples/cifar10_cnn.py.
 """
 
 import numpy as np
diff --git a/computer_vision/cifar10_tf_keras/model_def.py b/computer_vision/cifar10_tf_keras/model_def.py
index da1ad7f..84b738a 100644
--- a/computer_vision/cifar10_tf_keras/model_def.py
+++ b/computer_vision/cifar10_tf_keras/model_def.py
@@ -10,7 +10,7 @@
     https://docs.determined.ai/latest/reference/api/keras.html
     https://www.tensorflow.org/guide/keras
 
-Based on: https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py
+Based on: https://github.com/keras-team/keras/blob/keras-2/examples/cifar10_cnn.py
 
 """
 from typing import Generator, List, Tuple
diff --git a/deepspeed/cifar10_cpu_offloading/README.md b/deepspeed/cifar10_cpu_offloading/README.md
index 44e0531..ba57729 100644
--- a/deepspeed/cifar10_cpu_offloading/README.md
+++ b/deepspeed/cifar10_cpu_offloading/README.md
@@ -1,6 +1,6 @@
 # DeepSpeed CPU Offloading example
 This example is adapted from the 
-[CIFAR example in the DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/cifar) 
+[CIFAR example in the DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/training/cifar) 
 repository. It is intended to show how to configure 
 [ZeRO Stage 3 with CPU offloading](https://www.deepspeed.ai/tutorials/zero/) for a simple CNN network.
 
diff --git a/deepspeed/cifar10_moe/README.md b/deepspeed/cifar10_moe/README.md
index 39b9ca4..4219d49 100644
--- a/deepspeed/cifar10_moe/README.md
+++ b/deepspeed/cifar10_moe/README.md
@@ -1,6 +1,6 @@
 # DeepSpeed CIFAR Example
 This example is adapted from the 
-[CIFAR example in the DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/cifar) 
+[CIFAR example in the DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/training/cifar) 
 repository. It is intended to demonstrate a simple usecase of DeepSpeed with Determined.
 
 ## Files
diff --git a/deepspeed/deepspeed_dcgan/README.md b/deepspeed/deepspeed_dcgan/README.md
index de517f2..25f742b 100644
--- a/deepspeed/deepspeed_dcgan/README.md
+++ b/deepspeed/deepspeed_dcgan/README.md
@@ -1,6 +1,6 @@
 # DeepSpeed CIFAR Example
 This example is adapted from the
-[DCGAN example in the DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/gan)
+[DCGAN example in the DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/training/gan)
 repository. It is intended to demonstrate a simple usecase of DeepSpeed with Determined.
 
 ## Files
diff --git a/deepspeed/pipeline_parallelism/README.md b/deepspeed/pipeline_parallelism/README.md
index 51ee275..1c42293 100644
--- a/deepspeed/pipeline_parallelism/README.md
+++ b/deepspeed/pipeline_parallelism/README.md
@@ -1,6 +1,6 @@
 # DeepSpeed CIFAR Example
 This example is adapted from the 
-[pipeline parallelism example in the DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/pipeline_parallelism) 
+[pipeline parallelism example in the DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/training/pipeline_parallelism) 
 repository. It is intended to demonstrate a simple usecase of DeepSpeed's PipelineEngine with Determined.
 
 ## Files
diff --git a/gan/cyclegan/README.md b/gan/cyclegan/README.md
index 07bd4b0..d143b55 100644
--- a/gan/cyclegan/README.md
+++ b/gan/cyclegan/README.md
@@ -6,7 +6,7 @@ Determined implementation of [CycleGAN](https://github.com/eriklindernoren/PyTor
 
 - **determined_model_def.py** is the user model definition file for Determined-managed training. 
   This file is ported from **cyclegan.py** to implement the 
-  [Determined Pytorch API](https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial).
+  [Determined Pytorch API](https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial).
 - **const.yaml** is the user configuration for Determined-managed training.
 - **startup-hook.sh** is the startup bash script that is used for downloading and 
   extracting the training data.
@@ -56,5 +56,5 @@ hyperparameters:
 * The throughput is unstable due to inter-node communication when the global batch size 
   is 64 and the aggregation frequency is 1. Use a larger batch size or a larger aggregation 
   frequency to increase the scaling efficiency of the throughput. See 
-  [Effective Distributed Training](https://docs.determined.ai/latest/topic-guides/effective-distributed-training.html#effective-distributed-training)
+  [Distributed Training Performance Optimization](https://docs.determined.ai/latest/model-dev-guide/dtrain/dtrain-introduction.html#performance-optimization)
   for details.
diff --git a/gan/cyclegan/cyclegan.py b/gan/cyclegan/cyclegan.py
index b08027e..28c07ca 100644
--- a/gan/cyclegan/cyclegan.py
+++ b/gan/cyclegan/cyclegan.py
@@ -1,6 +1,6 @@
 """
 This file is not used in the Determined-managed training. We port this file to use the
-Determined Pytorch API (https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial).
+Determined Pytorch API (https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial).
 """
 import argparse
 import os
diff --git a/gan/gan_mnist_pytorch/README.md b/gan/gan_mnist_pytorch/README.md
index b8280d7..3dd616f 100644
--- a/gan/gan_mnist_pytorch/README.md
+++ b/gan/gan_mnist_pytorch/README.md
@@ -2,7 +2,7 @@
 
 This example demonstrates how to build a simple GAN on the MNIST dataset using
 Determined's PyTorch API. This example is adapted from this [PyTorch Lightning GAN
-example](https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/domain_templates/generative_adversarial_net.py).
+example](https://github.com/Lightning-AI/pytorch-lightning/blob/master/examples/pytorch/domain_templates/generative_adversarial_net.py).
 
 ## Files
 * **model_def.py**: The core code for the model. This includes building and compiling the model.
diff --git a/gan/pix2pix_tf_keras/README.md b/gan/pix2pix_tf_keras/README.md
index ddd0bf8..de00d1e 100644
--- a/gan/pix2pix_tf_keras/README.md
+++ b/gan/pix2pix_tf_keras/README.md
@@ -40,6 +40,6 @@ The following image was created using a model trained with `const.yaml`.
 
 ## Results
 The following plots show differences in performance made when utilizing GPUs with the Determined distributed system.
-![Cumulative Batches vs. Time](./images/batches_vs_time.png)
-![Training Loss vs. Time](./images/training_loss_vs_time.png)
-![Validation Loss vs. Time](./images/validation_loss_vs_time.png)
\ No newline at end of file
+![Cumulative Batches vs. Time](./images/batches_vs_time.jpg)
+![Training Loss vs. Time](./images/training_loss_vs_time.jpg)
+![Validation Loss vs. Time](./images/validation_loss_vs_time.jpg)
\ No newline at end of file
diff --git a/gan/pix2pix_tf_keras/images/batches_vs_time.jpg b/gan/pix2pix_tf_keras/images/batches_vs_time.jpg
new file mode 100644
index 0000000..28bfc92
Binary files /dev/null and b/gan/pix2pix_tf_keras/images/batches_vs_time.jpg differ
diff --git a/gan/pix2pix_tf_keras/images/training_loss_vs_time.jpg b/gan/pix2pix_tf_keras/images/training_loss_vs_time.jpg
new file mode 100644
index 0000000..5ac90f2
Binary files /dev/null and b/gan/pix2pix_tf_keras/images/training_loss_vs_time.jpg differ
diff --git a/gan/pix2pix_tf_keras/images/validation_loss_vs_time.jpg b/gan/pix2pix_tf_keras/images/validation_loss_vs_time.jpg
new file mode 100644
index 0000000..94a91d6
Binary files /dev/null and b/gan/pix2pix_tf_keras/images/validation_loss_vs_time.jpg differ
diff --git a/model_hub/huggingface/language-modeling/README.md b/model_hub/huggingface/language-modeling/README.md
index dc698d6..4afdb43 100644
--- a/model_hub/huggingface/language-modeling/README.md
+++ b/model_hub/huggingface/language-modeling/README.md
@@ -4,9 +4,9 @@ The examples here mirror the [language-modeling examples](https://github.com/hug
 You can finetune GPT and GPT-2 with the causal language model (CLM); ALBERT, BERT, DistilBERT, and RoBERTa with the masked language model(MLM); and XLNet with the permutation language model (PLM).
 
 ## Files
-* **clm_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial) for CLM. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](../model_hub/transformers/_trial.py).
-* **mlm_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial) for MLM. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](../model_hub/transformers/_trial.py).
-* **plm_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial) for PLM. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](../model_hub/transformers/_trial.py).
+* **clm_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial) for CLM. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](https://github.com/determined-ai/determined/blob/main/model_hub/model_hub/huggingface/_trial.py).
+* **mlm_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial) for MLM. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](https://github.com/determined-ai/determined/blob/main/model_hub/model_hub/huggingface/_trial.py).
+* **plm_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial) for PLM. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](https://github.com/determined-ai/determined/blob/main/model_hub/model_hub/huggingface/_trial.py).
 
 ### Configuration Files
 * **clm_config.yaml**: Experiment configuration for finetuning on WikiText-2 with GPT2.  
diff --git a/model_hub/huggingface/multiple-choice/README.md b/model_hub/huggingface/multiple-choice/README.md
index d818db8..81e9d8c 100644
--- a/model_hub/huggingface/multiple-choice/README.md
+++ b/model_hub/huggingface/multiple-choice/README.md
@@ -2,7 +2,7 @@
 This example mirrors the [multiple choice example](https://github.com/huggingface/transformers/tree/master/examples/pytorch/multiple-choice) from the original huggingface transformers repo for finetuning on the SWAG dataset.
 
 ## Files
-* **swag_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial) for this example. A few class methods are overwritten and specialized for named-entity recognition but otherwise the behavior is the same as the [BaseTransformerTrial class](../model_hub/transformers/_trial.py).
+* **swag_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial) for this example. A few class methods are overwritten and specialized for named-entity recognition but otherwise the behavior is the same as the [BaseTransformerTrial class](https://github.com/determined-ai/determined/blob/main/model_hub/model_hub/huggingface/_trial.py).
 * **data.py**: Modifies ``DataCollatorForMultipleChoice`` to work with PyTorchTrial.
 
 ### Configuration FIles
diff --git a/model_hub/huggingface/question-answering/README.md b/model_hub/huggingface/question-answering/README.md
index 760395f..0a06935 100644
--- a/model_hub/huggingface/question-answering/README.md
+++ b/model_hub/huggingface/question-answering/README.md
@@ -2,9 +2,9 @@
 The examples here mirror the [question answering examples](https://github.com/huggingface/transformers/tree/master/examples/pytorch/question-answering) from the original huggingface transformers repo.
 
 ## Files
-* **qa_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial) for question-answering on SQuAD. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](../model_hub/transformers/_trial.py).
+* **qa_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial) for question-answering on SQuAD. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](https://github.com/determined-ai/determined/blob/main/model_hub/model_hub/huggingface/_trial.py).
 * **data.py**: data pre and post-processing for question answering.
-* **qa_beam_search_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial) for question-answering with beam search. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](../model_hub/transformers/_trial.py).
+* **qa_beam_search_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial) for question-answering with beam search. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](https://github.com/determined-ai/determined/blob/main/model_hub/model_hub/huggingface/_trial.py).
 * **data_beam_search.py**: data pre and post-processing for question answering with beam search.
 
 ### Configuration Files
diff --git a/model_hub/huggingface/text-classification/README.md b/model_hub/huggingface/text-classification/README.md
index 384f667..bf2f8d0 100644
--- a/model_hub/huggingface/text-classification/README.md
+++ b/model_hub/huggingface/text-classification/README.md
@@ -4,7 +4,7 @@ The two examples here mirror the [text-classification examples](https://github.c
 ## GLUE Benchmark
 
 ### Files
-* **glue_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial) for this example. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](../model_hub/transformers/_trial.py).
+* **glue_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial) for this example. A few class methods are overwritten and specialized for text classification but otherwise the behavior is the same as the [BaseTransformerTrial class](https://github.com/determined-ai/determined/blob/main/model_hub/model_hub/huggingface/_trial.py).
 
 #### Configuration Files
 * **glue_config.yaml**: Experiment configuration for finetuning on GLUE datasets with Bert.  
@@ -32,7 +32,7 @@ Using the provided experiment config `glue_config.yaml` yields a Matthew's Corre
 ## XNLI Dataset
 
 ### Files
-* **xnli_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial) for this example. A few class methods are overwritten and specialized for text classification on XNLI but otherwise the behavior is the same as the [BaseTransformerTrial class](../model_hub/transformers/_trial.py).
+* **xnli_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial) for this example. A few class methods are overwritten and specialized for text classification on XNLI but otherwise the behavior is the same as the [BaseTransformerTrial class](https://github.com/determined-ai/determined/blob/main/model_hub/model_hub/huggingface/_trial.py).
 
 #### Configuration Files
 * **xnli_config.yaml**: Experiment configuration for finetuning on the XNLI dataset with Bert.
diff --git a/model_hub/huggingface/token-classification/README.md b/model_hub/huggingface/token-classification/README.md
index f53e53d..9f873ac 100644
--- a/model_hub/huggingface/token-classification/README.md
+++ b/model_hub/huggingface/token-classification/README.md
@@ -2,7 +2,7 @@
 This example mirrors the [token-classification example](https://github.com/huggingface/transformers/tree/master/examples/pytorch/token-classification) from the original huggingface transformers repo for named-entity recognition.
 
 ## Files
-* **ner_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/reference/api/pytorch.html#pytorch-trial) for this example. A few class methods are overwritten and specialized for named-entity recognition but otherwise the behavior is the same as the [BaseTransformerTrial class](../model_hub/transformers/_trial.py).
+* **ner_trial.py**: The [PyTorchTrial definition](https://docs.determined.ai/latest/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.html#pytorch-trial) for this example. A few class methods are overwritten and specialized for named-entity recognition but otherwise the behavior is the same as the [BaseTransformerTrial class](https://github.com/determined-ai/determined/blob/main/model_hub/model_hub/huggingface/_trial.py).
 * **ner_utils.py**: Utility functions for NER largely extracted from [run_ner.py](https://github.com/huggingface/transformers/tree/master/examples/pytorch/token-classification/run_ner.py) to separate example code from Determined code.
 
 ### Configuration Files
diff --git a/model_hub/mmdetection/README.md b/model_hub/mmdetection/README.md
index 45bd840..f5de334 100644
--- a/model_hub/mmdetection/README.md
+++ b/model_hub/mmdetection/README.md
@@ -2,7 +2,7 @@
 We have provided a default [Determined experiment configuration for Mask-RCNN on COCO 2017](./maskrcnn.yaml) as the starting point.
 
 ## Configuring MMDetection
-MMDetection has [its own configuration system](https://mmdetection.readthedocs.io/en/latest/tutorials/config.html) for specifying the dataset, model, optimizer, and other objects used during training.  Customizing MMDetection primarily amounts to specifying
+MMDetection has [its own configuration system](https://mmdetection.readthedocs.io/en/v2.27.0/tutorials/config.html) for specifying the dataset, model, optimizer, and other objects used during training.  Customizing MMDetection primarily amounts to specifying
 your own MMDetection configs and we preserve this when using MMDetection with Determined.  Hence, similar to directly using MMDetection, you will need to be familiar with MMDetection's configurable objects in order to get more custom behavior.
 
 We will cover how to specify and manipulate a *MMDetection config* in the experiment configuration below.  
@@ -56,7 +56,7 @@ field to turn on gradient clipping and mixed precision.
 
 ### Other experiment config fields
 #### Data backends
-We support `s3`, and `gcs` backends in addition to the [file client backends supported by MMCV](https://mmcv.readthedocs.io/en/latest/_modules/mmcv/fileio/file_client.html#FileClient) so you can easily access data in cloud storage buckets.  Note that MMDetection expects the data to follow [a particular structure for standard datasets](https://mmdetection.readthedocs.io/en/latest/1_exist_data_model.html#test-existing-models-on-standard-datasets) like COCO, Pascal, Cityscapes.
+We support `s3`, and `gcs` backends in addition to the [file client backends supported by MMCV](https://mmcv.readthedocs.io/en/v1.7.0/_modules/mmcv/fileio/file_client.html#FileClient) so you can easily access data in cloud storage buckets.  Note that MMDetection expects the data to follow [a particular structure for standard datasets](https://mmdetection.readthedocs.io/en/v2.27.0/1_exist_data_model.html#test-existing-models-on-standard-datasets) like COCO, Pascal, Cityscapes.
 
 You can change the backend by modifying the `data.file_client_args` section of the experiment config.
 
@@ -67,10 +67,10 @@ The `hyperparameters.global_batch_size` field of the Determined experiment confi
 MMDetection provides pretrained checkpoints corresponding to some of the configurations as listed in the `README` files for each model type.  If a pretrained weight is available for the specified `config_file`, you can warmstart the model with these weights by setting `hyperparameters.use_pretrained` to `true`.  
 
 ## Using custom dataset
-Training MMDetection on custom datasets in Determined is largely the same process as doing so directly with MMDetection.  Please follow [this guide](https://mmdetection.readthedocs.io/en/latest/tutorials/customize_dataset.html) from MMDetection to register your own datasets.
+Training MMDetection on custom datasets in Determined is largely the same process as doing so directly with MMDetection.  Please follow [this guide](https://mmdetection.readthedocs.io/en/v2.27.0/tutorials/customize_dataset.html) from MMDetection to register your own datasets.
 
 ## Adding classes
-Creating custom classes for models, otpimizers, losses, and other MMDetection objects also requires following the same process as you would normally.  Please see the [MMDetection tutorials](https://mmdetection.readthedocs.io/en/latest/tutorials/customize_models.html) for more info.
+Creating custom classes for models, otpimizers, losses, and other MMDetection objects also requires following the same process as you would normally.  Please see the [MMDetection tutorials](https://mmdetection.readthedocs.io/en/v2.27.0/tutorials/customize_models.html) for more info.
 
 ## Results
 The validation bounding box mAP for Faster-RCNN is shown in the image below.
@@ -84,4 +84,4 @@ under `docs/install-admin.html` or at https://docs.determined.ai/latest/index.ht
 Make sure the environment variable `DET_MASTER` is set to your cluster URL.
 Then you run the following command from the command line: `det experiment create -f <experiment_config> .`. 
 
-For modular and composable configuration, please check out how to [use MMDetection with Facebook's hydra](./hydra/README.md`).
+For modular and composable configuration, please check out how to [use MMDetection with Facebook's hydra](./hydra/README.md).
diff --git a/model_hub/mmdetection/fasterrcnn.yaml b/model_hub/mmdetection/fasterrcnn.yaml
index d112fb8..c587c40 100644
--- a/model_hub/mmdetection/fasterrcnn.yaml
+++ b/model_hub/mmdetection/fasterrcnn.yaml
@@ -19,7 +19,7 @@ hyperparameters:
   merge_config: null # You can specify a config you want to merge into the config_file above.
   use_pretrained: false # Whether to load pretrained weights for config if available.
   override_mmdet_config:
-    ##### Learn more about mmdet configs: https://mmdetection.readthedocs.io/en/latest/tutorials/config.html #####
+    ##### Learn more about mmdet configs: https://mmdetection.readthedocs.io/en/v2.27.0/tutorials/config.html #####
     ##### You can specify gradient clipping with below #####
     optimizer_config._delete_: true
     optimizer_config.grad_clip.max_norm: 100
diff --git a/model_hub/mmdetection/maskrcnn.yaml b/model_hub/mmdetection/maskrcnn.yaml
index 2139652..7049ef1 100644
--- a/model_hub/mmdetection/maskrcnn.yaml
+++ b/model_hub/mmdetection/maskrcnn.yaml
@@ -19,7 +19,7 @@ hyperparameters:
   merge_config: null # You can specify a config you want to merge into the config_file above.
   use_pretrained: false # Whether to load pretrained weights for config if available.
   override_mmdet_config:
-    ##### Learn more about mmdet configs: https://mmdetection.readthedocs.io/en/latest/tutorials/config.html #####
+    ##### Learn more about mmdet configs: https://mmdetection.readthedocs.io/en/v2.27.0/tutorials/config.html #####
     ##### You can specify gradient clipping with below #####
     optimizer_config._delete_: true
     optimizer_config.grad_clip.max_norm: 100
diff --git a/model_hub/mmdetection/panoptic_fpn.yaml b/model_hub/mmdetection/panoptic_fpn.yaml
index e7f2d62..7cdbd13 100644
--- a/model_hub/mmdetection/panoptic_fpn.yaml
+++ b/model_hub/mmdetection/panoptic_fpn.yaml
@@ -19,7 +19,7 @@ hyperparameters:
   merge_config: null # You can specify a config you want to merge into the config_file above.
   use_pretrained: false # Whether to load pretrained weights for config if available.
   #override_mmdet_config:
-    ##### Learn more about mmdet configs: https://mmdetection.readthedocs.io/en/latest/tutorials/config.html #####
+    ##### Learn more about mmdet configs: https://mmdetection.readthedocs.io/en/v2.27.0/tutorials/config.html #####
     ##### You can specify gradient clipping with below #####
     #optimizer_config._delete_: true
     #optimizer_config.grad_clip.max_norm: 100
diff --git a/model_hub/mmdetection/yolov3.yaml b/model_hub/mmdetection/yolov3.yaml
index 6656ace..60986d8 100644
--- a/model_hub/mmdetection/yolov3.yaml
+++ b/model_hub/mmdetection/yolov3.yaml
@@ -19,7 +19,7 @@ hyperparameters:
   merge_config: null # You can specify a config you want to merge into the config_file above.
   use_pretrained: false # Whether to load pretrained weights for config if available.
   #override_mmdet_config:
-    ##### Learn more about mmdet configs: https://mmdetection.readthedocs.io/en/latest/tutorials/config.html #####
+    ##### Learn more about mmdet configs: https://mmdetection.readthedocs.io/en/v2.27.0/tutorials/config.html #####
     ##### You can specify gradient clipping with below #####
     #optimizer_config._delete_: true
     #optimizer_config.grad_clip.max_norm: 100
diff --git a/nas/gaea_pytorch/README.md b/nas/gaea_pytorch/README.md
index 00d419a..bbde802 100644
--- a/nas/gaea_pytorch/README.md
+++ b/nas/gaea_pytorch/README.md
@@ -27,4 +27,4 @@ The data for ImageNet is the ILSVRC2012 version of the dataset, which is availab
 
 ### Expected Performance
 After 24 Epochs, the top 5 validation accuracy should be close to 80% (see learning curve below).  At convergence, the top 1 validation accuracy should be close to the 76% reported in the original paper.  
-![](./eval/top5\_val.png)
+![](./eval/top5_val.png)
diff --git a/nlp/albert_squad_pytorch/README.md b/nlp/albert_squad_pytorch/README.md
index a7a28e4..d63c70c 100644
--- a/nlp/albert_squad_pytorch/README.md
+++ b/nlp/albert_squad_pytorch/README.md
@@ -49,7 +49,7 @@ to save and look for the cache file, make sure to set the `data.use_bind_mount`
 fields correctly in the experiment configuration.
 
 ### Data
-The data used for this script was fetched based on Huggingface's [SQuAD page](https://github.com/huggingface/transformers/tree/master/examples/question-answering).
+The data used for this script was fetched based on Huggingface's [SQuAD page](https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering).
 
 The data will be automatically downloaded and saved before training. If you use a `bind_mount`, the 
 data will be saved between experiments and will not need to be downloaded again.