diff --git a/.github/workflows/actions.yml b/.github/workflows/actions.yml
index bc85e92866..004b197769 100644
--- a/.github/workflows/actions.yml
+++ b/.github/workflows/actions.yml
@@ -60,7 +60,7 @@ jobs:
         pip install keras-nightly --progress-bar off
     - name: Test with pytest
       run: |
-        pytest keras_nlp/
+        pytest keras_hub/
     - name: Run integration tests
       run: |
         python pip_build.py --install
diff --git a/.gitignore b/.gitignore
index 818a7d3f89..15d00c09cb 100644
--- a/.gitignore
+++ b/.gitignore
@@ -7,7 +7,7 @@ __pycache__/
 *.swp
 *.swo
 
-keras_nlp.egg-info/
+keras_hub.egg-info/
 dist/
 
 .coverage
diff --git a/.kokoro/github/ubuntu/gpu/build.sh b/.kokoro/github/ubuntu/gpu/build.sh
index be0294126d..8e972459de 100644
--- a/.kokoro/github/ubuntu/gpu/build.sh
+++ b/.kokoro/github/ubuntu/gpu/build.sh
@@ -62,9 +62,9 @@ pip install huggingface_hub
 # Run Extra Large Tests for Continuous builds
 if [ "${RUN_XLARGE:-0}" == "1" ]
 then
-   pytest keras_nlp --check_gpu --run_large --run_extra_large \
-      --cov=keras-nlp
+   pytest keras_hub --check_gpu --run_large --run_extra_large \
+      --cov=keras-hub
 else
-   pytest keras_nlp --check_gpu --run_large \
-      --cov=keras-nlp
+   pytest keras_hub --check_gpu --run_large \
+      --cov=keras-hub
 fi
\ No newline at end of file
diff --git a/API_DESIGN_GUIDE.md b/API_DESIGN_GUIDE.md
index e1536ff130..8c2684c24d 100644
--- a/API_DESIGN_GUIDE.md
+++ b/API_DESIGN_GUIDE.md
@@ -3,7 +3,7 @@
 Before reading this document, please read the
 [Keras API design guidelines](https://github.com/keras-team/governance/blob/master/keras_api_design_guidelines.md).
 
-Below are some design considerations specific to KerasNLP.
+Below are some design considerations specific to KerasHub.
 
 ## Philosophy
 
@@ -18,16 +18,16 @@ Below are some design considerations specific to KerasNLP.
   arbitrarily advanced use cases should be possible. There should always be a
   "we need to go deeper" path available to our most expert users.
 
-- **Grow as a platform and as a community.** KerasNLP development should be
+- **Grow as a platform and as a community.** KerasHub development should be
   driven by the community, with feature and release planning happening in
   the open on GitHub.
 
 ## Avoid new dependencies
 
-The core dependencies of KerasNLP are Keras, NumPy, TensorFlow, and
+The core dependencies of KerasHub are Keras, NumPy, TensorFlow, and
 [Tensorflow Text](https://www.tensorflow.org/text).
 
-We strive to keep KerasNLP as self-contained as possible, and avoid adding
+We strive to keep KerasHub as self-contained as possible, and avoid adding
 dependencies to projects (for example NLTK or spaCy) for text preprocessing.
 
 In rare cases, particularly with tokenizers and metrics, we may need to add
@@ -65,7 +65,7 @@ calling a layer, metric or loss with `@tf.function` without running into issues.
 [tf.text](https://www.tensorflow.org/text/api_docs/python/text) provides a large
 surface on TensorFlow operations that manipulate strings. If an low-level (c++)
 operation we need is missing, we should add it in collaboration with core
-TensorFlow or TensorFlow Text. KerasNLP is a python-only library.
+TensorFlow or TensorFlow Text. KerasHub is a python-only library.
 
 We should also strive to keep computation XLA compilable wherever possible (e.g.
 `tf.function(jit_compile=True)`). For trainable modeling components this is
@@ -84,7 +84,7 @@ both batched and unbatched data as input to preprocessing layers.
 
 ## Prioritize multi-lingual support
 
-We strive to keep KerasNLP a friendly and useful library for speakers of all
+We strive to keep KerasHub a friendly and useful library for speakers of all
 languages. In general, prefer designing workflows that are language agnostic,
 and do not involve logic (e.g. stemming) that need to be rewritten
 per-language.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 0e60522b83..b11e6024b7 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,7 +1,7 @@
 # Contribution guide
 
-KerasNLP is an actively growing project and community! We would love for you
-to get involved. Below are instructions for how to plug into KerasNLP
+KerasHub is an actively growing project and community! We would love for you
+to get involved. Below are instructions for how to plug into KerasHub
 development.
 
 ## Background reading
@@ -83,13 +83,13 @@ Once the pull request is approved, a team member will take care of merging.
 
 Python 3.9 or later is required.
 
-Setting up your KerasNLP development environment requires you to fork the
-KerasNLP repository and clone it locally. With the
+Setting up your KerasHub development environment requires you to fork the
+KerasHub repository and clone it locally. With the
 [GitHub CLI](https://github.com/cli/cli) installed, you can do this as follows:
 
 ```shell
 gh repo fork keras-team/keras-nlp --clone --remote
-cd keras-nlp
+cd keras-hub
 ```
 
 Next we must setup a python environment with the correct dependencies. We
@@ -97,7 +97,7 @@ recommend using `conda` to set up a base environment, and `pip` to install
 python packages from PyPI. The exact method will depend on your OS.
 
 **Note**: Be careful not to use mix pre-packaged tensorflow and jax libraries in
-`conda` with PyPI packages from `pip`. We recommend pulling *all* KerasNLP
+`conda` with PyPI packages from `pip`. We recommend pulling *all* KerasHub
 dependencies via `pip` as described below.
 
 ### Linux (recommended)
@@ -108,29 +108,29 @@ want accelerator support. The easiest way to get GPU support across all of our
 backends is to set up a few different python environements and pull in all cuda
 dependencies via `pip`.
 
-The shell snippet below will install four conda environments: `keras-nlp-cpu`,
-`keras-nlp-jax`, `keras-nlp-torch`, and `keras-nlp-tensorflow`. The cpu
+The shell snippet below will install four conda environments: `keras-hub-cpu`,
+`keras-hub-jax`, `keras-hub-torch`, and `keras-hub-tensorflow`. The cpu
 environement supports all backends without cuda, and each backend environement
 has cuda support.
 
 ```shell
-conda create -y -n keras-nlp-cpu python=3.10
-conda activate keras-nlp-cpu
+conda create -y -n keras-hub-cpu python=3.10
+conda activate keras-hub-cpu
 pip install -r requirements.txt  # install deps
-pip install -e .  # install keras-nlp
+pip install -e .  # install keras-hub
 
 for backend in "jax" "torch" "tensorflow"; do
-    conda create -y -n keras-nlp-${backend} python=3.10
-    conda activate keras-nlp-${backend}
+    conda create -y -n keras-hub-${backend} python=3.10
+    conda activate keras-hub-${backend}
     pip install -r requirements-${backend}-cuda.txt  # install deps
-    pip install -e .  # install keras-nlp
+    pip install -e .  # install keras-hub
 done
 ```
 
 To activate the jax environment and set keras to use jax, run:
 
 ```shell
-conda activate keras-nlp-jax && export KERAS_BACKEND=jax
+conda activate keras-hub-jax && export KERAS_BACKEND=jax
 ```
 
 ### MacOS
@@ -160,8 +160,8 @@ repository.
 
 ## Update Public API
 
-Run API generation script when creating PRs that update `keras_nlp_export`
-public APIs. Add the files changed in `keras_nlp/api` to the same PR.
+Run API generation script when creating PRs that update `keras_hub_export`
+public APIs. Add the files changed in `keras_hub/api` to the same PR.
 
 ```
 ./shell/api_gen.sh
@@ -169,7 +169,7 @@ public APIs. Add the files changed in `keras_nlp/api` to the same PR.
 
 ## Testing changes
 
-KerasNLP is tested using [PyTest](https://docs.pytest.org/en/6.2.x/).
+KerasHub is tested using [PyTest](https://docs.pytest.org/en/6.2.x/).
 
 ### Run a test file
 
@@ -184,7 +184,7 @@ can use the following command to run all the tests in `import_test.py`
 whose names contain `import`:
 
 ```shell
-pytest keras_nlp/keras_nlp/integration_tests/import_test.py -k="import"
+pytest keras_hub/integration_tests/import_test.py -k="import"
 ```
 
 ### Run the full test suite
diff --git a/CONTRIBUTING_MODELS.md b/CONTRIBUTING_MODELS.md
index 40028aac15..a9a67f99a5 100644
--- a/CONTRIBUTING_MODELS.md
+++ b/CONTRIBUTING_MODELS.md
@@ -1,13 +1,13 @@
 # Model Contribution Guide
 
-KerasNLP has a plethora of pre-trained large language models
+KerasHub has a plethora of pre-trained large language models
 ranging from BERT to OPT. We are always looking for more models and are always
 open to contributions!
 
 In this guide, we will walk you through the steps one needs to take in order to
-contribute a new pre-trained model to KerasNLP. For illustration purposes, let's
+contribute a new pre-trained model to KerasHub. For illustration purposes, let's
 assume that you want to contribute the DistilBERT model. Before we dive in, we encourage you to go through
-[our getting started guide](https://keras.io/guides/keras_nlp/getting_started/)
+[our getting started guide](https://keras.io/guides/keras_hub/getting_started/)
 for an introduction to the library, and our
 [contribution guide](https://github.com/keras-team/keras-nlp/blob/master/CONTRIBUTING.md).
 
@@ -22,19 +22,19 @@ Keep this checklist handy!
 
 ### Step 2: PR #1 - Add XXBackbone
 
-- [ ] An `xx/xx_backbone.py` file which has the model graph \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py)\].
-- [ ] An `xx/xx_backbone_test.py` file which has unit tests for the backbone \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone_test.py)\].
+- [ ] An `xx/xx_backbone.py` file which has the model graph \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone.py)\].
+- [ ] An `xx/xx_backbone_test.py` file which has unit tests for the backbone \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone_test.py)\].
 - [ ] A Colab notebook link in the PR description which matches the outputs of the implemented backbone model with the original source \[[Example](https://colab.research.google.com/drive/1SeZWJorKWmwWJax8ORSdxKrxE25BfhHa?usp=sharing)\].
 
 ### Step 3: PR #2 - Add XXTokenizer
 
-- [ ] An `xx/xx_tokenizer.py` file which has the tokenizer for the model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer.py)\].
-- [ ] An `xx/xx_tokenizer_test.py` file which has unit tests for the model tokenizer \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer_test.py)\].
+- [ ] An `xx/xx_tokenizer.py` file which has the tokenizer for the model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer.py)\].
+- [ ] An `xx/xx_tokenizer_test.py` file which has unit tests for the model tokenizer \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer_test.py)\].
 - [ ] A Colab notebook link in the PR description, demonstrating that the output of the tokenizer matches the original tokenizer \[[Example](https://colab.research.google.com/drive/1MH_rpuFB1Nz_NkKIAvVtVae2HFLjXZDA?usp=sharing)].
 
 ### Step 4: PR #3 - Add XX Presets
 
-- [ ] An `xx/xx_presets.py` file with links to weights uploaded to a personal GCP bucket/Google Drive \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_presets.py)\].
+- [ ] An `xx/xx_presets.py` file with links to weights uploaded to a personal GCP bucket/Google Drive \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_presets.py)\].
 - [ ] A `tools/checkpoint_conversion/convert_xx_checkpoints.py` which is reusable script for converting checkpoints \[[Example](https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_distilbert_checkpoints.py)\].
 - [ ] A Colab notebook link in the PR description, showing an end-to-end task such as text classification, etc. The task model can be built using the backbone model, with the task head on top \[[Example](https://gist.github.com/mattdangerw/bf0ca07fb66b6738150c8b56ee5bab4e)\].
 
@@ -42,9 +42,9 @@ Keep this checklist handy!
 
 This PR is optional.
 
-- [ ] An `xx/xx_<task>.py` file for adding a task model like classifier, masked LM, etc. \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier.py)\]
-- [ ] An `xx/xx_<task>_preprocessor.py` file which has the preprocessor and can be used to get inputs suitable for the task model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_preprocessor.py)\].
-- [ ] `xx/xx_<task>_test.py` file and `xx/xx_<task>_preprocessor_test.py` files which have unit tests for the above two modules \[[Example 1](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier_test.py) and [Example 2](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_preprocessor_test.py)\].
+- [ ] An `xx/xx_<task>.py` file for adding a task model like classifier, masked LM, etc. \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_classifier.py)\]
+- [ ] An `xx/xx_<task>_preprocessor.py` file which has the preprocessor and can be used to get inputs suitable for the task model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_preprocessor.py)\].
+- [ ] `xx/xx_<task>_test.py` file and `xx/xx_<task>_preprocessor_test.py` files which have unit tests for the above two modules \[[Example 1](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_classifier_test.py) and [Example 2](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_preprocessor_test.py)\].
 - [ ] A Colab notebook link in the PR description, demonstrating that the output of the preprocessor matches the output of the original preprocessor \[[Example](https://colab.research.google.com/drive/1GFFC7Y1I_2PtYlWDToqKvzYhHWv1b3nC?usp=sharing)].
 
 ## Detailed Instructions
@@ -81,7 +81,7 @@ around by a class to implement our models.
 
 A model is typically split into three/four sections. We would recommend you to
 compare this side-by-side with the
-[`keras_nlp.layers.DistilBertBackbone` source code](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py)!
+[`keras_hub.layers.DistilBertBackbone` source code](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone.py)!
 
 **Inputs to the model**
 
@@ -92,15 +92,15 @@ Generally, the standard inputs to any text model are:
 **Embedding layer(s)**
 
 Standard layers used: `keras.layers.Embedding`,
-`keras_nlp.layers.PositionEmbedding`, `keras_nlp.layers.TokenAndPositionEmbedding`.
+`keras_hub.layers.PositionEmbedding`, `keras_hub.layers.TokenAndPositionEmbedding`.
 
 **Encoder layers**
 
-Standard layers used: `keras_nlp.layers.TransformerEncoder`, `keras_nlp.layers.FNetEncoder`.
+Standard layers used: `keras_hub.layers.TransformerEncoder`, `keras_hub.layers.FNetEncoder`.
 
 **Decoder layers (possibly)**
 
-Standard layers used: `keras_nlp.layers.TransformerDecoder`.
+Standard layers used: `keras_hub.layers.TransformerDecoder`.
 
 **Other layers which might be used**
 
@@ -108,16 +108,16 @@ Standard layers used: `keras_nlp.layers.TransformerDecoder`.
 
 <br/>
 
-The standard layers provided in Keras and KerasNLP are generally enough for
+The standard layers provided in Keras and KerasHub are generally enough for
 most of the usecases and it is recommended to do a thorough search
-[here](https://keras.io/api/layers/) and [here](https://keras.io/api/keras_nlp/layers/).
+[here](https://keras.io/api/layers/) and [here](https://keras.io/api/keras_hub/layers/).
 However, sometimes, models have small tweaks/paradigm changes in their architecture.
 This is when things might slightly get complicated.
 
 If the model introduces a paradigm shift, such as using relative attention instead
 of vanilla attention, the contributor will have to implement complete custom layers. A case
-in point is `keras_nlp.models.DebertaV3Backbone` where we had to [implement layers
-from scratch](https://github.com/keras-team/keras-nlp/tree/master/keras_nlp/models/deberta_v3).
+in point is `keras_hub.models.DebertaV3Backbone` where we had to [implement layers
+from scratch](https://github.com/keras-team/keras-nlp/tree/master/keras_hub/models/deberta_v3).
 
 On the other hand, if the model has a small tweak, something simpler can be done.
 For instance, in the Whisper model, the self-attention and cross-attention mechanism
@@ -154,23 +154,23 @@ and loaded correctly, etc.
 #### Tokenizer
 
 Most text models nowadays use subword tokenizers such as WordPiece, SentencePiece
-and BPE Tokenizer. Since KerasNLP has implementations of most of the popular
+and BPE Tokenizer. Since KerasHub has implementations of most of the popular
 subword tokenizers, the model tokenizer layer typically inherits from a base
 tokenizer class.
 
 For example, DistilBERT uses the WordPiece tokenizer. So, we can introduce a new
-class, `DistilBertTokenizer`, which inherits from `keras_nlp.tokenizers.WordPieceTokenizer`.
+class, `DistilBertTokenizer`, which inherits from `keras_hub.tokenizers.WordPieceTokenizer`.
 All the underlying actual tokenization will be taken care of by the superclass.
 
 The important thing here is adding "special tokens". Most models have
 special tokens such as beginning-of-sequence token, end-of-sequence token,
 mask token, pad token, etc. These have to be
-[added as member attributes](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer.py#L91-L105)
+[added as member attributes](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer.py#L91-L105)
 to the tokenizer class. These member attributes are then accessed by the
 preprocessor layers.
 
-For a full list of the tokenizers KerasNLP offers, please visit
-[this link](https://keras.io/api/keras_nlp/tokenizers/) and make use of the
+For a full list of the tokenizers KerasHub offers, please visit
+[this link](https://keras.io/api/keras_hub/tokenizers/) and make use of the
 tokenizer your model uses!
 
 #### Unit Tests
@@ -193,7 +193,7 @@ files. These files will then be uploaded to GCP by us!
 After wrapping up the preset configuration file, you need to
 add the `from_preset` function to all three classes, i.e., `DistilBertBackbone`,
 and `DistilBertTokenizer`. Here is an
-[example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py#L187-L189).
+[example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone.py#L187-L189).
 
 The testing for presets is divided into two: "large" and "extra large".
 For "large" tests, we pick the smallest preset (in terms of number of parameters)
@@ -228,12 +228,12 @@ and return the dictionary in the form expected by the model.
 
 The preprocessor class might have a few intricacies depending on the model. For example,
 the DeBERTaV3 tokenizer does not have the `[MASK]` in the provided sentencepiece
-proto file, and we had to make some modifications [here](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/deberta_v3/deberta_v3_preprocessor.py). Secondly, we have
+proto file, and we had to make some modifications [here](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/deberta_v3/deberta_v3_preprocessor.py). Secondly, we have
 a separate preprocessor class for every task. This is because different tasks
-might require different input formats. For instance, we have a [separate preprocessor](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_masked_lm_preprocessor.py)
+might require different input formats. For instance, we have a [separate preprocessor](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_masked_lm_preprocessor.py)
 for masked language modeling (MLM) for DistilBERT.
 
 ## Conclusion
 
 Once all three PRs (and optionally, the fourth PR) have been merged, you have
-successfully contributed a model to KerasNLP. Congratulations! 🔥
+successfully contributed a model to KerasHub. Congratulations! 🔥
diff --git a/LICENSE b/LICENSE
index c08c53b76d..ff4bb93add 100644
--- a/LICENSE
+++ b/LICENSE
@@ -1,4 +1,4 @@
-Copyright 2024, KerasNLP authors. All rights reserved.
+Copyright 2024, KerasHub authors. All rights reserved.
 
                                  Apache License
                            Version 2.0, January 2004
@@ -188,7 +188,7 @@ Copyright 2024, KerasNLP authors. All rights reserved.
       same "printed page" as the copyright notice for easier
       identification within third-party archives.
 
-   Copyright 2024, KerasNLP authors.
+   Copyright 2024, KerasHub authors.
 
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
diff --git a/README.md b/README.md
index a5bb0b0bdd..8232581041 100644
--- a/README.md
+++ b/README.md
@@ -1,13 +1,13 @@
-# KerasNLP: Multi-framework NLP Models
+# KerasHub: Multi-framework NLP Models
 [![](https://github.com/keras-team/keras-nlp/workflows/Tests/badge.svg?branch=master)](https://github.com/keras-team/keras-nlp/actions?query=workflow%3ATests+branch%3Amaster)
 ![Python](https://img.shields.io/badge/python-v3.9.0+-success.svg)
 [![contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/keras-team/keras-nlp/issues)
 
 > [!IMPORTANT]
-> KerasNLP is becoming KerasHub! Read the announcement [here](https://github.com/keras-team/keras-nlp/issues/1831).
+> KerasHub is becoming KerasHub! Read the announcement [here](https://github.com/keras-team/keras-nlp/issues/1831).
 
-KerasNLP is a natural language processing library that works natively
-with TensorFlow, JAX, or PyTorch. KerasNLP provides a repository of pre-trained
+KerasHub is a natural language processing library that works natively
+with TensorFlow, JAX, or PyTorch. KerasHub provides a repository of pre-trained
 models and a collection of lower-level building blocks for language modeling.
 Built on Keras 3, models can be trained and serialized in any framework
 and re-used in another without costly migrations.
@@ -15,13 +15,13 @@ and re-used in another without costly migrations.
 This library is an extension of the core Keras API; all high-level modules are
 Layers and Models that receive that same level of polish as core Keras.
 If you are familiar with Keras, congratulations! You already understand most of
-KerasNLP.
+KerasHub.
 
 All models support JAX, TensorFlow, and PyTorch from a single model
 definition and can be fine-tuned on GPUs and TPUs out of the box. Models can
 be trained on individual accelerators with built-in PEFT techniques, or
 fine-tuned at scale with model and data parallel training. See our
-[Getting Started guide](https://keras.io/guides/keras_nlp/getting_started)
+[Getting Started guide](https://keras.io/guides/keras_hub/getting_started)
 to start learning our API. Browse our models on
 [Kaggle](https://www.kaggle.com/organizations/keras/models).
 We welcome contributions.
@@ -30,9 +30,9 @@ We welcome contributions.
 
 ### For everyone
 
-- [Home Page](https://keras.io/keras_nlp)
-- [Developer Guides](https://keras.io/guides/keras_nlp)
-- [API Reference](https://keras.io/api/keras_nlp)
+- [Home Page](https://keras.io/keras_hub)
+- [Developer Guides](https://keras.io/guides/keras_hub)
+- [API Reference](https://keras.io/api/keras_hub)
 - [Pre-trained Models](https://www.kaggle.com/organizations/keras/models)
 
 ### For contributors
@@ -51,7 +51,7 @@ Fine-tune BERT on IMDb movie reviews:
 import os
 os.environ["KERAS_BACKEND"] = "jax"  # Or "tensorflow" or "torch"!
 
-import keras_nlp
+import keras_hub
 import tensorflow_datasets as tfds
 
 imdb_train, imdb_test = tfds.load(
@@ -61,7 +61,7 @@ imdb_train, imdb_test = tfds.load(
     batch_size=16,
 )
 # Load a BERT model.
-classifier = keras_nlp.models.Classifier.from_preset(
+classifier = keras_hub.models.Classifier.from_preset(
     "bert_base_en", 
     num_classes=2,
     activation="softmax",
@@ -74,24 +74,24 @@ classifier.predict(["What an amazing movie!", "A total waste of my time."])
 
 Try it out [in a colab](https://colab.research.google.com/gist/mattdangerw/e457e42d5ea827110c8d5cb4eb9d9a07/kerasnlp-quickstart.ipynb).
 For more in depth guides and examples, visit
-[keras.io/keras_nlp](https://keras.io/keras_nlp/).
+[keras.io/keras_hub](https://keras.io/keras_hub/).
 
 ## Installation
 
-To install the latest KerasNLP release with Keras 3, simply run:
+To install the latest KerasHub release with Keras 3, simply run:
 
 ```
-pip install --upgrade keras-nlp
+pip install --upgrade keras-hub
 ```
 
-To install the latest nightly changes for both KerasNLP and Keras, you can use
+To install the latest nightly changes for both KerasHub and Keras, you can use
 our nightly package.
 
 ```
-pip install --upgrade keras-nlp-nightly
+pip install --upgrade keras-hub-nightly
 ```
 
-Note that currently, installing KerasNLP will always pull in TensorFlow for use
+Note that currently, installing KerasHub will always pull in TensorFlow for use
 of the `tf.data` API for preprocessing. Even when pre-processing with `tf.data`,
 training can still happen on any backend.
 
@@ -99,13 +99,13 @@ Read [Getting started with Keras](https://keras.io/getting_started/) for more
 information on installing Keras 3 and compatibility with different frameworks.
 
 > [!IMPORTANT]
-> We recommend using KerasNLP with TensorFlow 2.16 or later, as TF 2.16 packages
+> We recommend using KerasHub with TensorFlow 2.16 or later, as TF 2.16 packages
 > Keras 3 by default.
 
 ## Configuring your backend
 
 If you have Keras 3 installed in your environment (see installation above),
-you can use KerasNLP with any of JAX, TensorFlow and PyTorch. To do so, set the
+you can use KerasHub with any of JAX, TensorFlow and PyTorch. To do so, set the
 `KERAS_BACKEND` environment variable. For example:
 
 ```shell
@@ -118,7 +118,7 @@ Or in Colab, with:
 import os
 os.environ["KERAS_BACKEND"] = "jax"
 
-import keras_nlp
+import keras_hub
 ```
 
 > [!IMPORTANT]
@@ -134,21 +134,21 @@ may break compatibility at any time and APIs should not be consider stable.
 
 ## Disclaimer
 
-KerasNLP provides access to pre-trained models via the `keras_nlp.models` API.
+KerasHub provides access to pre-trained models via the `keras_hub.models` API.
 These pre-trained models are provided on an "as is" basis, without warranties
 or conditions of any kind. The following underlying models are provided by third
 parties, and subject to separate licenses:
 BART, BLOOM, DeBERTa, DistilBERT, GPT-2, Llama, Mistral, OPT, RoBERTa, Whisper,
 and XLM-RoBERTa.
 
-## Citing KerasNLP
+## Citing KerasHub
 
-If KerasNLP helps your research, we appreciate your citations.
+If KerasHub helps your research, we appreciate your citations.
 Here is the BibTeX entry:
 
 ```bibtex
 @misc{kerasnlp2022,
-  title={KerasNLP},
+  title={KerasHub},
   author={Watson, Matthew, and Qian, Chen, and Bischof, Jonathan and Chollet, 
   Fran\c{c}ois and others},
   year={2022},
diff --git a/RELEASE_PROCESS.md b/RELEASE_PROCESS.md
index c220e080d6..32b33349fc 100644
--- a/RELEASE_PROCESS.md
+++ b/RELEASE_PROCESS.md
@@ -1,6 +1,6 @@
 # Release Process
 
-⚠️ This doc is intended for maintainers of the KerasNLP library. Steps below
+⚠️ This doc is intended for maintainers of the KerasHub library. Steps below
 require push access to base repository. However, all are welcome to use this
 process for other projects, or suggest improvements!
 
@@ -8,9 +8,9 @@ process for other projects, or suggest improvements!
 
 Our release process consists of two main components:
 
-- Adding a new release to the [keras-nlp](https://pypi.org/project/keras-nlp/)
+- Adding a new release to the [keras-hub](https://pypi.org/project/keras-hub/)
   project on the Python Package Index (pypi).
-- Updating our documentation on [keras.io](https://keras.io/keras_nlp/) to match
+- Updating our documentation on [keras.io](https://keras.io/keras_hub/) to match
   the release.
 
 We follow [semantic versioning](https://semver.org/) for our releases, and
@@ -56,7 +56,7 @@ Use the following steps to create an `X.Y.0` release.
    ```shell
    git fetch --all
    git checkout --no-track -b version-bump-X.Y.0.dev0 upstream/rX.Y
-   # Update both setup.py and keras_nlp/__init__.py with an editor.
+   # Update both setup.py and keras_hub/__init__.py with an editor.
    git commit -m "Version bump to X.Y.0.dev0"
    git push -u origin version-bump-X.Y.0.dev0
    ```
@@ -77,7 +77,7 @@ Use the following steps to create an `X.Y.0` release.
    configured by [this file](.github/workflows/publish-to-pypi.yml).
 
 4. Wait a few minutes until the release appears on pypi, then test out the
-   release by running `pip install keras-nlp==X.Y.0.dev0`.
+   release by running `pip install keras-hub==X.Y.0.dev0`.
 
    Try to test the package thoroughly! It is a good idea to run through a few
    of our guides with the new version. Fix any bugs you find, and repeat steps
@@ -106,7 +106,7 @@ Use the following steps to create an `X.Y.0` release.
    [this PR](https://github.com/keras-team/keras-io/pull/1134) as a reference
    for what to change. Ask fchollet@ to review.
 
-   During development of the branch, you can pin the keras-nlp dev release in
+   During development of the branch, you can pin the keras-hub dev release in
    the keras-io `requirements.txt` file. Remember to update this to the official
    release before we merge the PR.
 
@@ -129,7 +129,7 @@ Use the following steps to create an `X.Y.0` release.
    ```shell
    git fetch --all
    git checkout --no-track -b version-bump-X.Ŷ.0 upstream/master
-   # Update both setup.py and keras_nlp/__init__.py with an editor.
+   # Update both setup.py and keras_hub/__init__.py with an editor.
    git commit -m "Version bump to X.Ŷ.0"
    git push -u origin version-bump-X.Ŷ.0
    ```
@@ -169,7 +169,7 @@ to push certain fixes out to our users.
    ```shell
    git fetch --all
    git checkout --no-track -b version-bump-X.Y.Z.dev0 upstream/rX.Y
-   # Update both setup.py and keras_nlp/__init__.py with an editor.
+   # Update both setup.py and keras_hub/__init__.py with an editor.
    git commit -m "Version bump to X.Y.Z.dev0"
    git push -u origin version-bump-X.Y.Z.dev0
    ```
@@ -188,7 +188,7 @@ to push certain fixes out to our users.
    configured by [this file](.github/workflows/publish-to-pypi.yml).
 
 4. Wait a few minutes until the release appears on pypi, then test out the
-   release by running `pip install keras-nlp==X.Y.Z.dev0`.
+   release by running `pip install keras-hub==X.Y.Z.dev0`.
 
    Try to test the package thoroughly! It is a good idea to run through a few
    of our guides with the new version. Fix any bugs you find, and repeat steps
@@ -203,7 +203,7 @@ to push certain fixes out to our users.
    [this PR](https://github.com/keras-team/keras-io/pull/1134) as a reference
    for what to change. Ask fchollet@ to review.
 
-   During development of the branch, you can pin the keras-nlp dev release in
+   During development of the branch, you can pin the keras-hub dev release in
    the keras-io `requirements.txt` file. Remember to update this to the official
    release before we merge the PR.
 
diff --git a/STYLE_GUIDE.md b/STYLE_GUIDE.md
index 335f7ade97..b84036534d 100644
--- a/STYLE_GUIDE.md
+++ b/STYLE_GUIDE.md
@@ -27,7 +27,7 @@ model naming, subject to the consistency constraints laid out here.
 - The model and preset names should be recognizable to users familiar with the
   original release. E.g. the model that goes with the "DeBERTaV3" paper should
   be called `DebertaV3`. A release of a [toxic-bert](https://huggingface.co/unitary/toxic-bert)
-  checkpoint for `keras_nlp.models.Bert`, should include the string
+  checkpoint for `keras_hub.models.Bert`, should include the string
   `"toxic_bert"`.
 - All preset names should include the language of the pretraining data. If three
   or more language are supported, the preset name should include `"multi"` (not
@@ -52,16 +52,16 @@ Small and/or unexported utility classes may live together along with code that
 uses it if convenient, e.g., our `BytePairTokenizerCache` is collocated in the
 same file as our `BytePairTokenizer`.
 
-## Import keras and keras_nlp as top-level objects
+## Import keras and keras_hub as top-level objects
 
-Prefer importing `tf`, `keras` and `keras_nlp` as top-level objects. We want
-it to be clear to a reader which symbols are from `keras_nlp` and which are
+Prefer importing `tf`, `keras` and `keras_hub` as top-level objects. We want
+it to be clear to a reader which symbols are from `keras_hub` and which are
 from core `keras`.
 
-For guides and examples using KerasNLP, the import block should look as follows:
+For guides and examples using KerasHub, the import block should look as follows:
 
 ```python
-import keras_nlp
+import keras_hub
 import tensorflow as tf
 from tensorflow import keras
 ```
@@ -70,18 +70,18 @@ from tensorflow import keras
 ✅ `keras.activations.X`
 
 ❌ `layers.X`<br/>
-✅ `keras.layers.X` or `keras_nlp.layers.X`
+✅ `keras.layers.X` or `keras_hub.layers.X`
 
 ❌ `Dense(1, activation='softmax')`<br/>
 ✅ `keras.layers.Dense(1, activation='softmax')`
 
-For KerasNLP library code, `keras_nlp` will not be directly imported, but
+For KerasHub library code, `keras_hub` will not be directly imported, but
 `keras` should still be used as a top-level object used to access library
 symbols.
 
 ## Ideal layer style
 
-When writing a new KerasNLP layer (or tokenizer or metric), please make sure to
+When writing a new KerasHub layer (or tokenizer or metric), please make sure to
 do the following:
 
 - Accept `**kwargs` in `__init__` and forward this to the super class.
@@ -96,7 +96,7 @@ do the following:
 - Document the
   [masking](https://keras.io/guides/understanding_masking_and_padding/) behavior
   of the layer in the class level docstring as well.
-- Always include usage examples using the full symbol location in `keras_nlp`.
+- Always include usage examples using the full symbol location in `keras_hub`.
 - Include a reference citation if applicable.
 
 ````python
@@ -119,7 +119,7 @@ class PositionEmbedding(keras.layers.Layer):
     Example:
 
     Direct call.
-    >>> layer = keras_nlp.layers.PositionEmbedding(sequence_length=10)
+    >>> layer = keras_hub.layers.PositionEmbedding(sequence_length=10)
     >>> layer(tf.zeros((8, 10, 16))).shape
     TensorShape([8, 10, 16])
 
@@ -132,7 +132,7 @@ class PositionEmbedding(keras.layers.Layer):
     token_embeddings = keras.layers.Embedding(
         input_dim=vocab_size, output_dim=embed_dim
     )(inputs)
-    position_embeddings = keras_nlp.layers.PositionEmbedding(
+    position_embeddings = keras_hub.layers.PositionEmbedding(
         sequence_length=seq_length
     )(token_embeddings)
     outputs = token_embeddings + position_embeddings
diff --git a/api_gen.py b/api_gen.py
index 734e79f1d0..c76ea87303 100644
--- a/api_gen.py
+++ b/api_gen.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,7 +11,7 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-"""Script to generate keras_nlp public API in `keras_nlp/api` directory.
+"""Script to generate keras_hub public API in `keras_hub/api` directory.
 
 Usage:
 
@@ -24,7 +24,7 @@
 
 import namex
 
-package = "keras_nlp"
+package = "keras_hub"
 
 
 def ignore_files(_, filenames):
@@ -32,7 +32,7 @@ def ignore_files(_, filenames):
 
 
 def copy_source_to_build_directory(root_path):
-    # Copy sources (`keras_nlp/` directory and setup files) to build dir
+    # Copy sources (`keras_hub/` directory and setup files) to build dir
     build_dir = os.path.join(root_path, "tmp_build_dir")
     if os.path.exists(build_dir):
         shutil.rmtree(build_dir)
@@ -47,12 +47,12 @@ def export_version_string(api_init_fname):
     with open(api_init_fname) as f:
         contents = f.read()
     with open(api_init_fname, "w") as f:
-        contents += "from keras_nlp.src.version_utils import __version__\n"
+        contents += "from keras_hub.src.version_utils import __version__\n"
         f.write(contents)
 
 
 def build():
-    # Backup the `keras_nlp/__init__.py` and restore it on error in api gen.
+    # Backup the `keras_hub/__init__.py` and restore it on error in api gen.
     root_path = os.path.dirname(os.path.abspath(__file__))
     code_api_dir = os.path.join(root_path, package, "api")
     # Create temp build dir
@@ -62,18 +62,18 @@ def build():
     build_api_init_fname = os.path.join(build_api_dir, "__init__.py")
     try:
         os.chdir(build_dir)
-        # Generates `keras_nlp/api` directory.
+        # Generates `keras_hub/api` directory.
         if os.path.exists(build_api_dir):
             shutil.rmtree(build_api_dir)
         if os.path.exists(build_init_fname):
             os.remove(build_init_fname)
         os.makedirs(build_api_dir)
         namex.generate_api_files(
-            "keras_nlp", code_directory="src", target_directory="api"
+            "keras_hub", code_directory="src", target_directory="api"
         )
         # Add __version__ to keras package
         export_version_string(build_api_init_fname)
-        # Copy back the keras_nlp/api and keras_nlp/__init__.py from build dir
+        # Copy back the keras_hub/api and keras_hub/__init__.py from build dir
         if os.path.exists(code_api_dir):
             shutil.rmtree(code_api_dir)
         shutil.copytree(build_api_dir, code_api_dir)
diff --git a/benchmarks/README.md b/benchmarks/README.md
index 87092950d8..54e9830285 100644
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -1,14 +1,14 @@
-# KerasNLP Benchmarks
+# KerasHub Benchmarks
 
 This directory houses a collection of scripts for benchmarking APIs and utility
-functions which KerasNLP provides.
+functions which KerasHub provides.
 
 ## Text Generation
 For benchmarking text generation functions, the following command can be run
 from the root of the repository:
 
 ```sh
-python3 ./keras_nlp/benchmarks/text_generation.py
+python3 ./keras_hub/benchmarks/text_generation.py
 ```
 
 On running this script on Google Colab (with 3090 GPU, and TensorFlow 2.11.0),
@@ -31,7 +31,7 @@ For benchmarking classification models, the following command can be run
 from the root of the repository:
 
 ```sh
-python3 keras_nlp/benchmarks/sentiment_analysis.py \
+python3 keras_hub/benchmarks/sentiment_analysis.py \
     --model="BertTextClassifier" \
     --preset="bert_small_en_uncased" \
     --learning_rate=5e-5 \
diff --git a/benchmarks/glue.py b/benchmarks/glue.py
index 8c14145b3b..76912f31c5 100644
--- a/benchmarks/glue.py
+++ b/benchmarks/glue.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -37,7 +37,7 @@
 from absl import logging
 from tensorflow import keras
 
-import keras_nlp
+import keras_hub
 
 seed = 42
 tf.random.set_seed(seed)
@@ -101,7 +101,7 @@ def split_features(x):
 
 
 def load_model(model, preset, num_classes):
-    for name, symbol in keras_nlp.models.__dict__.items():
+    for name, symbol in keras_hub.models.__dict__.items():
         if inspect.isclass(symbol) and issubclass(symbol, keras.Model):
             if model and name != model:
                 continue
diff --git a/benchmarks/sentiment_analysis.py b/benchmarks/sentiment_analysis.py
index 09e9cf6c1c..2f3b90ccd7 100644
--- a/benchmarks/sentiment_analysis.py
+++ b/benchmarks/sentiment_analysis.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -20,7 +20,7 @@
 from absl import flags
 from tensorflow import keras
 
-import keras_nlp
+import keras_hub
 
 FLAGS = flags.FLAGS
 flags.DEFINE_string(
@@ -79,7 +79,7 @@ def create_imdb_dataset():
 
 
 def create_model():
-    for name, symbol in keras_nlp.models.__dict__.items():
+    for name, symbol in keras_hub.models.__dict__.items():
         if inspect.isclass(symbol) and issubclass(symbol, keras.Model):
             if FLAGS.model and name != FLAGS.model:
                 continue
diff --git a/benchmarks/text_generation.py b/benchmarks/text_generation.py
index fe829b77ca..3491541cdc 100644
--- a/benchmarks/text_generation.py
+++ b/benchmarks/text_generation.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -19,7 +19,7 @@
 import tensorflow as tf
 from tensorflow import keras
 
-import keras_nlp
+import keras_hub
 
 SEED = 42
 
@@ -76,7 +76,7 @@ def build_model(
 ):
     inputs = keras.layers.Input(shape=(None,), dtype="int32")
     # Embedding.
-    x = keras_nlp.layers.TokenAndPositionEmbedding(
+    x = keras_hub.layers.TokenAndPositionEmbedding(
         vocabulary_size=vocab_size,
         sequence_length=max_length,
         embedding_dim=embed_dim,
@@ -84,7 +84,7 @@ def build_model(
     )(inputs)
     # Transformer decoders.
     for _ in range(num_layers):
-        x = keras_nlp.layers.TransformerDecoder(
+        x = keras_hub.layers.TransformerDecoder(
             num_heads=num_heads,
             intermediate_dim=ff_dim,
         )(x)
@@ -102,7 +102,7 @@ def generate_text(
 ):
     class TestModel(tf.keras.Model):
         def call(self, inputs):
-            generated = keras_nlp.samplers.get(sampler)(
+            generated = keras_hub.samplers.get(sampler)(
                 next=next,
                 prompt=inputs,
             )
diff --git a/conftest.py b/conftest.py
index 327654cf51..5449f5bd92 100644
--- a/conftest.py
+++ b/conftest.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/integration_tests/__init__.py b/integration_tests/__init__.py
index 3364a6bd16..fd48fde00f 100644
--- a/integration_tests/__init__.py
+++ b/integration_tests/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/integration_tests/basic_usage_test.py b/integration_tests/basic_usage_test.py
index 7dff522537..00e5887d56 100644
--- a/integration_tests/basic_usage_test.py
+++ b/integration_tests/basic_usage_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,7 +17,7 @@
 import keras
 import numpy as np
 
-import keras_nlp
+import keras_hub
 
 
 class BasicUsageTest(unittest.TestCase):
@@ -25,7 +25,7 @@ def test_transformer(self):
         # Tokenize some inputs with a binary label.
         vocab = ["[UNK]", "the", "qu", "##ick", "br", "##own", "fox", "."]
         sentences = ["The quick brown fox jumped.", "The fox slept."]
-        tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(
+        tokenizer = keras_hub.tokenizers.WordPieceTokenizer(
             vocabulary=vocab,
             sequence_length=10,
         )
@@ -33,12 +33,12 @@ def test_transformer(self):
 
         # Create a tiny transformer.
         inputs = keras.Input(shape=(None,), dtype="int32")
-        outputs = keras_nlp.layers.TokenAndPositionEmbedding(
+        outputs = keras_hub.layers.TokenAndPositionEmbedding(
             vocabulary_size=len(vocab),
             sequence_length=10,
             embedding_dim=16,
         )(inputs)
-        outputs = keras_nlp.layers.TransformerEncoder(
+        outputs = keras_hub.layers.TransformerEncoder(
             num_heads=4,
             intermediate_dim=32,
         )(outputs)
@@ -56,7 +56,7 @@ def test_transformer(self):
     def test_quickstart(self):
         """This roughly matches the quick start example in our base README."""
         # Load a BERT model.
-        classifier = keras_nlp.models.TextClassifier.from_preset(
+        classifier = keras_hub.models.TextClassifier.from_preset(
             "bert_tiny_en_uncased",
             num_classes=2,
             activation="softmax",
diff --git a/integration_tests/import_test.py b/integration_tests/import_test.py
index bd36202cff..5cb6586249 100644
--- a/integration_tests/import_test.py
+++ b/integration_tests/import_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,9 +14,9 @@
 
 import unittest
 
-import keras_nlp
+import keras_hub
 
 
 class ImportTest(unittest.TestCase):
     def test_version(self):
-        self.assertIsNotNone(keras_nlp.__version__)
+        self.assertIsNotNone(keras_hub.__version__)
diff --git a/integration_tests/no_tensorflow_test.py b/integration_tests/no_tensorflow_test.py
index 2562041218..ea5b5774f8 100644
--- a/integration_tests/no_tensorflow_test.py
+++ b/integration_tests/no_tensorflow_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,12 +16,12 @@
 
 import numpy as np
 
-import keras_nlp
+import keras_hub
 
 
 class NoTensorflow(unittest.TestCase):
     def test_backbone_works(self):
-        backbone = keras_nlp.models.BertBackbone.from_preset(
+        backbone = keras_hub.models.BertBackbone.from_preset(
             "bert_tiny_en_uncased",
         )
         backbone.predict(
@@ -34,7 +34,7 @@ def test_backbone_works(self):
 
     def test_tokenizer_errors(self):
         with self.assertRaises(Exception) as e:
-            keras_nlp.models.BertTokenizer.from_preset(
+            keras_hub.models.BertTokenizer.from_preset(
                 "bert_tiny_en_uncased",
             )
             self.assertTrue("pip install tensorflow-text" in e.exception)
diff --git a/keras_nlp/__init__.py b/keras_hub/__init__.py
similarity index 90%
rename from keras_nlp/__init__.py
rename to keras_hub/__init__.py
index 4f02f98bd6..068cc28e72 100644
--- a/keras_nlp/__init__.py
+++ b/keras_hub/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -26,8 +26,8 @@
     pass
 
 # Import everything from /api/ into keras.
-from keras_nlp.api import *  # noqa: F403
-from keras_nlp.api import __version__  # Import * ignores names start with "_".
+from keras_hub.api import *  # noqa: F403
+from keras_hub.api import __version__  # Import * ignores names start with "_".
 
 # Add everything in /api/ to the module search path.
 __path__.append(os.path.join(os.path.dirname(__file__), "api"))  # noqa: F405
diff --git a/keras_nlp/api/__init__.py b/keras_hub/api/__init__.py
similarity index 62%
rename from keras_nlp/api/__init__.py
rename to keras_hub/api/__init__.py
index 46b16d5d8c..6320bfdbcc 100644
--- a/keras_nlp/api/__init__.py
+++ b/keras_hub/api/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,12 +17,12 @@
 since your modifications would be overwritten.
 """
 
-from keras_nlp.api import bounding_box
-from keras_nlp.api import layers
-from keras_nlp.api import metrics
-from keras_nlp.api import models
-from keras_nlp.api import samplers
-from keras_nlp.api import tokenizers
-from keras_nlp.src.utils.preset_utils import upload_preset
-from keras_nlp.src.version_utils import __version__
-from keras_nlp.src.version_utils import version
+from keras_hub.api import bounding_box
+from keras_hub.api import layers
+from keras_hub.api import metrics
+from keras_hub.api import models
+from keras_hub.api import samplers
+from keras_hub.api import tokenizers
+from keras_hub.src.utils.preset_utils import upload_preset
+from keras_hub.src.version_utils import __version__
+from keras_hub.src.version_utils import version
diff --git a/keras_hub/api/bounding_box/__init__.py b/keras_hub/api/bounding_box/__init__.py
new file mode 100644
index 0000000000..989cdb1c6f
--- /dev/null
+++ b/keras_hub/api/bounding_box/__init__.py
@@ -0,0 +1,36 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""DO NOT EDIT.
+
+This file was autogenerated. Do not edit it by hand,
+since your modifications would be overwritten.
+"""
+
+from keras_hub.src.bounding_box.converters import convert_format
+from keras_hub.src.bounding_box.formats import CENTER_XYWH
+from keras_hub.src.bounding_box.formats import REL_XYWH
+from keras_hub.src.bounding_box.formats import REL_XYXY
+from keras_hub.src.bounding_box.formats import REL_YXYX
+from keras_hub.src.bounding_box.formats import XYWH
+from keras_hub.src.bounding_box.formats import XYXY
+from keras_hub.src.bounding_box.formats import YXYX
+from keras_hub.src.bounding_box.iou import compute_ciou
+from keras_hub.src.bounding_box.iou import compute_iou
+from keras_hub.src.bounding_box.to_dense import to_dense
+from keras_hub.src.bounding_box.to_ragged import to_ragged
+from keras_hub.src.bounding_box.utils import as_relative
+from keras_hub.src.bounding_box.utils import clip_boxes
+from keras_hub.src.bounding_box.utils import clip_to_image
+from keras_hub.src.bounding_box.utils import is_relative
+from keras_hub.src.bounding_box.validate_format import validate_format
diff --git a/keras_hub/api/layers/__init__.py b/keras_hub/api/layers/__init__.py
new file mode 100644
index 0000000000..b653a06332
--- /dev/null
+++ b/keras_hub/api/layers/__init__.py
@@ -0,0 +1,61 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""DO NOT EDIT.
+
+This file was autogenerated. Do not edit it by hand,
+since your modifications would be overwritten.
+"""
+
+from keras_hub.src.layers.modeling.alibi_bias import AlibiBias
+from keras_hub.src.layers.modeling.cached_multi_head_attention import (
+    CachedMultiHeadAttention,
+)
+from keras_hub.src.layers.modeling.f_net_encoder import FNetEncoder
+from keras_hub.src.layers.modeling.masked_lm_head import MaskedLMHead
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.layers.modeling.reversible_embedding import (
+    ReversibleEmbedding,
+)
+from keras_hub.src.layers.modeling.rotary_embedding import RotaryEmbedding
+from keras_hub.src.layers.modeling.sine_position_encoding import (
+    SinePositionEncoding,
+)
+from keras_hub.src.layers.modeling.token_and_position_embedding import (
+    TokenAndPositionEmbedding,
+)
+from keras_hub.src.layers.modeling.transformer_decoder import TransformerDecoder
+from keras_hub.src.layers.modeling.transformer_encoder import TransformerEncoder
+from keras_hub.src.layers.preprocessing.audio_converter import AudioConverter
+from keras_hub.src.layers.preprocessing.image_converter import ImageConverter
+from keras_hub.src.layers.preprocessing.masked_lm_mask_generator import (
+    MaskedLMMaskGenerator,
+)
+from keras_hub.src.layers.preprocessing.multi_segment_packer import (
+    MultiSegmentPacker,
+)
+from keras_hub.src.layers.preprocessing.random_deletion import RandomDeletion
+from keras_hub.src.layers.preprocessing.random_swap import RandomSwap
+from keras_hub.src.layers.preprocessing.resizing_image_converter import (
+    ResizingImageConverter,
+)
+from keras_hub.src.layers.preprocessing.start_end_packer import StartEndPacker
+from keras_hub.src.models.pali_gemma.pali_gemma_image_converter import (
+    PaliGemmaImageConverter,
+)
+from keras_hub.src.models.resnet.resnet_image_converter import (
+    ResNetImageConverter,
+)
+from keras_hub.src.models.whisper.whisper_audio_converter import (
+    WhisperAudioConverter,
+)
diff --git a/keras_nlp/api/metrics/__init__.py b/keras_hub/api/metrics/__init__.py
similarity index 69%
rename from keras_nlp/api/metrics/__init__.py
rename to keras_hub/api/metrics/__init__.py
index aa1f9b90eb..fc3c46c78d 100644
--- a/keras_nlp/api/metrics/__init__.py
+++ b/keras_hub/api/metrics/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,8 +17,8 @@
 since your modifications would be overwritten.
 """
 
-from keras_nlp.src.metrics.bleu import Bleu
-from keras_nlp.src.metrics.edit_distance import EditDistance
-from keras_nlp.src.metrics.perplexity import Perplexity
-from keras_nlp.src.metrics.rouge_l import RougeL
-from keras_nlp.src.metrics.rouge_n import RougeN
+from keras_hub.src.metrics.bleu import Bleu
+from keras_hub.src.metrics.edit_distance import EditDistance
+from keras_hub.src.metrics.perplexity import Perplexity
+from keras_hub.src.metrics.rouge_l import RougeL
+from keras_hub.src.metrics.rouge_n import RougeN
diff --git a/keras_hub/api/models/__init__.py b/keras_hub/api/models/__init__.py
new file mode 100644
index 0000000000..400284e487
--- /dev/null
+++ b/keras_hub/api/models/__init__.py
@@ -0,0 +1,298 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""DO NOT EDIT.
+
+This file was autogenerated. Do not edit it by hand,
+since your modifications would be overwritten.
+"""
+
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.models.albert.albert_masked_lm import AlbertMaskedLM
+from keras_hub.src.models.albert.albert_masked_lm_preprocessor import (
+    AlbertMaskedLMPreprocessor,
+)
+from keras_hub.src.models.albert.albert_text_classifier import (
+    AlbertTextClassifier,
+)
+from keras_hub.src.models.albert.albert_text_classifier import (
+    AlbertTextClassifier as AlbertClassifier,
+)
+from keras_hub.src.models.albert.albert_text_classifier_preprocessor import (
+    AlbertTextClassifierPreprocessor,
+)
+from keras_hub.src.models.albert.albert_text_classifier_preprocessor import (
+    AlbertTextClassifierPreprocessor as AlbertPreprocessor,
+)
+from keras_hub.src.models.albert.albert_tokenizer import AlbertTokenizer
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.bart.bart_backbone import BartBackbone
+from keras_hub.src.models.bart.bart_seq_2_seq_lm import BartSeq2SeqLM
+from keras_hub.src.models.bart.bart_seq_2_seq_lm_preprocessor import (
+    BartSeq2SeqLMPreprocessor,
+)
+from keras_hub.src.models.bart.bart_tokenizer import BartTokenizer
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.bert.bert_masked_lm import BertMaskedLM
+from keras_hub.src.models.bert.bert_masked_lm_preprocessor import (
+    BertMaskedLMPreprocessor,
+)
+from keras_hub.src.models.bert.bert_text_classifier import BertTextClassifier
+from keras_hub.src.models.bert.bert_text_classifier import (
+    BertTextClassifier as BertClassifier,
+)
+from keras_hub.src.models.bert.bert_text_classifier_preprocessor import (
+    BertTextClassifierPreprocessor,
+)
+from keras_hub.src.models.bert.bert_text_classifier_preprocessor import (
+    BertTextClassifierPreprocessor as BertPreprocessor,
+)
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.models.bloom.bloom_backbone import BloomBackbone
+from keras_hub.src.models.bloom.bloom_causal_lm import BloomCausalLM
+from keras_hub.src.models.bloom.bloom_causal_lm_preprocessor import (
+    BloomCausalLMPreprocessor,
+)
+from keras_hub.src.models.bloom.bloom_tokenizer import BloomTokenizer
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.csp_darknet.csp_darknet_backbone import (
+    CSPDarkNetBackbone,
+)
+from keras_hub.src.models.csp_darknet.csp_darknet_image_classifier import (
+    CSPDarkNetImageClassifier,
+)
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
+    DebertaV3Backbone,
+)
+from keras_hub.src.models.deberta_v3.deberta_v3_masked_lm import (
+    DebertaV3MaskedLM,
+)
+from keras_hub.src.models.deberta_v3.deberta_v3_masked_lm_preprocessor import (
+    DebertaV3MaskedLMPreprocessor,
+)
+from keras_hub.src.models.deberta_v3.deberta_v3_text_classifier import (
+    DebertaV3TextClassifier,
+)
+from keras_hub.src.models.deberta_v3.deberta_v3_text_classifier import (
+    DebertaV3TextClassifier as DebertaV3Classifier,
+)
+from keras_hub.src.models.deberta_v3.deberta_v3_text_classifier_preprocessor import (
+    DebertaV3TextClassifierPreprocessor,
+)
+from keras_hub.src.models.deberta_v3.deberta_v3_text_classifier_preprocessor import (
+    DebertaV3TextClassifierPreprocessor as DebertaV3Preprocessor,
+)
+from keras_hub.src.models.deberta_v3.deberta_v3_tokenizer import (
+    DebertaV3Tokenizer,
+)
+from keras_hub.src.models.densenet.densenet_backbone import DenseNetBackbone
+from keras_hub.src.models.densenet.densenet_image_classifier import (
+    DenseNetImageClassifier,
+)
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
+    DistilBertBackbone,
+)
+from keras_hub.src.models.distil_bert.distil_bert_masked_lm import (
+    DistilBertMaskedLM,
+)
+from keras_hub.src.models.distil_bert.distil_bert_masked_lm_preprocessor import (
+    DistilBertMaskedLMPreprocessor,
+)
+from keras_hub.src.models.distil_bert.distil_bert_text_classifier import (
+    DistilBertTextClassifier,
+)
+from keras_hub.src.models.distil_bert.distil_bert_text_classifier import (
+    DistilBertTextClassifier as DistilBertClassifier,
+)
+from keras_hub.src.models.distil_bert.distil_bert_text_classifier_preprocessor import (
+    DistilBertTextClassifierPreprocessor,
+)
+from keras_hub.src.models.distil_bert.distil_bert_text_classifier_preprocessor import (
+    DistilBertTextClassifierPreprocessor as DistilBertPreprocessor,
+)
+from keras_hub.src.models.distil_bert.distil_bert_tokenizer import (
+    DistilBertTokenizer,
+)
+from keras_hub.src.models.efficientnet.efficientnet_backbone import (
+    EfficientNetBackbone,
+)
+from keras_hub.src.models.electra.electra_backbone import ElectraBackbone
+from keras_hub.src.models.electra.electra_tokenizer import ElectraTokenizer
+from keras_hub.src.models.f_net.f_net_backbone import FNetBackbone
+from keras_hub.src.models.f_net.f_net_masked_lm import FNetMaskedLM
+from keras_hub.src.models.f_net.f_net_masked_lm_preprocessor import (
+    FNetMaskedLMPreprocessor,
+)
+from keras_hub.src.models.f_net.f_net_text_classifier import FNetTextClassifier
+from keras_hub.src.models.f_net.f_net_text_classifier import (
+    FNetTextClassifier as FNetClassifier,
+)
+from keras_hub.src.models.f_net.f_net_text_classifier_preprocessor import (
+    FNetTextClassifierPreprocessor,
+)
+from keras_hub.src.models.f_net.f_net_text_classifier_preprocessor import (
+    FNetTextClassifierPreprocessor as FNetPreprocessor,
+)
+from keras_hub.src.models.f_net.f_net_tokenizer import FNetTokenizer
+from keras_hub.src.models.falcon.falcon_backbone import FalconBackbone
+from keras_hub.src.models.falcon.falcon_causal_lm import FalconCausalLM
+from keras_hub.src.models.falcon.falcon_causal_lm_preprocessor import (
+    FalconCausalLMPreprocessor,
+)
+from keras_hub.src.models.falcon.falcon_tokenizer import FalconTokenizer
+from keras_hub.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
+from keras_hub.src.models.gemma.gemma_backbone import GemmaBackbone
+from keras_hub.src.models.gemma.gemma_causal_lm import GemmaCausalLM
+from keras_hub.src.models.gemma.gemma_causal_lm_preprocessor import (
+    GemmaCausalLMPreprocessor,
+)
+from keras_hub.src.models.gemma.gemma_tokenizer import GemmaTokenizer
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.models.gpt2.gpt2_causal_lm import GPT2CausalLM
+from keras_hub.src.models.gpt2.gpt2_causal_lm_preprocessor import (
+    GPT2CausalLMPreprocessor,
+)
+from keras_hub.src.models.gpt2.gpt2_preprocessor import GPT2Preprocessor
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_causal_lm import GPTNeoXCausalLM
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_causal_lm_preprocessor import (
+    GPTNeoXCausalLMPreprocessor,
+)
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
+from keras_hub.src.models.image_classifier import ImageClassifier
+from keras_hub.src.models.image_classifier_preprocessor import (
+    ImageClassifierPreprocessor,
+)
+from keras_hub.src.models.llama3.llama3_backbone import Llama3Backbone
+from keras_hub.src.models.llama3.llama3_causal_lm import Llama3CausalLM
+from keras_hub.src.models.llama3.llama3_causal_lm_preprocessor import (
+    Llama3CausalLMPreprocessor,
+)
+from keras_hub.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
+from keras_hub.src.models.llama.llama_backbone import LlamaBackbone
+from keras_hub.src.models.llama.llama_causal_lm import LlamaCausalLM
+from keras_hub.src.models.llama.llama_causal_lm_preprocessor import (
+    LlamaCausalLMPreprocessor,
+)
+from keras_hub.src.models.llama.llama_tokenizer import LlamaTokenizer
+from keras_hub.src.models.masked_lm import MaskedLM
+from keras_hub.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
+from keras_hub.src.models.mistral.mistral_backbone import MistralBackbone
+from keras_hub.src.models.mistral.mistral_causal_lm import MistralCausalLM
+from keras_hub.src.models.mistral.mistral_causal_lm_preprocessor import (
+    MistralCausalLMPreprocessor,
+)
+from keras_hub.src.models.mistral.mistral_tokenizer import MistralTokenizer
+from keras_hub.src.models.mix_transformer.mix_transformer_backbone import (
+    MiTBackbone,
+)
+from keras_hub.src.models.mix_transformer.mix_transformer_classifier import (
+    MiTImageClassifier,
+)
+from keras_hub.src.models.mobilenet.mobilenet_backbone import MobileNetBackbone
+from keras_hub.src.models.mobilenet.mobilenet_image_classifier import (
+    MobileNetImageClassifier,
+)
+from keras_hub.src.models.opt.opt_backbone import OPTBackbone
+from keras_hub.src.models.opt.opt_causal_lm import OPTCausalLM
+from keras_hub.src.models.opt.opt_causal_lm_preprocessor import (
+    OPTCausalLMPreprocessor,
+)
+from keras_hub.src.models.opt.opt_tokenizer import OPTTokenizer
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
+    PaliGemmaBackbone,
+)
+from keras_hub.src.models.pali_gemma.pali_gemma_causal_lm import (
+    PaliGemmaCausalLM,
+)
+from keras_hub.src.models.pali_gemma.pali_gemma_causal_lm_preprocessor import (
+    PaliGemmaCausalLMPreprocessor,
+)
+from keras_hub.src.models.pali_gemma.pali_gemma_tokenizer import (
+    PaliGemmaTokenizer,
+)
+from keras_hub.src.models.phi3.phi3_backbone import Phi3Backbone
+from keras_hub.src.models.phi3.phi3_causal_lm import Phi3CausalLM
+from keras_hub.src.models.phi3.phi3_causal_lm_preprocessor import (
+    Phi3CausalLMPreprocessor,
+)
+from keras_hub.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.models.resnet.resnet_backbone import ResNetBackbone
+from keras_hub.src.models.resnet.resnet_image_classifier import (
+    ResNetImageClassifier,
+)
+from keras_hub.src.models.resnet.resnet_image_classifier_preprocessor import (
+    ResNetImageClassifierPreprocessor,
+)
+from keras_hub.src.models.roberta.roberta_backbone import RobertaBackbone
+from keras_hub.src.models.roberta.roberta_masked_lm import RobertaMaskedLM
+from keras_hub.src.models.roberta.roberta_masked_lm_preprocessor import (
+    RobertaMaskedLMPreprocessor,
+)
+from keras_hub.src.models.roberta.roberta_text_classifier import (
+    RobertaTextClassifier,
+)
+from keras_hub.src.models.roberta.roberta_text_classifier import (
+    RobertaTextClassifier as RobertaClassifier,
+)
+from keras_hub.src.models.roberta.roberta_text_classifier_preprocessor import (
+    RobertaTextClassifierPreprocessor,
+)
+from keras_hub.src.models.roberta.roberta_text_classifier_preprocessor import (
+    RobertaTextClassifierPreprocessor as RobertaPreprocessor,
+)
+from keras_hub.src.models.roberta.roberta_tokenizer import RobertaTokenizer
+from keras_hub.src.models.seq_2_seq_lm import Seq2SeqLM
+from keras_hub.src.models.seq_2_seq_lm_preprocessor import Seq2SeqLMPreprocessor
+from keras_hub.src.models.t5.t5_backbone import T5Backbone
+from keras_hub.src.models.t5.t5_tokenizer import T5Tokenizer
+from keras_hub.src.models.task import Task
+from keras_hub.src.models.text_classifier import TextClassifier
+from keras_hub.src.models.text_classifier import TextClassifier as Classifier
+from keras_hub.src.models.text_classifier_preprocessor import (
+    TextClassifierPreprocessor,
+)
+from keras_hub.src.models.vgg.vgg_backbone import VGGBackbone
+from keras_hub.src.models.vgg.vgg_image_classifier import VGGImageClassifier
+from keras_hub.src.models.vit_det.vit_det_backbone import ViTDetBackbone
+from keras_hub.src.models.whisper.whisper_backbone import WhisperBackbone
+from keras_hub.src.models.whisper.whisper_tokenizer import WhisperTokenizer
+from keras_hub.src.models.xlm_roberta.xlm_roberta_backbone import (
+    XLMRobertaBackbone,
+)
+from keras_hub.src.models.xlm_roberta.xlm_roberta_masked_lm import (
+    XLMRobertaMaskedLM,
+)
+from keras_hub.src.models.xlm_roberta.xlm_roberta_masked_lm_preprocessor import (
+    XLMRobertaMaskedLMPreprocessor,
+)
+from keras_hub.src.models.xlm_roberta.xlm_roberta_text_classifier import (
+    XLMRobertaTextClassifier,
+)
+from keras_hub.src.models.xlm_roberta.xlm_roberta_text_classifier import (
+    XLMRobertaTextClassifier as XLMRobertaClassifier,
+)
+from keras_hub.src.models.xlm_roberta.xlm_roberta_text_classifier_preprocessor import (
+    XLMRobertaTextClassifierPreprocessor,
+)
+from keras_hub.src.models.xlm_roberta.xlm_roberta_text_classifier_preprocessor import (
+    XLMRobertaTextClassifierPreprocessor as XLMRobertaPreprocessor,
+)
+from keras_hub.src.models.xlm_roberta.xlm_roberta_tokenizer import (
+    XLMRobertaTokenizer,
+)
+from keras_hub.src.models.xlnet.xlnet_backbone import XLNetBackbone
+from keras_hub.src.tokenizers.tokenizer import Tokenizer
diff --git a/keras_nlp/api/samplers/__init__.py b/keras_hub/api/samplers/__init__.py
similarity index 51%
rename from keras_nlp/api/samplers/__init__.py
rename to keras_hub/api/samplers/__init__.py
index a825c276de..52d4a8920e 100644
--- a/keras_nlp/api/samplers/__init__.py
+++ b/keras_hub/api/samplers/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,13 +17,13 @@
 since your modifications would be overwritten.
 """
 
-from keras_nlp.src.samplers.beam_sampler import BeamSampler
-from keras_nlp.src.samplers.contrastive_sampler import ContrastiveSampler
-from keras_nlp.src.samplers.greedy_sampler import GreedySampler
-from keras_nlp.src.samplers.random_sampler import RandomSampler
-from keras_nlp.src.samplers.sampler import Sampler
-from keras_nlp.src.samplers.serialization import deserialize
-from keras_nlp.src.samplers.serialization import get
-from keras_nlp.src.samplers.serialization import serialize
-from keras_nlp.src.samplers.top_k_sampler import TopKSampler
-from keras_nlp.src.samplers.top_p_sampler import TopPSampler
+from keras_hub.src.samplers.beam_sampler import BeamSampler
+from keras_hub.src.samplers.contrastive_sampler import ContrastiveSampler
+from keras_hub.src.samplers.greedy_sampler import GreedySampler
+from keras_hub.src.samplers.random_sampler import RandomSampler
+from keras_hub.src.samplers.sampler import Sampler
+from keras_hub.src.samplers.serialization import deserialize
+from keras_hub.src.samplers.serialization import get
+from keras_hub.src.samplers.serialization import serialize
+from keras_hub.src.samplers.top_k_sampler import TopKSampler
+from keras_hub.src.samplers.top_p_sampler import TopPSampler
diff --git a/keras_hub/api/tokenizers/__init__.py b/keras_hub/api/tokenizers/__init__.py
new file mode 100644
index 0000000000..9a011836ed
--- /dev/null
+++ b/keras_hub/api/tokenizers/__init__.py
@@ -0,0 +1,65 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""DO NOT EDIT.
+
+This file was autogenerated. Do not edit it by hand,
+since your modifications would be overwritten.
+"""
+
+from keras_hub.src.models.albert.albert_tokenizer import AlbertTokenizer
+from keras_hub.src.models.bart.bart_tokenizer import BartTokenizer
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.models.bloom.bloom_tokenizer import BloomTokenizer
+from keras_hub.src.models.deberta_v3.deberta_v3_tokenizer import (
+    DebertaV3Tokenizer,
+)
+from keras_hub.src.models.distil_bert.distil_bert_tokenizer import (
+    DistilBertTokenizer,
+)
+from keras_hub.src.models.electra.electra_tokenizer import ElectraTokenizer
+from keras_hub.src.models.f_net.f_net_tokenizer import FNetTokenizer
+from keras_hub.src.models.falcon.falcon_tokenizer import FalconTokenizer
+from keras_hub.src.models.gemma.gemma_tokenizer import GemmaTokenizer
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
+from keras_hub.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
+from keras_hub.src.models.llama.llama_tokenizer import LlamaTokenizer
+from keras_hub.src.models.mistral.mistral_tokenizer import MistralTokenizer
+from keras_hub.src.models.opt.opt_tokenizer import OPTTokenizer
+from keras_hub.src.models.pali_gemma.pali_gemma_tokenizer import (
+    PaliGemmaTokenizer,
+)
+from keras_hub.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
+from keras_hub.src.models.roberta.roberta_tokenizer import RobertaTokenizer
+from keras_hub.src.models.t5.t5_tokenizer import T5Tokenizer
+from keras_hub.src.models.whisper.whisper_tokenizer import WhisperTokenizer
+from keras_hub.src.models.xlm_roberta.xlm_roberta_tokenizer import (
+    XLMRobertaTokenizer,
+)
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.tokenizers.byte_tokenizer import ByteTokenizer
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
+    SentencePieceTokenizer,
+)
+from keras_hub.src.tokenizers.sentence_piece_tokenizer_trainer import (
+    compute_sentence_piece_proto,
+)
+from keras_hub.src.tokenizers.tokenizer import Tokenizer
+from keras_hub.src.tokenizers.unicode_codepoint_tokenizer import (
+    UnicodeCodepointTokenizer,
+)
+from keras_hub.src.tokenizers.word_piece_tokenizer import WordPieceTokenizer
+from keras_hub.src.tokenizers.word_piece_tokenizer_trainer import (
+    compute_word_piece_vocabulary,
+)
diff --git a/keras_nlp/src/layers/modeling/__init__.py b/keras_hub/src/__init__.py
similarity index 93%
rename from keras_nlp/src/layers/modeling/__init__.py
rename to keras_hub/src/__init__.py
index 3364a6bd16..fd48fde00f 100644
--- a/keras_nlp/src/layers/modeling/__init__.py
+++ b/keras_hub/src/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/api_export.py b/keras_hub/src/api_export.py
similarity index 82%
rename from keras_nlp/src/api_export.py
rename to keras_hub/src/api_export.py
index 93e7b54c2f..798302cf3b 100644
--- a/keras_nlp/src/api_export.py
+++ b/keras_hub/src/api_export.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -26,17 +26,17 @@ def maybe_register_serializable(symbol):
     if isinstance(symbol, types.FunctionType) or hasattr(symbol, "get_config"):
         # We register twice, first with the old name, second with the new name,
         # so loading still works under the old name.
-        # TODO replace compat_package_name with keras-nlp after rename.
-        compat_name = "compat_package_name"
+        # TODO replace keras_nlp with keras-hub after rename.
+        compat_name = "keras_nlp"
         keras.saving.register_keras_serializable(package=compat_name)(symbol)
-        keras.saving.register_keras_serializable(package="keras_nlp")(symbol)
+        keras.saving.register_keras_serializable(package="keras_hub")(symbol)
 
 
 if namex:
 
-    class keras_nlp_export(namex.export):
+    class keras_hub_export(namex.export):
         def __init__(self, path):
-            super().__init__(package="keras_nlp", path=path)
+            super().__init__(package="keras_hub", path=path)
 
         def __call__(self, symbol):
             maybe_register_serializable(symbol)
@@ -44,7 +44,7 @@ def __call__(self, symbol):
 
 else:
 
-    class keras_nlp_export:
+    class keras_hub_export:
         def __init__(self, path):
             pass
 
diff --git a/keras_nlp/src/layers/__init__.py b/keras_hub/src/bounding_box/__init__.py
similarity index 93%
rename from keras_nlp/src/layers/__init__.py
rename to keras_hub/src/bounding_box/__init__.py
index 3364a6bd16..fd48fde00f 100644
--- a/keras_nlp/src/layers/__init__.py
+++ b/keras_hub/src/bounding_box/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/bounding_box/converters.py b/keras_hub/src/bounding_box/converters.py
similarity index 98%
rename from keras_nlp/src/bounding_box/converters.py
rename to keras_hub/src/bounding_box/converters.py
index 0e363fc6f7..63996e6abf 100644
--- a/keras_nlp/src/bounding_box/converters.py
+++ b/keras_hub/src/bounding_box/converters.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,7 +16,7 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 try:
     import tensorflow as tf
@@ -301,7 +301,7 @@ def _xyxy_to_rel_yxyx(boxes, images=None, image_shape=None):
 }
 
 
-@keras_nlp_export("keras_nlp.bounding_box.convert_format")
+@keras_hub_export("keras_hub.bounding_box.convert_format")
 def convert_format(
     boxes, source, target, images=None, image_shape=None, dtype="float32"
 ):
@@ -343,7 +343,7 @@ def convert_format(
 
     ```python
     boxes = load_coco_dataset()
-    boxes_in_xywh = keras_nlp.bounding_box.convert_format(
+    boxes_in_xywh = keras_hub.bounding_box.convert_format(
         boxes,
         source='xyxy',
         target='xyWH'
diff --git a/keras_nlp/src/bounding_box/converters_test.py b/keras_hub/src/bounding_box/converters_test.py
similarity index 97%
rename from keras_nlp/src/bounding_box/converters_test.py
rename to keras_hub/src/bounding_box/converters_test.py
index f6f3adfa17..9b46294569 100644
--- a/keras_nlp/src/bounding_box/converters_test.py
+++ b/keras_hub/src/bounding_box/converters_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -20,10 +20,10 @@
 from absl.testing import parameterized
 from keras import backend
 
-from keras_nlp.src.bounding_box import converters
-from keras_nlp.src.bounding_box import to_dense
-from keras_nlp.src.bounding_box import to_ragged
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.bounding_box import converters
+from keras_hub.src.bounding_box import to_dense
+from keras_hub.src.bounding_box import to_ragged
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ConvertersTestCase(TestCase):
diff --git a/keras_nlp/src/bounding_box/formats.py b/keras_hub/src/bounding_box/formats.py
similarity index 89%
rename from keras_nlp/src/bounding_box/formats.py
rename to keras_hub/src/bounding_box/formats.py
index fda64a860e..4f419457c4 100644
--- a/keras_nlp/src/bounding_box/formats.py
+++ b/keras_hub/src/bounding_box/formats.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 formats.py contains axis information for each supported format.
 """
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 
-@keras_nlp_export("keras_nlp.bounding_box.XYXY")
+@keras_hub_export("keras_hub.bounding_box.XYXY")
 class XYXY:
     """XYXY contains axis indices for the XYXY format.
 
@@ -38,7 +38,7 @@ class XYXY:
     BOTTOM = 3
 
 
-@keras_nlp_export("keras_nlp.bounding_box.REL_XYXY")
+@keras_hub_export("keras_hub.bounding_box.REL_XYXY")
 class REL_XYXY:
     """REL_XYXY contains axis indices for the REL_XYXY format.
 
@@ -60,7 +60,7 @@ class REL_XYXY:
     BOTTOM = 3
 
 
-@keras_nlp_export("keras_nlp.bounding_box.CENTER_XYWH")
+@keras_hub_export("keras_hub.bounding_box.CENTER_XYWH")
 class CENTER_XYWH:
     """CENTER_XYWH contains axis indices for the CENTER_XYWH format.
 
@@ -80,7 +80,7 @@ class CENTER_XYWH:
     HEIGHT = 3
 
 
-@keras_nlp_export("keras_nlp.bounding_box.XYWH")
+@keras_hub_export("keras_hub.bounding_box.XYWH")
 class XYWH:
     """XYWH contains axis indices for the XYWH format.
 
@@ -100,7 +100,7 @@ class XYWH:
     HEIGHT = 3
 
 
-@keras_nlp_export("keras_nlp.bounding_box.REL_XYWH")
+@keras_hub_export("keras_hub.bounding_box.REL_XYWH")
 class REL_XYWH:
     """REL_XYWH contains axis indices for the XYWH format.
 
@@ -120,7 +120,7 @@ class REL_XYWH:
     HEIGHT = 3
 
 
-@keras_nlp_export("keras_nlp.bounding_box.YXYX")
+@keras_hub_export("keras_hub.bounding_box.YXYX")
 class YXYX:
     """YXYX contains axis indices for the YXYX format.
 
@@ -140,7 +140,7 @@ class YXYX:
     RIGHT = 3
 
 
-@keras_nlp_export("keras_nlp.bounding_box.REL_YXYX")
+@keras_hub_export("keras_hub.bounding_box.REL_YXYX")
 class REL_YXYX:
     """REL_YXYX contains axis indices for the REL_YXYX format.
 
diff --git a/keras_nlp/src/bounding_box/iou.py b/keras_hub/src/bounding_box/iou.py
similarity index 96%
rename from keras_nlp/src/bounding_box/iou.py
rename to keras_hub/src/bounding_box/iou.py
index 46ea2a34b4..4f404f008a 100644
--- a/keras_nlp/src/bounding_box/iou.py
+++ b/keras_hub/src/bounding_box/iou.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,10 +17,10 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.bounding_box.converters import convert_format
-from keras_nlp.src.bounding_box.utils import as_relative
-from keras_nlp.src.bounding_box.utils import is_relative
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.bounding_box.converters import convert_format
+from keras_hub.src.bounding_box.utils import as_relative
+from keras_hub.src.bounding_box.utils import is_relative
 
 
 def _compute_area(box):
@@ -64,7 +64,7 @@ def _compute_intersection(boxes1, boxes2):
     return intersect_height * intersect_width
 
 
-@keras_nlp_export("keras_nlp.bounding_box.compute_iou")
+@keras_hub_export("keras_hub.bounding_box.compute_iou")
 def compute_iou(
     boxes1,
     boxes2,
@@ -175,7 +175,7 @@ def compute_iou(
     return iou_lookup_table
 
 
-@keras_nlp_export("keras_nlp.bounding_box.compute_ciou")
+@keras_hub_export("keras_hub.bounding_box.compute_ciou")
 def compute_ciou(boxes1, boxes2, bounding_box_format):
     """
     Computes the Complete IoU (CIoU) between two bounding boxes or between
diff --git a/keras_nlp/src/bounding_box/iou_test.py b/keras_hub/src/bounding_box/iou_test.py
similarity index 97%
rename from keras_nlp/src/bounding_box/iou_test.py
rename to keras_hub/src/bounding_box/iou_test.py
index ffd3b61cf3..5469dea03e 100644
--- a/keras_nlp/src/bounding_box/iou_test.py
+++ b/keras_hub/src/bounding_box/iou_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 
 import numpy as np
 
-from keras_nlp.src.bounding_box import iou as iou_lib
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.bounding_box import iou as iou_lib
+from keras_hub.src.tests.test_case import TestCase
 
 
 class IoUTest(TestCase):
diff --git a/keras_nlp/src/bounding_box/to_dense.py b/keras_hub/src/bounding_box/to_dense.py
similarity index 93%
rename from keras_nlp/src/bounding_box/to_dense.py
rename to keras_hub/src/bounding_box/to_dense.py
index 3c42d09f4f..90ba829346 100644
--- a/keras_nlp/src/bounding_box/to_dense.py
+++ b/keras_hub/src/bounding_box/to_dense.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-import keras_nlp.src.bounding_box.validate_format as validate_format
-from keras_nlp.src.api_export import keras_nlp_export
+import keras_hub.src.bounding_box.validate_format as validate_format
+from keras_hub.src.api_export import keras_hub_export
 
 try:
     import tensorflow as tf
@@ -40,7 +40,7 @@ def _classes_shape(batched, classes_shape, max_boxes):
     return [max_boxes] + classes_shape[2:]
 
 
-@keras_nlp_export("keras_nlp.bounding_box.to_dense")
+@keras_hub_export("keras_hub.bounding_box.to_dense")
 def to_dense(bounding_boxes, max_boxes=None, default_value=-1):
     """to_dense converts bounding boxes to Dense tensors
 
diff --git a/keras_nlp/src/bounding_box/to_dense_test.py b/keras_hub/src/bounding_box/to_dense_test.py
similarity index 90%
rename from keras_nlp/src/bounding_box/to_dense_test.py
rename to keras_hub/src/bounding_box/to_dense_test.py
index 4bb795659b..d33d093cd2 100644
--- a/keras_nlp/src/bounding_box/to_dense_test.py
+++ b/keras_hub/src/bounding_box/to_dense_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 import tensorflow as tf
 from keras import backend
 
-from keras_nlp.src.bounding_box import to_dense
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.bounding_box import to_dense
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ToDenseTest(TestCase):
diff --git a/keras_nlp/src/bounding_box/to_ragged.py b/keras_hub/src/bounding_box/to_ragged.py
similarity index 93%
rename from keras_nlp/src/bounding_box/to_ragged.py
rename to keras_hub/src/bounding_box/to_ragged.py
index 2ebd4a00f4..104916af82 100644
--- a/keras_nlp/src/bounding_box/to_ragged.py
+++ b/keras_hub/src/bounding_box/to_ragged.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,8 +13,8 @@
 # limitations under the License.
 import keras
 
-import keras_nlp.src.bounding_box.validate_format as validate_format
-from keras_nlp.src.api_export import keras_nlp_export
+import keras_hub.src.bounding_box.validate_format as validate_format
+from keras_hub.src.api_export import keras_hub_export
 
 try:
     import tensorflow as tf
@@ -22,7 +22,7 @@
     tf = None
 
 
-@keras_nlp_export("keras_nlp.bounding_box.to_ragged")
+@keras_hub_export("keras_hub.bounding_box.to_ragged")
 def to_ragged(bounding_boxes, sentinel=-1, dtype="float32"):
     """converts a Dense padded bounding box `tf.Tensor` to a `tf.RaggedTensor`.
 
diff --git a/keras_nlp/src/bounding_box/to_ragged_test.py b/keras_hub/src/bounding_box/to_ragged_test.py
similarity index 94%
rename from keras_nlp/src/bounding_box/to_ragged_test.py
rename to keras_hub/src/bounding_box/to_ragged_test.py
index cbe5146d11..203924671b 100644
--- a/keras_nlp/src/bounding_box/to_ragged_test.py
+++ b/keras_hub/src/bounding_box/to_ragged_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,9 +16,9 @@
 import pytest
 from keras import backend
 
-from keras_nlp.src.bounding_box import to_dense
-from keras_nlp.src.bounding_box import to_ragged
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.bounding_box import to_dense
+from keras_hub.src.bounding_box import to_ragged
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ToRaggedTest(TestCase):
diff --git a/keras_nlp/src/bounding_box/utils.py b/keras_hub/src/bounding_box/utils.py
similarity index 94%
rename from keras_nlp/src/bounding_box/utils.py
rename to keras_hub/src/bounding_box/utils.py
index a96c284a6c..4fe27e93b5 100644
--- a/keras_nlp/src/bounding_box/utils.py
+++ b/keras_hub/src/bounding_box/utils.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,12 +15,12 @@
 
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.bounding_box import converters
-from keras_nlp.src.bounding_box.formats import XYWH
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.bounding_box import converters
+from keras_hub.src.bounding_box.formats import XYWH
 
 
-@keras_nlp_export("keras_nlp.bounding_box.is_relative")
+@keras_hub_export("keras_hub.bounding_box.is_relative")
 def is_relative(bounding_box_format):
     """A util to check if a bounding box format uses relative coordinates"""
     if bounding_box_format.lower() not in converters.TO_XYXY_CONVERTERS:
@@ -34,7 +34,7 @@ def is_relative(bounding_box_format):
     return bounding_box_format.startswith("rel")
 
 
-@keras_nlp_export("keras_nlp.bounding_box.as_relative")
+@keras_hub_export("keras_hub.bounding_box.as_relative")
 def as_relative(bounding_box_format):
     """A util to get the relative equivalent of a provided bounding box format.
 
@@ -62,7 +62,7 @@ def _relative_area(boxes, bounding_box_format):
     )
 
 
-@keras_nlp_export("keras_nlp.bounding_box.clip_to_image")
+@keras_hub_export("keras_hub.bounding_box.clip_to_image")
 def clip_to_image(
     bounding_boxes, bounding_box_format, images=None, image_shape=None
 ):
@@ -128,7 +128,7 @@ class ID set to -1, indicating that there is no object present in them.
     return bounding_boxes
 
 
-@keras_nlp_export("keras_nlp.bounding_box.clip_boxes")
+@keras_hub_export("keras_hub.bounding_box.clip_boxes")
 def clip_boxes(boxes, image_shape):
     """Clip boxes to the boundaries of the image shape"""
     if boxes.shape[-1] != 4:
diff --git a/keras_nlp/src/bounding_box/utils_test.py b/keras_hub/src/bounding_box/utils_test.py
similarity index 97%
rename from keras_nlp/src/bounding_box/utils_test.py
rename to keras_hub/src/bounding_box/utils_test.py
index cf61436846..044fa088cc 100644
--- a/keras_nlp/src/bounding_box/utils_test.py
+++ b/keras_hub/src/bounding_box/utils_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 import numpy as np
 from keras import ops
 
-from keras_nlp.src.bounding_box import utils
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.bounding_box import utils
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BoundingBoxUtilTest(TestCase):
diff --git a/keras_nlp/src/bounding_box/validate_format.py b/keras_hub/src/bounding_box/validate_format.py
similarity index 95%
rename from keras_nlp/src/bounding_box/validate_format.py
rename to keras_hub/src/bounding_box/validate_format.py
index 51fb310807..5cd5660b52 100644
--- a/keras_nlp/src/bounding_box/validate_format.py
+++ b/keras_hub/src/bounding_box/validate_format.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,7 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 try:
     import tensorflow as tf
@@ -20,9 +20,9 @@
     tf = None
 
 
-@keras_nlp_export("keras_nlp.bounding_box.validate_format")
+@keras_hub_export("keras_hub.bounding_box.validate_format")
 def validate_format(bounding_boxes, variable_name="bounding_boxes"):
-    """validates that a given set of bounding boxes complies with KerasNLP
+    """validates that a given set of bounding boxes complies with KerasHub
     format.
 
     For a set of bounding boxes to be valid it must satisfy the following
diff --git a/keras_nlp/src/bounding_box/validate_format_test.py b/keras_hub/src/bounding_box/validate_format_test.py
similarity index 91%
rename from keras_nlp/src/bounding_box/validate_format_test.py
rename to keras_hub/src/bounding_box/validate_format_test.py
index 020279f334..496d9fc729 100644
--- a/keras_nlp/src/bounding_box/validate_format_test.py
+++ b/keras_hub/src/bounding_box/validate_format_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,8 +13,8 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.bounding_box import validate_format
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.bounding_box import validate_format
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ValidateTest(TestCase):
diff --git a/keras_nlp/src/__init__.py b/keras_hub/src/layers/__init__.py
similarity index 93%
rename from keras_nlp/src/__init__.py
rename to keras_hub/src/layers/__init__.py
index 3364a6bd16..fd48fde00f 100644
--- a/keras_nlp/src/__init__.py
+++ b/keras_hub/src/layers/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/bounding_box/__init__.py b/keras_hub/src/layers/modeling/__init__.py
similarity index 93%
rename from keras_nlp/src/bounding_box/__init__.py
rename to keras_hub/src/layers/modeling/__init__.py
index 3364a6bd16..fd48fde00f 100644
--- a/keras_nlp/src/bounding_box/__init__.py
+++ b/keras_hub/src/layers/modeling/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/layers/modeling/alibi_bias.py b/keras_hub/src/layers/modeling/alibi_bias.py
similarity index 96%
rename from keras_nlp/src/layers/modeling/alibi_bias.py
rename to keras_hub/src/layers/modeling/alibi_bias.py
index c109d16b0c..6de1df61da 100644
--- a/keras_nlp/src/layers/modeling/alibi_bias.py
+++ b/keras_hub/src/layers/modeling/alibi_bias.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,10 +16,10 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 
-@keras_nlp_export("keras_nlp.layers.AlibiBias")
+@keras_hub_export("keras_hub.layers.AlibiBias")
 class AlibiBias(keras.layers.Layer):
     """A layer that adds the alibi bias to attention scores.
 
@@ -53,7 +53,7 @@ class AlibiBias(keras.layers.Layer):
     hidden_dim = 8
 
     # Create new alibi layer.
-    alibi_layer = keras_nlp.layers.AlibiBias()
+    alibi_layer = keras_hub.layers.AlibiBias()
 
     query = np.zeros((batch_size, num_heads, query_length, hidden_dim))
     key = np.zeros((batch_size, num_heads, hidden_dim, key_length))
diff --git a/keras_nlp/src/layers/modeling/alibi_bias_test.py b/keras_hub/src/layers/modeling/alibi_bias_test.py
similarity index 97%
rename from keras_nlp/src/layers/modeling/alibi_bias_test.py
rename to keras_hub/src/layers/modeling/alibi_bias_test.py
index beb5a482b0..5f8075aa1d 100644
--- a/keras_nlp/src/layers/modeling/alibi_bias_test.py
+++ b/keras_hub/src/layers/modeling/alibi_bias_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.layers.modeling.alibi_bias import AlibiBias
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.layers.modeling.alibi_bias import AlibiBias
+from keras_hub.src.tests.test_case import TestCase
 
 
 class AlibiBiasTest(TestCase):
diff --git a/keras_nlp/src/layers/modeling/cached_multi_head_attention.py b/keras_hub/src/layers/modeling/cached_multi_head_attention.py
similarity index 97%
rename from keras_nlp/src/layers/modeling/cached_multi_head_attention.py
rename to keras_hub/src/layers/modeling/cached_multi_head_attention.py
index 7c18d63e89..ad83f79c67 100644
--- a/keras_nlp/src/layers/modeling/cached_multi_head_attention.py
+++ b/keras_hub/src/layers/modeling/cached_multi_head_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 
-@keras_nlp_export("keras_nlp.layers.CachedMultiHeadAttention")
+@keras_hub_export("keras_hub.layers.CachedMultiHeadAttention")
 class CachedMultiHeadAttention(keras.layers.MultiHeadAttention):
     """MultiHeadAttention layer with cache support.
 
diff --git a/keras_nlp/src/layers/modeling/cached_multi_head_attention_test.py b/keras_hub/src/layers/modeling/cached_multi_head_attention_test.py
similarity index 96%
rename from keras_nlp/src/layers/modeling/cached_multi_head_attention_test.py
rename to keras_hub/src/layers/modeling/cached_multi_head_attention_test.py
index e4d170e067..27eeb99573 100644
--- a/keras_nlp/src/layers/modeling/cached_multi_head_attention_test.py
+++ b/keras_hub/src/layers/modeling/cached_multi_head_attention_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.layers.modeling.cached_multi_head_attention import (
+from keras_hub.src.layers.modeling.cached_multi_head_attention import (
     CachedMultiHeadAttention,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class CachedMultiHeadAttentionTest(TestCase):
diff --git a/keras_nlp/src/layers/modeling/f_net_encoder.py b/keras_hub/src/layers/modeling/f_net_encoder.py
similarity index 96%
rename from keras_nlp/src/layers/modeling/f_net_encoder.py
rename to keras_hub/src/layers/modeling/f_net_encoder.py
index 4b650eea5c..3fa5b22655 100644
--- a/keras_nlp/src/layers/modeling/f_net_encoder.py
+++ b/keras_hub/src/layers/modeling/f_net_encoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,11 +15,11 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
-@keras_nlp_export("keras_nlp.layers.FNetEncoder")
+@keras_hub_export("keras_hub.layers.FNetEncoder")
 class FNetEncoder(keras.layers.Layer):
     """FNet encoder.
 
@@ -55,7 +55,7 @@ class FNetEncoder(keras.layers.Layer):
 
     ```python
     # Create a single FNet encoder layer.
-    encoder = keras_nlp.layers.FNetEncoder(
+    encoder = keras_hub.layers.FNetEncoder(
         intermediate_dim=64)
 
     # Create a simple model containing the encoder.
diff --git a/keras_nlp/src/layers/modeling/f_net_encoder_test.py b/keras_hub/src/layers/modeling/f_net_encoder_test.py
similarity index 94%
rename from keras_nlp/src/layers/modeling/f_net_encoder_test.py
rename to keras_hub/src/layers/modeling/f_net_encoder_test.py
index 79270f6fed..a594c4e84d 100644
--- a/keras_nlp/src/layers/modeling/f_net_encoder_test.py
+++ b/keras_hub/src/layers/modeling/f_net_encoder_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.layers.modeling.f_net_encoder import FNetEncoder
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.layers.modeling.f_net_encoder import FNetEncoder
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FNetEncoderTest(TestCase):
diff --git a/keras_nlp/src/layers/modeling/masked_lm_head.py b/keras_hub/src/layers/modeling/masked_lm_head.py
similarity index 96%
rename from keras_nlp/src/layers/modeling/masked_lm_head.py
rename to keras_hub/src/layers/modeling/masked_lm_head.py
index 962b2883c5..9979625f7a 100644
--- a/keras_nlp/src/layers/modeling/masked_lm_head.py
+++ b/keras_hub/src/layers/modeling/masked_lm_head.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 
-@keras_nlp_export("keras_nlp.layers.MaskedLMHead")
+@keras_hub_export("keras_hub.layers.MaskedLMHead")
 class MaskedLMHead(keras.layers.Layer):
     """Masked Language Model (MaskedLM) head.
 
@@ -40,12 +40,12 @@ class MaskedLMHead(keras.layers.Layer):
     `(batch_size, masks_per_sequence, vocabulary_size)`, which can be used to
     compute an MaskedLM loss function.
 
-    This layer is often be paired with `keras_nlp.layers.MaskedLMMaskGenerator`,
+    This layer is often be paired with `keras_hub.layers.MaskedLMMaskGenerator`,
     which will help prepare inputs for the MaskedLM task.
 
     Args:
         vocabulary_size: The total size of the vocabulary for predictions.
-        token_embedding: Optional. A `keras_nlp.layers.ReversibleEmbedding`
+        token_embedding: Optional. A `keras_hub.layers.ReversibleEmbedding`
             instance. If passed, the layer will be used to project from the
             `hidden_dim` of the model to the output `vocabulary_size`.
         intermediate_activation: The activation function of intermediate dense layer.
@@ -77,13 +77,13 @@ class MaskedLMHead(keras.layers.Layer):
     mask_positions = np.random.randint(seq_length, size=(batch_size, 5))
 
     # Embed tokens in a `hidden_dim` feature space.
-    token_embedding = keras_nlp.layers.ReversibleEmbedding(
+    token_embedding = keras_hub.layers.ReversibleEmbedding(
         vocab_size,
         hidden_dim,
     )
     hidden_states = token_embedding(token_ids)
 
-    preds = keras_nlp.layers.MaskedLMHead(
+    preds = keras_hub.layers.MaskedLMHead(
         vocabulary_size=vocab_size,
         token_embedding=token_embedding,
         activation="softmax",
diff --git a/keras_nlp/src/layers/modeling/masked_lm_head_test.py b/keras_hub/src/layers/modeling/masked_lm_head_test.py
similarity index 92%
rename from keras_nlp/src/layers/modeling/masked_lm_head_test.py
rename to keras_hub/src/layers/modeling/masked_lm_head_test.py
index b035390eb1..bd9daf92ea 100644
--- a/keras_nlp/src/layers/modeling/masked_lm_head_test.py
+++ b/keras_hub/src/layers/modeling/masked_lm_head_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 from keras import random
 
-from keras_nlp.src.layers.modeling.masked_lm_head import MaskedLMHead
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.layers.modeling.masked_lm_head import MaskedLMHead
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MaskedLMHeadTest(TestCase):
diff --git a/keras_nlp/src/layers/modeling/position_embedding.py b/keras_hub/src/layers/modeling/position_embedding.py
similarity index 93%
rename from keras_nlp/src/layers/modeling/position_embedding.py
rename to keras_hub/src/layers/modeling/position_embedding.py
index 205417beaa..f59ac87a9d 100644
--- a/keras_nlp/src/layers/modeling/position_embedding.py
+++ b/keras_hub/src/layers/modeling/position_embedding.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 
-@keras_nlp_export("keras_nlp.layers.PositionEmbedding")
+@keras_hub_export("keras_hub.layers.PositionEmbedding")
 class PositionEmbedding(keras.layers.Layer):
     """A layer which learns a position embedding for inputs sequences.
 
@@ -49,7 +49,7 @@ class PositionEmbedding(keras.layers.Layer):
     Example:
 
     Called directly on input.
-    >>> layer = keras_nlp.layers.PositionEmbedding(sequence_length=10)
+    >>> layer = keras_hub.layers.PositionEmbedding(sequence_length=10)
     >>> layer(np.zeros((8, 10, 16)))
 
     Combine with a token embedding.
@@ -61,7 +61,7 @@ class PositionEmbedding(keras.layers.Layer):
     token_embeddings = keras.layers.Embedding(
         input_dim=vocab_size, output_dim=embed_dim
     )(inputs)
-    position_embeddings = keras_nlp.layers.PositionEmbedding(
+    position_embeddings = keras_hub.layers.PositionEmbedding(
         sequence_length=seq_length
     )(token_embeddings)
     outputs = token_embeddings + position_embeddings
diff --git a/keras_nlp/src/layers/modeling/position_embedding_test.py b/keras_hub/src/layers/modeling/position_embedding_test.py
similarity index 97%
rename from keras_nlp/src/layers/modeling/position_embedding_test.py
rename to keras_hub/src/layers/modeling/position_embedding_test.py
index 3d099b57d6..02d224e762 100644
--- a/keras_nlp/src/layers/modeling/position_embedding_test.py
+++ b/keras_hub/src/layers/modeling/position_embedding_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,8 +17,8 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.tests.test_case import TestCase
 
 
 def custom_init(shape, dtype=None):
diff --git a/keras_nlp/src/layers/modeling/reversible_embedding.py b/keras_hub/src/layers/modeling/reversible_embedding.py
similarity index 97%
rename from keras_nlp/src/layers/modeling/reversible_embedding.py
rename to keras_hub/src/layers/modeling/reversible_embedding.py
index 485fb45606..963085d5d8 100644
--- a/keras_nlp/src/layers/modeling/reversible_embedding.py
+++ b/keras_hub/src/layers/modeling/reversible_embedding.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,11 +16,11 @@
 from keras import ops
 from packaging.version import parse
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.utils.keras_utils import assert_quantization_support
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.utils.keras_utils import assert_quantization_support
 
 
-@keras_nlp_export("keras_nlp.layers.ReversibleEmbedding")
+@keras_hub_export("keras_hub.layers.ReversibleEmbedding")
 class ReversibleEmbedding(keras.layers.Embedding):
     """An embedding layer which can project backwards to the input dim.
 
@@ -75,7 +75,7 @@ class ReversibleEmbedding(keras.layers.Embedding):
     # Generate random inputs.
     token_ids = np.random.randint(vocab_size, size=(batch_size, seq_length))
 
-    embedding = keras_nlp.layers.ReversibleEmbedding(vocab_size, hidden_dim)
+    embedding = keras_hub.layers.ReversibleEmbedding(vocab_size, hidden_dim)
     # Embed tokens to shape `(batch_size, seq_length, hidden_dim)`.
     hidden_states = embedding(token_ids)
     # Project hidden states to shape `(batch_size, seq_length, vocab_size)`.
diff --git a/keras_nlp/src/layers/modeling/reversible_embedding_test.py b/keras_hub/src/layers/modeling/reversible_embedding_test.py
similarity index 96%
rename from keras_nlp/src/layers/modeling/reversible_embedding_test.py
rename to keras_hub/src/layers/modeling/reversible_embedding_test.py
index 9d6921a126..80ad2457ee 100644
--- a/keras_nlp/src/layers/modeling/reversible_embedding_test.py
+++ b/keras_hub/src/layers/modeling/reversible_embedding_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -20,11 +20,11 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.utils.keras_utils import has_quantization_support
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.utils.keras_utils import has_quantization_support
 
 
 class ReversibleEmbeddingTest(TestCase):
diff --git a/keras_nlp/src/layers/modeling/rotary_embedding.py b/keras_hub/src/layers/modeling/rotary_embedding.py
similarity index 97%
rename from keras_nlp/src/layers/modeling/rotary_embedding.py
rename to keras_hub/src/layers/modeling/rotary_embedding.py
index 9b49a1f3ee..0350c6b687 100644
--- a/keras_nlp/src/layers/modeling/rotary_embedding.py
+++ b/keras_hub/src/layers/modeling/rotary_embedding.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 
-@keras_nlp_export("keras_nlp.layers.RotaryEmbedding")
+@keras_hub_export("keras_hub.layers.RotaryEmbedding")
 class RotaryEmbedding(keras.layers.Layer):
     """Rotary positional encoding layer.
 
diff --git a/keras_nlp/src/layers/modeling/rotary_embedding_test.py b/keras_hub/src/layers/modeling/rotary_embedding_test.py
similarity index 98%
rename from keras_nlp/src/layers/modeling/rotary_embedding_test.py
rename to keras_hub/src/layers/modeling/rotary_embedding_test.py
index fa176b8ebd..73678308d4 100644
--- a/keras_nlp/src/layers/modeling/rotary_embedding_test.py
+++ b/keras_hub/src/layers/modeling/rotary_embedding_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.layers.modeling.rotary_embedding import RotaryEmbedding
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.layers.modeling.rotary_embedding import RotaryEmbedding
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RotaryEmbeddingTest(TestCase):
diff --git a/keras_nlp/src/layers/modeling/sine_position_encoding.py b/keras_hub/src/layers/modeling/sine_position_encoding.py
similarity index 94%
rename from keras_nlp/src/layers/modeling/sine_position_encoding.py
rename to keras_hub/src/layers/modeling/sine_position_encoding.py
index b0c4b07fae..4294052a52 100644
--- a/keras_nlp/src/layers/modeling/sine_position_encoding.py
+++ b/keras_hub/src/layers/modeling/sine_position_encoding.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 
-@keras_nlp_export("keras_nlp.layers.SinePositionEncoding")
+@keras_hub_export("keras_hub.layers.SinePositionEncoding")
 class SinePositionEncoding(keras.layers.Layer):
     """Sinusoidal positional encoding layer.
 
@@ -55,7 +55,7 @@ class SinePositionEncoding(keras.layers.Layer):
     embedding = keras.layers.Embedding(
         input_dim=vocab_size, output_dim=embedding_dim
     )(inputs)
-    positional_encoding = keras_nlp.layers.SinePositionEncoding()(embedding)
+    positional_encoding = keras_hub.layers.SinePositionEncoding()(embedding)
     outputs = embedding + positional_encoding
     ```
 
diff --git a/keras_nlp/src/layers/modeling/sine_position_encoding_test.py b/keras_hub/src/layers/modeling/sine_position_encoding_test.py
similarity index 96%
rename from keras_nlp/src/layers/modeling/sine_position_encoding_test.py
rename to keras_hub/src/layers/modeling/sine_position_encoding_test.py
index 4c1c107647..edecda3fbe 100644
--- a/keras_nlp/src/layers/modeling/sine_position_encoding_test.py
+++ b/keras_hub/src/layers/modeling/sine_position_encoding_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,10 +16,10 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.layers.modeling.sine_position_encoding import (
+from keras_hub.src.layers.modeling.sine_position_encoding import (
     SinePositionEncoding,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class SinePositionEncodingTest(TestCase):
diff --git a/keras_nlp/src/layers/modeling/token_and_position_embedding.py b/keras_hub/src/layers/modeling/token_and_position_embedding.py
similarity index 91%
rename from keras_nlp/src/layers/modeling/token_and_position_embedding.py
rename to keras_hub/src/layers/modeling/token_and_position_embedding.py
index 38988f2478..20cbcfd00d 100644
--- a/keras_nlp/src/layers/modeling/token_and_position_embedding.py
+++ b/keras_hub/src/layers/modeling/token_and_position_embedding.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,21 +14,21 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
-@keras_nlp_export("keras_nlp.layers.TokenAndPositionEmbedding")
+@keras_hub_export("keras_hub.layers.TokenAndPositionEmbedding")
 class TokenAndPositionEmbedding(keras.layers.Layer):
     """A layer which sums a token and position embedding.
 
     Token and position embeddings are ways of representing words and their order
     in a sentence. This layer creates a `keras.layers.Embedding` token embedding
-    and a `keras_nlp.layers.PositionEmbedding` position embedding and sums their
+    and a `keras_hub.layers.PositionEmbedding` position embedding and sums their
     output when called. This layer assumes that the last dimension in the input
     corresponds to the sequence dimension.
 
@@ -55,7 +55,7 @@ class TokenAndPositionEmbedding(keras.layers.Layer):
     Example:
     ```python
     inputs = np.ones(shape=(1, 50), dtype="int32")
-    embedding_layer = keras_nlp.layers.TokenAndPositionEmbedding(
+    embedding_layer = keras_hub.layers.TokenAndPositionEmbedding(
         vocabulary_size=10_000,
         sequence_length=50,
         embedding_dim=128,
diff --git a/keras_nlp/src/layers/modeling/token_and_position_embedding_test.py b/keras_hub/src/layers/modeling/token_and_position_embedding_test.py
similarity index 91%
rename from keras_nlp/src/layers/modeling/token_and_position_embedding_test.py
rename to keras_hub/src/layers/modeling/token_and_position_embedding_test.py
index bc27a3cbec..dc52a56245 100644
--- a/keras_nlp/src/layers/modeling/token_and_position_embedding_test.py
+++ b/keras_hub/src/layers/modeling/token_and_position_embedding_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,10 +16,10 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.layers.modeling.token_and_position_embedding import (
+from keras_hub.src.layers.modeling.token_and_position_embedding import (
     TokenAndPositionEmbedding,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TokenAndPositionEmbeddingTest(TestCase):
diff --git a/keras_nlp/src/layers/modeling/transformer_decoder.py b/keras_hub/src/layers/modeling/transformer_decoder.py
similarity index 97%
rename from keras_nlp/src/layers/modeling/transformer_decoder.py
rename to keras_hub/src/layers/modeling/transformer_decoder.py
index c7c0064305..a5e6d82327 100644
--- a/keras_nlp/src/layers/modeling/transformer_decoder.py
+++ b/keras_hub/src/layers/modeling/transformer_decoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,19 +15,19 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.cached_multi_head_attention import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.cached_multi_head_attention import (
     CachedMultiHeadAttention,
 )
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.utils.keras_utils import clone_initializer
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (  # isort:skip
+from keras_hub.src.layers.modeling.transformer_layer_utils import (  # isort:skip
     compute_causal_mask,
     merge_padding_and_attention_mask,
 )
 
 
-@keras_nlp_export("keras_nlp.layers.TransformerDecoder")
+@keras_hub_export("keras_hub.layers.TransformerDecoder")
 class TransformerDecoder(keras.layers.Layer):
     """Transformer decoder.
 
@@ -76,7 +76,7 @@ class TransformerDecoder(keras.layers.Layer):
     Example:
     ```python
     # Create a single transformer decoder layer.
-    decoder = keras_nlp.layers.TransformerDecoder(
+    decoder = keras_hub.layers.TransformerDecoder(
         intermediate_dim=64, num_heads=8)
 
     # Create a simple model containing the decoder.
@@ -320,7 +320,7 @@ def call(
         if not has_cross_attention and has_encoder_sequence:
             raise ValueError(
                 "The number of call arguments to "
-                "`keras_nlp.layers.TransformerDecoder` should not change. "
+                "`keras_hub.layers.TransformerDecoder` should not change. "
                 "Use `layer(decoder_sequence, encoder_sequence)` to "
                 "build a layer with cross attention, or "
                 "`layer(decoder_sequence)` to build a layer without. "
@@ -330,7 +330,7 @@ def call(
         elif has_cross_attention and not has_encoder_sequence:
             raise ValueError(
                 "The number of call arguments to "
-                "`keras_nlp.layers.TransformerDecoder` should not change. "
+                "`keras_hub.layers.TransformerDecoder` should not change. "
                 "Use `layer(decoder_sequence, encoder_sequence)` to "
                 "build a layer with cross attention, or "
                 "`layer(decoder_sequence)` to build a layer without. "
@@ -344,7 +344,7 @@ def call(
             has_self_attention_cache != has_cross_attention_cache
         ):
             raise ValueError(
-                "When calling `keras_nlp.layers.TransformerDecoder` with "
+                "When calling `keras_hub.layers.TransformerDecoder` with "
                 "cross-attention (with both `encoder_sequence` and "
                 "`decoder_sequence`), `self_attention_cache` and "
                 "`cross_attention_cache` should both be set or both be `None`. "
diff --git a/keras_nlp/src/layers/modeling/transformer_decoder_test.py b/keras_hub/src/layers/modeling/transformer_decoder_test.py
similarity index 98%
rename from keras_nlp/src/layers/modeling/transformer_decoder_test.py
rename to keras_hub/src/layers/modeling/transformer_decoder_test.py
index 73799883ee..73bb9d225a 100644
--- a/keras_nlp/src/layers/modeling/transformer_decoder_test.py
+++ b/keras_hub/src/layers/modeling/transformer_decoder_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.layers.modeling.transformer_decoder import TransformerDecoder
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.layers.modeling.transformer_decoder import TransformerDecoder
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TransformerDecoderTest(TestCase):
diff --git a/keras_nlp/src/layers/modeling/transformer_encoder.py b/keras_hub/src/layers/modeling/transformer_encoder.py
similarity index 96%
rename from keras_nlp/src/layers/modeling/transformer_encoder.py
rename to keras_hub/src/layers/modeling/transformer_encoder.py
index db96ccbb1e..614ba0f0f4 100644
--- a/keras_nlp/src/layers/modeling/transformer_encoder.py
+++ b/keras_hub/src/layers/modeling/transformer_encoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,15 +14,15 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.utils.keras_utils import clone_initializer
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (  # isort:skip
+from keras_hub.src.layers.modeling.transformer_layer_utils import (  # isort:skip
     merge_padding_and_attention_mask,
 )
 
 
-@keras_nlp_export("keras_nlp.layers.TransformerEncoder")
+@keras_hub_export("keras_hub.layers.TransformerEncoder")
 class TransformerEncoder(keras.layers.Layer):
     """Transformer encoder.
 
@@ -66,7 +66,7 @@ class TransformerEncoder(keras.layers.Layer):
 
     ```python
     # Create a single transformer encoder layer.
-    encoder = keras_nlp.layers.TransformerEncoder(
+    encoder = keras_hub.layers.TransformerEncoder(
         intermediate_dim=64, num_heads=8)
 
     # Create a simple model containing the encoder.
diff --git a/keras_nlp/src/layers/modeling/transformer_encoder_test.py b/keras_hub/src/layers/modeling/transformer_encoder_test.py
similarity index 95%
rename from keras_nlp/src/layers/modeling/transformer_encoder_test.py
rename to keras_hub/src/layers/modeling/transformer_encoder_test.py
index 623f9202f2..9640d02a19 100644
--- a/keras_nlp/src/layers/modeling/transformer_encoder_test.py
+++ b/keras_hub/src/layers/modeling/transformer_encoder_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,8 +17,8 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.layers.modeling.transformer_encoder import TransformerEncoder
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.layers.modeling.transformer_encoder import TransformerEncoder
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TransformerEncoderTest(TestCase):
diff --git a/keras_nlp/src/layers/modeling/transformer_layer_utils.py b/keras_hub/src/layers/modeling/transformer_layer_utils.py
similarity index 99%
rename from keras_nlp/src/layers/modeling/transformer_layer_utils.py
rename to keras_hub/src/layers/modeling/transformer_layer_utils.py
index fff8ce76bc..de18fec771 100644
--- a/keras_nlp/src/layers/modeling/transformer_layer_utils.py
+++ b/keras_hub/src/layers/modeling/transformer_layer_utils.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/layers/modeling/transformer_layer_utils_test.py b/keras_hub/src/layers/modeling/transformer_layer_utils_test.py
similarity index 92%
rename from keras_nlp/src/layers/modeling/transformer_layer_utils_test.py
rename to keras_hub/src/layers/modeling/transformer_layer_utils_test.py
index e914d8cee1..8fc232b1cb 100644
--- a/keras_nlp/src/layers/modeling/transformer_layer_utils_test.py
+++ b/keras_hub/src/layers/modeling/transformer_layer_utils_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 from keras import ops
 from keras import random
 
-import keras_nlp.src.layers.modeling.transformer_layer_utils as utils
-from keras_nlp.src.tests.test_case import TestCase
+import keras_hub.src.layers.modeling.transformer_layer_utils as utils
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TransformerLayerUtilsTest(TestCase):
diff --git a/keras_hub/src/layers/preprocessing/__init__.py b/keras_hub/src/layers/preprocessing/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/layers/preprocessing/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/layers/preprocessing/audio_converter.py b/keras_hub/src/layers/preprocessing/audio_converter.py
similarity index 81%
rename from keras_nlp/src/layers/preprocessing/audio_converter.py
rename to keras_hub/src/layers/preprocessing/audio_converter.py
index aa70ebcb15..a07b45cd19 100644
--- a/keras_nlp/src/layers/preprocessing/audio_converter.py
+++ b/keras_hub/src/layers/preprocessing/audio_converter.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,19 +11,19 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.preprocessing_layer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.preprocessing_layer import (
     PreprocessingLayer,
 )
-from keras_nlp.src.utils.preset_utils import AUDIO_CONVERTER_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import builtin_presets
-from keras_nlp.src.utils.preset_utils import find_subclass
-from keras_nlp.src.utils.preset_utils import get_preset_loader
-from keras_nlp.src.utils.preset_utils import save_serialized_object
-from keras_nlp.src.utils.python_utils import classproperty
+from keras_hub.src.utils.preset_utils import AUDIO_CONVERTER_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import builtin_presets
+from keras_hub.src.utils.preset_utils import find_subclass
+from keras_hub.src.utils.preset_utils import get_preset_loader
+from keras_hub.src.utils.preset_utils import save_serialized_object
+from keras_hub.src.utils.python_utils import classproperty
 
 
-@keras_nlp_export("keras_nlp.layers.AudioConverter")
+@keras_hub_export("keras_hub.layers.AudioConverter")
 class AudioConverter(PreprocessingLayer):
     """Convert raw audio for models that support audio input.
 
@@ -41,7 +41,7 @@ class AudioConverter(PreprocessingLayer):
     Examples:
     ```python
     # Load an audio converter from a preset.
-    converter = keras_nlp.layers.AudioConverter.from_preset("whisper_base_en")
+    converter = keras_hub.layers.AudioConverter.from_preset("whisper_base_en")
     # Convert some raw audio input.
     converter(np.ones(2, 1_000))
     ```
@@ -64,7 +64,7 @@ def from_preset(
         preset,
         **kwargs,
     ):
-        """Instantiate a `keras_nlp.layers.AudioConverter` from a model preset.
+        """Instantiate a `keras_hub.layers.AudioConverter` from a model preset.
 
         A preset is a directory of configs, weights and other file assets used
         to save and load a pre-trained model. The `preset` can be passed as
@@ -80,8 +80,8 @@ def from_preset(
         on the class.
 
         This constructor can be called in one of two ways. Either from the base
-        class like `keras_nlp.models.AudioConverter.from_preset()`, or from a
-        model class like `keras_nlp.models.WhisperAudioConverter.from_preset()`.
+        class like `keras_hub.models.AudioConverter.from_preset()`, or from a
+        model class like `keras_hub.models.WhisperAudioConverter.from_preset()`.
         If calling from the base class, the subclass of the returning object
         will be inferred from the config in the preset directory.
 
@@ -95,7 +95,7 @@ class like `keras_nlp.models.AudioConverter.from_preset()`, or from a
         Examples:
         ```python
         # Load an audio converter from a preset.
-        converter = keras_nlp.layers.AudioConverter.from_preset(
+        converter = keras_hub.layers.AudioConverter.from_preset(
             "whisper_base_en"
         )
         # Convert some raw mono channel audio input.
diff --git a/keras_nlp/src/layers/preprocessing/audio_converter_test.py b/keras_hub/src/layers/preprocessing/audio_converter_test.py
similarity index 89%
rename from keras_nlp/src/layers/preprocessing/audio_converter_test.py
rename to keras_hub/src/layers/preprocessing/audio_converter_test.py
index 42a21c5e0f..be06bac65d 100644
--- a/keras_nlp/src/layers/preprocessing/audio_converter_test.py
+++ b/keras_hub/src/layers/preprocessing/audio_converter_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,12 +18,12 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.layers.preprocessing.audio_converter import AudioConverter
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.whisper.whisper_audio_converter import (
+from keras_hub.src.layers.preprocessing.audio_converter import AudioConverter
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.whisper.whisper_audio_converter import (
     WhisperAudioConverter,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class AudioConverterTest(TestCase):
diff --git a/keras_nlp/src/layers/preprocessing/image_converter.py b/keras_hub/src/layers/preprocessing/image_converter.py
similarity index 80%
rename from keras_nlp/src/layers/preprocessing/image_converter.py
rename to keras_hub/src/layers/preprocessing/image_converter.py
index e53803f49e..893ad6ccf3 100644
--- a/keras_nlp/src/layers/preprocessing/image_converter.py
+++ b/keras_hub/src/layers/preprocessing/image_converter.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,19 +11,19 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.preprocessing_layer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.preprocessing_layer import (
     PreprocessingLayer,
 )
-from keras_nlp.src.utils.preset_utils import IMAGE_CONVERTER_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import builtin_presets
-from keras_nlp.src.utils.preset_utils import find_subclass
-from keras_nlp.src.utils.preset_utils import get_preset_loader
-from keras_nlp.src.utils.preset_utils import save_serialized_object
-from keras_nlp.src.utils.python_utils import classproperty
+from keras_hub.src.utils.preset_utils import IMAGE_CONVERTER_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import builtin_presets
+from keras_hub.src.utils.preset_utils import find_subclass
+from keras_hub.src.utils.preset_utils import get_preset_loader
+from keras_hub.src.utils.preset_utils import save_serialized_object
+from keras_hub.src.utils.python_utils import classproperty
 
 
-@keras_nlp_export("keras_nlp.layers.ImageConverter")
+@keras_hub_export("keras_hub.layers.ImageConverter")
 class ImageConverter(PreprocessingLayer):
     """Convert raw image for models that support image input.
 
@@ -42,10 +42,10 @@ class ImageConverter(PreprocessingLayer):
     Examples:
     ```python
     # Resize images for `"pali_gemma_3b_224"`.
-    converter = keras_nlp.layers.ImageConverter.from_preset("pali_gemma_3b_224")
+    converter = keras_hub.layers.ImageConverter.from_preset("pali_gemma_3b_224")
     converter(np.ones(2, 512, 512, 3)) # Output shape: (2, 224, 224, 3)
     # Resize images for `"pali_gemma_3b_448"`.
-    converter = keras_nlp.layers.ImageConverter.from_preset("pali_gemma_3b_448")
+    converter = keras_hub.layers.ImageConverter.from_preset("pali_gemma_3b_448")
     converter(np.ones(2, 512, 512, 3)) # Output shape: (2, 448, 448, 3)
     ```
     """
@@ -67,7 +67,7 @@ def from_preset(
         preset,
         **kwargs,
     ):
-        """Instantiate a `keras_nlp.layers.ImageConverter` from a model preset.
+        """Instantiate a `keras_hub.layers.ImageConverter` from a model preset.
 
         A preset is a directory of configs, weights and other file assets used
         to save and load a pre-trained model. The `preset` can be passed as
@@ -83,9 +83,9 @@ def from_preset(
         on the class.
 
         This constructor can be called in one of two ways. Either from the base
-        class like `keras_nlp.models.ImageConverter.from_preset()`, or from a
+        class like `keras_hub.models.ImageConverter.from_preset()`, or from a
         model class like
-        `keras_nlp.models.PaliGemmaImageConverter.from_preset()`. If calling
+        `keras_hub.models.PaliGemmaImageConverter.from_preset()`. If calling
         from the base class, the subclass of the returning object will be
         inferred from the config in the preset directory.
 
@@ -99,12 +99,12 @@ class like `keras_nlp.models.ImageConverter.from_preset()`, or from a
         Examples:
         ```python
         # Resize images for `"pali_gemma_3b_224"`.
-        converter = keras_nlp.layers.ImageConverter.from_preset(
+        converter = keras_hub.layers.ImageConverter.from_preset(
             "pali_gemma_3b_224"
         )
         converter(np.ones(2, 512, 512, 3)) # Output shape: (2, 224, 224, 3)
         # Override arguments on the base class.
-        converter = keras_nlp.layers.ImageConverter.from_preset(
+        converter = keras_hub.layers.ImageConverter.from_preset(
             "pali_gemma_3b_448",
             crop_to_aspect_ratio=False,
         )
diff --git a/keras_nlp/src/layers/preprocessing/image_converter_test.py b/keras_hub/src/layers/preprocessing/image_converter_test.py
similarity index 90%
rename from keras_nlp/src/layers/preprocessing/image_converter_test.py
rename to keras_hub/src/layers/preprocessing/image_converter_test.py
index f2b1754033..46fc837eaa 100644
--- a/keras_nlp/src/layers/preprocessing/image_converter_test.py
+++ b/keras_hub/src/layers/preprocessing/image_converter_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,14 +18,14 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.layers.preprocessing.image_converter import ImageConverter
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
+from keras_hub.src.layers.preprocessing.image_converter import ImageConverter
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
     PaliGemmaBackbone,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_image_converter import (
+from keras_hub.src.models.pali_gemma.pali_gemma_image_converter import (
     PaliGemmaImageConverter,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ImageConverterTest(TestCase):
diff --git a/keras_nlp/src/layers/preprocessing/masked_lm_mask_generator.py b/keras_hub/src/layers/preprocessing/masked_lm_mask_generator.py
similarity index 94%
rename from keras_nlp/src/layers/preprocessing/masked_lm_mask_generator.py
rename to keras_hub/src/layers/preprocessing/masked_lm_mask_generator.py
index fe78e0c172..f38618d361 100644
--- a/keras_nlp/src/layers/preprocessing/masked_lm_mask_generator.py
+++ b/keras_hub/src/layers/preprocessing/masked_lm_mask_generator.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,12 +13,12 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.preprocessing_layer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.preprocessing_layer import (
     PreprocessingLayer,
 )
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 try:
     import tensorflow as tf
@@ -28,7 +28,7 @@
     tf_text = None
 
 
-@keras_nlp_export("keras_nlp.layers.MaskedLMMaskGenerator")
+@keras_hub_export("keras_hub.layers.MaskedLMMaskGenerator")
 class MaskedLMMaskGenerator(PreprocessingLayer):
     """Layer that applies language model masking.
 
@@ -87,7 +87,7 @@ class MaskedLMMaskGenerator(PreprocessingLayer):
 
     Basic usage.
     ```python
-    masker = keras_nlp.layers.MaskedLMMaskGenerator(
+    masker = keras_hub.layers.MaskedLMMaskGenerator(
         vocabulary_size=10,
         mask_selection_rate=0.2,
         mask_token_id=0,
@@ -108,7 +108,7 @@ class MaskedLMMaskGenerator(PreprocessingLayer):
         [cls_id,   4,    5, sep_id,      6,    7,    8,      9, sep_id, pad_id],
     ]
 
-    masker = keras_nlp.layers.MaskedLMMaskGenerator(
+    masker = keras_hub.layers.MaskedLMMaskGenerator(
         vocabulary_size = 10,
         mask_selection_rate = 0.2,
         mask_selection_length = 5,
diff --git a/keras_nlp/src/layers/preprocessing/masked_lm_mask_generator_test.py b/keras_hub/src/layers/preprocessing/masked_lm_mask_generator_test.py
similarity index 97%
rename from keras_nlp/src/layers/preprocessing/masked_lm_mask_generator_test.py
rename to keras_hub/src/layers/preprocessing/masked_lm_mask_generator_test.py
index cfc497c6f5..ce2c3653a0 100644
--- a/keras_nlp/src/layers/preprocessing/masked_lm_mask_generator_test.py
+++ b/keras_hub/src/layers/preprocessing/masked_lm_mask_generator_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import tensorflow as tf
 from keras import ops
 
-from keras_nlp.src.layers.preprocessing.masked_lm_mask_generator import (
+from keras_hub.src.layers.preprocessing.masked_lm_mask_generator import (
     MaskedLMMaskGenerator,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MaskedLMMaskGeneratorTest(TestCase):
diff --git a/keras_nlp/src/layers/preprocessing/multi_segment_packer.py b/keras_hub/src/layers/preprocessing/multi_segment_packer.py
similarity index 96%
rename from keras_nlp/src/layers/preprocessing/multi_segment_packer.py
rename to keras_hub/src/layers/preprocessing/multi_segment_packer.py
index 53625783cc..c7ec72a603 100644
--- a/keras_nlp/src/layers/preprocessing/multi_segment_packer.py
+++ b/keras_hub/src/layers/preprocessing/multi_segment_packer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,12 +12,12 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.preprocessing_layer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.preprocessing_layer import (
     PreprocessingLayer,
 )
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 try:
     import tensorflow as tf
@@ -27,7 +27,7 @@
     tf_text = None
 
 
-@keras_nlp_export("keras_nlp.layers.MultiSegmentPacker")
+@keras_hub_export("keras_hub.layers.MultiSegmentPacker")
 class MultiSegmentPacker(PreprocessingLayer):
     """Packs multiple sequences into a single fixed width model input.
 
@@ -90,7 +90,7 @@ class MultiSegmentPacker(PreprocessingLayer):
 
     *Pack a single input for classification.*
     >>> seq1 = [1, 2, 3, 4]
-    >>> packer = keras_nlp.layers.MultiSegmentPacker(
+    >>> packer = keras_hub.layers.MultiSegmentPacker(
     ...     sequence_length=8, start_value=101, end_value=102
     ... )
     >>> token_ids, segment_ids = packer((seq1,))
@@ -102,7 +102,7 @@ class MultiSegmentPacker(PreprocessingLayer):
     *Pack multiple inputs for classification.*
     >>> seq1 = [1, 2, 3, 4]
     >>> seq2 = [11, 12, 13, 14]
-    >>> packer = keras_nlp.layers.MultiSegmentPacker(
+    >>> packer = keras_hub.layers.MultiSegmentPacker(
     ...     sequence_length=8, start_value=101, end_value=102
     ... )
     >>> token_ids, segment_ids = packer((seq1, seq2))
@@ -114,7 +114,7 @@ class MultiSegmentPacker(PreprocessingLayer):
     *Pack multiple inputs for classification with different sep tokens.*
     >>> seq1 = [1, 2, 3, 4]
     >>> seq2 = [11, 12, 13, 14]
-    >>> packer = keras_nlp.layers.MultiSegmentPacker(
+    >>> packer = keras_hub.layers.MultiSegmentPacker(
     ...     sequence_length=8,
     ...     start_value=101,
     ...     end_value=102,
diff --git a/keras_nlp/src/layers/preprocessing/multi_segment_packer_test.py b/keras_hub/src/layers/preprocessing/multi_segment_packer_test.py
similarity index 97%
rename from keras_nlp/src/layers/preprocessing/multi_segment_packer_test.py
rename to keras_hub/src/layers/preprocessing/multi_segment_packer_test.py
index 2c34dcd0bd..78d6ccaa00 100644
--- a/keras_nlp/src/layers/preprocessing/multi_segment_packer_test.py
+++ b/keras_hub/src/layers/preprocessing/multi_segment_packer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,10 +14,10 @@
 
 import numpy as np
 
-from keras_nlp.src.layers.preprocessing.multi_segment_packer import (
+from keras_hub.src.layers.preprocessing.multi_segment_packer import (
     MultiSegmentPacker,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MultiSegmentPackerTest(TestCase):
diff --git a/keras_nlp/src/layers/preprocessing/preprocessing_layer.py b/keras_hub/src/layers/preprocessing/preprocessing_layer.py
similarity index 91%
rename from keras_nlp/src/layers/preprocessing/preprocessing_layer.py
rename to keras_hub/src/layers/preprocessing/preprocessing_layer.py
index 590a7760ac..924c89c23b 100644
--- a/keras_nlp/src/layers/preprocessing/preprocessing_layer.py
+++ b/keras_hub/src/layers/preprocessing/preprocessing_layer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,7 +14,7 @@
 
 import keras
 
-from keras_nlp.src.utils.tensor_utils import assert_tf_libs_installed
+from keras_hub.src.utils.tensor_utils import assert_tf_libs_installed
 
 
 class PreprocessingLayer(keras.layers.Layer):
diff --git a/keras_nlp/src/layers/preprocessing/random_deletion.py b/keras_hub/src/layers/preprocessing/random_deletion.py
similarity index 93%
rename from keras_nlp/src/layers/preprocessing/random_deletion.py
rename to keras_hub/src/layers/preprocessing/random_deletion.py
index 6df8b6ef28..62225f90a9 100644
--- a/keras_nlp/src/layers/preprocessing/random_deletion.py
+++ b/keras_hub/src/layers/preprocessing/random_deletion.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,14 +14,14 @@
 
 import random
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.preprocessing_layer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.preprocessing_layer import (
     PreprocessingLayer,
 )
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import is_int_dtype
-from keras_nlp.src.utils.tensor_utils import is_string_dtype
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import is_int_dtype
+from keras_hub.src.utils.tensor_utils import is_string_dtype
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 try:
     import tensorflow as tf
@@ -29,7 +29,7 @@
     tf = None
 
 
-@keras_nlp_export("keras_nlp.layers.RandomDeletion")
+@keras_hub_export("keras_hub.layers.RandomDeletion")
 class RandomDeletion(PreprocessingLayer):
     """Augments input by randomly deleting tokens.
 
@@ -68,7 +68,7 @@ class RandomDeletion(PreprocessingLayer):
     >>> keras.utils.set_random_seed(1337)
     >>> x = ["Hey I like", "Keras and Tensorflow"]
     >>> x = list(map(lambda x: x.split(), x))
-    >>> augmenter = keras_nlp.layers.RandomDeletion(rate=0.4, seed=42)
+    >>> augmenter = keras_hub.layers.RandomDeletion(rate=0.4, seed=42)
     >>> y = augmenter(x)
     >>> list(map(lambda y: " ".join(y), y))
     ['I like', 'and']
@@ -77,7 +77,7 @@ class RandomDeletion(PreprocessingLayer):
     >>> keras.utils.set_random_seed(1337)
     >>> x = ["Hey Dude", "Speed Up"]
     >>> x = list(map(lambda x: list(x), x))
-    >>> augmenter = keras_nlp.layers.RandomDeletion(rate=0.4, seed=42)
+    >>> augmenter = keras_hub.layers.RandomDeletion(rate=0.4, seed=42)
     >>> y = augmenter(x)
     >>> list(map(lambda y: "".join(y), y))
     ['H Dude', 'pedUp']
@@ -86,7 +86,7 @@ class RandomDeletion(PreprocessingLayer):
     >>> keras.utils.set_random_seed(1337)
     >>> x = ["Hey I like", "Keras and Tensorflow"]
     >>> x = list(map(lambda x: x.split(), x))
-    >>> augmenter = keras_nlp.layers.RandomDeletion(rate=0.4,
+    >>> augmenter = keras_hub.layers.RandomDeletion(rate=0.4,
     ...     skip_list=["Keras", "Tensorflow"], seed=42)
     >>> y = augmenter(x)
     >>> list(map(lambda y: " ".join(y), y))
@@ -98,7 +98,7 @@ class RandomDeletion(PreprocessingLayer):
     >>> keras.utils.set_random_seed(1337)
     >>> x = ["Hey I like", "Keras and Tensorflow"]
     >>> x = list(map(lambda x: x.split(), x))
-    >>> augmenter = keras_nlp.layers.RandomDeletion(rate=0.4,
+    >>> augmenter = keras_hub.layers.RandomDeletion(rate=0.4,
     ...     skip_fn=skip_fn, seed=42)
     >>> y = augmenter(x)
     >>> list(map(lambda y: " ".join(y), y))
diff --git a/keras_nlp/src/layers/preprocessing/random_deletion_test.py b/keras_hub/src/layers/preprocessing/random_deletion_test.py
similarity index 97%
rename from keras_nlp/src/layers/preprocessing/random_deletion_test.py
rename to keras_hub/src/layers/preprocessing/random_deletion_test.py
index ed3df6afdd..109387c37a 100644
--- a/keras_nlp/src/layers/preprocessing/random_deletion_test.py
+++ b/keras_hub/src/layers/preprocessing/random_deletion_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import keras
 import tensorflow as tf
 
-from keras_nlp.src.layers.preprocessing.random_deletion import RandomDeletion
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.layers.preprocessing.random_deletion import RandomDeletion
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RandomDeletionTest(TestCase):
diff --git a/keras_nlp/src/layers/preprocessing/random_swap.py b/keras_hub/src/layers/preprocessing/random_swap.py
similarity index 92%
rename from keras_nlp/src/layers/preprocessing/random_swap.py
rename to keras_hub/src/layers/preprocessing/random_swap.py
index 32f7a08d26..31d1d7bd65 100644
--- a/keras_nlp/src/layers/preprocessing/random_swap.py
+++ b/keras_hub/src/layers/preprocessing/random_swap.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,14 +14,14 @@
 
 import random
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.preprocessing_layer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.preprocessing_layer import (
     PreprocessingLayer,
 )
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import is_int_dtype
-from keras_nlp.src.utils.tensor_utils import is_string_dtype
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import is_int_dtype
+from keras_hub.src.utils.tensor_utils import is_string_dtype
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 try:
     import tensorflow as tf
@@ -29,7 +29,7 @@
     tf = None
 
 
-@keras_nlp_export("keras_nlp.layers.RandomSwap")
+@keras_hub_export("keras_hub.layers.RandomSwap")
 class RandomSwap(PreprocessingLayer):
     """Augments input by randomly swapping words.
 
@@ -70,7 +70,7 @@ class RandomSwap(PreprocessingLayer):
     >>> keras.utils.set_random_seed(1337)
     >>> x = ["Hey I like", "Keras and Tensorflow"]
     >>> x = list(map(lambda x: x.split(), x))
-    >>> augmenter = keras_nlp.layers.RandomSwap(rate=0.4, seed=42)
+    >>> augmenter = keras_hub.layers.RandomSwap(rate=0.4, seed=42)
     >>> y = augmenter(x)
     >>> list(map(lambda y: " ".join(y), y))
     ['like I Hey', 'and Keras Tensorflow']
@@ -79,7 +79,7 @@ class RandomSwap(PreprocessingLayer):
     >>> keras.utils.set_random_seed(1337)
     >>> x = ["Hey Dude", "Speed Up"]
     >>> x = list(map(lambda x: list(x), x))
-    >>> augmenter = keras_nlp.layers.RandomSwap(rate=0.4, seed=42)
+    >>> augmenter = keras_hub.layers.RandomSwap(rate=0.4, seed=42)
     >>> y = augmenter(x)
     >>> list(map(lambda y: "".join(y), y))
     ['deD yuHe', 'SUede pp']
@@ -88,7 +88,7 @@ class RandomSwap(PreprocessingLayer):
     >>> keras.utils.set_random_seed(1337)
     >>> x = ["Hey I like", "Keras and Tensorflow"]
     >>> x = list(map(lambda x: x.split(), x))
-    >>> augmenter = keras_nlp.layers.RandomSwap(rate=0.4,
+    >>> augmenter = keras_hub.layers.RandomSwap(rate=0.4,
     ...     skip_list=["Keras"], seed=42)
     >>> y = augmenter(x)
     >>> list(map(lambda y: " ".join(y), y))
@@ -100,7 +100,7 @@ class RandomSwap(PreprocessingLayer):
     >>> keras.utils.set_random_seed(1337)
     >>> x = ["Hey I like", "Keras and Tensorflow"]
     >>> x = list(map(lambda x: x.split(), x))
-    >>> augmenter = keras_nlp.layers.RandomSwap(rate=0.9, max_swaps=3,
+    >>> augmenter = keras_hub.layers.RandomSwap(rate=0.9, max_swaps=3,
     ...     skip_fn=skip_fn, seed=11)
     >>> y = augmenter(x)
     >>> list(map(lambda y: " ".join(y), y))
@@ -112,7 +112,7 @@ class RandomSwap(PreprocessingLayer):
     >>> keras.utils.set_random_seed(1337)
     >>> x = ["He was drifting along", "With the wind"]
     >>> x = list(map(lambda x: x.split(), x))
-    >>> augmenter = keras_nlp.layers.RandomSwap(rate=0.8, max_swaps=2,
+    >>> augmenter = keras_hub.layers.RandomSwap(rate=0.8, max_swaps=2,
     ...     skip_py_fn=skip_py_fn, seed=15)
     >>> y = augmenter(x)
     >>> list(map(lambda y: " ".join(y), y))
diff --git a/keras_nlp/src/layers/preprocessing/random_swap_test.py b/keras_hub/src/layers/preprocessing/random_swap_test.py
similarity index 97%
rename from keras_nlp/src/layers/preprocessing/random_swap_test.py
rename to keras_hub/src/layers/preprocessing/random_swap_test.py
index 92db967143..dfc9cb6f61 100644
--- a/keras_nlp/src/layers/preprocessing/random_swap_test.py
+++ b/keras_hub/src/layers/preprocessing/random_swap_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import keras
 import tensorflow as tf
 
-from keras_nlp.src.layers.preprocessing.random_swap import RandomSwap
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.layers.preprocessing.random_swap import RandomSwap
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RandomSwapTest(TestCase):
diff --git a/keras_nlp/src/layers/preprocessing/resizing_image_converter.py b/keras_hub/src/layers/preprocessing/resizing_image_converter.py
similarity index 90%
rename from keras_nlp/src/layers/preprocessing/resizing_image_converter.py
rename to keras_hub/src/layers/preprocessing/resizing_image_converter.py
index 0e42bb9b39..cfce694b65 100644
--- a/keras_nlp/src/layers/preprocessing/resizing_image_converter.py
+++ b/keras_hub/src/layers/preprocessing/resizing_image_converter.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,12 +13,12 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.image_converter import ImageConverter
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.image_converter import ImageConverter
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.layers.ResizingImageConverter")
+@keras_hub_export("keras_hub.layers.ResizingImageConverter")
 class ResizingImageConverter(ImageConverter):
     """An `ImageConverter` that simply resizes the input image.
 
@@ -52,10 +52,10 @@ class ResizingImageConverter(ImageConverter):
     Examples:
     ```python
     # Resize images for `"pali_gemma_3b_224"`.
-    converter = keras_nlp.layers.ImageConverter.from_preset("pali_gemma_3b_224")
+    converter = keras_hub.layers.ImageConverter.from_preset("pali_gemma_3b_224")
     converter(np.ones(2, 512, 512, 3)) # Output shape: (2, 224, 224, 3)
     # Resize images for `"pali_gemma_3b_224"`.
-    converter = keras_nlp.layers.ImageConverter.from_preset("pali_gemma_3b_448")
+    converter = keras_hub.layers.ImageConverter.from_preset("pali_gemma_3b_448")
     converter(np.ones(2, 512, 512, 3)) # Output shape: (2, 448, 448, 3)
     ```
     """
diff --git a/keras_nlp/src/layers/preprocessing/resizing_image_converter_test.py b/keras_hub/src/layers/preprocessing/resizing_image_converter_test.py
similarity index 91%
rename from keras_nlp/src/layers/preprocessing/resizing_image_converter_test.py
rename to keras_hub/src/layers/preprocessing/resizing_image_converter_test.py
index f96a3a3488..857cf578a8 100644
--- a/keras_nlp/src/layers/preprocessing/resizing_image_converter_test.py
+++ b/keras_hub/src/layers/preprocessing/resizing_image_converter_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import numpy as np
 from keras import ops
 
-from keras_nlp.src.layers.preprocessing.resizing_image_converter import (
+from keras_hub.src.layers.preprocessing.resizing_image_converter import (
     ResizingImageConverter,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ResizingImageConverterTest(TestCase):
diff --git a/keras_nlp/src/layers/preprocessing/start_end_packer.py b/keras_hub/src/layers/preprocessing/start_end_packer.py
similarity index 92%
rename from keras_nlp/src/layers/preprocessing/start_end_packer.py
rename to keras_hub/src/layers/preprocessing/start_end_packer.py
index a6e7b4d068..0f218b5e74 100644
--- a/keras_nlp/src/layers/preprocessing/start_end_packer.py
+++ b/keras_hub/src/layers/preprocessing/start_end_packer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,12 +13,12 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.preprocessing_layer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.preprocessing_layer import (
     PreprocessingLayer,
 )
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 try:
     import tensorflow as tf
@@ -26,7 +26,7 @@
     tf = None
 
 
-@keras_nlp_export("keras_nlp.layers.StartEndPacker")
+@keras_hub_export("keras_hub.layers.StartEndPacker")
 class StartEndPacker(PreprocessingLayer):
     """Adds start and end tokens to a sequence and pads to a fixed length.
 
@@ -68,7 +68,7 @@ class StartEndPacker(PreprocessingLayer):
 
     Unbatched input (int).
     >>> inputs = [5, 6, 7]
-    >>> start_end_packer = keras_nlp.layers.StartEndPacker(
+    >>> start_end_packer = keras_hub.layers.StartEndPacker(
     ...     sequence_length=7, start_value=1, end_value=2,
     ... )
     >>> outputs = start_end_packer(inputs)
@@ -77,7 +77,7 @@ class StartEndPacker(PreprocessingLayer):
 
     Batched input (int).
     >>> inputs = [[5, 6, 7], [8, 9, 10, 11, 12, 13, 14]]
-    >>> start_end_packer = keras_nlp.layers.StartEndPacker(
+    >>> start_end_packer = keras_hub.layers.StartEndPacker(
     ...     sequence_length=6, start_value=1, end_value=2,
     ... )
     >>> outputs = start_end_packer(inputs)
@@ -87,7 +87,7 @@ class StartEndPacker(PreprocessingLayer):
 
     Unbatched input (str).
     >>> inputs = tf.constant(["this", "is", "fun"])
-    >>> start_end_packer = keras_nlp.layers.StartEndPacker(
+    >>> start_end_packer = keras_hub.layers.StartEndPacker(
     ...     sequence_length=6, start_value="<s>", end_value="</s>",
     ...     pad_value="<pad>"
     ... )
@@ -97,7 +97,7 @@ class StartEndPacker(PreprocessingLayer):
 
     Batched input (str).
     >>> inputs = tf.ragged.constant([["this", "is", "fun"], ["awesome"]])
-    >>> start_end_packer = keras_nlp.layers.StartEndPacker(
+    >>> start_end_packer = keras_hub.layers.StartEndPacker(
     ...     sequence_length=6, start_value="<s>", end_value="</s>",
     ...     pad_value="<pad>"
     ... )
@@ -108,7 +108,7 @@ class StartEndPacker(PreprocessingLayer):
 
     Multiple start tokens.
     >>> inputs = tf.ragged.constant([["this", "is", "fun"], ["awesome"]])
-    >>> start_end_packer = keras_nlp.layers.StartEndPacker(
+    >>> start_end_packer = keras_hub.layers.StartEndPacker(
     ...     sequence_length=6, start_value=["</s>", "<s>"], end_value="</s>",
     ...     pad_value="<pad>"
     ... )
diff --git a/keras_nlp/src/layers/preprocessing/start_end_packer_test.py b/keras_hub/src/layers/preprocessing/start_end_packer_test.py
similarity index 93%
rename from keras_nlp/src/layers/preprocessing/start_end_packer_test.py
rename to keras_hub/src/layers/preprocessing/start_end_packer_test.py
index 5fb77a930e..bf989bd459 100644
--- a/keras_nlp/src/layers/preprocessing/start_end_packer_test.py
+++ b/keras_hub/src/layers/preprocessing/start_end_packer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import tensorflow as tf
 
-from keras_nlp.src.layers.preprocessing.start_end_packer import StartEndPacker
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.layers.preprocessing.start_end_packer import StartEndPacker
+from keras_hub.src.tests.test_case import TestCase
 
 
 class StartEndPackerTest(TestCase):
@@ -88,7 +88,7 @@ def test_end_token_value_during_truncation(self):
         self.assertAllEqual(output, expected_output)
 
     def test_string_input(self):
-        input_data = [["KerasNLP", "is", "awesome"], ["amazing"]]
+        input_data = [["KerasHub", "is", "awesome"], ["amazing"]]
         start_end_packer = StartEndPacker(
             sequence_length=5,
             start_value="[START]",
@@ -97,13 +97,13 @@ def test_string_input(self):
         )
         output = start_end_packer(input_data)
         expected_output = [
-            ["[START]", "KerasNLP", "is", "awesome", "[END]"],
+            ["[START]", "KerasHub", "is", "awesome", "[END]"],
             ["[START]", "amazing", "[END]", "[PAD]", "[PAD]"],
         ]
         self.assertAllEqual(output, expected_output)
 
     def test_string_input_with_multiple_special_values(self):
-        input_data = [["KerasNLP", "is", "awesome"], ["amazing"]]
+        input_data = [["KerasHub", "is", "awesome"], ["amazing"]]
         start_end_packer = StartEndPacker(
             sequence_length=6,
             start_value=["[END]", "[START]"],
@@ -112,7 +112,7 @@ def test_string_input_with_multiple_special_values(self):
         )
         output = start_end_packer(input_data)
         expected_output = [
-            ["[END]", "[START]", "KerasNLP", "is", "awesome", "[END]"],
+            ["[END]", "[START]", "KerasHub", "is", "awesome", "[END]"],
             ["[END]", "[START]", "amazing", "[END]", "[PAD]", "[PAD]"],
         ]
         self.assertAllEqual(output, expected_output)
diff --git a/keras_hub/src/metrics/__init__.py b/keras_hub/src/metrics/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/metrics/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/metrics/bleu.py b/keras_hub/src/metrics/bleu.py
similarity index 98%
rename from keras_nlp/src/metrics/bleu.py
rename to keras_hub/src/metrics/bleu.py
index 973228f6c9..c6829e740e 100644
--- a/keras_nlp/src/metrics/bleu.py
+++ b/keras_hub/src/metrics/bleu.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,9 +18,9 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.utils.tensor_utils import is_float_dtype
-from keras_nlp.src.utils.tensor_utils import tensor_to_list
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.utils.tensor_utils import is_float_dtype
+from keras_hub.src.utils.tensor_utils import tensor_to_list
 
 try:
     import tensorflow as tf
@@ -55,7 +55,7 @@
 ]
 
 
-@keras_nlp_export("keras_nlp.metrics.Bleu")
+@keras_hub_export("keras_hub.metrics.Bleu")
 class Bleu(keras.metrics.Metric):
     """BLEU metric.
 
diff --git a/keras_nlp/src/metrics/bleu_test.py b/keras_hub/src/metrics/bleu_test.py
similarity index 96%
rename from keras_nlp/src/metrics/bleu_test.py
rename to keras_hub/src/metrics/bleu_test.py
index f8526ab3b9..a33ad51002 100644
--- a/keras_nlp/src/metrics/bleu_test.py
+++ b/keras_hub/src/metrics/bleu_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,9 +16,9 @@
 import pytest
 import tensorflow as tf
 
-from keras_nlp.src.metrics.bleu import Bleu
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.byte_tokenizer import ByteTokenizer
+from keras_hub.src.metrics.bleu import Bleu
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.byte_tokenizer import ByteTokenizer
 
 
 class BleuTest(TestCase):
diff --git a/keras_nlp/src/metrics/edit_distance.py b/keras_hub/src/metrics/edit_distance.py
similarity index 95%
rename from keras_nlp/src/metrics/edit_distance.py
rename to keras_hub/src/metrics/edit_distance.py
index 91698af094..edda7ad40d 100644
--- a/keras_nlp/src/metrics/edit_distance.py
+++ b/keras_hub/src/metrics/edit_distance.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.utils.tensor_utils import is_float_dtype
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.utils.tensor_utils import is_float_dtype
 
 try:
     import tensorflow as tf
@@ -23,7 +23,7 @@
     tf = None
 
 
-@keras_nlp_export("keras_nlp.metrics.EditDistance")
+@keras_hub_export("keras_hub.metrics.EditDistance")
 class EditDistance(keras.metrics.Metric):
     """Edit Distance metric.
 
@@ -62,14 +62,14 @@ class EditDistance(keras.metrics.Metric):
     Various Input Types.
 
     Single-level Python list.
-    >>> edit_distance = keras_nlp.metrics.EditDistance()
+    >>> edit_distance = keras_hub.metrics.EditDistance()
     >>> y_true = "the tiny little cat was found under the big funny bed".split()
     >>> y_pred = "the cat was found under the bed".split()
     >>> edit_distance(y_true, y_pred)
     <tf.Tensor: shape=(), dtype=float32, numpy=0.36363637>
 
     Nested Python list.
-    >>> edit_distance = keras_nlp.metrics.EditDistance()
+    >>> edit_distance = keras_hub.metrics.EditDistance()
     >>> y_true = [
     ...     "the tiny little cat was found under the big funny bed".split(),
     ...     "it is sunny today".split(),
diff --git a/keras_nlp/src/metrics/edit_distance_test.py b/keras_hub/src/metrics/edit_distance_test.py
similarity index 98%
rename from keras_nlp/src/metrics/edit_distance_test.py
rename to keras_hub/src/metrics/edit_distance_test.py
index 8bd8c6fcc3..43fd1e93a8 100644
--- a/keras_nlp/src/metrics/edit_distance_test.py
+++ b/keras_hub/src/metrics/edit_distance_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 import pytest
 import tensorflow as tf
 
-from keras_nlp.src.metrics.edit_distance import EditDistance
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.metrics.edit_distance import EditDistance
+from keras_hub.src.tests.test_case import TestCase
 
 
 class EditDistanceTest(TestCase):
diff --git a/keras_nlp/src/metrics/perplexity.py b/keras_hub/src/metrics/perplexity.py
similarity index 93%
rename from keras_nlp/src/metrics/perplexity.py
rename to keras_hub/src/metrics/perplexity.py
index cf567d0a2d..db2431c89a 100644
--- a/keras_nlp/src/metrics/perplexity.py
+++ b/keras_hub/src/metrics/perplexity.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,11 +15,11 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.utils.tensor_utils import is_float_dtype
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.utils.tensor_utils import is_float_dtype
 
 
-@keras_nlp_export("keras_nlp.metrics.Perplexity")
+@keras_hub_export("keras_hub.metrics.Perplexity")
 class Perplexity(keras.metrics.Metric):
     """Perplexity metric.
 
@@ -46,7 +46,7 @@ class Perplexity(keras.metrics.Metric):
     1. Calculate perplexity by calling update_state() and result().
     1.1. `sample_weight`, and `mask_token_id` are not provided.
     >>> np.random.seed(42)
-    >>> perplexity = keras_nlp.metrics.Perplexity(name="perplexity")
+    >>> perplexity = keras_hub.metrics.Perplexity(name="perplexity")
     >>> target = np.random.randint(10, size=[2, 5])
     >>> logits = np.random.uniform(size=(2, 5, 10))
     >>> perplexity.update_state(target, logits)
@@ -55,7 +55,7 @@ class Perplexity(keras.metrics.Metric):
 
     1.2. `sample_weight` specified (masking token with ID 0).
     >>> np.random.seed(42)
-    >>> perplexity = keras_nlp.metrics.Perplexity(name="perplexity")
+    >>> perplexity = keras_hub.metrics.Perplexity(name="perplexity")
     >>> target = np.random.randint(10, size=[2, 5])
     >>> logits = np.random.uniform(size=(2, 5, 10))
     >>> sample_weight = (target != 0).astype("float32")
@@ -65,7 +65,7 @@ class Perplexity(keras.metrics.Metric):
 
     2. Call perplexity directly.
     >>> np.random.seed(42)
-    >>> perplexity = keras_nlp.metrics.Perplexity(name="perplexity")
+    >>> perplexity = keras_hub.metrics.Perplexity(name="perplexity")
     >>> target = np.random.randint(10, size=[2, 5])
     >>> logits = np.random.uniform(size=(2, 5, 10))
     >>> perplexity(target, logits)
@@ -74,7 +74,7 @@ class Perplexity(keras.metrics.Metric):
     3. Provide the padding token ID and let the class compute the mask on its
        own.
     >>> np.random.seed(42)
-    >>> perplexity = keras_nlp.metrics.Perplexity(mask_token_id=0)
+    >>> perplexity = keras_hub.metrics.Perplexity(mask_token_id=0)
     >>> target = np.random.randint(10, size=[2, 5])
     >>> logits = np.random.uniform(size=(2, 5, 10))
     >>> perplexity(target, logits)
diff --git a/keras_nlp/src/metrics/perplexity_test.py b/keras_hub/src/metrics/perplexity_test.py
similarity index 98%
rename from keras_nlp/src/metrics/perplexity_test.py
rename to keras_hub/src/metrics/perplexity_test.py
index 32c993cf4a..6b6589aafa 100644
--- a/keras_nlp/src/metrics/perplexity_test.py
+++ b/keras_hub/src/metrics/perplexity_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 from keras import ops
 
-from keras_nlp.src.metrics.perplexity import Perplexity
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.metrics.perplexity import Perplexity
+from keras_hub.src.tests.test_case import TestCase
 
 
 class PerplexityTest(TestCase):
diff --git a/keras_nlp/src/metrics/rouge_base.py b/keras_hub/src/metrics/rouge_base.py
similarity index 97%
rename from keras_nlp/src/metrics/rouge_base.py
rename to keras_hub/src/metrics/rouge_base.py
index 4c84454590..0869d9e960 100644
--- a/keras_nlp/src/metrics/rouge_base.py
+++ b/keras_hub/src/metrics/rouge_base.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.utils.tensor_utils import is_float_dtype
-from keras_nlp.src.utils.tensor_utils import tensor_to_list
+from keras_hub.src.utils.tensor_utils import is_float_dtype
+from keras_hub.src.utils.tensor_utils import tensor_to_list
 
 try:
     import tensorflow as tf
diff --git a/keras_nlp/src/metrics/rouge_l.py b/keras_hub/src/metrics/rouge_l.py
similarity index 83%
rename from keras_nlp/src/metrics/rouge_l.py
rename to keras_hub/src/metrics/rouge_l.py
index 94aa342c0c..619d48452d 100644
--- a/keras_nlp/src/metrics/rouge_l.py
+++ b/keras_hub/src/metrics/rouge_l.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,11 +12,11 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.metrics.rouge_base import RougeBase
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.metrics.rouge_base import RougeBase
 
 
-@keras_nlp_export("keras_nlp.metrics.RougeL")
+@keras_hub_export("keras_hub.metrics.RougeL")
 class RougeL(RougeBase):
     """ROUGE-L metric.
 
@@ -43,7 +43,7 @@ class RougeL(RougeBase):
     Examples:
 
     1. Python string.
-    >>> rouge_l = keras_nlp.metrics.RougeL()
+    >>> rouge_l = keras_hub.metrics.RougeL()
     >>> y_true = "the tiny little cat was found under the big funny bed"
     >>> y_pred = "the cat was under the bed"
     >>> rouge_l(y_true, y_pred)["f1_score"]
@@ -51,28 +51,28 @@ class RougeL(RougeBase):
 
     2. List inputs.
     a. Python list.
-    >>> rouge_l = keras_nlp.metrics.RougeL()
+    >>> rouge_l = keras_hub.metrics.RougeL()
     >>> y_true = [
     ...     "the tiny little cat was found under the big funny bed",
-    ...     "i really love contributing to KerasNLP",
+    ...     "i really love contributing to KerasHub",
     ... ]
     >>> y_pred = [
     ...     "the cat was under the bed",
-    ...     "i love contributing to KerasNLP",
+    ...     "i love contributing to KerasHub",
     ... ]
     >>> rouge_l(y_true, y_pred)["f1_score"]
     <tf.Tensor: shape=(), dtype=float32, numpy=0.80748665>
 
 
     3. 2D inputs.
-    >>> rouge_l = keras_nlp.metrics.RougeL()
+    >>> rouge_l = keras_hub.metrics.RougeL()
     >>> y_true = [
     ...     ["the tiny little cat was found under the big funny bed"],
-    ...     ["i really love contributing to KerasNLP"],
+    ...     ["i really love contributing to KerasHub"],
     ... ]
     >>> y_pred = [
     ...     ["the cat was under the bed"],
-    ...     ["i love contributing to KerasNLP"],
+    ...     ["i love contributing to KerasHub"],
     ... ]
     >>> rouge_l(y_true, y_pred)["f1_score"]
     <tf.Tensor: shape=(), dtype=float32, numpy=0.80748665>
diff --git a/keras_nlp/src/metrics/rouge_l_test.py b/keras_hub/src/metrics/rouge_l_test.py
similarity index 88%
rename from keras_nlp/src/metrics/rouge_l_test.py
rename to keras_hub/src/metrics/rouge_l_test.py
index 98cdf9e01a..7d6c8cd325 100644
--- a/keras_nlp/src/metrics/rouge_l_test.py
+++ b/keras_hub/src/metrics/rouge_l_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import tensorflow as tf
 
-from keras_nlp.src.metrics.rouge_l import RougeL
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.metrics.rouge_l import RougeL
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RougeLTest(TestCase):
@@ -43,11 +43,11 @@ def test_string_list_input(self):
         rouge = RougeL(use_stemmer=False)
         y_true = [
             "the tiny little cat was found under the big funny bed",
-            "i really love contributing to KerasNLP",
+            "i really love contributing to KerasHub",
         ]
         y_pred = [
             "the cat was under the bed",
-            "i love contributing to KerasNLP",
+            "i love contributing to KerasHub",
         ]
 
         rouge_val = rouge(y_true, y_pred)
@@ -61,11 +61,11 @@ def test_tensor_input(self):
         y_true = tf.constant(
             [
                 "the tiny little cat was found under the big funny bed",
-                "i really love contributing to KerasNLP",
+                "i really love contributing to KerasHub",
             ]
         )
         y_pred = tf.constant(
-            ["the cat was under the bed", "i love contributing to KerasNLP"]
+            ["the cat was under the bed", "i love contributing to KerasHub"]
         )
 
         rouge_val = rouge(y_true, y_pred)
@@ -76,10 +76,10 @@ def test_tensor_input(self):
 
     def test_reset_state(self):
         rouge = RougeL()
-        y_true = ["hey, this is great fun", "i love contributing to KerasNLP"]
+        y_true = ["hey, this is great fun", "i love contributing to KerasHub"]
         y_pred = [
             "great fun indeed",
-            "KerasNLP is awesome, i love contributing to it",
+            "KerasHub is awesome, i love contributing to it",
         ]
 
         rouge.update_state(y_true, y_pred)
@@ -100,11 +100,11 @@ def test_update_state(self):
         rouge = RougeL()
         y_true_1 = [
             "the tiny little cat was found under the big funny bed",
-            "i really love contributing to KerasNLP",
+            "i really love contributing to KerasHub",
         ]
         y_pred_1 = [
             "the cat was under the bed",
-            "i love contributing to KerasNLP",
+            "i love contributing to KerasHub",
         ]
 
         rouge.update_state(y_true_1, y_pred_1)
diff --git a/keras_nlp/src/metrics/rouge_n.py b/keras_hub/src/metrics/rouge_n.py
similarity index 83%
rename from keras_nlp/src/metrics/rouge_n.py
rename to keras_hub/src/metrics/rouge_n.py
index 61b43866bd..e4f2242034 100644
--- a/keras_nlp/src/metrics/rouge_n.py
+++ b/keras_hub/src/metrics/rouge_n.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,11 +12,11 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.metrics.rouge_base import RougeBase
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.metrics.rouge_base import RougeBase
 
 
-@keras_nlp_export("keras_nlp.metrics.RougeN")
+@keras_hub_export("keras_hub.metrics.RougeN")
 class RougeN(RougeBase):
     """ROUGE-N metric.
 
@@ -45,47 +45,47 @@ class RougeN(RougeBase):
     Examples:
 
     1. Python string.
-    >>> rouge_n = keras_nlp.metrics.RougeN(order=2)
+    >>> rouge_n = keras_hub.metrics.RougeN(order=2)
     >>> y_true = "the tiny little cat was found under the big funny bed"
     >>> y_pred = "the cat was under the bed"
     >>> rouge_n(y_true, y_pred)["f1_score"]
     <tf.Tensor: shape=(), dtype=float32, numpy=0.26666668>
 
     2. List inputs.
-    >>> rouge_n = keras_nlp.metrics.RougeN(order=2)
+    >>> rouge_n = keras_hub.metrics.RougeN(order=2)
     >>> y_true = [
     ...     "the tiny little cat was found under the big funny bed",
-    ...     "i really love contributing to KerasNLP",
+    ...     "i really love contributing to KerasHub",
     ... ]
     >>> y_pred = [
     ...     "the cat was under the bed",
-    ...     "i love contributing to KerasNLP",
+    ...     "i love contributing to KerasHub",
     ... ]
     >>> rouge_n(y_true, y_pred)["f1_score"]
     <tf.Tensor: shape=(), dtype=float32, numpy=0.4666667>
 
     3. 2D inputs.
-    >>> rouge_n = keras_nlp.metrics.RougeN(order=2)
+    >>> rouge_n = keras_hub.metrics.RougeN(order=2)
     >>> y_true =[
     ...     ["the tiny little cat was found under the big funny bed"],
-    ...     ["i really love contributing to KerasNLP"],
+    ...     ["i really love contributing to KerasHub"],
     ... ]
     >>> y_pred =[
     ...     ["the cat was under the bed"],
-    ...     ["i love contributing to KerasNLP"],
+    ...     ["i love contributing to KerasHub"],
     ... ]
     >>> rouge_n(y_true, y_pred)["f1_score"]
     <tf.Tensor: shape=(), dtype=float32, numpy=0.4666667>
 
     4. Trigrams.
-    >>> rouge_n = keras_nlp.metrics.RougeN(order=3)
+    >>> rouge_n = keras_hub.metrics.RougeN(order=3)
     >>> y_true = [
     ...     "the tiny little cat was found under the big funny bed",
-    ...     "i really love contributing to KerasNLP",
+    ...     "i really love contributing to KerasHub",
     ... ]
     >>> y_pred = [
     ...     "the cat was under the bed",
-    ...     "i love contributing to KerasNLP",
+    ...     "i love contributing to KerasHub",
     ... ]
     >>> rouge_n(y_true, y_pred)["f1_score"]
     <tf.Tensor: shape=(), dtype=float32, numpy=0.2857143>
diff --git a/keras_nlp/src/metrics/rouge_n_test.py b/keras_hub/src/metrics/rouge_n_test.py
similarity index 89%
rename from keras_nlp/src/metrics/rouge_n_test.py
rename to keras_hub/src/metrics/rouge_n_test.py
index 5c48ae4a5d..92a8d8bbc9 100644
--- a/keras_nlp/src/metrics/rouge_n_test.py
+++ b/keras_hub/src/metrics/rouge_n_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 import pytest
 import tensorflow as tf
 
-from keras_nlp.src.metrics.rouge_n import RougeN
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.metrics.rouge_n import RougeN
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RougeNTest(TestCase):
@@ -45,11 +45,11 @@ def test_string_list_input(self):
         rouge = RougeN(order=2, use_stemmer=False)
         y_true = [
             "the tiny little cat was found under the big funny bed",
-            "i really love contributing to KerasNLP",
+            "i really love contributing to KerasHub",
         ]
         y_pred = [
             "the cat was under the bed",
-            "i love contributing to KerasNLP",
+            "i love contributing to KerasHub",
         ]
 
         rouge_val = rouge(y_true, y_pred)
@@ -63,11 +63,11 @@ def test_tensor_input(self):
         y_true = tf.constant(
             [
                 "the tiny little cat was found under the big funny bed",
-                "i really love contributing to KerasNLP",
+                "i really love contributing to KerasHub",
             ]
         )
         y_pred = tf.constant(
-            ["the cat was under the bed", "i love contributing to KerasNLP"]
+            ["the cat was under the bed", "i love contributing to KerasHub"]
         )
 
         rouge_val = rouge(y_true, y_pred)
@@ -100,11 +100,11 @@ def test_different_order(self):
         rouge = RougeN(order=3, use_stemmer=False)
         y_true = [
             "the tiny little cat was found under the big funny bed",
-            "i really love contributing to KerasNLP",
+            "i really love contributing to KerasHub",
         ]
         y_pred = [
             "the cat was under the bed",
-            "i love contributing to KerasNLP",
+            "i love contributing to KerasHub",
         ]
 
         rouge_val = rouge(y_true, y_pred)
@@ -115,10 +115,10 @@ def test_different_order(self):
 
     def test_reset_state(self):
         rouge = RougeN()
-        y_true = ["hey, this is great fun", "i love contributing to KerasNLP"]
+        y_true = ["hey, this is great fun", "i love contributing to KerasHub"]
         y_pred = [
             "great fun indeed",
-            "KerasNLP is awesome, i love contributing to it",
+            "KerasHub is awesome, i love contributing to it",
         ]
 
         rouge.update_state(y_true, y_pred)
@@ -139,11 +139,11 @@ def test_update_state(self):
         rouge = RougeN()
         y_true_1 = [
             "the tiny little cat was found under the big funny bed",
-            "i really love contributing to KerasNLP",
+            "i really love contributing to KerasHub",
         ]
         y_pred_1 = [
             "the cat was under the bed",
-            "i love contributing to KerasNLP",
+            "i love contributing to KerasHub",
         ]
 
         rouge.update_state(y_true_1, y_pred_1)
diff --git a/keras_hub/src/models/__init__.py b/keras_hub/src/models/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/albert/__init__.py b/keras_hub/src/models/albert/__init__.py
similarity index 72%
rename from keras_nlp/src/models/albert/__init__.py
rename to keras_hub/src/models/albert/__init__.py
index 4b788209b9..5f507ab1c2 100644
--- a/keras_nlp/src/models/albert/__init__.py
+++ b/keras_hub/src/models/albert/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.models.albert.albert_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.models.albert.albert_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, AlbertBackbone)
diff --git a/keras_nlp/src/models/albert/albert_backbone.py b/keras_hub/src/models/albert/albert_backbone.py
similarity index 95%
rename from keras_nlp/src/models/albert/albert_backbone.py
rename to keras_hub/src/models/albert/albert_backbone.py
index 16734364dc..04730e1a2b 100644
--- a/keras_nlp/src/models/albert/albert_backbone.py
+++ b/keras_hub/src/models/albert/albert_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,21 +14,21 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.layers.modeling.transformer_encoder import TransformerEncoder
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.utils.keras_utils import gelu_approximate
+from keras_hub.src.layers.modeling.transformer_encoder import TransformerEncoder
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.utils.keras_utils import gelu_approximate
 
 
 def albert_kernel_initializer(stddev=0.02):
     return keras.initializers.TruncatedNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.AlbertBackbone")
+@keras_hub_export("keras_hub.models.AlbertBackbone")
 class AlbertBackbone(Backbone):
     """ALBERT encoder network.
 
@@ -85,7 +85,7 @@ class AlbertBackbone(Backbone):
     }
 
     # Randomly initialized ALBERT encoder
-    model = keras_nlp.models.AlbertBackbone(
+    model = keras_hub.models.AlbertBackbone(
         vocabulary_size=30000,
         num_layers=12,
         num_heads=12,
diff --git a/keras_nlp/src/models/albert/albert_backbone_test.py b/keras_hub/src/models/albert/albert_backbone_test.py
similarity index 95%
rename from keras_nlp/src/models/albert/albert_backbone_test.py
rename to keras_hub/src/models/albert/albert_backbone_test.py
index 7d21ecdfb6..5a120e1b40 100644
--- a/keras_nlp/src/models/albert/albert_backbone_test.py
+++ b/keras_hub/src/models/albert/albert_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class AlbertBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/albert/albert_masked_lm.py b/keras_hub/src/models/albert/albert_masked_lm.py
similarity index 83%
rename from keras_nlp/src/models/albert/albert_masked_lm.py
rename to keras_hub/src/models/albert/albert_masked_lm.py
index b861504d5d..097a3b827d 100644
--- a/keras_nlp/src/models/albert/albert_masked_lm.py
+++ b/keras_hub/src/models/albert/albert_masked_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,20 +14,20 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.masked_lm_head import MaskedLMHead
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.models.albert.albert_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.masked_lm_head import MaskedLMHead
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.models.albert.albert_backbone import (
     albert_kernel_initializer,
 )
-from keras_nlp.src.models.albert.albert_masked_lm_preprocessor import (
+from keras_hub.src.models.albert.albert_masked_lm_preprocessor import (
     AlbertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.masked_lm import MaskedLM
-from keras_nlp.src.utils.keras_utils import gelu_approximate
+from keras_hub.src.models.masked_lm import MaskedLM
+from keras_hub.src.utils.keras_utils import gelu_approximate
 
 
-@keras_nlp_export("keras_nlp.models.AlbertMaskedLM")
+@keras_hub_export("keras_hub.models.AlbertMaskedLM")
 class AlbertMaskedLM(MaskedLM):
     """An end-to-end ALBERT model for the masked language modeling task.
 
@@ -46,8 +46,8 @@ class AlbertMaskedLM(MaskedLM):
     warranties or conditions of any kind.
 
     Args:
-        backbone: A `keras_nlp.models.AlbertBackbone` instance.
-        preprocessor: A `keras_nlp.models.AlbertMaskedLMPreprocessor` or
+        backbone: A `keras_hub.models.AlbertBackbone` instance.
+        preprocessor: A `keras_hub.models.AlbertMaskedLMPreprocessor` or
             `None`. If `None`, this model will not apply preprocessing, and
             inputs should be preprocessed before calling the model.
 
@@ -58,7 +58,7 @@ class AlbertMaskedLM(MaskedLM):
     features = ["The quick brown fox jumped.", "I forgot my homework."]
 
     # Pretrained language model.
-    masked_lm = keras_nlp.models.AlbertMaskedLM.from_preset(
+    masked_lm = keras_hub.models.AlbertMaskedLM.from_preset(
         "albert_base_en_uncased",
     )
     masked_lm.fit(x=features, batch_size=2)
@@ -87,7 +87,7 @@ class AlbertMaskedLM(MaskedLM):
     # Labels are the original masked values.
     labels = [[3, 5]] * 2
 
-    masked_lm = keras_nlp.models.AlbertMaskedLM.from_preset(
+    masked_lm = keras_hub.models.AlbertMaskedLM.from_preset(
         "albert_base_en_uncased",
         preprocessor=None,
     )
diff --git a/keras_nlp/src/models/albert/albert_masked_lm_preprocessor.py b/keras_hub/src/models/albert/albert_masked_lm_preprocessor.py
similarity index 87%
rename from keras_nlp/src/models/albert/albert_masked_lm_preprocessor.py
rename to keras_hub/src/models/albert/albert_masked_lm_preprocessor.py
index ce2bba2fb5..d4f02d4f67 100644
--- a/keras_nlp/src/models/albert/albert_masked_lm_preprocessor.py
+++ b/keras_hub/src/models/albert/albert_masked_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,19 +12,19 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.models.albert.albert_tokenizer import AlbertTokenizer
-from keras_nlp.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.models.albert.albert_tokenizer import AlbertTokenizer
+from keras_hub.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
 
 
-@keras_nlp_export("keras_nlp.models.AlbertMaskedLMPreprocessor")
+@keras_hub_export("keras_hub.models.AlbertMaskedLMPreprocessor")
 class AlbertMaskedLMPreprocessor(MaskedLMPreprocessor):
     """ALBERT preprocessing for the masked language modeling task.
 
     This preprocessing layer will prepare inputs for a masked language modeling
     task. It is primarily intended for use with the
-    `keras_nlp.models.AlbertMaskedLM` task model. Preprocessing will occur in
+    `keras_hub.models.AlbertMaskedLM` task model. Preprocessing will occur in
     multiple steps.
 
     - Tokenize any number of input segments using the `tokenizer`.
@@ -35,10 +35,10 @@ class AlbertMaskedLMPreprocessor(MaskedLMPreprocessor):
     - Randomly select non-special tokens to mask, controlled by
       `mask_selection_rate`.
     - Construct a `(x, y, sample_weight)` tuple suitable for training with a
-      `keras_nlp.models.AlbertMaskedLM` task model.
+      `keras_hub.models.AlbertMaskedLM` task model.
 
     Args:
-        tokenizer: A `keras_nlp.models.AlbertTokenizer` instance.
+        tokenizer: A `keras_hub.models.AlbertTokenizer` instance.
         sequence_length: The length of the packed inputs.
         mask_selection_rate: The probability an input token will be dynamically
             masked.
@@ -69,7 +69,7 @@ class AlbertMaskedLMPreprocessor(MaskedLMPreprocessor):
 
     Directly calling the layer on data.
     ```python
-    preprocessor = keras_nlp.models.AlbertMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.AlbertMaskedLMPreprocessor.from_preset(
         "albert_base_en_uncased"
     )
 
@@ -88,7 +88,7 @@ class AlbertMaskedLMPreprocessor(MaskedLMPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.AlbertMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.AlbertMaskedLMPreprocessor.from_preset(
         "albert_base_en_uncased"
     )
 
diff --git a/keras_nlp/src/models/albert/albert_masked_lm_preprocessor_test.py b/keras_hub/src/models/albert/albert_masked_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/albert/albert_masked_lm_preprocessor_test.py
rename to keras_hub/src/models/albert/albert_masked_lm_preprocessor_test.py
index 343c53d039..e870fa4a63 100644
--- a/keras_nlp/src/models/albert/albert_masked_lm_preprocessor_test.py
+++ b/keras_hub/src/models/albert/albert_masked_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,11 +16,11 @@
 
 import pytest
 
-from keras_nlp.src.models.albert.albert_masked_lm_preprocessor import (
+from keras_hub.src.models.albert.albert_masked_lm_preprocessor import (
     AlbertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.albert.albert_tokenizer import AlbertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.albert.albert_tokenizer import AlbertTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class AlbertMaskedLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/albert/albert_masked_lm_test.py b/keras_hub/src/models/albert/albert_masked_lm_test.py
similarity index 88%
rename from keras_nlp/src/models/albert/albert_masked_lm_test.py
rename to keras_hub/src/models/albert/albert_masked_lm_test.py
index 977f8f1fec..d95ad6fa1b 100644
--- a/keras_nlp/src/models/albert/albert_masked_lm_test.py
+++ b/keras_hub/src/models/albert/albert_masked_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,13 +16,13 @@
 
 import pytest
 
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.models.albert.albert_masked_lm import AlbertMaskedLM
-from keras_nlp.src.models.albert.albert_masked_lm_preprocessor import (
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.models.albert.albert_masked_lm import AlbertMaskedLM
+from keras_hub.src.models.albert.albert_masked_lm_preprocessor import (
     AlbertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.albert.albert_tokenizer import AlbertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.albert.albert_tokenizer import AlbertTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class AlbertMaskedLMTest(TestCase):
diff --git a/keras_nlp/src/models/albert/albert_presets.py b/keras_hub/src/models/albert/albert_presets.py
similarity index 98%
rename from keras_nlp/src/models/albert/albert_presets.py
rename to keras_hub/src/models/albert/albert_presets.py
index db2507669a..b98d7373f9 100644
--- a/keras_nlp/src/models/albert/albert_presets.py
+++ b/keras_hub/src/models/albert/albert_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/albert/albert_text_classifier.py b/keras_hub/src/models/albert/albert_text_classifier.py
similarity index 86%
rename from keras_nlp/src/models/albert/albert_text_classifier.py
rename to keras_hub/src/models/albert/albert_text_classifier.py
index 3ffd193157..46e583c468 100644
--- a/keras_nlp/src/models/albert/albert_text_classifier.py
+++ b/keras_hub/src/models/albert/albert_text_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,27 +14,27 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.models.albert.albert_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.models.albert.albert_backbone import (
     albert_kernel_initializer,
 )
-from keras_nlp.src.models.albert.albert_text_classifier_preprocessor import (
+from keras_hub.src.models.albert.albert_text_classifier_preprocessor import (
     AlbertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.text_classifier import TextClassifier
+from keras_hub.src.models.text_classifier import TextClassifier
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.AlbertTextClassifier",
-        "keras_nlp.models.AlbertClassifier",
+        "keras_hub.models.AlbertTextClassifier",
+        "keras_hub.models.AlbertClassifier",
     ]
 )
 class AlbertTextClassifier(TextClassifier):
     """An end-to-end ALBERT model for classification tasks
 
-    This model attaches a classification head to a `keras_nlp.model.AlbertBackbone`
+    This model attaches a classification head to a `keras_hub.model.AlbertBackbone`
     backbone, mapping from the backbone outputs to logit output suitable for
     a classification task. For usage of this model with pre-trained weights, see
     the `from_preset()` method.
@@ -48,9 +48,9 @@ class AlbertTextClassifier(TextClassifier):
     warranties or conditions of any kind.
 
     Args:
-        backbone: A `keras_nlp.models.AlertBackbone` instance.
+        backbone: A `keras_hub.models.AlertBackbone` instance.
         num_classes: int. Number of classes to predict.
-        preprocessor: A `keras_nlp.models.AlbertTextClassifierPreprocessor` or `None`. If
+        preprocessor: A `keras_hub.models.AlbertTextClassifierPreprocessor` or `None`. If
             `None`, this model will not apply preprocessing, and inputs should
             be preprocessed before calling the model.
         activation: Optional `str` or callable. The
@@ -68,7 +68,7 @@ class AlbertTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier.
-    classifier = keras_nlp.models.AlbertTextClassifier.from_preset(
+    classifier = keras_hub.models.AlbertTextClassifier.from_preset(
         "albert_base_en_uncased",
         num_classes=4,
     )
@@ -97,7 +97,7 @@ class AlbertTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier without preprocessing.
-    classifier = keras_nlp.models.AlbertTextClassifier.from_preset(
+    classifier = keras_hub.models.AlbertTextClassifier.from_preset(
         "albert_base_en_uncased",
         num_classes=4,
         preprocessor=None,
@@ -127,14 +127,14 @@ class AlbertTextClassifier(TextClassifier):
         eos_piece="[SEP]",
         user_defined_symbols="[MASK]",
     )
-    tokenizer = keras_nlp.models.AlbertTokenizer(
+    tokenizer = keras_hub.models.AlbertTokenizer(
         proto=bytes_io.getvalue(),
     )
-    preprocessor = keras_nlp.models.AlbertTextClassifierPreprocessor(
+    preprocessor = keras_hub.models.AlbertTextClassifierPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.AlbertBackbone(
+    backbone = keras_hub.models.AlbertBackbone(
         vocabulary_size=tokenizer.vocabulary_size(),
         num_layers=4,
         num_heads=4,
@@ -143,7 +143,7 @@ class AlbertTextClassifier(TextClassifier):
         intermediate_dim=512,
         max_sequence_length=128,
     )
-    classifier = keras_nlp.models.AlbertTextClassifier(
+    classifier = keras_hub.models.AlbertTextClassifier(
         backbone=backbone,
         preprocessor=preprocessor,
         num_classes=4,
diff --git a/keras_nlp/src/models/albert/albert_text_classifier_preprocessor.py b/keras_hub/src/models/albert/albert_text_classifier_preprocessor.py
similarity index 86%
rename from keras_nlp/src/models/albert/albert_text_classifier_preprocessor.py
rename to keras_hub/src/models/albert/albert_text_classifier_preprocessor.py
index be533be259..811f0f3312 100644
--- a/keras_nlp/src/models/albert/albert_text_classifier_preprocessor.py
+++ b/keras_hub/src/models/albert/albert_text_classifier_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,18 +12,18 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.models.albert.albert_tokenizer import AlbertTokenizer
-from keras_nlp.src.models.text_classifier_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.models.albert.albert_tokenizer import AlbertTokenizer
+from keras_hub.src.models.text_classifier_preprocessor import (
     TextClassifierPreprocessor,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.AlbertTextClassifierPreprocessor",
-        "keras_nlp.models.AlbertPreprocessor",
+        "keras_hub.models.AlbertTextClassifierPreprocessor",
+        "keras_hub.models.AlbertPreprocessor",
     ]
 )
 class AlbertTextClassifierPreprocessor(TextClassifierPreprocessor):
@@ -32,11 +32,11 @@ class AlbertTextClassifierPreprocessor(TextClassifierPreprocessor):
     This preprocessing layer will do three things:
 
      - Tokenize any number of input segments using the `tokenizer`.
-     - Pack the inputs together using a `keras_nlp.layers.MultiSegmentPacker`.
+     - Pack the inputs together using a `keras_hub.layers.MultiSegmentPacker`.
        with the appropriate `"[CLS]"`, `"[SEP]"` and `"<pad>"` tokens.
      - Construct a dictionary with keys `"token_ids"`, `"segment_ids"` and
        `"padding_mask"`, that can be passed directly to
-       `keras_nlp.models.AlbertBackbone`.
+       `keras_hub.models.AlbertBackbone`.
 
     This layer can be used directly with `tf.data.Dataset.map` to preprocess
     string data in the `(x, y, sample_weight)` format used by
@@ -56,7 +56,7 @@ class AlbertTextClassifierPreprocessor(TextClassifierPreprocessor):
     the layer, e.g. `ds.map(lambda seg1, seg2: preprocessor(x=(seg1, seg2)))`.
 
     Args:
-        tokenizer: A `keras_nlp.models.AlbertTokenizer` instance.
+        tokenizer: A `keras_hub.models.AlbertTokenizer` instance.
         sequence_length: The length of the packed inputs.
         truncate: string. The algorithm to truncate a list of batched segments
             to fit within `sequence_length`. The value can be either
@@ -72,7 +72,7 @@ class AlbertTextClassifierPreprocessor(TextClassifierPreprocessor):
     Examples:
     Directly calling the layer on data.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "albert_base_en_uncased"
     )
 
@@ -106,16 +106,16 @@ class AlbertTextClassifierPreprocessor(TextClassifierPreprocessor):
         eos_piece="[SEP]",
         user_defined_symbols="[MASK]",
     )
-    tokenizer = keras_nlp.models.AlbertTokenizer(
+    tokenizer = keras_hub.models.AlbertTokenizer(
         proto=bytes_io.getvalue(),
     )
-    preprocessor = keras_nlp.models.AlbertTextClassifierPreprocessor(tokenizer)
+    preprocessor = keras_hub.models.AlbertTextClassifierPreprocessor(tokenizer)
     preprocessor("The quick brown fox jumped.")
     ```
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "albert_base_en_uncased"
     )
 
diff --git a/keras_nlp/src/models/albert/albert_text_classifier_preprocessor_test.py b/keras_hub/src/models/albert/albert_text_classifier_preprocessor_test.py
similarity index 91%
rename from keras_nlp/src/models/albert/albert_text_classifier_preprocessor_test.py
rename to keras_hub/src/models/albert/albert_text_classifier_preprocessor_test.py
index 6de562fb11..8f1ab0a93b 100644
--- a/keras_nlp/src/models/albert/albert_text_classifier_preprocessor_test.py
+++ b/keras_hub/src/models/albert/albert_text_classifier_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,11 +16,11 @@
 
 import pytest
 
-from keras_nlp.src.models.albert.albert_text_classifier_preprocessor import (
+from keras_hub.src.models.albert.albert_text_classifier_preprocessor import (
     AlbertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.albert.albert_tokenizer import AlbertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.albert.albert_tokenizer import AlbertTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class AlbertTextClassifierPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/albert/albert_text_classifier_test.py b/keras_hub/src/models/albert/albert_text_classifier_test.py
similarity index 88%
rename from keras_nlp/src/models/albert/albert_text_classifier_test.py
rename to keras_hub/src/models/albert/albert_text_classifier_test.py
index 5f98e0f8a9..40df84baec 100644
--- a/keras_nlp/src/models/albert/albert_text_classifier_test.py
+++ b/keras_hub/src/models/albert/albert_text_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,15 +16,15 @@
 
 import pytest
 
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.models.albert.albert_text_classifier import (
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.models.albert.albert_text_classifier import (
     AlbertTextClassifier,
 )
-from keras_nlp.src.models.albert.albert_text_classifier_preprocessor import (
+from keras_hub.src.models.albert.albert_text_classifier_preprocessor import (
     AlbertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.albert.albert_tokenizer import AlbertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.albert.albert_tokenizer import AlbertTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class AlbertTextClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/albert/albert_tokenizer.py b/keras_hub/src/models/albert/albert_tokenizer.py
similarity index 85%
rename from keras_nlp/src/models/albert/albert_tokenizer.py
rename to keras_hub/src/models/albert/albert_tokenizer.py
index b96b11cac4..0e54c5078f 100644
--- a/keras_nlp/src/models/albert/albert_tokenizer.py
+++ b/keras_hub/src/models/albert/albert_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,24 +12,24 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.AlbertTokenizer",
-        "keras_nlp.models.AlbertTokenizer",
+        "keras_hub.tokenizers.AlbertTokenizer",
+        "keras_hub.models.AlbertTokenizer",
     ]
 )
 class AlbertTokenizer(SentencePieceTokenizer):
     """ALBERT tokenizer layer based on SentencePiece.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.SentencePieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.SentencePieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by
     ALBERT models and provides a `from_preset()` method to automatically
     download a matching vocabulary for a ALBERT preset.
@@ -50,7 +50,7 @@ class AlbertTokenizer(SentencePieceTokenizer):
 
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.AlbertTokenizer.from_preset(
+    tokenizer = keras_hub.models.AlbertTokenizer.from_preset(
         "albert_base_en_uncased",
     )
     tokenizer("The quick brown fox jumped.")
@@ -79,7 +79,7 @@ class AlbertTokenizer(SentencePieceTokenizer):
         eos_piece="[SEP]",
         user_defined_symbols="[MASK]",
     )
-    tokenizer = keras_nlp.models.AlbertTokenizer(
+    tokenizer = keras_hub.models.AlbertTokenizer(
         proto=bytes_io.getvalue(),
     )
     tokenizer("The quick brown fox jumped.")
diff --git a/keras_nlp/src/models/albert/albert_tokenizer_test.py b/keras_hub/src/models/albert/albert_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/albert/albert_tokenizer_test.py
rename to keras_hub/src/models/albert/albert_tokenizer_test.py
index 39bafc3dc3..9699207c96 100644
--- a/keras_nlp/src/models/albert/albert_tokenizer_test.py
+++ b/keras_hub/src/models/albert/albert_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 
 import pytest
 
-from keras_nlp.src.models.albert.albert_tokenizer import AlbertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.albert.albert_tokenizer import AlbertTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class AlbertTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/backbone.py b/keras_hub/src/models/backbone.py
similarity index 90%
rename from keras_nlp/src/models/backbone.py
rename to keras_hub/src/models/backbone.py
index 1c5addae9c..00481bd9eb 100644
--- a/keras_nlp/src/models/backbone.py
+++ b/keras_hub/src/models/backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,23 +16,23 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.utils.keras_utils import assert_quantization_support
-from keras_nlp.src.utils.preset_utils import CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import MODEL_WEIGHTS_FILE
-from keras_nlp.src.utils.preset_utils import builtin_presets
-from keras_nlp.src.utils.preset_utils import get_preset_loader
-from keras_nlp.src.utils.preset_utils import save_metadata
-from keras_nlp.src.utils.preset_utils import save_serialized_object
-from keras_nlp.src.utils.python_utils import classproperty
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.utils.keras_utils import assert_quantization_support
+from keras_hub.src.utils.preset_utils import CONFIG_FILE
+from keras_hub.src.utils.preset_utils import MODEL_WEIGHTS_FILE
+from keras_hub.src.utils.preset_utils import builtin_presets
+from keras_hub.src.utils.preset_utils import get_preset_loader
+from keras_hub.src.utils.preset_utils import save_metadata
+from keras_hub.src.utils.preset_utils import save_serialized_object
+from keras_hub.src.utils.python_utils import classproperty
 
 
-@keras_nlp_export("keras_nlp.models.Backbone")
+@keras_hub_export("keras_hub.models.Backbone")
 class Backbone(keras.Model):
     """Base class for all `Backbone` models.
 
     A `Backbone` is the basic architecture for a given NLP model. Unlike a
-    `keras_nlp.models.Task`, a `Backbone` is not tailored to any specific loss
+    `keras_hub.models.Task`, a `Backbone` is not tailored to any specific loss
     function and training setup. A `Backbone` generally outputs the last hidden
     states of an architecture before any output predictions.
 
@@ -52,11 +52,11 @@ class Backbone(keras.Model):
     Example:
     ```python
     # Load a BERT backbone with pre-trained weights.
-    backbone = keras_nlp.models.Backbone.from_preset(
+    backbone = keras_hub.models.Backbone.from_preset(
         "bert_base_en",
     )
     # Load a GPT2 backbone with pre-trained weights at bfloat16 precision.
-    backbone = keras_nlp.models.Backbone.from_preset(
+    backbone = keras_hub.models.Backbone.from_preset(
         "gpt2_base_en",
         dtype="bfloat16",
         trainable=False,
@@ -150,7 +150,7 @@ def from_preset(
         load_weights=True,
         **kwargs,
     ):
-        """Instantiate a `keras_nlp.models.Backbone` from a model preset.
+        """Instantiate a `keras_hub.models.Backbone` from a model preset.
 
         A preset is a directory of configs, weights and other file assets used
         to save and load a pre-trained model. The `preset` can be passed as a
@@ -162,8 +162,8 @@ def from_preset(
         4. a path to a local preset directory like `'./bert_base_en'`
 
         This constructor can be called in one of two ways. Either from the base
-        class like `keras_nlp.models.Backbone.from_preset()`, or from
-        a model class like `keras_nlp.models.GemmaBackbone.from_preset()`.
+        class like `keras_hub.models.Backbone.from_preset()`, or from
+        a model class like `keras_hub.models.GemmaBackbone.from_preset()`.
         If calling from the base class, the subclass of the returning object
         will be inferred from the config in the preset directory.
 
@@ -180,12 +180,12 @@ class like `keras_nlp.models.Backbone.from_preset()`, or from
         Examples:
         ```python
         # Load a Gemma backbone with pre-trained weights.
-        model = keras_nlp.models.Backbone.from_preset(
+        model = keras_hub.models.Backbone.from_preset(
             "gemma_2b_en",
         )
 
         # Load a Bert backbone with a pre-trained config and random weights.
-        model = keras_nlp.models.Backbone.from_preset(
+        model = keras_hub.models.Backbone.from_preset(
             "bert_base_en",
             load_weights=False,
         )
diff --git a/keras_nlp/src/models/backbone_test.py b/keras_hub/src/models/backbone_test.py
similarity index 87%
rename from keras_nlp/src/models/backbone_test.py
rename to keras_hub/src/models/backbone_test.py
index 03d1dc80f9..8822fe3bf5 100644
--- a/keras_nlp/src/models/backbone_test.py
+++ b/keras_hub/src/models/backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,15 +16,15 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.utils.preset_utils import CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import METADATA_FILE
-from keras_nlp.src.utils.preset_utils import MODEL_WEIGHTS_FILE
-from keras_nlp.src.utils.preset_utils import check_config_class
-from keras_nlp.src.utils.preset_utils import load_json
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.utils.preset_utils import CONFIG_FILE
+from keras_hub.src.utils.preset_utils import METADATA_FILE
+from keras_hub.src.utils.preset_utils import MODEL_WEIGHTS_FILE
+from keras_hub.src.utils.preset_utils import check_config_class
+from keras_hub.src.utils.preset_utils import load_json
 
 
 class TestBackbone(TestCase):
diff --git a/keras_nlp/src/models/bart/__init__.py b/keras_hub/src/models/bart/__init__.py
similarity index 72%
rename from keras_nlp/src/models/bart/__init__.py
rename to keras_hub/src/models/bart/__init__.py
index 8927e19913..80b6071a4a 100644
--- a/keras_nlp/src/models/bart/__init__.py
+++ b/keras_hub/src/models/bart/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.bart.bart_backbone import BartBackbone
-from keras_nlp.src.models.bart.bart_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.bart.bart_backbone import BartBackbone
+from keras_hub.src.models.bart.bart_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, BartBackbone)
diff --git a/keras_nlp/src/models/bart/bart_backbone.py b/keras_hub/src/models/bart/bart_backbone.py
similarity index 94%
rename from keras_nlp/src/models/bart/bart_backbone.py
rename to keras_hub/src/models/bart/bart_backbone.py
index f64b9165b6..f15c6a8a6a 100644
--- a/keras_nlp/src/models/bart/bart_backbone.py
+++ b/keras_hub/src/models/bart/bart_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,21 +14,21 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.layers.modeling.transformer_decoder import TransformerDecoder
-from keras_nlp.src.layers.modeling.transformer_encoder import TransformerEncoder
-from keras_nlp.src.models.backbone import Backbone
+from keras_hub.src.layers.modeling.transformer_decoder import TransformerDecoder
+from keras_hub.src.layers.modeling.transformer_encoder import TransformerEncoder
+from keras_hub.src.models.backbone import Backbone
 
 
 def bart_kernel_initializer(stddev=0.02):
     return keras.initializers.TruncatedNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.BartBackbone")
+@keras_hub_export("keras_hub.models.BartBackbone")
 class BartBackbone(Backbone):
     """BART encoder-decoder network.
 
@@ -78,11 +78,11 @@ class BartBackbone(Backbone):
     }
 
     # Pretrained BART encoder.
-    model = keras_nlp.models.BartBackbone.from_preset("bart_base_en")
+    model = keras_hub.models.BartBackbone.from_preset("bart_base_en")
     model(input_data)
 
     # Randomly initialized BART encoder-decoder model with a custom config
-    model = keras_nlp.models.BartBackbone(
+    model = keras_hub.models.BartBackbone(
         vocabulary_size=50265,
         num_layers=6,
         num_heads=12,
diff --git a/keras_nlp/src/models/bart/bart_backbone_test.py b/keras_hub/src/models/bart/bart_backbone_test.py
similarity index 95%
rename from keras_nlp/src/models/bart/bart_backbone_test.py
rename to keras_hub/src/models/bart/bart_backbone_test.py
index b9c87d524b..d51827dbe2 100644
--- a/keras_nlp/src/models/bart/bart_backbone_test.py
+++ b/keras_hub/src/models/bart/bart_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.bart.bart_backbone import BartBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bart.bart_backbone import BartBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BartBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/bart/bart_presets.py b/keras_hub/src/models/bart/bart_presets.py
similarity index 98%
rename from keras_nlp/src/models/bart/bart_presets.py
rename to keras_hub/src/models/bart/bart_presets.py
index f73d5b569c..1e72eee9a7 100644
--- a/keras_nlp/src/models/bart/bart_presets.py
+++ b/keras_hub/src/models/bart/bart_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/bart/bart_seq_2_seq_lm.py b/keras_hub/src/models/bart/bart_seq_2_seq_lm.py
similarity index 94%
rename from keras_nlp/src/models/bart/bart_seq_2_seq_lm.py
rename to keras_hub/src/models/bart/bart_seq_2_seq_lm.py
index bcd957cdac..5adc45d61c 100644
--- a/keras_nlp/src/models/bart/bart_seq_2_seq_lm.py
+++ b/keras_hub/src/models/bart/bart_seq_2_seq_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,16 +15,16 @@
 
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.bart.bart_backbone import BartBackbone
-from keras_nlp.src.models.bart.bart_seq_2_seq_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.bart.bart_backbone import BartBackbone
+from keras_hub.src.models.bart.bart_seq_2_seq_lm_preprocessor import (
     BartSeq2SeqLMPreprocessor,
 )
-from keras_nlp.src.models.seq_2_seq_lm import Seq2SeqLM
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.models.seq_2_seq_lm import Seq2SeqLM
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.BartSeq2SeqLM")
+@keras_hub_export("keras_hub.models.BartSeq2SeqLM")
 class BartSeq2SeqLM(Seq2SeqLM):
     """An end-to-end BART model for seq2seq language modeling.
 
@@ -37,7 +37,7 @@ class BartSeq2SeqLM(Seq2SeqLM):
     This model has a `generate()` method, which generates text based on
     encoder inputs and an optional prompt for the decoder. The generation
     strategy used is controlled by an additional `sampler` argument passed to
-    `compile()`. You can recompile the model with different `keras_nlp.samplers`
+    `compile()`. You can recompile the model with different `keras_hub.samplers`
     objects to control the generation. By default, `"top_k"` sampling will be
     used.
 
@@ -52,8 +52,8 @@ class BartSeq2SeqLM(Seq2SeqLM):
     [here](https://github.com/facebookresearch/fairseq/).
 
     Args:
-        backbone: A `keras_nlp.models.BartBackbone` instance.
-        preprocessor: A `keras_nlp.models.BartSeq2SeqLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.BartBackbone` instance.
+        preprocessor: A `keras_hub.models.BartSeq2SeqLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
 
@@ -61,7 +61,7 @@ class BartSeq2SeqLM(Seq2SeqLM):
 
     Use `generate()` to do text generation, given an input context.
     ```python
-    bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset("bart_base_en")
+    bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("bart_base_en")
     bart_lm.generate("The quick brown fox", max_length=30)
 
     # Generate with batched inputs.
@@ -70,14 +70,14 @@ class BartSeq2SeqLM(Seq2SeqLM):
 
     Compile the `generate()` function with a custom sampler.
     ```python
-    bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset("bart_base_en")
+    bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("bart_base_en")
     bart_lm.compile(sampler="greedy")
     bart_lm.generate("The quick brown fox", max_length=30)
     ```
 
     Use `generate()` with encoder inputs and an incomplete decoder input (prompt).
     ```python
-    bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset("bart_base_en")
+    bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("bart_base_en")
     bart_lm.generate(
         {
             "encoder_text": "The quick brown fox",
@@ -100,7 +100,7 @@ class BartSeq2SeqLM(Seq2SeqLM):
         "decoder_padding_mask": np.array([[True, True, True, True, False, False]])
     }
 
-    bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset(
+    bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset(
         "bart_base_en",
         preprocessor=None,
     )
@@ -113,7 +113,7 @@ class BartSeq2SeqLM(Seq2SeqLM):
         "encoder_text": ["The quick brown fox jumped.", "I forgot my homework."],
         "decoder_text": ["The fast hazel fox leapt.", "I forgot my assignment."]
     }
-    bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset("bart_base_en")
+    bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("bart_base_en")
     bart_lm.fit(x=features, batch_size=2)
     ```
 
@@ -128,7 +128,7 @@ class BartSeq2SeqLM(Seq2SeqLM):
     y = np.array([[0, 133, 1769, 2, 1]] * 2)
     sw = np.array([[1, 1, 1, 1, 0]] * 2)
 
-    bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset(
+    bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset(
         "bart_base_en",
         preprocessor=None,
     )
@@ -152,16 +152,16 @@ class BartSeq2SeqLM(Seq2SeqLM):
     merges = ["Ġ a", "Ġ s", "Ġ n", "e r", "n o", "o n", "Ġs u", "Ġa f", "no on"]
     merges += ["Ġsu n", "Ġaf t", "Ġaft er"]
 
-    tokenizer = keras_nlp.models.BartTokenizer(
+    tokenizer = keras_hub.models.BartTokenizer(
         vocabulary=vocab,
         merges=merges,
     )
-    preprocessor = keras_nlp.models.BartSeq2SeqLMPreprocessor(
+    preprocessor = keras_hub.models.BartSeq2SeqLMPreprocessor(
         tokenizer=tokenizer,
         encoder_sequence_length=128,
         decoder_sequence_length=128,
     )
-    backbone = keras_nlp.models.BartBackbone(
+    backbone = keras_hub.models.BartBackbone(
         vocabulary_size=50265,
         num_layers=6,
         num_heads=12,
@@ -169,7 +169,7 @@ class BartSeq2SeqLM(Seq2SeqLM):
         intermediate_dim=3072,
         max_sequence_length=128,
     )
-    bart_lm = keras_nlp.models.BartSeq2SeqLM(
+    bart_lm = keras_hub.models.BartSeq2SeqLM(
         backbone=backbone,
         preprocessor=preprocessor,
     )
diff --git a/keras_nlp/src/models/bart/bart_seq_2_seq_lm_preprocessor.py b/keras_hub/src/models/bart/bart_seq_2_seq_lm_preprocessor.py
similarity index 84%
rename from keras_nlp/src/models/bart/bart_seq_2_seq_lm_preprocessor.py
rename to keras_hub/src/models/bart/bart_seq_2_seq_lm_preprocessor.py
index 315242511e..478ee1a269 100644
--- a/keras_nlp/src/models/bart/bart_seq_2_seq_lm_preprocessor.py
+++ b/keras_hub/src/models/bart/bart_seq_2_seq_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,19 +13,19 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.start_end_packer import StartEndPacker
-from keras_nlp.src.models.bart.bart_backbone import BartBackbone
-from keras_nlp.src.models.bart.bart_tokenizer import BartTokenizer
-from keras_nlp.src.models.seq_2_seq_lm_preprocessor import Seq2SeqLMPreprocessor
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.start_end_packer import StartEndPacker
+from keras_hub.src.models.bart.bart_backbone import BartBackbone
+from keras_hub.src.models.bart.bart_tokenizer import BartTokenizer
+from keras_hub.src.models.seq_2_seq_lm_preprocessor import Seq2SeqLMPreprocessor
 
 
-@keras_nlp_export("keras_nlp.models.BartSeq2SeqLMPreprocessor")
+@keras_hub_export("keras_hub.models.BartSeq2SeqLMPreprocessor")
 class BartSeq2SeqLMPreprocessor(Seq2SeqLMPreprocessor):
     """BART Seq2Seq LM preprocessor.
 
     This layer is used as preprocessor for seq2seq tasks using the BART model.
-    This class subclasses `keras_nlp.models.BartPreprocessor` and keeps most of
+    This class subclasses `keras_hub.models.BartPreprocessor` and keeps most of
     its functionality. It has two changes from the superclass:
 
      1. Sets the `y` (label) and `sample_weights` fields by shifting the
@@ -35,7 +35,7 @@ class BartSeq2SeqLMPreprocessor(Seq2SeqLMPreprocessor):
         a successor.
 
     Args:
-        tokenizer: A `keras_nlp.models.BartTokenizer` instance.
+        tokenizer: A `keras_hub.models.BartTokenizer` instance.
         encoder_sequence_length: The length of the packed encoder inputs.
         decoder_sequence_length: The length of the packed decoder inputs.
 
@@ -54,7 +54,7 @@ class BartSeq2SeqLMPreprocessor(Seq2SeqLMPreprocessor):
 
     Directly calling the layer on data
     ```python
-    preprocessor = keras_nlp.models.BartPreprocessor.from_preset("bart_base_en")
+    preprocessor = keras_hub.models.BartPreprocessor.from_preset("bart_base_en")
 
     # Preprocess unbatched inputs.
     inputs = {
@@ -82,11 +82,11 @@ class BartSeq2SeqLMPreprocessor(Seq2SeqLMPreprocessor):
     merges = ["Ġ a", "Ġ s", "Ġ n", "e r", "n o", "o n", "Ġs u", "Ġa f", "no on"]
     merges += ["Ġsu n", "Ġaf t", "Ġaft er"]
 
-    tokenizer = keras_nlp.models.BartTokenizer(
+    tokenizer = keras_hub.models.BartTokenizer(
         vocabulary=vocab,
         merges=merges,
     )
-    preprocessor = keras_nlp.models.BartPreprocessor(
+    preprocessor = keras_hub.models.BartPreprocessor(
         tokenizer=tokenizer,
         encoder_sequence_length=20,
         decoder_sequence_length=10,
@@ -100,7 +100,7 @@ class BartSeq2SeqLMPreprocessor(Seq2SeqLMPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.BartPreprocessor.from_preset("bart_base_en")
+    preprocessor = keras_hub.models.BartPreprocessor.from_preset("bart_base_en")
 
     # Map single sentences.
     features = {
diff --git a/keras_nlp/src/models/bart/bart_seq_2_seq_lm_preprocessor_test.py b/keras_hub/src/models/bart/bart_seq_2_seq_lm_preprocessor_test.py
similarity index 94%
rename from keras_nlp/src/models/bart/bart_seq_2_seq_lm_preprocessor_test.py
rename to keras_hub/src/models/bart/bart_seq_2_seq_lm_preprocessor_test.py
index 471fd45e95..21b329f134 100644
--- a/keras_nlp/src/models/bart/bart_seq_2_seq_lm_preprocessor_test.py
+++ b/keras_hub/src/models/bart/bart_seq_2_seq_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 import pytest
 
-from keras_nlp.src.models.bart.bart_seq_2_seq_lm_preprocessor import (
+from keras_hub.src.models.bart.bart_seq_2_seq_lm_preprocessor import (
     BartSeq2SeqLMPreprocessor,
 )
-from keras_nlp.src.models.bart.bart_tokenizer import BartTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bart.bart_tokenizer import BartTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BartSeq2SeqLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/bart/bart_seq_2_seq_lm_test.py b/keras_hub/src/models/bart/bart_seq_2_seq_lm_test.py
similarity index 94%
rename from keras_nlp/src/models/bart/bart_seq_2_seq_lm_test.py
rename to keras_hub/src/models/bart/bart_seq_2_seq_lm_test.py
index a218ae3e82..494e4116d0 100644
--- a/keras_nlp/src/models/bart/bart_seq_2_seq_lm_test.py
+++ b/keras_hub/src/models/bart/bart_seq_2_seq_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,13 +17,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.bart.bart_backbone import BartBackbone
-from keras_nlp.src.models.bart.bart_seq_2_seq_lm import BartSeq2SeqLM
-from keras_nlp.src.models.bart.bart_seq_2_seq_lm_preprocessor import (
+from keras_hub.src.models.bart.bart_backbone import BartBackbone
+from keras_hub.src.models.bart.bart_seq_2_seq_lm import BartSeq2SeqLM
+from keras_hub.src.models.bart.bart_seq_2_seq_lm_preprocessor import (
     BartSeq2SeqLMPreprocessor,
 )
-from keras_nlp.src.models.bart.bart_tokenizer import BartTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bart.bart_tokenizer import BartTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BartSeq2SeqLMTest(TestCase):
diff --git a/keras_nlp/src/models/bart/bart_tokenizer.py b/keras_hub/src/models/bart/bart_tokenizer.py
similarity index 83%
rename from keras_nlp/src/models/bart/bart_tokenizer.py
rename to keras_hub/src/models/bart/bart_tokenizer.py
index 45dd2189b8..4cf53aa9b3 100644
--- a/keras_nlp/src/models/bart/bart_tokenizer.py
+++ b/keras_hub/src/models/bart/bart_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,28 +13,28 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.bart.bart_backbone import BartBackbone
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.bart.bart_backbone import BartBackbone
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.BartTokenizer",
-        "keras_nlp.models.BartTokenizer",
+        "keras_hub.tokenizers.BartTokenizer",
+        "keras_hub.models.BartTokenizer",
     ]
 )
 class BartTokenizer(BytePairTokenizer):
     """A BART tokenizer using Byte-Pair Encoding subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.BytePairTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.BytePairTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by BART
     models and provides a `from_preset()` method to automatically download
     a matching vocabulary for a BART preset.
 
     This tokenizer does not provide truncation or padding of inputs. It can be
-    combined with a `keras_nlp.models.BartPreprocessor` layer for input
+    combined with a `keras_hub.models.BartPreprocessor` layer for input
     packing.
 
     If input is a batch of strings (rank > 0), the layer will output a
@@ -55,7 +55,7 @@ class BartTokenizer(BytePairTokenizer):
 
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.BartTokenizer.from_preset(
+    tokenizer = keras_hub.models.BartTokenizer.from_preset(
         "bart_base_en",
     )
     tokenizer("The quick brown fox jumped.")
@@ -71,7 +71,7 @@ class BartTokenizer(BytePairTokenizer):
     vocab = {**vocab, "a": 4, "Ġquick": 5, "Ġfox": 6}
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick"]
     merges += ["Ġ f", "o x", "Ġf ox"]
-    tokenizer = keras_nlp.models.BartTokenizer(
+    tokenizer = keras_hub.models.BartTokenizer(
         vocabulary=vocab,
         merges=merges,
     )
diff --git a/keras_nlp/src/models/bart/bart_tokenizer_test.py b/keras_hub/src/models/bart/bart_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/bart/bart_tokenizer_test.py
rename to keras_hub/src/models/bart/bart_tokenizer_test.py
index 3f01c6e712..633618976d 100644
--- a/keras_nlp/src/models/bart/bart_tokenizer_test.py
+++ b/keras_hub/src/models/bart/bart_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import pytest
 
-from keras_nlp.src.models.bart.bart_tokenizer import BartTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bart.bart_tokenizer import BartTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BartTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/bert/__init__.py b/keras_hub/src/models/bert/__init__.py
similarity index 72%
rename from keras_nlp/src/models/bert/__init__.py
rename to keras_hub/src/models/bert/__init__.py
index bfc3e904e2..0ee5feb84a 100644
--- a/keras_nlp/src/models/bert/__init__.py
+++ b/keras_hub/src/models/bert/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.bert.bert_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.bert.bert_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, BertBackbone)
diff --git a/keras_nlp/src/models/bert/bert_backbone.py b/keras_hub/src/models/bert/bert_backbone.py
similarity index 93%
rename from keras_nlp/src/models/bert/bert_backbone.py
rename to keras_hub/src/models/bert/bert_backbone.py
index 2b47578be8..df00172b8b 100644
--- a/keras_nlp/src/models/bert/bert_backbone.py
+++ b/keras_hub/src/models/bert/bert_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,21 +14,21 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.layers.modeling.transformer_encoder import TransformerEncoder
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.utils.keras_utils import gelu_approximate
+from keras_hub.src.layers.modeling.transformer_encoder import TransformerEncoder
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.utils.keras_utils import gelu_approximate
 
 
 def bert_kernel_initializer(stddev=0.02):
     return keras.initializers.TruncatedNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.BertBackbone")
+@keras_hub_export("keras_hub.models.BertBackbone")
 class BertBackbone(Backbone):
     """A BERT encoder network.
 
@@ -74,11 +74,11 @@ class BertBackbone(Backbone):
     }
 
     # Pretrained BERT encoder.
-    model = keras_nlp.models.BertBackbone.from_preset("bert_base_en_uncased")
+    model = keras_hub.models.BertBackbone.from_preset("bert_base_en_uncased")
     model(input_data)
 
     # Randomly initialized BERT encoder with a custom config.
-    model = keras_nlp.models.BertBackbone(
+    model = keras_hub.models.BertBackbone(
         vocabulary_size=30552,
         num_layers=4,
         num_heads=4,
diff --git a/keras_nlp/src/models/bert/bert_backbone_test.py b/keras_hub/src/models/bert/bert_backbone_test.py
similarity index 94%
rename from keras_nlp/src/models/bert/bert_backbone_test.py
rename to keras_hub/src/models/bert/bert_backbone_test.py
index f2e8bd5d34..91a703a08f 100644
--- a/keras_nlp/src/models/bert/bert_backbone_test.py
+++ b/keras_hub/src/models/bert/bert_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BertBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/bert/bert_masked_lm.py b/keras_hub/src/models/bert/bert_masked_lm.py
similarity index 84%
rename from keras_nlp/src/models/bert/bert_masked_lm.py
rename to keras_hub/src/models/bert/bert_masked_lm.py
index 1daa570559..4bbe6a595a 100644
--- a/keras_nlp/src/models/bert/bert_masked_lm.py
+++ b/keras_hub/src/models/bert/bert_masked_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,17 +14,17 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.masked_lm_head import MaskedLMHead
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.bert.bert_backbone import bert_kernel_initializer
-from keras_nlp.src.models.bert.bert_masked_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.masked_lm_head import MaskedLMHead
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.bert.bert_backbone import bert_kernel_initializer
+from keras_hub.src.models.bert.bert_masked_lm_preprocessor import (
     BertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.masked_lm import MaskedLM
+from keras_hub.src.models.masked_lm import MaskedLM
 
 
-@keras_nlp_export("keras_nlp.models.BertMaskedLM")
+@keras_hub_export("keras_hub.models.BertMaskedLM")
 class BertMaskedLM(MaskedLM):
     """An end-to-end BERT model for the masked language modeling task.
 
@@ -43,8 +43,8 @@ class BertMaskedLM(MaskedLM):
     warranties or conditions of any kind.
 
     Args:
-        backbone: A `keras_nlp.models.BertBackbone` instance.
-        preprocessor: A `keras_nlp.models.BertMaskedLMPreprocessor` or
+        backbone: A `keras_hub.models.BertBackbone` instance.
+        preprocessor: A `keras_hub.models.BertMaskedLMPreprocessor` or
             `None`. If `None`, this model will not apply preprocessing, and
             inputs should be preprocessed before calling the model.
 
@@ -55,7 +55,7 @@ class BertMaskedLM(MaskedLM):
     features = ["The quick brown fox jumped.", "I forgot my homework."]
 
     # Pretrained language model.
-    masked_lm = keras_nlp.models.BertMaskedLM.from_preset(
+    masked_lm = keras_hub.models.BertMaskedLM.from_preset(
         "bert_base_en_uncased",
     )
     masked_lm.fit(x=features, batch_size=2)
@@ -84,7 +84,7 @@ class BertMaskedLM(MaskedLM):
     # Labels are the original masked values.
     labels = [[3, 5]] * 2
 
-    masked_lm = keras_nlp.models.BertMaskedLM.from_preset(
+    masked_lm = keras_hub.models.BertMaskedLM.from_preset(
         "bert_base_en_uncased",
         preprocessor=None,
     )
diff --git a/keras_nlp/src/models/bert/bert_masked_lm_preprocessor.py b/keras_hub/src/models/bert/bert_masked_lm_preprocessor.py
similarity index 87%
rename from keras_nlp/src/models/bert/bert_masked_lm_preprocessor.py
rename to keras_hub/src/models/bert/bert_masked_lm_preprocessor.py
index ef060d2e46..84aa0dea86 100644
--- a/keras_nlp/src/models/bert/bert_masked_lm_preprocessor.py
+++ b/keras_hub/src/models/bert/bert_masked_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,19 +12,19 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
 
 
-@keras_nlp_export("keras_nlp.models.BertMaskedLMPreprocessor")
+@keras_hub_export("keras_hub.models.BertMaskedLMPreprocessor")
 class BertMaskedLMPreprocessor(MaskedLMPreprocessor):
     """BERT preprocessing for the masked language modeling task.
 
     This preprocessing layer will prepare inputs for a masked language modeling
     task. It is primarily intended for use with the
-    `keras_nlp.models.BertMaskedLM` task model. Preprocessing will occur in
+    `keras_hub.models.BertMaskedLM` task model. Preprocessing will occur in
     multiple steps.
 
     1. Tokenize any number of input segments using the `tokenizer`.
@@ -33,10 +33,10 @@ class BertMaskedLMPreprocessor(MaskedLMPreprocessor):
     3. Randomly select non-special tokens to mask, controlled by
       `mask_selection_rate`.
     4. Construct a `(x, y, sample_weight)` tuple suitable for training with a
-      `keras_nlp.models.BertMaskedLM` task model.
+      `keras_hub.models.BertMaskedLM` task model.
 
     Args:
-        tokenizer: A `keras_nlp.models.BertTokenizer` instance.
+        tokenizer: A `keras_hub.models.BertTokenizer` instance.
         sequence_length: int. The length of the packed inputs.
         truncate: string. The algorithm to truncate a list of batched segments
             to fit within `sequence_length`. The value can be either
@@ -72,7 +72,7 @@ class BertMaskedLMPreprocessor(MaskedLMPreprocessor):
 
     Directly calling the layer on data.
     ```python
-    preprocessor = keras_nlp.models.BertMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.BertMaskedLMPreprocessor.from_preset(
         "bert_base_en_uncased"
     )
 
@@ -91,7 +91,7 @@ class BertMaskedLMPreprocessor(MaskedLMPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.BertMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.BertMaskedLMPreprocessor.from_preset(
         "bert_base_en_uncased"
     )
 
diff --git a/keras_nlp/src/models/bert/bert_masked_lm_preprocessor_test.py b/keras_hub/src/models/bert/bert_masked_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/bert/bert_masked_lm_preprocessor_test.py
rename to keras_hub/src/models/bert/bert_masked_lm_preprocessor_test.py
index 3e41b8c55e..c6076d701d 100644
--- a/keras_nlp/src/models/bert/bert_masked_lm_preprocessor_test.py
+++ b/keras_hub/src/models/bert/bert_masked_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 import pytest
 
-from keras_nlp.src.models.bert.bert_masked_lm_preprocessor import (
+from keras_hub.src.models.bert.bert_masked_lm_preprocessor import (
     BertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BertMaskedLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/bert/bert_masked_lm_test.py b/keras_hub/src/models/bert/bert_masked_lm_test.py
similarity index 87%
rename from keras_nlp/src/models/bert/bert_masked_lm_test.py
rename to keras_hub/src/models/bert/bert_masked_lm_test.py
index c00848c7c7..67e92e6bcf 100644
--- a/keras_nlp/src/models/bert/bert_masked_lm_test.py
+++ b/keras_hub/src/models/bert/bert_masked_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,13 +14,13 @@
 
 import pytest
 
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.bert.bert_masked_lm import BertMaskedLM
-from keras_nlp.src.models.bert.bert_masked_lm_preprocessor import (
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.bert.bert_masked_lm import BertMaskedLM
+from keras_hub.src.models.bert.bert_masked_lm_preprocessor import (
     BertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BertMaskedLMTest(TestCase):
diff --git a/keras_nlp/src/models/bert/bert_presets.py b/keras_hub/src/models/bert/bert_presets.py
similarity index 99%
rename from keras_nlp/src/models/bert/bert_presets.py
rename to keras_hub/src/models/bert/bert_presets.py
index 85d1c8f6e2..b4547e259c 100644
--- a/keras_nlp/src/models/bert/bert_presets.py
+++ b/keras_hub/src/models/bert/bert_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/bert/bert_text_classifier.py b/keras_hub/src/models/bert/bert_text_classifier.py
similarity index 84%
rename from keras_nlp/src/models/bert/bert_text_classifier.py
rename to keras_hub/src/models/bert/bert_text_classifier.py
index 3af9116914..0f21e0c82a 100644
--- a/keras_nlp/src/models/bert/bert_text_classifier.py
+++ b/keras_hub/src/models/bert/bert_text_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,26 +14,26 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.bert.bert_backbone import bert_kernel_initializer
-from keras_nlp.src.models.bert.bert_text_classifier_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.bert.bert_backbone import bert_kernel_initializer
+from keras_hub.src.models.bert.bert_text_classifier_preprocessor import (
     BertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.text_classifier import TextClassifier
+from keras_hub.src.models.text_classifier import TextClassifier
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.BertTextClassifier",
-        "keras_nlp.models.BertClassifier",
+        "keras_hub.models.BertTextClassifier",
+        "keras_hub.models.BertClassifier",
     ]
 )
 class BertTextClassifier(TextClassifier):
     """An end-to-end BERT model for classification tasks.
 
     This model attaches a classification head to a
-    `keras_nlp.model.BertBackbone` instance, mapping from the backbone outputs
+    `keras_hub.model.BertBackbone` instance, mapping from the backbone outputs
     to logits suitable for a classification task. For usage of this model with
     pre-trained weights, use the `from_preset()` constructor.
 
@@ -46,9 +46,9 @@ class BertTextClassifier(TextClassifier):
     warranties or conditions of any kind.
 
     Args:
-        backbone: A `keras_nlp.models.BertBackbone` instance.
+        backbone: A `keras_hub.models.BertBackbone` instance.
         num_classes: int. Number of classes to predict.
-        preprocessor: A `keras_nlp.models.BertTextClassifierPreprocessor` or `None`. If
+        preprocessor: A `keras_hub.models.BertTextClassifierPreprocessor` or `None`. If
             `None`, this model will not apply preprocessing, and inputs should
             be preprocessed before calling the model.
         activation: Optional `str` or callable. The
@@ -66,7 +66,7 @@ class BertTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier.
-    classifier = keras_nlp.models.BertTextClassifier.from_preset(
+    classifier = keras_hub.models.BertTextClassifier.from_preset(
         "bert_base_en_uncased",
         num_classes=4,
     )
@@ -95,7 +95,7 @@ class BertTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier without preprocessing.
-    classifier = keras_nlp.models.BertTextClassifier.from_preset(
+    classifier = keras_hub.models.BertTextClassifier.from_preset(
         "bert_base_en_uncased",
         num_classes=4,
         preprocessor=None,
@@ -110,14 +110,14 @@ class BertTextClassifier(TextClassifier):
 
     vocab = ["[UNK]", "[CLS]", "[SEP]", "[PAD]", "[MASK]"]
     vocab += ["The", "quick", "brown", "fox", "jumped", "."]
-    tokenizer = keras_nlp.models.BertTokenizer(
+    tokenizer = keras_hub.models.BertTokenizer(
         vocabulary=vocab,
     )
-    preprocessor = keras_nlp.models.BertTextClassifierPreprocessor(
+    preprocessor = keras_hub.models.BertTextClassifierPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.BertBackbone(
+    backbone = keras_hub.models.BertBackbone(
         vocabulary_size=30552,
         num_layers=4,
         num_heads=4,
@@ -125,7 +125,7 @@ class BertTextClassifier(TextClassifier):
         intermediate_dim=512,
         max_sequence_length=128,
     )
-    classifier = keras_nlp.models.BertTextClassifier(
+    classifier = keras_hub.models.BertTextClassifier(
         backbone=backbone,
         preprocessor=preprocessor,
         num_classes=4,
diff --git a/keras_nlp/src/models/bert/bert_text_classifier_preprocessor.py b/keras_hub/src/models/bert/bert_text_classifier_preprocessor.py
similarity index 85%
rename from keras_nlp/src/models/bert/bert_text_classifier_preprocessor.py
rename to keras_hub/src/models/bert/bert_text_classifier_preprocessor.py
index 5ed47b7694..8239d300f9 100644
--- a/keras_nlp/src/models/bert/bert_text_classifier_preprocessor.py
+++ b/keras_hub/src/models/bert/bert_text_classifier_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,18 +12,18 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.models.text_classifier_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.models.text_classifier_preprocessor import (
     TextClassifierPreprocessor,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.BertTextClassifierPreprocessor",
-        "keras_nlp.models.BertPreprocessor",
+        "keras_hub.models.BertTextClassifierPreprocessor",
+        "keras_hub.models.BertPreprocessor",
     ]
 )
 class BertTextClassifierPreprocessor(TextClassifierPreprocessor):
@@ -32,7 +32,7 @@ class BertTextClassifierPreprocessor(TextClassifierPreprocessor):
     This preprocessing layer will do three things:
 
     1. Tokenize any number of input segments using the `tokenizer`.
-    2. Pack the inputs together using a `keras_nlp.layers.MultiSegmentPacker`.
+    2. Pack the inputs together using a `keras_hub.layers.MultiSegmentPacker`.
        with the appropriate `"[CLS]"`, `"[SEP]"` and `"[PAD]"` tokens.
     3. Construct a dictionary with keys `"token_ids"`, `"segment_ids"`,
        `"padding_mask"`, that can be passed directly to a BERT model.
@@ -42,7 +42,7 @@ class BertTextClassifierPreprocessor(TextClassifierPreprocessor):
     `keras.Model.fit`.
 
     Args:
-        tokenizer: A `keras_nlp.models.BertTokenizer` instance.
+        tokenizer: A `keras_hub.models.BertTokenizer` instance.
         sequence_length: The length of the packed inputs.
         truncate: string. The algorithm to truncate a list of batched segments
             to fit within `sequence_length`. The value can be either
@@ -67,7 +67,7 @@ class BertTextClassifierPreprocessor(TextClassifierPreprocessor):
 
     Directly calling the layer on data.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "bert_base_en_uncased"
     )
 
@@ -86,14 +86,14 @@ class BertTextClassifierPreprocessor(TextClassifierPreprocessor):
     # Custom vocabulary.
     vocab = ["[UNK]", "[CLS]", "[SEP]", "[PAD]", "[MASK]"]
     vocab += ["The", "quick", "brown", "fox", "jumped", "."]
-    tokenizer = keras_nlp.models.BertTokenizer(vocabulary=vocab)
-    preprocessor = keras_nlp.models.BertTextClassifierPreprocessor(tokenizer)
+    tokenizer = keras_hub.models.BertTokenizer(vocabulary=vocab)
+    preprocessor = keras_hub.models.BertTextClassifierPreprocessor(tokenizer)
     preprocessor("The quick brown fox jumped.")
     ```
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "bert_base_en_uncased"
     )
 
diff --git a/keras_nlp/src/models/bert/bert_text_classifier_preprocessor_test.py b/keras_hub/src/models/bert/bert_text_classifier_preprocessor_test.py
similarity index 91%
rename from keras_nlp/src/models/bert/bert_text_classifier_preprocessor_test.py
rename to keras_hub/src/models/bert/bert_text_classifier_preprocessor_test.py
index 30215f2a81..53bd8727f2 100644
--- a/keras_nlp/src/models/bert/bert_text_classifier_preprocessor_test.py
+++ b/keras_hub/src/models/bert/bert_text_classifier_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 import pytest
 
-from keras_nlp.src.models.bert.bert_text_classifier_preprocessor import (
+from keras_hub.src.models.bert.bert_text_classifier_preprocessor import (
     BertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BertTextClassifierPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/bert/bert_text_classifier_test.py b/keras_hub/src/models/bert/bert_text_classifier_test.py
similarity index 87%
rename from keras_nlp/src/models/bert/bert_text_classifier_test.py
rename to keras_hub/src/models/bert/bert_text_classifier_test.py
index 44b9a3d846..44fca4b38b 100644
--- a/keras_nlp/src/models/bert/bert_text_classifier_test.py
+++ b/keras_hub/src/models/bert/bert_text_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,13 +14,13 @@
 
 import pytest
 
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.bert.bert_text_classifier import BertTextClassifier
-from keras_nlp.src.models.bert.bert_text_classifier_preprocessor import (
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.bert.bert_text_classifier import BertTextClassifier
+from keras_hub.src.models.bert.bert_text_classifier_preprocessor import (
     BertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BertTextClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/bert/bert_tokenizer.py b/keras_hub/src/models/bert/bert_tokenizer.py
similarity index 85%
rename from keras_nlp/src/models/bert/bert_tokenizer.py
rename to keras_hub/src/models/bert/bert_tokenizer.py
index da11efec21..7bb5182863 100644
--- a/keras_nlp/src/models/bert/bert_tokenizer.py
+++ b/keras_hub/src/models/bert/bert_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,22 +12,22 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.tokenizers.word_piece_tokenizer import WordPieceTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.tokenizers.word_piece_tokenizer import WordPieceTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.BertTokenizer",
-        "keras_nlp.models.BertTokenizer",
+        "keras_hub.tokenizers.BertTokenizer",
+        "keras_hub.models.BertTokenizer",
     ]
 )
 class BertTokenizer(WordPieceTokenizer):
     """A BERT tokenizer using WordPiece subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.WordPieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.WordPieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by BERT
     models and provides a `from_preset()` method to automatically download
     a matching vocabulary for a BERT preset.
@@ -52,7 +52,7 @@ class BertTokenizer(WordPieceTokenizer):
     Examples:
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.BertTokenizer.from_preset(
+    tokenizer = keras_hub.models.BertTokenizer.from_preset(
         "bert_base_en_uncased",
     )
     tokenizer("The quick brown fox jumped.")
@@ -66,7 +66,7 @@ class BertTokenizer(WordPieceTokenizer):
     # Custom vocabulary.
     vocab = ["[UNK]", "[CLS]", "[SEP]", "[PAD]", "[MASK]"]
     vocab += ["The", "quick", "brown", "fox", "jumped", "."]
-    tokenizer = keras_nlp.models.BertTokenizer(vocabulary=vocab)
+    tokenizer = keras_hub.models.BertTokenizer(vocabulary=vocab)
     tokenizer("The quick brown fox jumped.")
     ```
     """
diff --git a/keras_nlp/src/models/bert/bert_tokenizer_test.py b/keras_hub/src/models/bert/bert_tokenizer_test.py
similarity index 94%
rename from keras_nlp/src/models/bert/bert_tokenizer_test.py
rename to keras_hub/src/models/bert/bert_tokenizer_test.py
index 56e17327fb..5b8e3e754c 100644
--- a/keras_nlp/src/models/bert/bert_tokenizer_test.py
+++ b/keras_hub/src/models/bert/bert_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import pytest
 
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BertTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/bloom/__init__.py b/keras_hub/src/models/bloom/__init__.py
similarity index 72%
rename from keras_nlp/src/models/bloom/__init__.py
rename to keras_hub/src/models/bloom/__init__.py
index 35fe54e0f6..3f5897cf81 100644
--- a/keras_nlp/src/models/bloom/__init__.py
+++ b/keras_hub/src/models/bloom/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.bloom.bloom_backbone import BloomBackbone
-from keras_nlp.src.models.bloom.bloom_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.bloom.bloom_backbone import BloomBackbone
+from keras_hub.src.models.bloom.bloom_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, BloomBackbone)
diff --git a/keras_nlp/src/models/bloom/bloom_attention.py b/keras_hub/src/models/bloom/bloom_attention.py
similarity index 97%
rename from keras_nlp/src/models/bloom/bloom_attention.py
rename to keras_hub/src/models/bloom/bloom_attention.py
index d91eafd575..29199904a3 100644
--- a/keras_nlp/src/models/bloom/bloom_attention.py
+++ b/keras_hub/src/models/bloom/bloom_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.alibi_bias import AlibiBias
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.layers.modeling.alibi_bias import AlibiBias
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class BloomAttention(keras.layers.Layer):
diff --git a/keras_nlp/src/models/bloom/bloom_backbone.py b/keras_hub/src/models/bloom/bloom_backbone.py
similarity index 93%
rename from keras_nlp/src/models/bloom/bloom_backbone.py
rename to keras_hub/src/models/bloom/bloom_backbone.py
index a5d88d0011..f8c9a1eada 100644
--- a/keras_nlp/src/models/bloom/bloom_backbone.py
+++ b/keras_hub/src/models/bloom/bloom_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,19 +14,19 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.bloom.bloom_decoder import BloomDecoder
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.bloom.bloom_decoder import BloomDecoder
 
 
 def _bloom_kernel_initializer(stddev=0.02):
     return keras.initializers.RandomNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.BloomBackbone")
+@keras_hub_export("keras_hub.models.BloomBackbone")
 class BloomBackbone(Backbone):
     """A BLOOM decoder network.
 
@@ -66,11 +66,11 @@ class BloomBackbone(Backbone):
     }
 
     # Pretrained BLOOM decoder.
-    model = keras_nlp.models.BloomBackbone.from_preset("bloom_560m_multi")
+    model = keras_hub.models.BloomBackbone.from_preset("bloom_560m_multi")
     model(input_data)
 
     # Randomly initialized BLOOM decoder with a custom config.
-    model = keras_nlp.models.BloomBackbone(
+    model = keras_hub.models.BloomBackbone(
         vocabulary_size=10,
         num_layers=2,
         num_heads=2,
diff --git a/keras_nlp/src/models/bloom/bloom_backbone_test.py b/keras_hub/src/models/bloom/bloom_backbone_test.py
similarity index 93%
rename from keras_nlp/src/models/bloom/bloom_backbone_test.py
rename to keras_hub/src/models/bloom/bloom_backbone_test.py
index eeacf06291..75fcc02b29 100644
--- a/keras_nlp/src/models/bloom/bloom_backbone_test.py
+++ b/keras_hub/src/models/bloom/bloom_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.bloom.bloom_backbone import BloomBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bloom.bloom_backbone import BloomBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BloomBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/bloom/bloom_causal_lm.py b/keras_hub/src/models/bloom/bloom_causal_lm.py
similarity index 89%
rename from keras_nlp/src/models/bloom/bloom_causal_lm.py
rename to keras_hub/src/models/bloom/bloom_causal_lm.py
index 40cd4a8a5c..f047762182 100644
--- a/keras_nlp/src/models/bloom/bloom_causal_lm.py
+++ b/keras_hub/src/models/bloom/bloom_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,16 +15,16 @@
 
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.bloom.bloom_backbone import BloomBackbone
-from keras_nlp.src.models.bloom.bloom_causal_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.bloom.bloom_backbone import BloomBackbone
+from keras_hub.src.models.bloom.bloom_causal_lm_preprocessor import (
     BloomCausalLMPreprocessor,
 )
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.BloomCausalLM")
+@keras_hub_export("keras_hub.models.BloomCausalLM")
 class BloomCausalLM(CausalLM):
     """An end-to-end BLOOM model for causal language modeling.
 
@@ -37,7 +37,7 @@ class BloomCausalLM(CausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"greedy"` sampling will be used.
 
     This model can optionally be configured with a `preprocessor` layer, in
@@ -46,8 +46,8 @@ class BloomCausalLM(CausalLM):
     when creating the model with `from_preset()`.
 
     Args:
-        backbone: A `keras_nlp.models.BloomBackbone` instance.
-        preprocessor: A `keras_nlp.models.BloomCausalLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.BloomBackbone` instance.
+        preprocessor: A `keras_hub.models.BloomCausalLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
 
@@ -55,7 +55,7 @@ class BloomCausalLM(CausalLM):
 
     Use `generate()` to do text generation.
     ```python
-    bloom_lm = keras_nlp.models.BloomCausalLM.from_preset("bloom_560m_multi")
+    bloom_lm = keras_hub.models.BloomCausalLM.from_preset("bloom_560m_multi")
     bloom_lm.generate("I want to say", max_length=30)
 
     # Generate with batched prompts.
@@ -64,11 +64,11 @@ class BloomCausalLM(CausalLM):
 
     Compile the `generate()` function with a custom sampler.
     ```python
-    bloom_lm = keras_nlp.models.BloomCausalLM.from_preset("bloom_560m_multi")
+    bloom_lm = keras_hub.models.BloomCausalLM.from_preset("bloom_560m_multi")
     bloom_lm.compile(sampler="top_k")
     bloom_lm.generate("I want to say", max_length=30)
 
-    bloom_lm.compile(sampler=keras_nlp.samplers.BeamSampler(num_beams=2))
+    bloom_lm.compile(sampler=keras_hub.samplers.BeamSampler(num_beams=2))
     bloom_lm.generate("I want to say", max_length=30)
     ```
 
@@ -81,7 +81,7 @@ class BloomCausalLM(CausalLM):
         "padding_mask": np.array([[1, 1, 1, 1, 0, 0, 0, 0, 0]] * 2),
     }
 
-    bloom_lm = keras_nlp.models.BloomCausalLM.from_preset(
+    bloom_lm = keras_hub.models.BloomCausalLM.from_preset(
         "bloom_560m_multi",
         preprocessor=None,
     )
@@ -91,7 +91,7 @@ class BloomCausalLM(CausalLM):
     Call `fit()` on a single batch.
     ```python
     features = ["The quick brown fox jumped.", "I forgot my homework."]
-    bloom_lm = keras_nlp.models.BloomCausalLM.from_preset("bloom_560m_multi")
+    bloom_lm = keras_hub.models.BloomCausalLM.from_preset("bloom_560m_multi")
     bloom_lm.fit(x=features, batch_size=2)
     ```
 
@@ -105,7 +105,7 @@ class BloomCausalLM(CausalLM):
     y = np.array([[214064, 603, 5271, 6044, 9581, 3, 0, 0]] * 2)
     sw = np.array([[1, 1, 1, 1, 1, 1, 0, 0]] * 2)
 
-    bloom_lm = keras_nlp.models.BloomCausalLM.from_preset(
+    bloom_lm = keras_hub.models.BloomCausalLM.from_preset(
         "bloom_560m_multi",
         preprocessor=None,
     )
@@ -124,19 +124,19 @@ class BloomCausalLM(CausalLM):
     merges = ["Ġ a", "Ġ t", "Ġ i", "Ġ b", "a i", "p l", "n e"]
     merges += ["Ġa t", "p o", "r t", "Ġt h", "ai r", "pl a", "po rt"]
     merges += ["Ġai r", "Ġa i", "pla ne"]
-    tokenizer = keras_nlp.models.BloomTokenizer(vocabulary=vocab, merges=merges)
-    preprocessor = keras_nlp.models.BloomCausalLMPreprocessor(
+    tokenizer = keras_hub.models.BloomTokenizer(vocabulary=vocab, merges=merges)
+    preprocessor = keras_hub.models.BloomCausalLMPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.BloomBackbone(
+    backbone = keras_hub.models.BloomBackbone(
         vocabulary_size=tokenizer.vocabulary_size(),
         num_layers=4,
         num_heads=4,
         hidden_dim=32,
         intermediate_dim=128,
     )
-    bloom_lm = keras_nlp.models.BloomCausalLM(
+    bloom_lm = keras_hub.models.BloomCausalLM(
         backbone=backbone,
         preprocessor=preprocessor,
     )
diff --git a/keras_nlp/src/models/bloom/bloom_causal_lm_preprocessor.py b/keras_hub/src/models/bloom/bloom_causal_lm_preprocessor.py
similarity index 83%
rename from keras_nlp/src/models/bloom/bloom_causal_lm_preprocessor.py
rename to keras_hub/src/models/bloom/bloom_causal_lm_preprocessor.py
index f713e2c578..de3dd90a0b 100644
--- a/keras_nlp/src/models/bloom/bloom_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/bloom/bloom_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,30 +13,30 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.bloom.bloom_backbone import BloomBackbone
-from keras_nlp.src.models.bloom.bloom_tokenizer import BloomTokenizer
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.bloom.bloom_backbone import BloomBackbone
+from keras_hub.src.models.bloom.bloom_tokenizer import BloomTokenizer
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
 
 
-@keras_nlp_export("keras_nlp.models.BloomCausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.BloomCausalLMPreprocessor")
 class BloomCausalLMPreprocessor(CausalLMPreprocessor):
     """BLOOM Causal LM preprocessor.
 
     This preprocessing layer is meant for use with
-    `keras_nlp.models.BloomCausalLM`. By default, it will take in batches of
+    `keras_hub.models.BloomCausalLM`. By default, it will take in batches of
     strings, and return outputs in a `(x, y, sample_weight)` format, where the
     `y` label is the next token id in the `x` sequence.
 
     For use with generation, the layer also exposes two methods
     `generate_preprocess()` and `generate_postprocess()`. When this preprocessor
-    is attached to a `keras_nlp.models.BloomCausalLM` instance, these methods
+    is attached to a `keras_hub.models.BloomCausalLM` instance, these methods
     will be called implicitly in `generate()`. They can also be called
     standalone (e.g. to precompute preprocessing inputs for generation in a
     separate process).
 
     Args:
-        tokenizer: A `keras_nlp.models.BloomTokenizer` instance.
+        tokenizer: A `keras_hub.models.BloomTokenizer` instance.
         sequence_length: The length of the packed inputs.
         add_start_token: If `True`, the preprocessor will prepend the tokenizer
             start token to each input sequence.
@@ -54,7 +54,7 @@ class BloomCausalLMPreprocessor(CausalLMPreprocessor):
     Examples:
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.BloomCausalLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.BloomCausalLMPreprocessor.from_preset(
         "bloom_560m_multi"
     )
 
diff --git a/keras_nlp/src/models/bloom/bloom_causal_lm_preprocessor_test.py b/keras_hub/src/models/bloom/bloom_causal_lm_preprocessor_test.py
similarity index 94%
rename from keras_nlp/src/models/bloom/bloom_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/bloom/bloom_causal_lm_preprocessor_test.py
index e726768fda..6b59132653 100644
--- a/keras_nlp/src/models/bloom/bloom_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/bloom/bloom_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 import pytest
 
-from keras_nlp.src.models.bloom.bloom_causal_lm_preprocessor import (
+from keras_hub.src.models.bloom.bloom_causal_lm_preprocessor import (
     BloomCausalLMPreprocessor,
 )
-from keras_nlp.src.models.bloom.bloom_tokenizer import BloomTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bloom.bloom_tokenizer import BloomTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BloomCausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/bloom/bloom_causal_lm_test.py b/keras_hub/src/models/bloom/bloom_causal_lm_test.py
similarity index 95%
rename from keras_nlp/src/models/bloom/bloom_causal_lm_test.py
rename to keras_hub/src/models/bloom/bloom_causal_lm_test.py
index d40798550b..7f4119c264 100644
--- a/keras_nlp/src/models/bloom/bloom_causal_lm_test.py
+++ b/keras_hub/src/models/bloom/bloom_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,13 +17,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.bloom.bloom_backbone import BloomBackbone
-from keras_nlp.src.models.bloom.bloom_causal_lm import BloomCausalLM
-from keras_nlp.src.models.bloom.bloom_causal_lm_preprocessor import (
+from keras_hub.src.models.bloom.bloom_backbone import BloomBackbone
+from keras_hub.src.models.bloom.bloom_causal_lm import BloomCausalLM
+from keras_hub.src.models.bloom.bloom_causal_lm_preprocessor import (
     BloomCausalLMPreprocessor,
 )
-from keras_nlp.src.models.bloom.bloom_tokenizer import BloomTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bloom.bloom_tokenizer import BloomTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BloomCausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/bloom/bloom_decoder.py b/keras_hub/src/models/bloom/bloom_decoder.py
similarity index 95%
rename from keras_nlp/src/models/bloom/bloom_decoder.py
rename to keras_hub/src/models/bloom/bloom_decoder.py
index e4e2f3a298..86a456a8b8 100644
--- a/keras_nlp/src/models/bloom/bloom_decoder.py
+++ b/keras_hub/src/models/bloom/bloom_decoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,14 +16,14 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     compute_causal_mask,
 )
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     merge_padding_and_attention_mask,
 )
-from keras_nlp.src.models.bloom.bloom_attention import BloomAttention
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.models.bloom.bloom_attention import BloomAttention
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class BloomDecoder(keras.layers.Layer):
diff --git a/keras_nlp/src/models/bloom/bloom_presets.py b/keras_hub/src/models/bloom/bloom_presets.py
similarity index 99%
rename from keras_nlp/src/models/bloom/bloom_presets.py
rename to keras_hub/src/models/bloom/bloom_presets.py
index fec7128c62..e34554ed0d 100644
--- a/keras_nlp/src/models/bloom/bloom_presets.py
+++ b/keras_hub/src/models/bloom/bloom_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/bloom/bloom_tokenizer.py b/keras_hub/src/models/bloom/bloom_tokenizer.py
similarity index 83%
rename from keras_nlp/src/models/bloom/bloom_tokenizer.py
rename to keras_hub/src/models/bloom/bloom_tokenizer.py
index fb9debb628..a5b470f16e 100644
--- a/keras_nlp/src/models/bloom/bloom_tokenizer.py
+++ b/keras_hub/src/models/bloom/bloom_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,22 +13,22 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.bloom.bloom_backbone import BloomBackbone
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.bloom.bloom_backbone import BloomBackbone
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.BloomTokenizer",
-        "keras_nlp.models.BloomTokenizer",
+        "keras_hub.tokenizers.BloomTokenizer",
+        "keras_hub.models.BloomTokenizer",
     ]
 )
 class BloomTokenizer(BytePairTokenizer):
     """A BLOOM tokenizer using Byte-Pair Encoding subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.BytePairTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.BytePairTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by BLOOM
     models and provides a `from_preset()` method to automatically download
     a matching vocabulary for a BLOOM preset.
@@ -51,7 +51,7 @@ class BloomTokenizer(BytePairTokenizer):
 
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.BloomTokenizer.from_preset("bloom_560m_multi")
+    tokenizer = keras_hub.models.BloomTokenizer.from_preset("bloom_560m_multi")
     tokenizer("The quick brown fox jumped.")
 
     # Batched input.
@@ -64,7 +64,7 @@ class BloomTokenizer(BytePairTokenizer):
     vocab = {"<s>": 0, "</s>": 1, "<pad>": 2, "a": 3, "Ġquick": 4, "Ġfox": 5}
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick"]
     merges += ["Ġ f", "o x", "Ġf ox"]
-    tokenizer = keras_nlp.models.BloomTokenizer(vocabulary=vocab, merges=merges)
+    tokenizer = keras_hub.models.BloomTokenizer(vocabulary=vocab, merges=merges)
     tokenizer("a quick fox.")
     ```
     """
diff --git a/keras_nlp/src/models/bloom/bloom_tokenizer_test.py b/keras_hub/src/models/bloom/bloom_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/bloom/bloom_tokenizer_test.py
rename to keras_hub/src/models/bloom/bloom_tokenizer_test.py
index b52e2fc45b..32984c92f8 100644
--- a/keras_nlp/src/models/bloom/bloom_tokenizer_test.py
+++ b/keras_hub/src/models/bloom/bloom_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import pytest
 
-from keras_nlp.src.models.bloom.bloom_tokenizer import BloomTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bloom.bloom_tokenizer import BloomTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BloomTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/causal_lm.py b/keras_hub/src/models/causal_lm.py
similarity index 95%
rename from keras_nlp/src/models/causal_lm.py
rename to keras_hub/src/models/causal_lm.py
index 7fa61d6ba7..56b93cd78e 100644
--- a/keras_nlp/src/models/causal_lm.py
+++ b/keras_hub/src/models/causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -19,9 +19,9 @@
 from keras import ops
 from keras import tree
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.task import Task
-from keras_nlp.src.samplers.serialization import get as get_sampler
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.task import Task
+from keras_hub.src.samplers.serialization import get as get_sampler
 
 try:
     import tensorflow as tf
@@ -29,19 +29,19 @@
     tf = None
 
 
-@keras_nlp_export("keras_nlp.models.CausalLM")
+@keras_hub_export("keras_hub.models.CausalLM")
 class CausalLM(Task):
     """Base class for generative language modeling tasks.
 
-    `CausalLM` tasks wrap a `keras_nlp.models.Backbone` and
-    a `keras_nlp.models.Preprocessor` to create a model that can be used for
+    `CausalLM` tasks wrap a `keras_hub.models.Backbone` and
+    a `keras_hub.models.Preprocessor` to create a model that can be used for
     generation and generative fine-tuning.
 
     `CausalLM` tasks provide an additional, high-level `generate()` function
     which can be used to auto-regressively sample a model token by token with a
     string in, string out signature. The `compile()` method of all `CausalLM`
     classes contains an additional `sampler` argument, which can be used to pass
-    a `keras_nlp.samplers.Sampler` to control how the predicted distribution
+    a `keras_hub.samplers.Sampler` to control how the predicted distribution
     will be sampled.
 
     When calling `fit()`, the tokenized input will be predicted token-by-token
@@ -54,14 +54,14 @@ class CausalLM(Task):
     Example:
     ```python
     # Load a GPT2 backbone with pre-trained weights.
-    causal_lm = keras_nlp.models.CausalLM.from_preset(
+    causal_lm = keras_hub.models.CausalLM.from_preset(
         "gpt2_base_en",
     )
     causal_lm.compile(sampler="top_k")
     causal_lm.generate("Keras is a", max_length=64)
 
     # Load a Mistral instruction tuned checkpoint at bfloat16 precision.
-    causal_lm = keras_nlp.models.CausalLM.from_preset(
+    causal_lm = keras_hub.models.CausalLM.from_preset(
         "mistral_instruct_7b_en",
         dtype="bfloat16",
     )
@@ -113,9 +113,9 @@ def compile(
                 applied to track the accuracy of the model at guessing masked
                 token values. See `keras.Model.compile` and `keras.metrics` for
                 more info on possible `weighted_metrics` values.
-            sampler: A sampler name, or a `keras_nlp.samplers.Sampler` instance.
+            sampler: A sampler name, or a `keras_hub.samplers.Sampler` instance.
                 Configures the sampling method used during `generate()` calls.
-                See `keras_nlp.samplers` for a full list of built-in sampling
+                See `keras_hub.samplers` for a full list of built-in sampling
                 strategies.
             **kwargs: See `keras.Model.compile` for a full list of arguments
                 supported by the compile method.
diff --git a/keras_nlp/src/models/causal_lm_preprocessor.py b/keras_hub/src/models/causal_lm_preprocessor.py
similarity index 92%
rename from keras_nlp/src/models/causal_lm_preprocessor.py
rename to keras_hub/src/models/causal_lm_preprocessor.py
index 6a0dad3bdf..0963669c1e 100644
--- a/keras_nlp/src/models/causal_lm_preprocessor.py
+++ b/keras_hub/src/models/causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,18 +13,18 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.start_end_packer import StartEndPacker
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
-from keras_nlp.src.utils.tensor_utils import strip_to_ragged
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.start_end_packer import StartEndPacker
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import strip_to_ragged
 
 
-@keras_nlp_export("keras_nlp.models.CausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.CausalLMPreprocessor")
 class CausalLMPreprocessor(Preprocessor):
     """Base class for causal language modeling preprocessing layers.
 
-    `CausalLMPreprocessor` tasks wrap a `keras_nlp.tokenizer.Tokenizer` to
+    `CausalLMPreprocessor` tasks wrap a `keras_hub.tokenizer.Tokenizer` to
     create a preprocessing layer for causal language modeling tasks. It is
     intended to be paired with a `keras.models.CausalLM` task.
 
@@ -48,7 +48,7 @@ class CausalLMPreprocessor(Preprocessor):
 
     Examples.
     ```python
-    preprocessor = keras_nlp.models.CausalLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.CausalLMPreprocessor.from_preset(
         "bert_base_en_uncased",
         sequence_length=256, # Optional.
     )
diff --git a/keras_nlp/src/models/causal_lm_preprocessor_test.py b/keras_hub/src/models/causal_lm_preprocessor_test.py
similarity index 85%
rename from keras_nlp/src/models/causal_lm_preprocessor_test.py
rename to keras_hub/src/models/causal_lm_preprocessor_test.py
index 969674e1b5..ce193907d3 100644
--- a/keras_nlp/src/models/causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,13 +13,13 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.gpt2.gpt2_causal_lm_preprocessor import (
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.gpt2.gpt2_causal_lm_preprocessor import (
     GPT2CausalLMPreprocessor,
 )
-from keras_nlp.src.models.gpt2.gpt2_preprocessor import GPT2Preprocessor
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt2.gpt2_preprocessor import GPT2Preprocessor
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestCausalLMPreprocessor(TestCase):
diff --git a/keras_hub/src/models/csp_darknet/__init__.py b/keras_hub/src/models/csp_darknet/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/csp_darknet/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/csp_darknet/csp_darknet_backbone.py b/keras_hub/src/models/csp_darknet/csp_darknet_backbone.py
similarity index 97%
rename from keras_nlp/src/models/csp_darknet/csp_darknet_backbone.py
rename to keras_hub/src/models/csp_darknet/csp_darknet_backbone.py
index 40efb6de04..ab33823405 100644
--- a/keras_nlp/src/models/csp_darknet/csp_darknet_backbone.py
+++ b/keras_hub/src/models/csp_darknet/csp_darknet_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 import keras
 from keras import layers
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
 
 
-@keras_nlp_export("keras_nlp.models.CSPDarkNetBackbone")
+@keras_hub_export("keras_hub.models.CSPDarkNetBackbone")
 class CSPDarkNetBackbone(FeaturePyramidBackbone):
     """This class represents Keras Backbone of CSPDarkNet model.
 
@@ -46,13 +46,13 @@ class CSPDarkNetBackbone(FeaturePyramidBackbone):
     input_data = np.ones(shape=(8, 224, 224, 3))
 
     # Pretrained backbone
-    model = keras_nlp.models.CSPDarkNetBackbone.from_preset(
+    model = keras_hub.models.CSPDarkNetBackbone.from_preset(
         "csp_darknet_tiny_imagenet"
     )
     model(input_data)
 
     # Randomly initialized backbone with a custom config
-    model = keras_nlp.models.CSPDarkNetBackbone(
+    model = keras_hub.models.CSPDarkNetBackbone(
         stackwise_num_filters=[128, 256, 512, 1024],
         stackwise_depth=[3, 9, 9, 3],
         include_rescaling=False,
diff --git a/keras_nlp/src/models/csp_darknet/csp_darknet_backbone_test.py b/keras_hub/src/models/csp_darknet/csp_darknet_backbone_test.py
similarity index 91%
rename from keras_nlp/src/models/csp_darknet/csp_darknet_backbone_test.py
rename to keras_hub/src/models/csp_darknet/csp_darknet_backbone_test.py
index ed6dc7b525..5eb960e3ba 100644
--- a/keras_nlp/src/models/csp_darknet/csp_darknet_backbone_test.py
+++ b/keras_hub/src/models/csp_darknet/csp_darknet_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.csp_darknet.csp_darknet_backbone import (
+from keras_hub.src.models.csp_darknet.csp_darknet_backbone import (
     CSPDarkNetBackbone,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class CSPDarkNetBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/csp_darknet/csp_darknet_image_classifier.py b/keras_hub/src/models/csp_darknet/csp_darknet_image_classifier.py
similarity index 85%
rename from keras_nlp/src/models/csp_darknet/csp_darknet_image_classifier.py
rename to keras_hub/src/models/csp_darknet/csp_darknet_image_classifier.py
index 09a7022122..28069e7d9f 100644
--- a/keras_nlp/src/models/csp_darknet/csp_darknet_image_classifier.py
+++ b/keras_hub/src/models/csp_darknet/csp_darknet_image_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,19 +13,19 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.csp_darknet.csp_darknet_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.csp_darknet.csp_darknet_backbone import (
     CSPDarkNetBackbone,
 )
-from keras_nlp.src.models.image_classifier import ImageClassifier
+from keras_hub.src.models.image_classifier import ImageClassifier
 
 
-@keras_nlp_export("keras_nlp.models.CSPDarkNetImageClassifier")
+@keras_hub_export("keras_hub.models.CSPDarkNetImageClassifier")
 class CSPDarkNetImageClassifier(ImageClassifier):
     """CSPDarkNet image classifier task model.
 
     Args:
-        backbone: A `keras_nlp.models.CSPDarkNetBackbone` instance.
+        backbone: A `keras_hub.models.CSPDarkNetBackbone` instance.
         num_classes: int. The number of classes to predict.
         activation: `None`, str or callable. The activation function to use on
             the `Dense` layer. Set `activation=None` to return the output
@@ -42,7 +42,7 @@ class CSPDarkNetImageClassifier(ImageClassifier):
     ```python
     # Load preset and train
     images = np.ones((2, 224, 224, 3), dtype="float32")
-    classifier = keras_nlp.models.CSPDarkNetImageClassifier.from_preset(
+    classifier = keras_hub.models.CSPDarkNetImageClassifier.from_preset(
         "csp_darknet_tiny_imagenet")
     classifier.predict(images)
     ```
@@ -52,14 +52,14 @@ class CSPDarkNetImageClassifier(ImageClassifier):
     # Load preset and train
     images = np.ones((2, 224, 224, 3), dtype="float32")
     labels = [0, 3]
-    classifier = keras_nlp.models.CSPDarkNetImageClassifier.from_preset(
+    classifier = keras_hub.models.CSPDarkNetImageClassifier.from_preset(
         "csp_darknet_tiny_imagenet")
     classifier.fit(x=images, y=labels, batch_size=2)
     ```
 
     Call `fit()` with custom loss, optimizer and backbone.
     ```python
-    classifier = keras_nlp.models.CSPDarkNetImageClassifier.from_preset(
+    classifier = keras_hub.models.CSPDarkNetImageClassifier.from_preset(
         "csp_darknet_tiny_imagenet")
     classifier.compile(
         loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
@@ -73,14 +73,14 @@ class CSPDarkNetImageClassifier(ImageClassifier):
     ```python
     images = np.ones((2, 224, 224, 3), dtype="float32")
     labels = [0, 3]
-    backbone = keras_nlp.models.CSPDarkNetBackbone(
+    backbone = keras_hub.models.CSPDarkNetBackbone(
         stackwise_num_filters=[128, 256, 512, 1024],
         stackwise_depth=[3, 9, 9, 3],
         include_rescaling=False,
         block_type="basic_block",
         image_shape = (224, 224, 3),
     )
-    classifier = keras_nlp.models.CSPDarkNetImageClassifier(
+    classifier = keras_hub.models.CSPDarkNetImageClassifier(
         backbone=backbone,
         num_classes=4,
     )
diff --git a/keras_nlp/src/models/csp_darknet/csp_darknet_image_classifier_test.py b/keras_hub/src/models/csp_darknet/csp_darknet_image_classifier_test.py
similarity index 89%
rename from keras_nlp/src/models/csp_darknet/csp_darknet_image_classifier_test.py
rename to keras_hub/src/models/csp_darknet/csp_darknet_image_classifier_test.py
index 33261c25b6..f3735be2fe 100644
--- a/keras_nlp/src/models/csp_darknet/csp_darknet_image_classifier_test.py
+++ b/keras_hub/src/models/csp_darknet/csp_darknet_image_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 The KerasNLP Authors
+# Copyright 2023 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,13 +14,13 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.csp_darknet.csp_darknet_backbone import (
+from keras_hub.src.models.csp_darknet.csp_darknet_backbone import (
     CSPDarkNetBackbone,
 )
-from keras_nlp.src.models.csp_darknet.csp_darknet_image_classifier import (
+from keras_hub.src.models.csp_darknet.csp_darknet_image_classifier import (
     CSPDarkNetImageClassifier,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class CSPDarkNetImageClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/deberta_v3/__init__.py b/keras_hub/src/models/deberta_v3/__init__.py
similarity index 73%
rename from keras_nlp/src/models/deberta_v3/__init__.py
rename to keras_hub/src/models/deberta_v3/__init__.py
index 332fd35a10..2d099d5d43 100644
--- a/keras_nlp/src/models/deberta_v3/__init__.py
+++ b/keras_hub/src/models/deberta_v3/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,10 +12,10 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     DebertaV3Backbone,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.deberta_v3.deberta_v3_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, DebertaV3Backbone)
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_backbone.py b/keras_hub/src/models/deberta_v3/deberta_v3_backbone.py
similarity index 93%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_backbone.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_backbone.py
index eca4101641..57ac9717da 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_backbone.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,22 +15,22 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.deberta_v3.disentangled_attention_encoder import (
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.deberta_v3.disentangled_attention_encoder import (
     DisentangledAttentionEncoder,
 )
-from keras_nlp.src.models.deberta_v3.relative_embedding import RelativeEmbedding
+from keras_hub.src.models.deberta_v3.relative_embedding import RelativeEmbedding
 
 
 def deberta_kernel_initializer(stddev=0.02):
     return keras.initializers.TruncatedNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.DebertaV3Backbone")
+@keras_hub_export("keras_hub.models.DebertaV3Backbone")
 class DebertaV3Backbone(Backbone):
     """DeBERTa encoder network.
 
@@ -80,13 +80,13 @@ class DebertaV3Backbone(Backbone):
     }
 
     # Pretrained DeBERTa encoder.
-    model = keras_nlp.models.DebertaV3Backbone.from_preset(
+    model = keras_hub.models.DebertaV3Backbone.from_preset(
         "deberta_v3_base_en",
     )
     model(input_data)
 
     # Randomly initialized DeBERTa encoder with custom config
-    model = keras_nlp.models.DebertaV3Backbone(
+    model = keras_hub.models.DebertaV3Backbone(
         vocabulary_size=128100,
         num_layers=12,
         num_heads=6,
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_backbone_test.py b/keras_hub/src/models/deberta_v3/deberta_v3_backbone_test.py
similarity index 93%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_backbone_test.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_backbone_test.py
index 72bf6ef7cd..9f3a4d4f7e 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_backbone_test.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     DebertaV3Backbone,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DebertaV3BackboneTest(TestCase):
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm.py b/keras_hub/src/models/deberta_v3/deberta_v3_masked_lm.py
similarity index 85%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_masked_lm.py
index a35928094e..e705f3a056 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_masked_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,21 +15,21 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.masked_lm_head import MaskedLMHead
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.masked_lm_head import MaskedLMHead
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     DebertaV3Backbone,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     deberta_kernel_initializer,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_masked_lm_preprocessor import (
+from keras_hub.src.models.deberta_v3.deberta_v3_masked_lm_preprocessor import (
     DebertaV3MaskedLMPreprocessor,
 )
-from keras_nlp.src.models.masked_lm import MaskedLM
+from keras_hub.src.models.masked_lm import MaskedLM
 
 
-@keras_nlp_export("keras_nlp.models.DebertaV3MaskedLM")
+@keras_hub_export("keras_hub.models.DebertaV3MaskedLM")
 class DebertaV3MaskedLM(MaskedLM):
     """An end-to-end DeBERTaV3 model for the masked language modeling task.
 
@@ -50,8 +50,8 @@ class DebertaV3MaskedLM(MaskedLM):
     [here](https://github.com/microsoft/DeBERTa).
 
     Args:
-        backbone: A `keras_nlp.models.DebertaV3Backbone` instance.
-        preprocessor: A `keras_nlp.models.DebertaV3MaskedLMPreprocessor` or
+        backbone: A `keras_hub.models.DebertaV3Backbone` instance.
+        preprocessor: A `keras_hub.models.DebertaV3MaskedLMPreprocessor` or
             `None`. If `None`, this model will not apply preprocessing, and
             inputs should be preprocessed before calling the model.
 
@@ -62,7 +62,7 @@ class DebertaV3MaskedLM(MaskedLM):
     features = ["The quick brown fox jumped.", "I forgot my homework."]
 
     # Pretrained language model.
-    masked_lm = keras_nlp.models.DebertaV3MaskedLM.from_preset(
+    masked_lm = keras_hub.models.DebertaV3MaskedLM.from_preset(
         "deberta_v3_base_en",
     )
     masked_lm.fit(x=features, batch_size=2)
@@ -90,7 +90,7 @@ class DebertaV3MaskedLM(MaskedLM):
     # Labels are the original masked values.
     labels = [[3, 5]] * 2
 
-    masked_lm = keras_nlp.models.DebertaV3MaskedLM.from_preset(
+    masked_lm = keras_hub.models.DebertaV3MaskedLM.from_preset(
         "deberta_v3_base_en",
         preprocessor=None,
     )
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor.py b/keras_hub/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor.py
similarity index 86%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor.py
index 6823d1a8eb..7244c5ffcd 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,24 +14,24 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     DebertaV3Backbone,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_tokenizer import (
+from keras_hub.src.models.deberta_v3.deberta_v3_tokenizer import (
     DebertaV3Tokenizer,
 )
-from keras_nlp.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.models.DebertaV3MaskedLMPreprocessor")
+@keras_hub_export("keras_hub.models.DebertaV3MaskedLMPreprocessor")
 class DebertaV3MaskedLMPreprocessor(MaskedLMPreprocessor):
     """DeBERTa preprocessing for the masked language modeling task.
 
     This preprocessing layer will prepare inputs for a masked language modeling
     task. It is primarily intended for use with the
-    `keras_nlp.models.DebertaV3MaskedLM` task model. Preprocessing will occur in
+    `keras_hub.models.DebertaV3MaskedLM` task model. Preprocessing will occur in
     multiple steps.
 
     - Tokenize any number of input segments using the `tokenizer`.
@@ -42,10 +42,10 @@ class DebertaV3MaskedLMPreprocessor(MaskedLMPreprocessor):
     - Randomly select non-special tokens to mask, controlled by
       `mask_selection_rate`.
     - Construct a `(x, y, sample_weight)` tuple suitable for training with a
-      `keras_nlp.models.DebertaV3MaskedLM` task model.
+      `keras_hub.models.DebertaV3MaskedLM` task model.
 
     Args:
-        tokenizer: A `keras_nlp.models.DebertaV3Tokenizer` instance.
+        tokenizer: A `keras_hub.models.DebertaV3Tokenizer` instance.
         sequence_length: The length of the packed inputs.
         mask_selection_rate: The probability an input token will be dynamically
             masked.
@@ -75,7 +75,7 @@ class DebertaV3MaskedLMPreprocessor(MaskedLMPreprocessor):
     Examples:
     Directly calling the layer on data.
     ```python
-    preprocessor = keras_nlp.models.DebertaV3MaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.DebertaV3MaskedLMPreprocessor.from_preset(
         "deberta_v3_base_en"
     )
 
@@ -94,7 +94,7 @@ class DebertaV3MaskedLMPreprocessor(MaskedLMPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.DebertaV3MaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.DebertaV3MaskedLMPreprocessor.from_preset(
         "deberta_v3_base_en"
     )
 
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor_test.py b/keras_hub/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor_test.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor_test.py
index 85dea22740..390a7ce606 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor_test.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_masked_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,13 +16,13 @@
 
 import pytest
 
-from keras_nlp.src.models.deberta_v3.deberta_v3_masked_lm_preprocessor import (
+from keras_hub.src.models.deberta_v3.deberta_v3_masked_lm_preprocessor import (
     DebertaV3MaskedLMPreprocessor,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_tokenizer import (
+from keras_hub.src.models.deberta_v3.deberta_v3_tokenizer import (
     DebertaV3Tokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DebertaV3MaskedLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm_test.py b/keras_hub/src/models/deberta_v3/deberta_v3_masked_lm_test.py
similarity index 88%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm_test.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_masked_lm_test.py
index 6d2c73115e..1d68edfc6a 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_masked_lm_test.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_masked_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,19 +16,19 @@
 
 import pytest
 
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     DebertaV3Backbone,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_masked_lm import (
+from keras_hub.src.models.deberta_v3.deberta_v3_masked_lm import (
     DebertaV3MaskedLM,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_masked_lm_preprocessor import (
+from keras_hub.src.models.deberta_v3.deberta_v3_masked_lm_preprocessor import (
     DebertaV3MaskedLMPreprocessor,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_tokenizer import (
+from keras_hub.src.models.deberta_v3.deberta_v3_tokenizer import (
     DebertaV3Tokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DebertaV3MaskedLMTest(TestCase):
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_presets.py b/keras_hub/src/models/deberta_v3/deberta_v3_presets.py
similarity index 98%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_presets.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_presets.py
index 3b7d4a7b2c..8e074fb7b1 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_presets.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier.py b/keras_hub/src/models/deberta_v3/deberta_v3_text_classifier.py
similarity index 87%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_text_classifier.py
index a888398690..aaad4122f7 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_text_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,30 +15,30 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     DebertaV3Backbone,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     deberta_kernel_initializer,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_text_classifier_preprocessor import (
+from keras_hub.src.models.deberta_v3.deberta_v3_text_classifier_preprocessor import (
     DebertaV3TextClassifierPreprocessor,
 )
-from keras_nlp.src.models.text_classifier import TextClassifier
+from keras_hub.src.models.text_classifier import TextClassifier
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.DebertaV3TextClassifier",
-        "keras_nlp.models.DebertaV3Classifier",
+        "keras_hub.models.DebertaV3TextClassifier",
+        "keras_hub.models.DebertaV3Classifier",
     ]
 )
 class DebertaV3TextClassifier(TextClassifier):
     """An end-to-end DeBERTa model for classification tasks.
 
     This model attaches a classification head to a
-    `keras_nlp.model.DebertaV3Backbone` model, mapping from the backbone
+    `keras_hub.model.DebertaV3Backbone` model, mapping from the backbone
     outputs to logit output suitable for a classification task. For usage of
     this model with pre-trained weights, see the `from_preset()` method.
 
@@ -56,9 +56,9 @@ class DebertaV3TextClassifier(TextClassifier):
     [here](https://github.com/microsoft/DeBERTa).
 
     Args:
-        backbone: A `keras_nlp.models.DebertaV3` instance.
+        backbone: A `keras_hub.models.DebertaV3` instance.
         num_classes: int. Number of classes to predict.
-        preprocessor: A `keras_nlp.models.DebertaV3TextClassifierPreprocessor` or `None`. If
+        preprocessor: A `keras_hub.models.DebertaV3TextClassifierPreprocessor` or `None`. If
             `None`, this model will not apply preprocessing, and inputs should
             be preprocessed before calling the model.
         activation: Optional `str` or callable. The
@@ -77,7 +77,7 @@ class DebertaV3TextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier.
-    classifier = keras_nlp.models.DebertaV3TextClassifier.from_preset(
+    classifier = keras_hub.models.DebertaV3TextClassifier.from_preset(
         "deberta_v3_base_en",
         num_classes=4,
     )
@@ -105,7 +105,7 @@ class DebertaV3TextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier without preprocessing.
-    classifier = keras_nlp.models.DebertaV3TextClassifier.from_preset(
+    classifier = keras_hub.models.DebertaV3TextClassifier.from_preset(
         "deberta_v3_base_en",
         num_classes=4,
         preprocessor=None,
@@ -134,14 +134,14 @@ class DebertaV3TextClassifier(TextClassifier):
         eos_piece="[SEP]",
         unk_piece="[UNK]",
     )
-    tokenizer = keras_nlp.models.DebertaV3Tokenizer(
+    tokenizer = keras_hub.models.DebertaV3Tokenizer(
         proto=bytes_io.getvalue(),
     )
-    preprocessor = keras_nlp.models.DebertaV3TextClassifierPreprocessor(
+    preprocessor = keras_hub.models.DebertaV3TextClassifierPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.DebertaV3Backbone(
+    backbone = keras_hub.models.DebertaV3Backbone(
         vocabulary_size=30552,
         num_layers=4,
         num_heads=4,
@@ -149,7 +149,7 @@ class DebertaV3TextClassifier(TextClassifier):
         intermediate_dim=512,
         max_sequence_length=128,
     )
-    classifier = keras_nlp.models.DebertaV3TextClassifier(
+    classifier = keras_hub.models.DebertaV3TextClassifier(
         backbone=backbone,
         preprocessor=preprocessor,
         num_classes=4,
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor.py b/keras_hub/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor.py
similarity index 86%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor.py
index 9634b77238..e19c6bc346 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,23 +14,23 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     DebertaV3Backbone,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_tokenizer import (
+from keras_hub.src.models.deberta_v3.deberta_v3_tokenizer import (
     DebertaV3Tokenizer,
 )
-from keras_nlp.src.models.text_classifier_preprocessor import (
+from keras_hub.src.models.text_classifier_preprocessor import (
     TextClassifierPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.DebertaV3TextClassifierPreprocessor",
-        "keras_nlp.models.DebertaV3Preprocessor",
+        "keras_hub.models.DebertaV3TextClassifierPreprocessor",
+        "keras_hub.models.DebertaV3Preprocessor",
     ]
 )
 class DebertaV3TextClassifierPreprocessor(TextClassifierPreprocessor):
@@ -39,7 +39,7 @@ class DebertaV3TextClassifierPreprocessor(TextClassifierPreprocessor):
     This preprocessing layer will do three things:
 
      - Tokenize any number of input segments using the `tokenizer`.
-     - Pack the inputs together using a `keras_nlp.layers.MultiSegmentPacker`.
+     - Pack the inputs together using a `keras_hub.layers.MultiSegmentPacker`.
        with the appropriate `"[CLS]"`, `"[SEP]"` and `"[PAD]"` tokens.
      - Construct a dictionary with keys `"token_ids"` and `"padding_mask"`, that
        can be passed directly to a DeBERTa model.
@@ -62,7 +62,7 @@ class DebertaV3TextClassifierPreprocessor(TextClassifierPreprocessor):
     the layer, e.g. `ds.map(lambda seg1, seg2: preprocessor(x=(seg1, seg2)))`.
 
     Args:
-        tokenizer: A `keras_nlp.models.DebertaV3Tokenizer` instance.
+        tokenizer: A `keras_hub.models.DebertaV3Tokenizer` instance.
         sequence_length: The length of the packed inputs.
         truncate: string. The algorithm to truncate a list of batched segments
             to fit within `sequence_length`. The value can be either
@@ -78,7 +78,7 @@ class DebertaV3TextClassifierPreprocessor(TextClassifierPreprocessor):
     Examples:
     Directly calling the layer on data.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "deberta_v3_base_en"
     )
 
@@ -111,10 +111,10 @@ class DebertaV3TextClassifierPreprocessor(TextClassifierPreprocessor):
         eos_piece="[SEP]",
         unk_piece="[UNK]",
     )
-    tokenizer = keras_nlp.models.DebertaV3Tokenizer(
+    tokenizer = keras_hub.models.DebertaV3Tokenizer(
         proto=bytes_io.getvalue(),
     )
-    preprocessor = keras_nlp.models.DebertaV3TextClassifierPreprocessor(
+    preprocessor = keras_hub.models.DebertaV3TextClassifierPreprocessor(
         tokenizer
     )
     preprocessor("The quick brown fox jumped.")
@@ -122,7 +122,7 @@ class DebertaV3TextClassifierPreprocessor(TextClassifierPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "deberta_v3_base_en"
     )
 
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor_test.py b/keras_hub/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor_test.py
similarity index 91%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor_test.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor_test.py
index 29a55b19f5..58da798dc2 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor_test.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_text_classifier_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,13 +16,13 @@
 
 import pytest
 
-from keras_nlp.src.models.deberta_v3.deberta_v3_text_classifier_preprocessor import (
+from keras_hub.src.models.deberta_v3.deberta_v3_text_classifier_preprocessor import (
     DebertaV3TextClassifierPreprocessor,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_tokenizer import (
+from keras_hub.src.models.deberta_v3.deberta_v3_tokenizer import (
     DebertaV3Tokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DebertaV3TextClassifierPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier_test.py b/keras_hub/src/models/deberta_v3/deberta_v3_text_classifier_test.py
similarity index 88%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier_test.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_text_classifier_test.py
index 8b3bd06fb4..caffb40711 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_text_classifier_test.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_text_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,19 +16,19 @@
 
 import pytest
 
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     DebertaV3Backbone,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_text_classifier import (
+from keras_hub.src.models.deberta_v3.deberta_v3_text_classifier import (
     DebertaV3TextClassifier,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_text_classifier_preprocessor import (
+from keras_hub.src.models.deberta_v3.deberta_v3_text_classifier_preprocessor import (
     DebertaV3TextClassifierPreprocessor,
 )
-from keras_nlp.src.models.deberta_v3.deberta_v3_tokenizer import (
+from keras_hub.src.models.deberta_v3.deberta_v3_tokenizer import (
     DebertaV3Tokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DebertaV3TextClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_tokenizer.py b/keras_hub/src/models/deberta_v3/deberta_v3_tokenizer.py
similarity index 90%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_tokenizer.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_tokenizer.py
index 168df159ec..e6b1936967 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_tokenizer.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.deberta_v3.deberta_v3_backbone import (
     DebertaV3Backbone,
 )
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
 
@@ -27,17 +27,17 @@
     tf = None
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.DebertaV3Tokenizer",
-        "keras_nlp.models.DebertaV3Tokenizer",
+        "keras_hub.tokenizers.DebertaV3Tokenizer",
+        "keras_hub.models.DebertaV3Tokenizer",
     ]
 )
 class DebertaV3Tokenizer(SentencePieceTokenizer):
     """DeBERTa tokenizer layer based on SentencePiece.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.SentencePieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.SentencePieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by
     DeBERTa models and provides a `from_preset()` method to automatically
     download a matching vocabulary for a DeBERTa preset.
@@ -63,7 +63,7 @@ class DebertaV3Tokenizer(SentencePieceTokenizer):
 
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.DebertaV3Tokenizer.from_preset(
+    tokenizer = keras_hub.models.DebertaV3Tokenizer.from_preset(
         "deberta_v3_base_en",
     )
     tokenizer("The quick brown fox jumped.")
@@ -91,7 +91,7 @@ class DebertaV3Tokenizer(SentencePieceTokenizer):
         eos_piece="[SEP]",
         unk_piece="[UNK]",
     )
-    tokenizer = keras_nlp.models.DebertaV3Tokenizer(
+    tokenizer = keras_hub.models.DebertaV3Tokenizer(
         proto=bytes_io.getvalue(),
     )
     tokenizer("The quick brown fox jumped.")
diff --git a/keras_nlp/src/models/deberta_v3/deberta_v3_tokenizer_test.py b/keras_hub/src/models/deberta_v3/deberta_v3_tokenizer_test.py
similarity index 94%
rename from keras_nlp/src/models/deberta_v3/deberta_v3_tokenizer_test.py
rename to keras_hub/src/models/deberta_v3/deberta_v3_tokenizer_test.py
index 4ffa632523..f507e55ade 100644
--- a/keras_nlp/src/models/deberta_v3/deberta_v3_tokenizer_test.py
+++ b/keras_hub/src/models/deberta_v3/deberta_v3_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,10 +16,10 @@
 
 import pytest
 
-from keras_nlp.src.models.deberta_v3.deberta_v3_tokenizer import (
+from keras_hub.src.models.deberta_v3.deberta_v3_tokenizer import (
     DebertaV3Tokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DebertaV3TokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/deberta_v3/disentangled_attention_encoder.py b/keras_hub/src/models/deberta_v3/disentangled_attention_encoder.py
similarity index 96%
rename from keras_nlp/src/models/deberta_v3/disentangled_attention_encoder.py
rename to keras_hub/src/models/deberta_v3/disentangled_attention_encoder.py
index 0d3d64d6e0..6855f8aa96 100644
--- a/keras_nlp/src/models/deberta_v3/disentangled_attention_encoder.py
+++ b/keras_hub/src/models/deberta_v3/disentangled_attention_encoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,12 +14,12 @@
 
 import keras
 
-from keras_nlp.src.models.deberta_v3.disentangled_self_attention import (
+from keras_hub.src.models.deberta_v3.disentangled_self_attention import (
     DisentangledSelfAttention,
 )
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.utils.keras_utils import clone_initializer
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (  # isort:skip
+from keras_hub.src.layers.modeling.transformer_layer_utils import (  # isort:skip
     merge_padding_and_attention_mask,
 )
 
@@ -34,7 +34,7 @@ class DisentangledAttentionEncoder(keras.layers.Layer):
     an encoder model which has disentangled self-attention.
 
     `DisentangledAttentionEncoder` is similar to
-    `keras_nlp.layers.TransformerEncoder`, except for the attention layer - it
+    `keras_hub.layers.TransformerEncoder`, except for the attention layer - it
     uses disentangled self-attention instead of multi-head attention.
 
     Args:
diff --git a/keras_nlp/src/models/deberta_v3/disentangled_self_attention.py b/keras_hub/src/models/deberta_v3/disentangled_self_attention.py
similarity index 99%
rename from keras_nlp/src/models/deberta_v3/disentangled_self_attention.py
rename to keras_hub/src/models/deberta_v3/disentangled_self_attention.py
index 570a0e64d3..f95d40b3fd 100644
--- a/keras_nlp/src/models/deberta_v3/disentangled_self_attention.py
+++ b/keras_hub/src/models/deberta_v3/disentangled_self_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,7 +17,7 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class DisentangledSelfAttention(keras.layers.Layer):
diff --git a/keras_nlp/src/models/deberta_v3/relative_embedding.py b/keras_hub/src/models/deberta_v3/relative_embedding.py
similarity index 98%
rename from keras_nlp/src/models/deberta_v3/relative_embedding.py
rename to keras_hub/src/models/deberta_v3/relative_embedding.py
index c437ccd770..7b19ba6b5a 100644
--- a/keras_nlp/src/models/deberta_v3/relative_embedding.py
+++ b/keras_hub/src/models/deberta_v3/relative_embedding.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_hub/src/models/densenet/__init__.py b/keras_hub/src/models/densenet/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/densenet/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/densenet/densenet_backbone.py b/keras_hub/src/models/densenet/densenet_backbone.py
similarity index 95%
rename from keras_nlp/src/models/densenet/densenet_backbone.py
rename to keras_hub/src/models/densenet/densenet_backbone.py
index 13e3d8597f..d5de88f2ae 100644
--- a/keras_nlp/src/models/densenet/densenet_backbone.py
+++ b/keras_hub/src/models/densenet/densenet_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,13 +13,13 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
 
 BN_EPSILON = 1.001e-5
 
 
-@keras_nlp_export("keras_nlp.models.DenseNetBackbone")
+@keras_hub_export("keras_hub.models.DenseNetBackbone")
 class DenseNetBackbone(FeaturePyramidBackbone):
     """Instantiates the DenseNet architecture.
 
@@ -45,11 +45,11 @@ class DenseNetBackbone(FeaturePyramidBackbone):
     input_data = np.ones(shape=(8, 224, 224, 3))
 
     # Pretrained backbone
-    model = keras_nlp.models.DenseNetBackbone.from_preset("densenet121_imagenet")
+    model = keras_hub.models.DenseNetBackbone.from_preset("densenet121_imagenet")
     model(input_data)
 
     # Randomly initialized backbone with a custom config
-    model = keras_nlp.models.DenseNetBackbone(
+    model = keras_hub.models.DenseNetBackbone(
         stackwise_num_repeats=[6, 12, 24, 16],
         include_rescaling=False,
     )
diff --git a/keras_nlp/src/models/densenet/densenet_backbone_test.py b/keras_hub/src/models/densenet/densenet_backbone_test.py
similarity index 91%
rename from keras_nlp/src/models/densenet/densenet_backbone_test.py
rename to keras_hub/src/models/densenet/densenet_backbone_test.py
index 7720411319..f8120f4c55 100644
--- a/keras_nlp/src/models/densenet/densenet_backbone_test.py
+++ b/keras_hub/src/models/densenet/densenet_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.densenet.densenet_backbone import DenseNetBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.densenet.densenet_backbone import DenseNetBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DenseNetBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/densenet/densenet_image_classifier.py b/keras_hub/src/models/densenet/densenet_image_classifier.py
similarity index 85%
rename from keras_nlp/src/models/densenet/densenet_image_classifier.py
rename to keras_hub/src/models/densenet/densenet_image_classifier.py
index 130904be70..6bd7bbbaa1 100644
--- a/keras_nlp/src/models/densenet/densenet_image_classifier.py
+++ b/keras_hub/src/models/densenet/densenet_image_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,12 +13,12 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.densenet.densenet_backbone import DenseNetBackbone
-from keras_nlp.src.models.image_classifier import ImageClassifier
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.densenet.densenet_backbone import DenseNetBackbone
+from keras_hub.src.models.image_classifier import ImageClassifier
 
 
-@keras_nlp_export("keras_nlp.models.DenseNetImageClassifier")
+@keras_hub_export("keras_hub.models.DenseNetImageClassifier")
 class DenseNetImageClassifier(ImageClassifier):
     """DenseNet image classifier task model.
 
@@ -28,7 +28,7 @@ class DenseNetImageClassifier(ImageClassifier):
     be used to load a pre-trained config and weights.
 
     Args:
-        backbone: A `keras_nlp.models.DenseNetBackbone` instance.
+        backbone: A `keras_hub.models.DenseNetBackbone` instance.
         num_classes: int. The number of classes to predict.
         activation: `None`, str or callable. The activation function to use on
             the `Dense` layer. Set `activation=None` to return the output
@@ -40,7 +40,7 @@ class DenseNetImageClassifier(ImageClassifier):
     ```python
     # Load preset and train
     images = np.ones((2, 224, 224, 3), dtype="float32")
-    classifier = keras_nlp.models.DenseNetImageClassifier.from_preset(
+    classifier = keras_hub.models.DenseNetImageClassifier.from_preset(
         "densenet121_imagenet")
     classifier.predict(images)
     ```
@@ -50,14 +50,14 @@ class DenseNetImageClassifier(ImageClassifier):
     # Load preset and train
     images = np.ones((2, 224, 224, 3), dtype="float32")
     labels = [0, 3]
-    classifier = keras_nlp.models.DenseNetImageClassifier.from_preset(
+    classifier = keras_hub.models.DenseNetImageClassifier.from_preset(
         "densenet121_imagenet")
     classifier.fit(x=images, y=labels, batch_size=2)
     ```
 
     Call `fit()` with custom loss, optimizer and backbone.
     ```python
-    classifier = keras_nlp.models.DenseNetImageClassifier.from_preset(
+    classifier = keras_hub.models.DenseNetImageClassifier.from_preset(
         "densenet121_imagenet")
     classifier.compile(
         loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
@@ -71,14 +71,14 @@ class DenseNetImageClassifier(ImageClassifier):
     ```python
     images = np.ones((2, 224, 224, 3), dtype="float32")
     labels = [0, 3]
-    backbone = keras_nlp.models.DenseNetBackbone(
+    backbone = keras_hub.models.DenseNetBackbone(
         stackwise_num_filters=[128, 256, 512, 1024],
         stackwise_depth=[3, 9, 9, 3],
         include_rescaling=False,
         block_type="basic_block",
         image_shape = (224, 224, 3),
     )
-    classifier = keras_nlp.models.DenseNetImageClassifier(
+    classifier = keras_hub.models.DenseNetImageClassifier(
         backbone=backbone,
         num_classes=4,
     )
diff --git a/keras_nlp/src/models/densenet/densenet_image_classifier_test.py b/keras_hub/src/models/densenet/densenet_image_classifier_test.py
similarity index 89%
rename from keras_nlp/src/models/densenet/densenet_image_classifier_test.py
rename to keras_hub/src/models/densenet/densenet_image_classifier_test.py
index 439a60008d..b4bb19d35a 100644
--- a/keras_nlp/src/models/densenet/densenet_image_classifier_test.py
+++ b/keras_hub/src/models/densenet/densenet_image_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 The KerasNLP Authors
+# Copyright 2023 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.densenet.densenet_backbone import DenseNetBackbone
-from keras_nlp.src.models.densenet.densenet_image_classifier import (
+from keras_hub.src.models.densenet.densenet_backbone import DenseNetBackbone
+from keras_hub.src.models.densenet.densenet_image_classifier import (
     DenseNetImageClassifier,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DenseNetImageClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/distil_bert/__init__.py b/keras_hub/src/models/distil_bert/__init__.py
similarity index 74%
rename from keras_nlp/src/models/distil_bert/__init__.py
rename to keras_hub/src/models/distil_bert/__init__.py
index e65dccc0f7..da9a8ed9c3 100644
--- a/keras_nlp/src/models/distil_bert/__init__.py
+++ b/keras_hub/src/models/distil_bert/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,12 +12,12 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_presets import (
+from keras_hub.src.models.distil_bert.distil_bert_presets import (
     backbone_presets,
 )
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, DistilBertBackbone)
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_backbone.py b/keras_hub/src/models/distil_bert/distil_bert_backbone.py
similarity index 93%
rename from keras_nlp/src/models/distil_bert/distil_bert_backbone.py
rename to keras_hub/src/models/distil_bert/distil_bert_backbone.py
index 8fb5603a33..6e31d38cad 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_backbone.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,19 +15,19 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.token_and_position_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.token_and_position_embedding import (
     TokenAndPositionEmbedding,
 )
-from keras_nlp.src.layers.modeling.transformer_encoder import TransformerEncoder
-from keras_nlp.src.models.backbone import Backbone
+from keras_hub.src.layers.modeling.transformer_encoder import TransformerEncoder
+from keras_hub.src.models.backbone import Backbone
 
 
 def distilbert_kernel_initializer(stddev=0.02):
     return keras.initializers.TruncatedNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.DistilBertBackbone")
+@keras_hub_export("keras_hub.models.DistilBertBackbone")
 class DistilBertBackbone(Backbone):
     """A DistilBERT encoder network.
 
@@ -73,13 +73,13 @@ class DistilBertBackbone(Backbone):
     }
 
     # Pretrained DistilBERT encoder.
-    model = keras_nlp.models.DistilBertBackbone.from_preset(
+    model = keras_hub.models.DistilBertBackbone.from_preset(
         "distil_bert_base_en_uncased"
     )
     model(input_data)
 
     # Randomly initialized DistilBERT encoder with custom config.
-    model = keras_nlp.models.DistilBertBackbone(
+    model = keras_hub.models.DistilBertBackbone(
         vocabulary_size=30552,
         num_layers=4,
         num_heads=4,
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_backbone_test.py b/keras_hub/src/models/distil_bert/distil_bert_backbone_test.py
similarity index 94%
rename from keras_nlp/src/models/distil_bert/distil_bert_backbone_test.py
rename to keras_hub/src/models/distil_bert/distil_bert_backbone_test.py
index 0d10b24b42..240564c7bb 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_backbone_test.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DistilBertBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_masked_lm.py b/keras_hub/src/models/distil_bert/distil_bert_masked_lm.py
similarity index 85%
rename from keras_nlp/src/models/distil_bert/distil_bert_masked_lm.py
rename to keras_hub/src/models/distil_bert/distil_bert_masked_lm.py
index 293c6656db..0a848ece06 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_masked_lm.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_masked_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,21 +15,21 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.masked_lm_head import MaskedLMHead
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.masked_lm_head import MaskedLMHead
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     distilbert_kernel_initializer,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_masked_lm_preprocessor import (
+from keras_hub.src.models.distil_bert.distil_bert_masked_lm_preprocessor import (
     DistilBertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.masked_lm import MaskedLM
+from keras_hub.src.models.masked_lm import MaskedLM
 
 
-@keras_nlp_export("keras_nlp.models.DistilBertMaskedLM")
+@keras_hub_export("keras_hub.models.DistilBertMaskedLM")
 class DistilBertMaskedLM(MaskedLM):
     """An end-to-end DistilBERT model for the masked language modeling task.
 
@@ -50,8 +50,8 @@ class DistilBertMaskedLM(MaskedLM):
     [here](https://github.com/huggingface/transformers).
 
     Args:
-        backbone: A `keras_nlp.models.DistilBertBackbone` instance.
-        preprocessor: A `keras_nlp.models.DistilBertMaskedLMPreprocessor` or
+        backbone: A `keras_hub.models.DistilBertBackbone` instance.
+        preprocessor: A `keras_hub.models.DistilBertMaskedLMPreprocessor` or
             `None`. If `None`, this model will not apply preprocessing, and
             inputs should be preprocessed before calling the model.
 
@@ -62,7 +62,7 @@ class DistilBertMaskedLM(MaskedLM):
     features = ["The quick brown fox jumped.", "I forgot my homework."]
 
     # Pretrained language model.
-    masked_lm = keras_nlp.models.DistilBertMaskedLM.from_preset(
+    masked_lm = keras_hub.models.DistilBertMaskedLM.from_preset(
         "distil_bert_base_en_uncased",
     )
     masked_lm.fit(x=features, batch_size=2)
@@ -90,7 +90,7 @@ class DistilBertMaskedLM(MaskedLM):
     # Labels are the original masked values.
     labels = [[3, 5]] * 2
 
-    masked_lm = keras_nlp.models.DistilBertMaskedLM.from_preset(
+    masked_lm = keras_hub.models.DistilBertMaskedLM.from_preset(
         "distil_bert_base_en_uncased",
         preprocessor=None,
     )
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_masked_lm_preprocessor.py b/keras_hub/src/models/distil_bert/distil_bert_masked_lm_preprocessor.py
similarity index 86%
rename from keras_nlp/src/models/distil_bert/distil_bert_masked_lm_preprocessor.py
rename to keras_hub/src/models/distil_bert/distil_bert_masked_lm_preprocessor.py
index 1a69de4e50..52be2615b8 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_masked_lm_preprocessor.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_masked_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,36 +14,36 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_tokenizer import (
+from keras_hub.src.models.distil_bert.distil_bert_tokenizer import (
     DistilBertTokenizer,
 )
-from keras_nlp.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.models.DistilBertMaskedLMPreprocessor")
+@keras_hub_export("keras_hub.models.DistilBertMaskedLMPreprocessor")
 class DistilBertMaskedLMPreprocessor(MaskedLMPreprocessor):
     """DistilBERT preprocessing for the masked language modeling task.
 
     This preprocessing layer will prepare inputs for a masked language modeling
     task. It is primarily intended for use with the
-    `keras_nlp.models.DistilBertMaskedLM` task model. Preprocessing will occur in
+    `keras_hub.models.DistilBertMaskedLM` task model. Preprocessing will occur in
     multiple steps.
 
     1. Tokenize any number of input segments using the `tokenizer`.
-    2. Pack the inputs together using a `keras_nlp.layers.MultiSegmentPacker`.
+    2. Pack the inputs together using a `keras_hub.layers.MultiSegmentPacker`.
        with the appropriate `"[CLS]"`, `"[SEP]"` and `"[PAD]"` tokens.
     3. Randomly select non-special tokens to mask, controlled by
       `mask_selection_rate`.
     4. Construct a `(x, y, sample_weight)` tuple suitable for training with a
-      `keras_nlp.models.DistilBertMaskedLM` task model.
+      `keras_hub.models.DistilBertMaskedLM` task model.
 
     Args:
-        tokenizer: A `keras_nlp.models.DistilBertTokenizer` instance.
+        tokenizer: A `keras_hub.models.DistilBertTokenizer` instance.
         sequence_length: int. The length of the packed inputs.
         truncate: string. The algorithm to truncate a list of batched segments
             to fit within `sequence_length`. The value can be either
@@ -79,7 +79,7 @@ class DistilBertMaskedLMPreprocessor(MaskedLMPreprocessor):
 
     Directly calling the layer on data.
     ```python
-    preprocessor = keras_nlp.models.DistilBertMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.DistilBertMaskedLMPreprocessor.from_preset(
         "distil_bert_base_en_uncased"
     )
 
@@ -98,7 +98,7 @@ class DistilBertMaskedLMPreprocessor(MaskedLMPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.DistilBertMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.DistilBertMaskedLMPreprocessor.from_preset(
         "distil_bert_base_en_uncased"
     )
 
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_masked_lm_preprocessor_test.py b/keras_hub/src/models/distil_bert/distil_bert_masked_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/distil_bert/distil_bert_masked_lm_preprocessor_test.py
rename to keras_hub/src/models/distil_bert/distil_bert_masked_lm_preprocessor_test.py
index a9fb2c68ca..d51900b534 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_masked_lm_preprocessor_test.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_masked_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,13 +14,13 @@
 
 import pytest
 
-from keras_nlp.src.models.distil_bert.distil_bert_masked_lm_preprocessor import (
+from keras_hub.src.models.distil_bert.distil_bert_masked_lm_preprocessor import (
     DistilBertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_tokenizer import (
+from keras_hub.src.models.distil_bert.distil_bert_tokenizer import (
     DistilBertTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DistilBertMaskedLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_masked_lm_test.py b/keras_hub/src/models/distil_bert/distil_bert_masked_lm_test.py
similarity index 88%
rename from keras_nlp/src/models/distil_bert/distil_bert_masked_lm_test.py
rename to keras_hub/src/models/distil_bert/distil_bert_masked_lm_test.py
index 7626db55ab..be1dca5ffd 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_masked_lm_test.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_masked_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,19 +14,19 @@
 
 import pytest
 
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_masked_lm import (
+from keras_hub.src.models.distil_bert.distil_bert_masked_lm import (
     DistilBertMaskedLM,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_masked_lm_preprocessor import (
+from keras_hub.src.models.distil_bert.distil_bert_masked_lm_preprocessor import (
     DistilBertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_tokenizer import (
+from keras_hub.src.models.distil_bert.distil_bert_tokenizer import (
     DistilBertTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DistilBertMaskedLMTest(TestCase):
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_presets.py b/keras_hub/src/models/distil_bert/distil_bert_presets.py
similarity index 98%
rename from keras_nlp/src/models/distil_bert/distil_bert_presets.py
rename to keras_hub/src/models/distil_bert/distil_bert_presets.py
index e0ab213699..97a94007d9 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_presets.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_text_classifier.py b/keras_hub/src/models/distil_bert/distil_bert_text_classifier.py
similarity index 85%
rename from keras_nlp/src/models/distil_bert/distil_bert_text_classifier.py
rename to keras_hub/src/models/distil_bert/distil_bert_text_classifier.py
index f5a612a7d6..723de49945 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_text_classifier.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_text_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,30 +15,30 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     distilbert_kernel_initializer,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_text_classifier_preprocessor import (
+from keras_hub.src.models.distil_bert.distil_bert_text_classifier_preprocessor import (
     DistilBertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.text_classifier import TextClassifier
+from keras_hub.src.models.text_classifier import TextClassifier
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.DistilBertTextClassifier",
-        "keras_nlp.models.DistilBertClassifier",
+        "keras_hub.models.DistilBertTextClassifier",
+        "keras_hub.models.DistilBertClassifier",
     ]
 )
 class DistilBertTextClassifier(TextClassifier):
     """An end-to-end DistilBERT model for classification tasks.
 
     This model attaches a classification head to a
-    `keras_nlp.model.DistilBertBackbone` instance, mapping from the backbone
+    `keras_hub.model.DistilBertBackbone` instance, mapping from the backbone
     outputs to logits suitable for a classification task. For usage of
     this model with pre-trained weights, see the `from_preset()` constructor.
 
@@ -53,9 +53,9 @@ class DistilBertTextClassifier(TextClassifier):
     [here](https://github.com/huggingface/transformers).
 
     Args:
-        backbone: A `keras_nlp.models.DistilBert` instance.
+        backbone: A `keras_hub.models.DistilBert` instance.
         num_classes: int. Number of classes to predict.
-        preprocessor: A `keras_nlp.models.DistilBertTextClassifierPreprocessor` or `None`. If
+        preprocessor: A `keras_hub.models.DistilBertTextClassifierPreprocessor` or `None`. If
             `None`, this model will not apply preprocessing, and inputs should
             be preprocessed before calling the model.
         activation: Optional `str` or callable. The
@@ -74,12 +74,12 @@ class DistilBertTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Use a shorter sequence length.
-    preprocessor = keras_nlp.models.DistilBertTextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.DistilBertTextClassifierPreprocessor.from_preset(
         "distil_bert_base_en_uncased",
         sequence_length=128,
     )
     # Pretrained classifier.
-    classifier = keras_nlp.models.DistilBertTextClassifier.from_preset(
+    classifier = keras_hub.models.DistilBertTextClassifier.from_preset(
         "distil_bert_base_en_uncased",
         num_classes=4,
         preprocessor=preprocessor,
@@ -107,7 +107,7 @@ class DistilBertTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier without preprocessing.
-    classifier = keras_nlp.models.DistilBertTextClassifier.from_preset(
+    classifier = keras_hub.models.DistilBertTextClassifier.from_preset(
         "distil_bert_base_en_uncased",
         num_classes=4,
         preprocessor=None,
@@ -121,14 +121,14 @@ class DistilBertTextClassifier(TextClassifier):
     labels = [0, 3]
     vocab = ["[UNK]", "[CLS]", "[SEP]", "[PAD]", "[MASK]"]
     vocab += ["The", "quick", "brown", "fox", "jumped", "."]
-    tokenizer = keras_nlp.models.DistilBertTokenizer(
+    tokenizer = keras_hub.models.DistilBertTokenizer(
         vocabulary=vocab,
     )
-    preprocessor = keras_nlp.models.DistilBertTextClassifierPreprocessor(
+    preprocessor = keras_hub.models.DistilBertTextClassifierPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.DistilBertBackbone(
+    backbone = keras_hub.models.DistilBertBackbone(
         vocabulary_size=30552,
         num_layers=4,
         num_heads=4,
@@ -136,7 +136,7 @@ class DistilBertTextClassifier(TextClassifier):
         intermediate_dim=512,
         max_sequence_length=128,
     )
-    classifier = keras_nlp.models.DistilBertTextClassifier(
+    classifier = keras_hub.models.DistilBertTextClassifier(
         backbone=backbone,
         preprocessor=preprocessor,
         num_classes=4,
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_text_classifier_preprocessor.py b/keras_hub/src/models/distil_bert/distil_bert_text_classifier_preprocessor.py
similarity index 84%
rename from keras_nlp/src/models/distil_bert/distil_bert_text_classifier_preprocessor.py
rename to keras_hub/src/models/distil_bert/distil_bert_text_classifier_preprocessor.py
index f4bda84a2c..434da4624e 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_text_classifier_preprocessor.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_text_classifier_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,23 +15,23 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_tokenizer import (
+from keras_hub.src.models.distil_bert.distil_bert_tokenizer import (
     DistilBertTokenizer,
 )
-from keras_nlp.src.models.text_classifier_preprocessor import (
+from keras_hub.src.models.text_classifier_preprocessor import (
     TextClassifierPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.DistilBertTextClassifierPreprocessor",
-        "keras_nlp.models.DistilBertPreprocessor",
+        "keras_hub.models.DistilBertTextClassifierPreprocessor",
+        "keras_hub.models.DistilBertPreprocessor",
     ]
 )
 class DistilBertTextClassifierPreprocessor(TextClassifierPreprocessor):
@@ -40,7 +40,7 @@ class DistilBertTextClassifierPreprocessor(TextClassifierPreprocessor):
     This preprocessing layer will do three things:
 
      1. Tokenize any number of input segments using the `tokenizer`.
-     2. Pack the inputs together using a `keras_nlp.layers.MultiSegmentPacker`.
+     2. Pack the inputs together using a `keras_hub.layers.MultiSegmentPacker`.
        with the appropriate `"[CLS]"`, `"[SEP]"` and `"[PAD]"` tokens.
      3. Construct a dictionary of with keys `"token_ids"` and `"padding_mask"`,
        that can be passed directly to a DistilBERT model.
@@ -50,7 +50,7 @@ class DistilBertTextClassifierPreprocessor(TextClassifierPreprocessor):
     `keras.Model.fit`.
 
     Args:
-        tokenizer: A `keras_nlp.models.DistilBertTokenizer` instance.
+        tokenizer: A `keras_hub.models.DistilBertTokenizer` instance.
         sequence_length: The length of the packed inputs.
         truncate: string. The algorithm to truncate a list of batched segments
             to fit within `sequence_length`. The value can be either
@@ -75,7 +75,7 @@ class DistilBertTextClassifierPreprocessor(TextClassifierPreprocessor):
 
     Directly calling the layer on data.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "distil_bert_base_en_uncased"
     )
     preprocessor(["The quick brown fox jumped.", "Call me Ishmael."])
@@ -83,8 +83,8 @@ class DistilBertTextClassifierPreprocessor(TextClassifierPreprocessor):
     # Custom vocabulary.
     vocab = ["[UNK]", "[CLS]", "[SEP]", "[PAD]", "[MASK]"]
     vocab += ["The", "quick", "brown", "fox", "jumped", "."]
-    tokenizer = keras_nlp.models.DistilBertTokenizer(vocabulary=vocab)
-    preprocessor = keras_nlp.models.DistilBertTextClassifierPreprocessor(
+    tokenizer = keras_hub.models.DistilBertTokenizer(vocabulary=vocab)
+    preprocessor = keras_hub.models.DistilBertTextClassifierPreprocessor(
         tokenizer
     )
     preprocessor("The quick brown fox jumped.")
@@ -92,7 +92,7 @@ class DistilBertTextClassifierPreprocessor(TextClassifierPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "distil_bert_base_en_uncased"
     )
 
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_text_classifier_preprocessor_test.py b/keras_hub/src/models/distil_bert/distil_bert_text_classifier_preprocessor_test.py
similarity index 91%
rename from keras_nlp/src/models/distil_bert/distil_bert_text_classifier_preprocessor_test.py
rename to keras_hub/src/models/distil_bert/distil_bert_text_classifier_preprocessor_test.py
index 3a0e183e11..57f37ff098 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_text_classifier_preprocessor_test.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_text_classifier_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,13 +14,13 @@
 
 import pytest
 
-from keras_nlp.src.models.distil_bert.distil_bert_text_classifier_preprocessor import (
+from keras_hub.src.models.distil_bert.distil_bert_text_classifier_preprocessor import (
     DistilBertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_tokenizer import (
+from keras_hub.src.models.distil_bert.distil_bert_tokenizer import (
     DistilBertTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DistilBertTextClassifierPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_text_classifier_test.py b/keras_hub/src/models/distil_bert/distil_bert_text_classifier_test.py
similarity index 88%
rename from keras_nlp/src/models/distil_bert/distil_bert_text_classifier_test.py
rename to keras_hub/src/models/distil_bert/distil_bert_text_classifier_test.py
index ab4dd489af..d0f6f75295 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_text_classifier_test.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_text_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,19 +14,19 @@
 
 import pytest
 
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_text_classifier import (
+from keras_hub.src.models.distil_bert.distil_bert_text_classifier import (
     DistilBertTextClassifier,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_text_classifier_preprocessor import (
+from keras_hub.src.models.distil_bert.distil_bert_text_classifier_preprocessor import (
     DistilBertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_tokenizer import (
+from keras_hub.src.models.distil_bert.distil_bert_tokenizer import (
     DistilBertTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DistilBertTextClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_tokenizer.py b/keras_hub/src/models/distil_bert/distil_bert_tokenizer.py
similarity index 85%
rename from keras_nlp/src/models/distil_bert/distil_bert_tokenizer.py
rename to keras_hub/src/models/distil_bert/distil_bert_tokenizer.py
index f99a41069b..72f4ad50a5 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_tokenizer.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,24 +13,24 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.tokenizers.word_piece_tokenizer import WordPieceTokenizer
+from keras_hub.src.tokenizers.word_piece_tokenizer import WordPieceTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.DistilBertTokenizer",
-        "keras_nlp.models.DistilBertTokenizer",
+        "keras_hub.tokenizers.DistilBertTokenizer",
+        "keras_hub.models.DistilBertTokenizer",
     ]
 )
 class DistilBertTokenizer(WordPieceTokenizer):
     """A DistilBERT tokenizer using WordPiece subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.WordPieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.WordPieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by DistilBERT
     models and provides a `from_preset()` method to automatically download
     a matching vocabulary for a DistilBERT preset.
@@ -56,7 +56,7 @@ class DistilBertTokenizer(WordPieceTokenizer):
 
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.DistilBertTokenizer.from_preset(
+    tokenizer = keras_hub.models.DistilBertTokenizer.from_preset(
         "distil_bert_base_en_uncased",
     )
     tokenizer("The quick brown fox jumped.")
@@ -70,7 +70,7 @@ class DistilBertTokenizer(WordPieceTokenizer):
     # Custom vocabulary.
     vocab = ["[UNK]", "[CLS]", "[SEP]", "[PAD]", "[MASK]"]
     vocab += ["The", "quick", "brown", "fox", "jumped", "."]
-    tokenizer = keras_nlp.models.DistilBertTokenizer(vocabulary=vocab)
+    tokenizer = keras_hub.models.DistilBertTokenizer(vocabulary=vocab)
     tokenizer("The quick brown fox jumped.")
     ```
     """
diff --git a/keras_nlp/src/models/distil_bert/distil_bert_tokenizer_test.py b/keras_hub/src/models/distil_bert/distil_bert_tokenizer_test.py
similarity index 94%
rename from keras_nlp/src/models/distil_bert/distil_bert_tokenizer_test.py
rename to keras_hub/src/models/distil_bert/distil_bert_tokenizer_test.py
index 377b608ca6..5b1ac95940 100644
--- a/keras_nlp/src/models/distil_bert/distil_bert_tokenizer_test.py
+++ b/keras_hub/src/models/distil_bert/distil_bert_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,10 +14,10 @@
 
 import pytest
 
-from keras_nlp.src.models.distil_bert.distil_bert_tokenizer import (
+from keras_hub.src.models.distil_bert.distil_bert_tokenizer import (
     DistilBertTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class DistilBertTokenizerTest(TestCase):
diff --git a/keras_hub/src/models/efficientnet/__init__.py b/keras_hub/src/models/efficientnet/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/efficientnet/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/efficientnet/efficientnet_backbone.py b/keras_hub/src/models/efficientnet/efficientnet_backbone.py
similarity index 98%
rename from keras_nlp/src/models/efficientnet/efficientnet_backbone.py
rename to keras_hub/src/models/efficientnet/efficientnet_backbone.py
index 2d7940d6df..2cb7a82f8b 100644
--- a/keras_nlp/src/models/efficientnet/efficientnet_backbone.py
+++ b/keras_hub/src/models/efficientnet/efficientnet_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,13 +15,13 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.efficientnet.fusedmbconv import FusedMBConvBlock
-from keras_nlp.src.models.efficientnet.mbconv import MBConvBlock
-from keras_nlp.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.efficientnet.fusedmbconv import FusedMBConvBlock
+from keras_hub.src.models.efficientnet.mbconv import MBConvBlock
+from keras_hub.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
 
 
-@keras_nlp_export("keras_nlp.models.EfficientNetBackbone")
+@keras_hub_export("keras_hub.models.EfficientNetBackbone")
 class EfficientNetBackbone(FeaturePyramidBackbone):
     """An EfficientNet backbone model.
 
diff --git a/keras_nlp/src/models/efficientnet/efficientnet_backbone_test.py b/keras_hub/src/models/efficientnet/efficientnet_backbone_test.py
similarity index 97%
rename from keras_nlp/src/models/efficientnet/efficientnet_backbone_test.py
rename to keras_hub/src/models/efficientnet/efficientnet_backbone_test.py
index 8705ed7af1..aab9f6dc69 100644
--- a/keras_nlp/src/models/efficientnet/efficientnet_backbone_test.py
+++ b/keras_hub/src/models/efficientnet/efficientnet_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,10 +16,10 @@
 import pytest
 from absl.testing import parameterized
 
-from keras_nlp.src.models.efficientnet.efficientnet_backbone import (
+from keras_hub.src.models.efficientnet.efficientnet_backbone import (
     EfficientNetBackbone,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class EfficientNetBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/efficientnet/fusedmbconv.py b/keras_hub/src/models/efficientnet/fusedmbconv.py
similarity index 99%
rename from keras_nlp/src/models/efficientnet/fusedmbconv.py
rename to keras_hub/src/models/efficientnet/fusedmbconv.py
index 5c3817c30e..53d4227f50 100644
--- a/keras_nlp/src/models/efficientnet/fusedmbconv.py
+++ b/keras_hub/src/models/efficientnet/fusedmbconv.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/efficientnet/fusedmbconv_test.py b/keras_hub/src/models/efficientnet/fusedmbconv_test.py
similarity index 91%
rename from keras_nlp/src/models/efficientnet/fusedmbconv_test.py
rename to keras_hub/src/models/efficientnet/fusedmbconv_test.py
index e59e251156..1d47082f98 100644
--- a/keras_nlp/src/models/efficientnet/fusedmbconv_test.py
+++ b/keras_hub/src/models/efficientnet/fusedmbconv_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import keras
 
-from keras_nlp.src.models.efficientnet.fusedmbconv import FusedMBConvBlock
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.efficientnet.fusedmbconv import FusedMBConvBlock
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FusedMBConvBlockTest(TestCase):
diff --git a/keras_nlp/src/models/efficientnet/mbconv.py b/keras_hub/src/models/efficientnet/mbconv.py
similarity index 99%
rename from keras_nlp/src/models/efficientnet/mbconv.py
rename to keras_hub/src/models/efficientnet/mbconv.py
index 4889606f8f..ff38162fef 100644
--- a/keras_nlp/src/models/efficientnet/mbconv.py
+++ b/keras_hub/src/models/efficientnet/mbconv.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/efficientnet/mbconv_test.py b/keras_hub/src/models/efficientnet/mbconv_test.py
similarity index 90%
rename from keras_nlp/src/models/efficientnet/mbconv_test.py
rename to keras_hub/src/models/efficientnet/mbconv_test.py
index d4ba2b1f73..9e84defd70 100644
--- a/keras_nlp/src/models/efficientnet/mbconv_test.py
+++ b/keras_hub/src/models/efficientnet/mbconv_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import keras
 
-from keras_nlp.src.models.efficientnet.mbconv import MBConvBlock
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.efficientnet.mbconv import MBConvBlock
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MBConvTest(TestCase):
diff --git a/keras_nlp/src/models/electra/__init__.py b/keras_hub/src/models/electra/__init__.py
similarity index 73%
rename from keras_nlp/src/models/electra/__init__.py
rename to keras_hub/src/models/electra/__init__.py
index fe2f145755..8b1baa41af 100644
--- a/keras_nlp/src/models/electra/__init__.py
+++ b/keras_hub/src/models/electra/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.electra.electra_backbone import ElectraBackbone
-from keras_nlp.src.models.electra.electra_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.electra.electra_backbone import ElectraBackbone
+from keras_hub.src.models.electra.electra_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, ElectraBackbone)
diff --git a/keras_nlp/src/models/electra/electra_backbone.py b/keras_hub/src/models/electra/electra_backbone.py
similarity index 94%
rename from keras_nlp/src/models/electra/electra_backbone.py
rename to keras_hub/src/models/electra/electra_backbone.py
index baf15e66bf..d4c7571661 100644
--- a/keras_nlp/src/models/electra/electra_backbone.py
+++ b/keras_hub/src/models/electra/electra_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,21 +14,21 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.layers.modeling.transformer_encoder import TransformerEncoder
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.utils.keras_utils import gelu_approximate
+from keras_hub.src.layers.modeling.transformer_encoder import TransformerEncoder
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.utils.keras_utils import gelu_approximate
 
 
 def electra_kernel_initializer(stddev=0.02):
     return keras.initializers.TruncatedNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.ElectraBackbone")
+@keras_hub_export("keras_hub.models.ElectraBackbone")
 class ElectraBackbone(Backbone):
     """A Electra encoder network.
 
@@ -76,13 +76,13 @@ class ElectraBackbone(Backbone):
         }
 
         # Pre-trained ELECTRA encoder.
-        model = keras_nlp.models.ElectraBackbone.from_preset(
+        model = keras_hub.models.ElectraBackbone.from_preset(
             "electra_base_discriminator_en"
         )
         model(input_data)
 
         # Randomly initialized Electra encoder
-        backbone = keras_nlp.models.ElectraBackbone(
+        backbone = keras_hub.models.ElectraBackbone(
             vocabulary_size=1000,
             num_layers=2,
             num_heads=2,
diff --git a/keras_nlp/src/models/electra/electra_backbone_test.py b/keras_hub/src/models/electra/electra_backbone_test.py
similarity index 95%
rename from keras_nlp/src/models/electra/electra_backbone_test.py
rename to keras_hub/src/models/electra/electra_backbone_test.py
index 259885c8dd..4d6e77c7f0 100644
--- a/keras_nlp/src/models/electra/electra_backbone_test.py
+++ b/keras_hub/src/models/electra/electra_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.electra.electra_backbone import ElectraBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.electra.electra_backbone import ElectraBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ElectraBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/electra/electra_presets.py b/keras_hub/src/models/electra/electra_presets.py
similarity index 99%
rename from keras_nlp/src/models/electra/electra_presets.py
rename to keras_hub/src/models/electra/electra_presets.py
index 375acb002a..51e011e25b 100644
--- a/keras_nlp/src/models/electra/electra_presets.py
+++ b/keras_hub/src/models/electra/electra_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/electra/electra_tokenizer.py b/keras_hub/src/models/electra/electra_tokenizer.py
similarity index 86%
rename from keras_nlp/src/models/electra/electra_tokenizer.py
rename to keras_hub/src/models/electra/electra_tokenizer.py
index aa73e6e0b9..1d64463a65 100644
--- a/keras_nlp/src/models/electra/electra_tokenizer.py
+++ b/keras_hub/src/models/electra/electra_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,22 +12,22 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.electra.electra_backbone import ElectraBackbone
-from keras_nlp.src.tokenizers.word_piece_tokenizer import WordPieceTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.electra.electra_backbone import ElectraBackbone
+from keras_hub.src.tokenizers.word_piece_tokenizer import WordPieceTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.ElectraTokenizer",
-        "keras_nlp.models.ElectraTokenizer",
+        "keras_hub.tokenizers.ElectraTokenizer",
+        "keras_hub.models.ElectraTokenizer",
     ]
 )
 class ElectraTokenizer(WordPieceTokenizer):
     """A ELECTRA tokenizer using WordPiece subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.WordPieceTokenizer`.
+    is based on `keras_hub.tokenizers.WordPieceTokenizer`.
 
     If input is a batch of strings (rank > 0), the layer will output a
     `tf.RaggedTensor` where the last dimension of the output is ragged.
@@ -53,7 +53,7 @@ class ElectraTokenizer(WordPieceTokenizer):
     vocab += ["The", "quick", "brown", "fox", "jumped", "."]
 
     # Instantiate the tokenizer.
-    tokenizer = keras_nlp.models.ElectraTokenizer(vocabulary=vocab)
+    tokenizer = keras_hub.models.ElectraTokenizer(vocabulary=vocab)
 
     # Unbatched input.
     tokenizer("The quick brown fox jumped.")
diff --git a/keras_nlp/src/models/electra/electra_tokenizer_test.py b/keras_hub/src/models/electra/electra_tokenizer_test.py
similarity index 94%
rename from keras_nlp/src/models/electra/electra_tokenizer_test.py
rename to keras_hub/src/models/electra/electra_tokenizer_test.py
index 839382dcae..1c2c2a4ea2 100644
--- a/keras_nlp/src/models/electra/electra_tokenizer_test.py
+++ b/keras_hub/src/models/electra/electra_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import pytest
 
-from keras_nlp.src.models.electra.electra_tokenizer import ElectraTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.electra.electra_tokenizer import ElectraTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ElectraTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/f_net/__init__.py b/keras_hub/src/models/f_net/__init__.py
similarity index 72%
rename from keras_nlp/src/models/f_net/__init__.py
rename to keras_hub/src/models/f_net/__init__.py
index 4fc6ac0ee2..be13d2bbd2 100644
--- a/keras_nlp/src/models/f_net/__init__.py
+++ b/keras_hub/src/models/f_net/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.f_net.f_net_backbone import FNetBackbone
-from keras_nlp.src.models.f_net.f_net_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.f_net.f_net_backbone import FNetBackbone
+from keras_hub.src.models.f_net.f_net_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, FNetBackbone)
diff --git a/keras_nlp/src/models/f_net/f_net_backbone.py b/keras_hub/src/models/f_net/f_net_backbone.py
similarity index 93%
rename from keras_nlp/src/models/f_net/f_net_backbone.py
rename to keras_hub/src/models/f_net/f_net_backbone.py
index 97e8f6304a..193d87fe1b 100644
--- a/keras_nlp/src/models/f_net/f_net_backbone.py
+++ b/keras_hub/src/models/f_net/f_net_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,14 +15,14 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.f_net_encoder import FNetEncoder
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.f_net_encoder import FNetEncoder
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.utils.keras_utils import gelu_approximate
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.utils.keras_utils import gelu_approximate
 
 
 def f_net_kernel_initializer(stddev=0.02):
@@ -33,13 +33,13 @@ def f_net_bias_initializer(stddev=0.02):
     return keras.initializers.RandomNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.FNetBackbone")
+@keras_hub_export("keras_hub.models.FNetBackbone")
 class FNetBackbone(Backbone):
     """A FNet encoder network.
 
     This class implements a bi-directional Fourier Transform-based encoder as
     described in ["FNet: Mixing Tokens with Fourier Transforms"](https://arxiv.org/abs/2105.03824).
-    It includes the embedding lookups and `keras_nlp.layers.FNetEncoder` layers,
+    It includes the embedding lookups and `keras_hub.layers.FNetEncoder` layers,
     but not the masked language model or next sentence prediction heads.
 
     The default constructor gives a fully customizable, randomly initialized
@@ -79,11 +79,11 @@ class FNetBackbone(Backbone):
     }
 
     # Pretrained BERT encoder.
-    model = keras_nlp.models.FNetBackbone.from_preset("f_net_base_en")
+    model = keras_hub.models.FNetBackbone.from_preset("f_net_base_en")
     model(input_data)
 
     # Randomly initialized FNet encoder with a custom config.
-    model = keras_nlp.models.FNetBackbone(
+    model = keras_hub.models.FNetBackbone(
         vocabulary_size=32000,
         num_layers=4,
         hidden_dim=256,
diff --git a/keras_nlp/src/models/f_net/f_net_backbone_test.py b/keras_hub/src/models/f_net/f_net_backbone_test.py
similarity index 94%
rename from keras_nlp/src/models/f_net/f_net_backbone_test.py
rename to keras_hub/src/models/f_net/f_net_backbone_test.py
index f5b9e1e67b..c86629cd06 100644
--- a/keras_nlp/src/models/f_net/f_net_backbone_test.py
+++ b/keras_hub/src/models/f_net/f_net_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.f_net.f_net_backbone import FNetBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.f_net.f_net_backbone import FNetBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FNetBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/f_net/f_net_masked_lm.py b/keras_hub/src/models/f_net/f_net_masked_lm.py
similarity index 86%
rename from keras_nlp/src/models/f_net/f_net_masked_lm.py
rename to keras_hub/src/models/f_net/f_net_masked_lm.py
index 6bd0ac84f3..799ef810aa 100644
--- a/keras_nlp/src/models/f_net/f_net_masked_lm.py
+++ b/keras_hub/src/models/f_net/f_net_masked_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,17 +15,17 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.masked_lm_head import MaskedLMHead
-from keras_nlp.src.models.f_net.f_net_backbone import FNetBackbone
-from keras_nlp.src.models.f_net.f_net_backbone import f_net_kernel_initializer
-from keras_nlp.src.models.f_net.f_net_masked_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.masked_lm_head import MaskedLMHead
+from keras_hub.src.models.f_net.f_net_backbone import FNetBackbone
+from keras_hub.src.models.f_net.f_net_backbone import f_net_kernel_initializer
+from keras_hub.src.models.f_net.f_net_masked_lm_preprocessor import (
     FNetMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.masked_lm import MaskedLM
+from keras_hub.src.models.masked_lm import MaskedLM
 
 
-@keras_nlp_export("keras_nlp.models.FNetMaskedLM")
+@keras_hub_export("keras_hub.models.FNetMaskedLM")
 class FNetMaskedLM(MaskedLM):
     """An end-to-end FNet model for the masked language modeling task.
 
@@ -44,8 +44,8 @@ class FNetMaskedLM(MaskedLM):
     warranties or conditions of any kind.
 
     Args:
-        backbone: A `keras_nlp.models.FNetBackbone` instance.
-        preprocessor: A `keras_nlp.models.FNetMaskedLMPreprocessor` or
+        backbone: A `keras_hub.models.FNetBackbone` instance.
+        preprocessor: A `keras_hub.models.FNetMaskedLMPreprocessor` or
             `None`. If `None`, this model will not apply preprocessing, and
             inputs should be preprocessed before calling the model.
 
@@ -57,7 +57,7 @@ class FNetMaskedLM(MaskedLM):
     features = ["The quick brown fox jumped.", "I forgot my homework."]
 
     # Pretrained language model.
-    masked_lm = keras_nlp.models.FNetMaskedLM.from_preset(
+    masked_lm = keras_hub.models.FNetMaskedLM.from_preset(
         "f_net_base_en",
     )
     masked_lm.fit(x=features, batch_size=2)
@@ -85,7 +85,7 @@ class FNetMaskedLM(MaskedLM):
     # Labels are the original masked values.
     labels = [[3, 5]] * 2
 
-    masked_lm = keras_nlp.models.FNetMaskedLM.from_preset(
+    masked_lm = keras_hub.models.FNetMaskedLM.from_preset(
         "f_net_base_en",
         preprocessor=None,
     )
diff --git a/keras_nlp/src/models/f_net/f_net_masked_lm_preprocessor.py b/keras_hub/src/models/f_net/f_net_masked_lm_preprocessor.py
similarity index 87%
rename from keras_nlp/src/models/f_net/f_net_masked_lm_preprocessor.py
rename to keras_hub/src/models/f_net/f_net_masked_lm_preprocessor.py
index 3d0b625ef1..5af51e6135 100644
--- a/keras_nlp/src/models/f_net/f_net_masked_lm_preprocessor.py
+++ b/keras_hub/src/models/f_net/f_net_masked_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,20 +14,20 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.f_net.f_net_backbone import FNetBackbone
-from keras_nlp.src.models.f_net.f_net_tokenizer import FNetTokenizer
-from keras_nlp.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.f_net.f_net_backbone import FNetBackbone
+from keras_hub.src.models.f_net.f_net_tokenizer import FNetTokenizer
+from keras_hub.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.models.FNetMaskedLMPreprocessor")
+@keras_hub_export("keras_hub.models.FNetMaskedLMPreprocessor")
 class FNetMaskedLMPreprocessor(MaskedLMPreprocessor):
     """FNet preprocessing for the masked language modeling task.
 
     This preprocessing layer will prepare inputs for a masked language modeling
     task. It is primarily intended for use with the
-    `keras_nlp.models.FNetMaskedLM` task model. Preprocessing will occur in
+    `keras_hub.models.FNetMaskedLM` task model. Preprocessing will occur in
     multiple steps.
 
     1. Tokenize any number of input segments using the `tokenizer`.
@@ -38,10 +38,10 @@ class FNetMaskedLMPreprocessor(MaskedLMPreprocessor):
     3. Randomly select non-special tokens to mask, controlled by
       `mask_selection_rate`.
     4. Construct a `(x, y, sample_weight)` tuple suitable for training with a
-      `keras_nlp.models.FNetMaskedLM` task model.
+      `keras_hub.models.FNetMaskedLM` task model.
 
     Args:
-        tokenizer: A `keras_nlp.models.FNetTokenizer` instance.
+        tokenizer: A `keras_hub.models.FNetTokenizer` instance.
         sequence_length: The length of the packed inputs.
         mask_selection_rate: The probability an input token will be dynamically
             masked.
@@ -72,7 +72,7 @@ class FNetMaskedLMPreprocessor(MaskedLMPreprocessor):
     Directly calling the layer on data.
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.FNetMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.FNetMaskedLMPreprocessor.from_preset(
         "f_net_base_en"
     )
 
@@ -91,7 +91,7 @@ class FNetMaskedLMPreprocessor(MaskedLMPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.FNetMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.FNetMaskedLMPreprocessor.from_preset(
         "f_net_base_en"
     )
 
diff --git a/keras_nlp/src/models/f_net/f_net_masked_lm_preprocessor_test.py b/keras_hub/src/models/f_net/f_net_masked_lm_preprocessor_test.py
similarity index 92%
rename from keras_nlp/src/models/f_net/f_net_masked_lm_preprocessor_test.py
rename to keras_hub/src/models/f_net/f_net_masked_lm_preprocessor_test.py
index 7c5a517c19..32af388bba 100644
--- a/keras_nlp/src/models/f_net/f_net_masked_lm_preprocessor_test.py
+++ b/keras_hub/src/models/f_net/f_net_masked_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,11 +16,11 @@
 
 import pytest
 
-from keras_nlp.src.models.f_net.f_net_masked_lm_preprocessor import (
+from keras_hub.src.models.f_net.f_net_masked_lm_preprocessor import (
     FNetMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.f_net.f_net_tokenizer import FNetTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.f_net.f_net_tokenizer import FNetTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FNetMaskedLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/f_net/f_net_masked_lm_test.py b/keras_hub/src/models/f_net/f_net_masked_lm_test.py
similarity index 87%
rename from keras_nlp/src/models/f_net/f_net_masked_lm_test.py
rename to keras_hub/src/models/f_net/f_net_masked_lm_test.py
index 482b2ae71a..78da71c2f8 100644
--- a/keras_nlp/src/models/f_net/f_net_masked_lm_test.py
+++ b/keras_hub/src/models/f_net/f_net_masked_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,13 +16,13 @@
 
 import pytest
 
-from keras_nlp.src.models.f_net.f_net_backbone import FNetBackbone
-from keras_nlp.src.models.f_net.f_net_masked_lm import FNetMaskedLM
-from keras_nlp.src.models.f_net.f_net_masked_lm_preprocessor import (
+from keras_hub.src.models.f_net.f_net_backbone import FNetBackbone
+from keras_hub.src.models.f_net.f_net_masked_lm import FNetMaskedLM
+from keras_hub.src.models.f_net.f_net_masked_lm_preprocessor import (
     FNetMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.f_net.f_net_tokenizer import FNetTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.f_net.f_net_tokenizer import FNetTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FNetMaskedLMTest(TestCase):
diff --git a/keras_nlp/src/models/f_net/f_net_presets.py b/keras_hub/src/models/f_net/f_net_presets.py
similarity index 97%
rename from keras_nlp/src/models/f_net/f_net_presets.py
rename to keras_hub/src/models/f_net/f_net_presets.py
index c053eb346d..18c2832927 100644
--- a/keras_nlp/src/models/f_net/f_net_presets.py
+++ b/keras_hub/src/models/f_net/f_net_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/f_net/f_net_text_classifier.py b/keras_hub/src/models/f_net/f_net_text_classifier.py
similarity index 85%
rename from keras_nlp/src/models/f_net/f_net_text_classifier.py
rename to keras_hub/src/models/f_net/f_net_text_classifier.py
index fa18b5ea38..40f30d9984 100644
--- a/keras_nlp/src/models/f_net/f_net_text_classifier.py
+++ b/keras_hub/src/models/f_net/f_net_text_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,26 +15,26 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.f_net.f_net_backbone import FNetBackbone
-from keras_nlp.src.models.f_net.f_net_backbone import f_net_kernel_initializer
-from keras_nlp.src.models.f_net.f_net_text_classifier_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.f_net.f_net_backbone import FNetBackbone
+from keras_hub.src.models.f_net.f_net_backbone import f_net_kernel_initializer
+from keras_hub.src.models.f_net.f_net_text_classifier_preprocessor import (
     FNetTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.text_classifier import TextClassifier
+from keras_hub.src.models.text_classifier import TextClassifier
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.FNetTextClassifier",
-        "keras_nlp.models.FNetClassifier",
+        "keras_hub.models.FNetTextClassifier",
+        "keras_hub.models.FNetClassifier",
     ]
 )
 class FNetTextClassifier(TextClassifier):
     """An end-to-end f_net model for classification tasks.
 
     This model attaches a classification head to a
-    `keras_nlp.model.FNetBackbone` instance, mapping from the backbone outputs
+    `keras_hub.model.FNetBackbone` instance, mapping from the backbone outputs
     to logits suitable for a classification task. For usage of this model with
     pre-trained weights, use the `from_preset()` constructor.
 
@@ -47,9 +47,9 @@ class FNetTextClassifier(TextClassifier):
     warranties or conditions of any kind.
 
     Args:
-        backbone: A `keras_nlp.models.FNetBackbone` instance.
+        backbone: A `keras_hub.models.FNetBackbone` instance.
         num_classes: int. Number of classes to predict.
-        preprocessor: A `keras_nlp.models.FNetTextClassifierPreprocessor` or `None`. If
+        preprocessor: A `keras_hub.models.FNetTextClassifierPreprocessor` or `None`. If
             `None`, this model will not apply preprocessing, and inputs should
             be preprocessed before calling the model.
         activation: Optional `str` or callable. The
@@ -68,7 +68,7 @@ class FNetTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier.
-    classifier = keras_nlp.models.FNetTextClassifier.from_preset(
+    classifier = keras_hub.models.FNetTextClassifier.from_preset(
         "f_net_base_en",
         num_classes=4,
     )
@@ -96,7 +96,7 @@ class FNetTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier without preprocessing.
-    classifier = keras_nlp.models.FNetTextClassifier.from_preset(
+    classifier = keras_hub.models.FNetTextClassifier.from_preset(
         "f_net_base_en",
         num_classes=4,
         preprocessor=None,
diff --git a/keras_nlp/src/models/f_net/f_net_text_classifier_preprocessor.py b/keras_hub/src/models/f_net/f_net_text_classifier_preprocessor.py
similarity index 85%
rename from keras_nlp/src/models/f_net/f_net_text_classifier_preprocessor.py
rename to keras_hub/src/models/f_net/f_net_text_classifier_preprocessor.py
index 124fc803e2..3ea7e8f27f 100644
--- a/keras_nlp/src/models/f_net/f_net_text_classifier_preprocessor.py
+++ b/keras_hub/src/models/f_net/f_net_text_classifier_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,19 +15,19 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.f_net.f_net_backbone import FNetBackbone
-from keras_nlp.src.models.f_net.f_net_tokenizer import FNetTokenizer
-from keras_nlp.src.models.text_classifier_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.f_net.f_net_backbone import FNetBackbone
+from keras_hub.src.models.f_net.f_net_tokenizer import FNetTokenizer
+from keras_hub.src.models.text_classifier_preprocessor import (
     TextClassifierPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.FNetTextClassifierPreprocessor",
-        "keras_nlp.models.FNetPreprocessor",
+        "keras_hub.models.FNetTextClassifierPreprocessor",
+        "keras_hub.models.FNetPreprocessor",
     ]
 )
 class FNetTextClassifierPreprocessor(TextClassifierPreprocessor):
@@ -36,17 +36,17 @@ class FNetTextClassifierPreprocessor(TextClassifierPreprocessor):
     This preprocessing layer will do three things:
 
      1. Tokenize any number of input segments using the `tokenizer`.
-     2. Pack the inputs together using a `keras_nlp.layers.MultiSegmentPacker`.
+     2. Pack the inputs together using a `keras_hub.layers.MultiSegmentPacker`.
        with the appropriate `"[CLS]"`, `"[SEP]"` and `"<pad>"` tokens.
      3. Construct a dictionary with keys `"token_ids"`, and `"segment_ids"`  that
-       can be passed directly to `keras_nlp.models.FNetBackbone`.
+       can be passed directly to `keras_hub.models.FNetBackbone`.
 
     This layer can be used directly with `tf.data.Dataset.map` to preprocess
     string data in the `(x, y, sample_weight)` format used by
     `keras.Model.fit`.
 
     Args:
-        tokenizer: A `keras_nlp.models.FNetTokenizer` instance.
+        tokenizer: A `keras_hub.models.FNetTokenizer` instance.
         sequence_length: The length of the packed inputs.
         truncate: string. The algorithm to truncate a list of batched segments
             to fit within `sequence_length`. The value can be either
@@ -71,7 +71,7 @@ class FNetTextClassifierPreprocessor(TextClassifierPreprocessor):
 
     Directly calling the from_preset().
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "f_net_base_en"
     )
 
@@ -90,7 +90,7 @@ class FNetTextClassifierPreprocessor(TextClassifierPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "f_net_base_en"
     )
     first = tf.constant(["The quick brown fox jumped.", "Call me Ishmael."])
diff --git a/keras_nlp/src/models/f_net/f_net_text_classifier_preprocessor_test.py b/keras_hub/src/models/f_net/f_net_text_classifier_preprocessor_test.py
similarity index 91%
rename from keras_nlp/src/models/f_net/f_net_text_classifier_preprocessor_test.py
rename to keras_hub/src/models/f_net/f_net_text_classifier_preprocessor_test.py
index 7d7c6371fd..c6cec7afe4 100644
--- a/keras_nlp/src/models/f_net/f_net_text_classifier_preprocessor_test.py
+++ b/keras_hub/src/models/f_net/f_net_text_classifier_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,11 +16,11 @@
 
 import pytest
 
-from keras_nlp.src.models.f_net.f_net_text_classifier_preprocessor import (
+from keras_hub.src.models.f_net.f_net_text_classifier_preprocessor import (
     FNetTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.f_net.f_net_tokenizer import FNetTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.f_net.f_net_tokenizer import FNetTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FNetTextClassifierPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/f_net/f_net_text_classifier_test.py b/keras_hub/src/models/f_net/f_net_text_classifier_test.py
similarity index 87%
rename from keras_nlp/src/models/f_net/f_net_text_classifier_test.py
rename to keras_hub/src/models/f_net/f_net_text_classifier_test.py
index 31a700f5b3..2e2911e9fe 100644
--- a/keras_nlp/src/models/f_net/f_net_text_classifier_test.py
+++ b/keras_hub/src/models/f_net/f_net_text_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,13 +16,13 @@
 
 import pytest
 
-from keras_nlp.src.models.f_net.f_net_backbone import FNetBackbone
-from keras_nlp.src.models.f_net.f_net_text_classifier import FNetTextClassifier
-from keras_nlp.src.models.f_net.f_net_text_classifier_preprocessor import (
+from keras_hub.src.models.f_net.f_net_backbone import FNetBackbone
+from keras_hub.src.models.f_net.f_net_text_classifier import FNetTextClassifier
+from keras_hub.src.models.f_net.f_net_text_classifier_preprocessor import (
     FNetTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.f_net.f_net_tokenizer import FNetTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.f_net.f_net_tokenizer import FNetTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FNetTextClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/f_net/f_net_tokenizer.py b/keras_hub/src/models/f_net/f_net_tokenizer.py
similarity index 84%
rename from keras_nlp/src/models/f_net/f_net_tokenizer.py
rename to keras_hub/src/models/f_net/f_net_tokenizer.py
index 3b4d48fec3..9f49bc4a28 100644
--- a/keras_nlp/src/models/f_net/f_net_tokenizer.py
+++ b/keras_hub/src/models/f_net/f_net_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,24 +13,24 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.f_net.f_net_backbone import FNetBackbone
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.f_net.f_net_backbone import FNetBackbone
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.FNetTokenizer",
-        "keras_nlp.models.FNetTokenizer",
+        "keras_hub.tokenizers.FNetTokenizer",
+        "keras_hub.models.FNetTokenizer",
     ]
 )
 class FNetTokenizer(SentencePieceTokenizer):
     """FNet tokenizer layer based on SentencePiece.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.SentencePieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.SentencePieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by
     FNet models and provides a `from_preset()` method to automatically
     download a matching vocabulary for a FNet preset.
@@ -50,7 +50,7 @@ class FNetTokenizer(SentencePieceTokenizer):
     Examples:
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.FNetTokenizer.from_preset(
+    tokenizer = keras_hub.models.FNetTokenizer.from_preset(
         "f_net_base_en",
     )
     tokenizer("The quick brown fox jumped.")
diff --git a/keras_nlp/src/models/f_net/f_net_tokenizer_test.py b/keras_hub/src/models/f_net/f_net_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/f_net/f_net_tokenizer_test.py
rename to keras_hub/src/models/f_net/f_net_tokenizer_test.py
index 0e8ff6620f..f9e74c1231 100644
--- a/keras_nlp/src/models/f_net/f_net_tokenizer_test.py
+++ b/keras_hub/src/models/f_net/f_net_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 
 import pytest
 
-from keras_nlp.src.models.f_net.f_net_tokenizer import FNetTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.f_net.f_net_tokenizer import FNetTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FNetTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/falcon/__init__.py b/keras_hub/src/models/falcon/__init__.py
similarity index 72%
rename from keras_nlp/src/models/falcon/__init__.py
rename to keras_hub/src/models/falcon/__init__.py
index 12adc32feb..07e9ae03c1 100644
--- a/keras_nlp/src/models/falcon/__init__.py
+++ b/keras_hub/src/models/falcon/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.falcon.falcon_backbone import FalconBackbone
-from keras_nlp.src.models.falcon.falcon_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.falcon.falcon_backbone import FalconBackbone
+from keras_hub.src.models.falcon.falcon_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, FalconBackbone)
diff --git a/keras_nlp/src/models/falcon/falcon_attention.py b/keras_hub/src/models/falcon/falcon_attention.py
similarity index 99%
rename from keras_nlp/src/models/falcon/falcon_attention.py
rename to keras_hub/src/models/falcon/falcon_attention.py
index b1bcaec47e..775bc3b928 100644
--- a/keras_nlp/src/models/falcon/falcon_attention.py
+++ b/keras_hub/src/models/falcon/falcon_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/falcon/falcon_backbone.py b/keras_hub/src/models/falcon/falcon_backbone.py
similarity index 92%
rename from keras_nlp/src/models/falcon/falcon_backbone.py
rename to keras_hub/src/models/falcon/falcon_backbone.py
index 3703dcff43..0235371570 100644
--- a/keras_nlp/src/models/falcon/falcon_backbone.py
+++ b/keras_hub/src/models/falcon/falcon_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,17 +13,17 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.falcon.falcon_transformer_decoder import (
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.falcon.falcon_transformer_decoder import (
     FalconTransformerDecoder,
 )
 
 
-@keras_nlp_export("keras_nlp.models.FalconBackbone")
+@keras_hub_export("keras_hub.models.FalconBackbone")
 class FalconBackbone(Backbone):
     """The Falcon core architecure.
 
@@ -56,11 +56,11 @@ class FalconBackbone(Backbone):
 
     # Pretrained Falcon decoder.
     # TODO: Update the preset.
-    model = keras_nlp.models.FalconBackbone.from_preset("falcon_preset")
+    model = keras_hub.models.FalconBackbone.from_preset("falcon_preset")
     model(input_data)
 
     # Randomly initialized Falcon decoder with a custom config.
-    model = keras_nlp.models.FalconBackbone(
+    model = keras_hub.models.FalconBackbone(
         vocabulary_size=10,
         num_layers=2,
         num_attention_heads=2,
diff --git a/keras_nlp/src/models/falcon/falcon_backbone_test.py b/keras_hub/src/models/falcon/falcon_backbone_test.py
similarity index 90%
rename from keras_nlp/src/models/falcon/falcon_backbone_test.py
rename to keras_hub/src/models/falcon/falcon_backbone_test.py
index e7104b37fd..dc56268700 100644
--- a/keras_nlp/src/models/falcon/falcon_backbone_test.py
+++ b/keras_hub/src/models/falcon/falcon_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.falcon.falcon_backbone import FalconBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.falcon.falcon_backbone import FalconBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FalconBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/falcon/falcon_causal_lm.py b/keras_hub/src/models/falcon/falcon_causal_lm.py
similarity index 89%
rename from keras_nlp/src/models/falcon/falcon_causal_lm.py
rename to keras_hub/src/models/falcon/falcon_causal_lm.py
index 5257b3799d..13374d6f90 100644
--- a/keras_nlp/src/models/falcon/falcon_causal_lm.py
+++ b/keras_hub/src/models/falcon/falcon_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,16 +14,16 @@
 
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.falcon.falcon_backbone import FalconBackbone
-from keras_nlp.src.models.falcon.falcon_causal_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.falcon.falcon_backbone import FalconBackbone
+from keras_hub.src.models.falcon.falcon_causal_lm_preprocessor import (
     FalconCausalLMPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.FalconCausalLM")
+@keras_hub_export("keras_hub.models.FalconCausalLM")
 class FalconCausalLM(CausalLM):
     """An end-to-end Falcon model for causal language modeling.
 
@@ -36,7 +36,7 @@ class FalconCausalLM(CausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"greedy"` sampling will be used.
 
     This model can optionally be configured with a `preprocessor` layer, in
@@ -45,8 +45,8 @@ class FalconCausalLM(CausalLM):
     when creating the model with `from_preset()`.
 
     Args:
-        backbone: A `keras_nlp.models.FalconBackbone` instance.
-        preprocessor: A `keras_nlp.models.FalconCausalLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.FalconBackbone` instance.
+        preprocessor: A `keras_hub.models.FalconCausalLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
 
@@ -54,7 +54,7 @@ class FalconCausalLM(CausalLM):
 
     Use `generate()` to do text generation.
     ```python
-    falcon_lm = keras_nlp.models.FalconCausalLM.from_preset("falcon_refinedweb_1b_en")
+    falcon_lm = keras_hub.models.FalconCausalLM.from_preset("falcon_refinedweb_1b_en")
     falcon_lm.generate("I want to say", max_length=30)
 
     # Generate with batched prompts.
@@ -63,11 +63,11 @@ class FalconCausalLM(CausalLM):
 
     Compile the `generate()` function with a custom sampler.
     ```python
-    falcon_lm = keras_nlp.models.FalconCausalLM.from_preset("falcon_refinedweb_1b_en")
+    falcon_lm = keras_hub.models.FalconCausalLM.from_preset("falcon_refinedweb_1b_en")
     falcon_lm.compile(sampler="top_k")
     falcon_lm.generate("I want to say", max_length=30)
 
-    falcon_lm.compile(sampler=keras_nlp.samplers.BeamSampler(num_beams=2))
+    falcon_lm.compile(sampler=keras_hub.samplers.BeamSampler(num_beams=2))
     falcon_lm.generate("I want to say", max_length=30)
     ```
 
@@ -80,7 +80,7 @@ class FalconCausalLM(CausalLM):
         "padding_mask": np.array([[1, 1, 1, 1]] * 2),
     }
 
-    falcon_lm = keras_nlp.models.FalconCausalLM.from_preset(
+    falcon_lm = keras_hub.models.FalconCausalLM.from_preset(
         "falcon_refinedweb_1b_en",
         preprocessor=None,
     )
@@ -90,7 +90,7 @@ class FalconCausalLM(CausalLM):
     Call `fit()` on a single batch.
     ```python
     features = ["The quick brown fox jumped.", "I forgot my homework."]
-    falcon_lm = keras_nlp.models.FalconCausalLM.from_preset("falcon_refinedweb_1b_en")
+    falcon_lm = keras_hub.models.FalconCausalLM.from_preset("falcon_refinedweb_1b_en")
     falcon_lm.fit(x=features, batch_size=2)
     ```
 
@@ -104,7 +104,7 @@ class FalconCausalLM(CausalLM):
     y = np.array([[17337,   292,   318,  2769,  4673,  5888, 50256, 0, 0]] * 2)
     sw = np.array([[1, 1, 1, 1, 1, 1, 1, 0, 0]] * 2)
 
-    falcon_lm = keras_nlp.models.FalconCausalLM.from_preset(
+    falcon_lm = keras_hub.models.FalconCausalLM.from_preset(
         "falcon_refinedweb_1b_en",
         preprocessor=None,
     )
@@ -116,22 +116,22 @@ class FalconCausalLM(CausalLM):
     vocab = {"<|endoftext|>": 0, "a": 4, "Ġquick": 5, "Ġfox": 6}
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick"]
     merges += ["Ġ f", "o x", "Ġf ox"]
-    tokenizer = keras_nlp.models.FalconTokenizer(
+    tokenizer = keras_hub.models.FalconTokenizer(
         vocabulary=vocab,
         merges=merges,
     )
-    preprocessor = keras_nlp.models.FalconCausalLMPreprocessor(
+    preprocessor = keras_hub.models.FalconCausalLMPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.FalconBackbone(
+    backbone = keras_hub.models.FalconBackbone(
         vocabulary_size=50304,
         num_layers=24,
         num_attention_heads=64,
         hidden_dim=2048,
         intermediate_dim=4*2048,
     )
-    falcon_lm = keras_nlp.models.FalconCausalLM(
+    falcon_lm = keras_hub.models.FalconCausalLM(
         backbone=backbone,
         preprocessor=preprocessor,
     )
diff --git a/keras_nlp/src/models/falcon/falcon_causal_lm_preprocessor.py b/keras_hub/src/models/falcon/falcon_causal_lm_preprocessor.py
similarity index 83%
rename from keras_nlp/src/models/falcon/falcon_causal_lm_preprocessor.py
rename to keras_hub/src/models/falcon/falcon_causal_lm_preprocessor.py
index 2b6ae080ec..7567c92b5b 100644
--- a/keras_nlp/src/models/falcon/falcon_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/falcon/falcon_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,30 +12,30 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.falcon.falcon_backbone import FalconBackbone
-from keras_nlp.src.models.falcon.falcon_tokenizer import FalconTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.falcon.falcon_backbone import FalconBackbone
+from keras_hub.src.models.falcon.falcon_tokenizer import FalconTokenizer
 
 
-@keras_nlp_export("keras_nlp.models.FalconCausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.FalconCausalLMPreprocessor")
 class FalconCausalLMPreprocessor(CausalLMPreprocessor):
     """Falcon Causal LM preprocessor.
 
     This preprocessing layer is meant for use with
-    `keras_nlp.models.FalconCausalLM`. By default, it will take in batches of
+    `keras_hub.models.FalconCausalLM`. By default, it will take in batches of
     strings, and return outputs in a `(x, y, sample_weight)` format, where the
     `y` label is the next token id in the `x` sequence.
 
     For use with generation, the layer also exposes two methods
     `generate_preprocess()` and `generate_postprocess()`. When this preprocessor
-    is attached to a `keras_nlp.models.FalconCausalLM` instance, these methods
+    is attached to a `keras_hub.models.FalconCausalLM` instance, these methods
     will be called implicitly in `generate()`. They can also be called
     standalone (e.g. to precompute preprocessing inputs for generation in a
     separate process).
 
     Args:
-        tokenizer: A `keras_nlp.models.FalconTokenizer` instance.
+        tokenizer: A `keras_hub.models.FalconTokenizer` instance.
         sequence_length: The length of the packed inputs.
         add_start_token: If `True`, the preprocessor will prepend the tokenizer
             start token to each input sequence.
@@ -53,7 +53,7 @@ class FalconCausalLMPreprocessor(CausalLMPreprocessor):
     Examples:
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.FalconCausalLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.FalconCausalLMPreprocessor.from_preset(
         "falcon_refinedweb_1b_en"
     )
 
diff --git a/keras_nlp/src/models/falcon/falcon_causal_lm_preprocessor_test.py b/keras_hub/src/models/falcon/falcon_causal_lm_preprocessor_test.py
similarity index 94%
rename from keras_nlp/src/models/falcon/falcon_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/falcon/falcon_causal_lm_preprocessor_test.py
index f768544a7f..aaf945e2f9 100644
--- a/keras_nlp/src/models/falcon/falcon_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/falcon/falcon_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 import pytest
 
-from keras_nlp.src.models.falcon.falcon_causal_lm_preprocessor import (
+from keras_hub.src.models.falcon.falcon_causal_lm_preprocessor import (
     FalconCausalLMPreprocessor,
 )
-from keras_nlp.src.models.falcon.falcon_tokenizer import FalconTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.falcon.falcon_tokenizer import FalconTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FalconCausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/falcon/falcon_causal_lm_test.py b/keras_hub/src/models/falcon/falcon_causal_lm_test.py
similarity index 95%
rename from keras_nlp/src/models/falcon/falcon_causal_lm_test.py
rename to keras_hub/src/models/falcon/falcon_causal_lm_test.py
index c2d62998d6..5b434bae76 100644
--- a/keras_nlp/src/models/falcon/falcon_causal_lm_test.py
+++ b/keras_hub/src/models/falcon/falcon_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,13 +17,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.falcon.falcon_backbone import FalconBackbone
-from keras_nlp.src.models.falcon.falcon_causal_lm import FalconCausalLM
-from keras_nlp.src.models.falcon.falcon_causal_lm_preprocessor import (
+from keras_hub.src.models.falcon.falcon_backbone import FalconBackbone
+from keras_hub.src.models.falcon.falcon_causal_lm import FalconCausalLM
+from keras_hub.src.models.falcon.falcon_causal_lm_preprocessor import (
     FalconCausalLMPreprocessor,
 )
-from keras_nlp.src.models.falcon.falcon_tokenizer import FalconTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.falcon.falcon_tokenizer import FalconTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FalconCausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/falcon/falcon_presets.py b/keras_hub/src/models/falcon/falcon_presets.py
similarity index 96%
rename from keras_nlp/src/models/falcon/falcon_presets.py
rename to keras_hub/src/models/falcon/falcon_presets.py
index b0bb6aa54e..b562902c86 100644
--- a/keras_nlp/src/models/falcon/falcon_presets.py
+++ b/keras_hub/src/models/falcon/falcon_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/falcon/falcon_tokenizer.py b/keras_hub/src/models/falcon/falcon_tokenizer.py
similarity index 83%
rename from keras_nlp/src/models/falcon/falcon_tokenizer.py
rename to keras_hub/src/models/falcon/falcon_tokenizer.py
index c59e0ce8da..bd6730bc83 100644
--- a/keras_nlp/src/models/falcon/falcon_tokenizer.py
+++ b/keras_hub/src/models/falcon/falcon_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,22 +13,22 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.falcon.falcon_backbone import FalconBackbone
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.falcon.falcon_backbone import FalconBackbone
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.FalconTokenizer",
-        "keras_nlp.models.FalconTokenizer",
+        "keras_hub.tokenizers.FalconTokenizer",
+        "keras_hub.models.FalconTokenizer",
     ]
 )
 class FalconTokenizer(BytePairTokenizer):
     """Falcon tokenizer based on BytePairTokenizer.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.BytePairTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.BytePairTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by Falcon
     models and provides a `from_preset()` method to automatically download
     a matching vocabulary for a Falcon preset.
@@ -51,7 +51,7 @@ class FalconTokenizer(BytePairTokenizer):
 
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.FalconTokenizer.from_preset("falcon_refinedweb_1b_en")
+    tokenizer = keras_hub.models.FalconTokenizer.from_preset("falcon_refinedweb_1b_en")
     tokenizer("The quick brown fox jumped.")
 
     # Batched input.
@@ -64,7 +64,7 @@ class FalconTokenizer(BytePairTokenizer):
     vocab = {"<|endoftext|>": 0, "a": 4, "Ġquick": 5, "Ġfox": 6}
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick"]
     merges += ["Ġ f", "o x", "Ġf ox"]
-    tokenizer = keras_nlp.models.FalconTokenizer(vocabulary=vocab, merges=merges)
+    tokenizer = keras_hub.models.FalconTokenizer(vocabulary=vocab, merges=merges)
     tokenizer("a quick fox.")
     ```
     """
diff --git a/keras_nlp/src/models/falcon/falcon_tokenizer_test.py b/keras_hub/src/models/falcon/falcon_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/falcon/falcon_tokenizer_test.py
rename to keras_hub/src/models/falcon/falcon_tokenizer_test.py
index 51d30b975d..1a8f7d4dfb 100644
--- a/keras_nlp/src/models/falcon/falcon_tokenizer_test.py
+++ b/keras_hub/src/models/falcon/falcon_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,8 +13,8 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.falcon.falcon_tokenizer import FalconTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.falcon.falcon_tokenizer import FalconTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class FalconTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/falcon/falcon_transformer_decoder.py b/keras_hub/src/models/falcon/falcon_transformer_decoder.py
similarity index 97%
rename from keras_nlp/src/models/falcon/falcon_transformer_decoder.py
rename to keras_hub/src/models/falcon/falcon_transformer_decoder.py
index b9d4fb4321..b49011f9b7 100644
--- a/keras_nlp/src/models/falcon/falcon_transformer_decoder.py
+++ b/keras_hub/src/models/falcon/falcon_transformer_decoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,13 +16,13 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     compute_causal_mask,
 )
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     merge_padding_and_attention_mask,
 )
-from keras_nlp.src.models.falcon.falcon_attention import FalconAttention
+from keras_hub.src.models.falcon.falcon_attention import FalconAttention
 
 
 class FalconTransformerDecoder(keras.layers.Layer):
diff --git a/keras_nlp/src/models/feature_pyramid_backbone.py b/keras_hub/src/models/feature_pyramid_backbone.py
similarity index 92%
rename from keras_nlp/src/models/feature_pyramid_backbone.py
rename to keras_hub/src/models/feature_pyramid_backbone.py
index 989d9fbd64..1d0ad54cc2 100644
--- a/keras_nlp/src/models/feature_pyramid_backbone.py
+++ b/keras_hub/src/models/feature_pyramid_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.backbone import Backbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.backbone import Backbone
 
 
-@keras_nlp_export("keras_nlp.models.FeaturePyramidBackbone")
+@keras_hub_export("keras_hub.models.FeaturePyramidBackbone")
 class FeaturePyramidBackbone(Backbone):
     """A backbone with feature pyramid outputs.
 
diff --git a/keras_nlp/src/models/gemma/__init__.py b/keras_hub/src/models/gemma/__init__.py
similarity index 72%
rename from keras_nlp/src/models/gemma/__init__.py
rename to keras_hub/src/models/gemma/__init__.py
index 4e47f22c49..951bb6194e 100644
--- a/keras_nlp/src/models/gemma/__init__.py
+++ b/keras_hub/src/models/gemma/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.gemma.gemma_backbone import GemmaBackbone
-from keras_nlp.src.models.gemma.gemma_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.gemma.gemma_backbone import GemmaBackbone
+from keras_hub.src.models.gemma.gemma_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, GemmaBackbone)
diff --git a/keras_nlp/src/models/gemma/gemma_attention.py b/keras_hub/src/models/gemma/gemma_attention.py
similarity index 98%
rename from keras_nlp/src/models/gemma/gemma_attention.py
rename to keras_hub/src/models/gemma/gemma_attention.py
index a01c8fc2fc..5a54cebcd0 100644
--- a/keras_nlp/src/models/gemma/gemma_attention.py
+++ b/keras_hub/src/models/gemma/gemma_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import numpy as np
 from keras import ops
 
-from keras_nlp.src.layers.modeling.rotary_embedding import RotaryEmbedding
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.layers.modeling.rotary_embedding import RotaryEmbedding
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class CachedGemmaAttention(keras.layers.Layer):
diff --git a/keras_nlp/src/models/gemma/gemma_backbone.py b/keras_hub/src/models/gemma/gemma_backbone.py
similarity index 95%
rename from keras_nlp/src/models/gemma/gemma_backbone.py
rename to keras_hub/src/models/gemma/gemma_backbone.py
index 624622fd5e..b62263b911 100644
--- a/keras_nlp/src/models/gemma/gemma_backbone.py
+++ b/keras_hub/src/models/gemma/gemma_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,16 +16,16 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.gemma.gemma_decoder_block import GemmaDecoderBlock
-from keras_nlp.src.models.gemma.rms_normalization import RMSNormalization
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.gemma.gemma_decoder_block import GemmaDecoderBlock
+from keras_hub.src.models.gemma.rms_normalization import RMSNormalization
 
 
-@keras_nlp_export("keras_nlp.models.GemmaBackbone")
+@keras_hub_export("keras_hub.models.GemmaBackbone")
 class GemmaBackbone(Backbone):
     """Gemma core network with hyperparameters.
 
@@ -33,7 +33,7 @@ class GemmaBackbone(Backbone):
     It includes the embedding lookups and transformer layers. This backbone
     will output the final hidden states for each token, not generative
     predictions over the vocabulary space. For a higher-level object for text
-    generation, see `keras_nlp.models.GemmaCausalLM`.
+    generation, see `keras_hub.models.GemmaCausalLM`.
 
     The default constructor gives a fully customizable, randomly initialized
     Gemma model with any number of layers, heads, and embedding dimensions. To
@@ -82,11 +82,11 @@ class GemmaBackbone(Backbone):
     }
 
     # Pretrained Gemma decoder.
-    model = keras_nlp.models.GemmaBackbone.from_preset("gemma_2b_en")
+    model = keras_hub.models.GemmaBackbone.from_preset("gemma_2b_en")
     model(input_data)
 
     # Randomly initialized Gemma decoder with custom config.
-    model = keras_nlp.models.GemmaBackbone(
+    model = keras_hub.models.GemmaBackbone(
         vocabulary_size=50257,
         num_layers=12,
         num_query_heads=12,
@@ -249,7 +249,7 @@ def get_layout_map(
         distribution = keras.distribution.ModelParallel(
             mesh, layout_map, batch_dim_name='batch')
         with distribution.scope():
-           gemma_model = keras_nlp.models.GemmaCausalLM.from_preset()
+           gemma_model = keras_hub.models.GemmaCausalLM.from_preset()
         ```
 
         Args:
diff --git a/keras_nlp/src/models/gemma/gemma_backbone_test.py b/keras_hub/src/models/gemma/gemma_backbone_test.py
similarity index 98%
rename from keras_nlp/src/models/gemma/gemma_backbone_test.py
rename to keras_hub/src/models/gemma/gemma_backbone_test.py
index 703f51271d..d756565968 100644
--- a/keras_nlp/src/models/gemma/gemma_backbone_test.py
+++ b/keras_hub/src/models/gemma/gemma_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.gemma.gemma_backbone import GemmaBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gemma.gemma_backbone import GemmaBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GemmaBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/gemma/gemma_causal_lm.py b/keras_hub/src/models/gemma/gemma_causal_lm.py
similarity index 92%
rename from keras_nlp/src/models/gemma/gemma_causal_lm.py
rename to keras_hub/src/models/gemma/gemma_causal_lm.py
index 89a53a7b05..4a4f9ebd12 100644
--- a/keras_nlp/src/models/gemma/gemma_causal_lm.py
+++ b/keras_hub/src/models/gemma/gemma_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,16 +16,16 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.gemma.gemma_backbone import GemmaBackbone
-from keras_nlp.src.models.gemma.gemma_causal_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.gemma.gemma_backbone import GemmaBackbone
+from keras_hub.src.models.gemma.gemma_causal_lm_preprocessor import (
     GemmaCausalLMPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.GemmaCausalLM")
+@keras_hub_export("keras_hub.models.GemmaCausalLM")
 class GemmaCausalLM(CausalLM):
     """An end-to-end Gemma model for causal language modeling.
 
@@ -38,7 +38,7 @@ class GemmaCausalLM(CausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"greedy"` sampling will be used.
 
     This model can optionally be configured with a `preprocessor` layer, in
@@ -47,8 +47,8 @@ class GemmaCausalLM(CausalLM):
     when creating the model with `from_preset()`.
 
     Args:
-        backbone: A `keras_nlp.models.GemmaBackbone` instance.
-        preprocessor: A `keras_nlp.models.GemmaCausalLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.GemmaBackbone` instance.
+        preprocessor: A `keras_hub.models.GemmaCausalLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
 
@@ -56,7 +56,7 @@ class GemmaCausalLM(CausalLM):
 
     Use `generate()` to do text generation.
     ```python
-    gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma_2b_en")
+    gemma_lm = keras_hub.models.GemmaCausalLM.from_preset("gemma_2b_en")
     gemma_lm.generate("I want to say", max_length=30)
 
     # Generate with batched prompts.
@@ -65,11 +65,11 @@ class GemmaCausalLM(CausalLM):
 
     Compile the `generate()` function with a custom sampler.
     ```python
-    gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma_2b_en")
+    gemma_lm = keras_hub.models.GemmaCausalLM.from_preset("gemma_2b_en")
     gemma_lm.compile(sampler="top_k")
     gemma_lm.generate("I want to say", max_length=30)
 
-    gemma_lm.compile(sampler=keras_nlp.samplers.BeamSampler(num_beams=2))
+    gemma_lm.compile(sampler=keras_hub.samplers.BeamSampler(num_beams=2))
     gemma_lm.generate("I want to say", max_length=30)
     ```
 
@@ -82,7 +82,7 @@ class GemmaCausalLM(CausalLM):
         "padding_mask": np.array([[1, 1, 1, 0, 0, 0, 0]] * 2),
     }
 
-    gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset(
+    gemma_lm = keras_hub.models.GemmaCausalLM.from_preset(
         "gemma_2b_en",
         preprocessor=None,
     )
@@ -92,14 +92,14 @@ class GemmaCausalLM(CausalLM):
     Call `fit()` on a single batch.
     ```python
     features = ["The quick brown fox jumped.", "I forgot my homework."]
-    gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma_2b_en")
+    gemma_lm = keras_hub.models.GemmaCausalLM.from_preset("gemma_2b_en")
     gemma_lm.fit(x=features, batch_size=2)
     ```
 
     Call `fit()` with LoRA fine-tuning enabled.
     ```python
     features = ["The quick brown fox jumped.", "I forgot my homework."]
-    gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma_2b_en")
+    gemma_lm = keras_hub.models.GemmaCausalLM.from_preset("gemma_2b_en")
     gemma.backbone.enable_lora(rank=4)
     gemma_lm.fit(x=features, batch_size=2)
     ```
@@ -114,7 +114,7 @@ class GemmaCausalLM(CausalLM):
     y = np.array([[214064, 603, 5271, 6044, 9581, 3, 0, 0]] * 2)
     sw = np.array([[1, 1, 1, 1, 1, 1, 0, 0]] * 2)
 
-    gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset(
+    gemma_lm = keras_hub.models.GemmaCausalLM.from_preset(
         "gemma_2b_en",
         preprocessor=None,
     )
@@ -123,14 +123,14 @@ class GemmaCausalLM(CausalLM):
 
     Custom backbone and vocabulary.
     ```python
-    tokenizer = keras_nlp.models.GemmaTokenizer(
+    tokenizer = keras_hub.models.GemmaTokenizer(
         proto="proto.spm",
     )
-    preprocessor = keras_nlp.models.GemmaCausalLMPreprocessor(
+    preprocessor = keras_hub.models.GemmaCausalLMPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.GemmaBackbone(
+    backbone = keras_hub.models.GemmaBackbone(
         vocabulary_size=30552,
         num_layers=4,
         num_heads=4,
@@ -138,7 +138,7 @@ class GemmaCausalLM(CausalLM):
         intermediate_dim=512,
         max_sequence_length=128,
     )
-    gemma_lm = keras_nlp.models.GemmaCausalLM(
+    gemma_lm = keras_hub.models.GemmaCausalLM(
         backbone=backbone,
         preprocessor=preprocessor,
     )
@@ -370,7 +370,7 @@ def score(
 
         Compute gradients between embeddings and loss scores with TensorFlow:
         ```python
-        gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset(
+        gemma_lm = keras_hub.models.GemmaCausalLM.from_preset(
             "gemma_2b_en"
         )
         generations = gemma_lm.generate(
diff --git a/keras_nlp/src/models/gemma/gemma_causal_lm_preprocessor.py b/keras_hub/src/models/gemma/gemma_causal_lm_preprocessor.py
similarity index 82%
rename from keras_nlp/src/models/gemma/gemma_causal_lm_preprocessor.py
rename to keras_hub/src/models/gemma/gemma_causal_lm_preprocessor.py
index 9a35f9baba..1fe5d3f0d7 100644
--- a/keras_nlp/src/models/gemma/gemma_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/gemma/gemma_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,30 +12,30 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.gemma.gemma_backbone import GemmaBackbone
-from keras_nlp.src.models.gemma.gemma_tokenizer import GemmaTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.gemma.gemma_backbone import GemmaBackbone
+from keras_hub.src.models.gemma.gemma_tokenizer import GemmaTokenizer
 
 
-@keras_nlp_export("keras_nlp.models.GemmaCausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.GemmaCausalLMPreprocessor")
 class GemmaCausalLMPreprocessor(CausalLMPreprocessor):
     """Gemma Causal LM preprocessor.
 
     This preprocessing layer is meant for use with
-    `keras_nlp.models.GemmaCausalLM`. By default, it will take in batches of
+    `keras_hub.models.GemmaCausalLM`. By default, it will take in batches of
     strings, and return outputs in a `(x, y, sample_weight)` format, where the
     `y` label is the next token id in the `x` sequence.
 
     For use with generation, the layer also exposes two methods
     `generate_preprocess()` and `generate_postprocess()`. When this preprocessor
-    is attached to a `keras_nlp.models.GemmaCausalLM` instance, these methods
+    is attached to a `keras_hub.models.GemmaCausalLM` instance, these methods
     will be called implicitly in `generate()`. They can also be called
     standalone (e.g. to precompute preprocessing inputs for generation in a
     separate process).
 
     Args:
-        tokenizer: A `keras_nlp.models.GemmaTokenizer` instance.
+        tokenizer: A `keras_hub.models.GemmaTokenizer` instance.
         sequence_length: The length of the packed inputs.
         add_start_token: If `True`, the preprocessor will prepend the tokenizer
             start token to each input sequence.
@@ -53,7 +53,7 @@ class GemmaCausalLMPreprocessor(CausalLMPreprocessor):
     Examples:
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.GemmaCausalLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.GemmaCausalLMPreprocessor.from_preset(
         "gemma_2b_en"
     )
 
diff --git a/keras_nlp/src/models/gemma/gemma_causal_lm_preprocessor_test.py b/keras_hub/src/models/gemma/gemma_causal_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/gemma/gemma_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/gemma/gemma_causal_lm_preprocessor_test.py
index 41f1577827..15253cd7a5 100644
--- a/keras_nlp/src/models/gemma/gemma_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/gemma/gemma_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,11 +16,11 @@
 
 import pytest
 
-from keras_nlp.src.models.gemma.gemma_causal_lm_preprocessor import (
+from keras_hub.src.models.gemma.gemma_causal_lm_preprocessor import (
     GemmaCausalLMPreprocessor,
 )
-from keras_nlp.src.models.gemma.gemma_tokenizer import GemmaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gemma.gemma_tokenizer import GemmaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GemmaCausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/gemma/gemma_causal_lm_test.py b/keras_hub/src/models/gemma/gemma_causal_lm_test.py
similarity index 97%
rename from keras_nlp/src/models/gemma/gemma_causal_lm_test.py
rename to keras_hub/src/models/gemma/gemma_causal_lm_test.py
index 42887f7518..a517e333fb 100644
--- a/keras_nlp/src/models/gemma/gemma_causal_lm_test.py
+++ b/keras_hub/src/models/gemma/gemma_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -19,13 +19,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.gemma.gemma_backbone import GemmaBackbone
-from keras_nlp.src.models.gemma.gemma_causal_lm import GemmaCausalLM
-from keras_nlp.src.models.gemma.gemma_causal_lm_preprocessor import (
+from keras_hub.src.models.gemma.gemma_backbone import GemmaBackbone
+from keras_hub.src.models.gemma.gemma_causal_lm import GemmaCausalLM
+from keras_hub.src.models.gemma.gemma_causal_lm_preprocessor import (
     GemmaCausalLMPreprocessor,
 )
-from keras_nlp.src.models.gemma.gemma_tokenizer import GemmaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gemma.gemma_tokenizer import GemmaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GemmaCausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/gemma/gemma_decoder_block.py b/keras_hub/src/models/gemma/gemma_decoder_block.py
similarity index 96%
rename from keras_nlp/src/models/gemma/gemma_decoder_block.py
rename to keras_hub/src/models/gemma/gemma_decoder_block.py
index 860e6a93a3..134fdc29e5 100644
--- a/keras_nlp/src/models/gemma/gemma_decoder_block.py
+++ b/keras_hub/src/models/gemma/gemma_decoder_block.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,14 +14,14 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     compute_causal_mask,
 )
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     merge_padding_and_attention_mask,
 )
-from keras_nlp.src.models.gemma.gemma_attention import CachedGemmaAttention
-from keras_nlp.src.models.gemma.rms_normalization import RMSNormalization
+from keras_hub.src.models.gemma.gemma_attention import CachedGemmaAttention
+from keras_hub.src.models.gemma.rms_normalization import RMSNormalization
 
 
 class GemmaDecoderBlock(keras.layers.Layer):
diff --git a/keras_nlp/src/models/gemma/gemma_lora_test.py b/keras_hub/src/models/gemma/gemma_lora_test.py
similarity index 96%
rename from keras_nlp/src/models/gemma/gemma_lora_test.py
rename to keras_hub/src/models/gemma/gemma_lora_test.py
index 019922dbf0..823f0b0cc1 100644
--- a/keras_nlp/src/models/gemma/gemma_lora_test.py
+++ b/keras_hub/src/models/gemma/gemma_lora_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 
 import numpy as np
 
-from keras_nlp.src.models.gemma.gemma_backbone import GemmaBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gemma.gemma_backbone import GemmaBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GemmaLoraTest(TestCase):
diff --git a/keras_nlp/src/models/gemma/gemma_presets.py b/keras_hub/src/models/gemma/gemma_presets.py
similarity index 99%
rename from keras_nlp/src/models/gemma/gemma_presets.py
rename to keras_hub/src/models/gemma/gemma_presets.py
index 1a9ad13b0e..2ea4bca646 100644
--- a/keras_nlp/src/models/gemma/gemma_presets.py
+++ b/keras_hub/src/models/gemma/gemma_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/gemma/gemma_tokenizer.py b/keras_hub/src/models/gemma/gemma_tokenizer.py
similarity index 84%
rename from keras_nlp/src/models/gemma/gemma_tokenizer.py
rename to keras_hub/src/models/gemma/gemma_tokenizer.py
index b66fc8df68..18e90423a3 100644
--- a/keras_nlp/src/models/gemma/gemma_tokenizer.py
+++ b/keras_hub/src/models/gemma/gemma_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,24 +12,24 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.gemma.gemma_backbone import GemmaBackbone
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.gemma.gemma_backbone import GemmaBackbone
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.GemmaTokenizer",
-        "keras_nlp.models.GemmaTokenizer",
+        "keras_hub.tokenizers.GemmaTokenizer",
+        "keras_hub.models.GemmaTokenizer",
     ]
 )
 class GemmaTokenizer(SentencePieceTokenizer):
     """Gemma tokenizer layer based on SentencePiece.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.SentencePieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.SentencePieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by
     Gemma models and provides a `from_preset()` method to automatically
     download a matching vocabulary for a Gemma preset.
@@ -50,7 +50,7 @@ class GemmaTokenizer(SentencePieceTokenizer):
 
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.GemmaTokenizer.from_preset("gemma_2b_en")
+    tokenizer = keras_hub.models.GemmaTokenizer.from_preset("gemma_2b_en")
     tokenizer("The quick brown fox jumped.")
 
     # Batched input.
@@ -76,7 +76,7 @@ class GemmaTokenizer(SentencePieceTokenizer):
         eos_piece="<eos>",
         unk_piece="<unk>",
     )
-    tokenizer = keras_nlp.models.GemmaTokenizer(
+    tokenizer = keras_hub.models.GemmaTokenizer(
         proto=bytes_io.getvalue(),
     )
     tokenizer("The quick brown fox jumped.")
diff --git a/keras_nlp/src/models/gemma/gemma_tokenizer_test.py b/keras_hub/src/models/gemma/gemma_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/gemma/gemma_tokenizer_test.py
rename to keras_hub/src/models/gemma/gemma_tokenizer_test.py
index 80c6158f0b..518ffdbb01 100644
--- a/keras_nlp/src/models/gemma/gemma_tokenizer_test.py
+++ b/keras_hub/src/models/gemma/gemma_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 
 import pytest
 
-from keras_nlp.src.models.gemma.gemma_tokenizer import GemmaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gemma.gemma_tokenizer import GemmaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GemmaTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/gemma/rms_normalization.py b/keras_hub/src/models/gemma/rms_normalization.py
similarity index 97%
rename from keras_nlp/src/models/gemma/rms_normalization.py
rename to keras_hub/src/models/gemma/rms_normalization.py
index d82ab53c7d..cdf09572ea 100644
--- a/keras_nlp/src/models/gemma/rms_normalization.py
+++ b/keras_hub/src/models/gemma/rms_normalization.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/gpt2/__init__.py b/keras_hub/src/models/gpt2/__init__.py
similarity index 72%
rename from keras_nlp/src/models/gpt2/__init__.py
rename to keras_hub/src/models/gpt2/__init__.py
index afffa0e825..2d6f2e8156 100644
--- a/keras_nlp/src/models/gpt2/__init__.py
+++ b/keras_hub/src/models/gpt2/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.models.gpt2.gpt2_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.models.gpt2.gpt2_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, GPT2Backbone)
diff --git a/keras_nlp/src/models/gpt2/gpt2_backbone.py b/keras_hub/src/models/gpt2/gpt2_backbone.py
similarity index 92%
rename from keras_nlp/src/models/gpt2/gpt2_backbone.py
rename to keras_hub/src/models/gpt2/gpt2_backbone.py
index c704c726a6..feed682afc 100644
--- a/keras_nlp/src/models/gpt2/gpt2_backbone.py
+++ b/keras_hub/src/models/gpt2/gpt2_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,21 +15,21 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.layers.modeling.transformer_decoder import TransformerDecoder
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.utils.keras_utils import gelu_approximate
+from keras_hub.src.layers.modeling.transformer_decoder import TransformerDecoder
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.utils.keras_utils import gelu_approximate
 
 
 def _gpt_2_kernel_initializer(stddev=0.02):
     return keras.initializers.RandomNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.GPT2Backbone")
+@keras_hub_export("keras_hub.models.GPT2Backbone")
 class GPT2Backbone(Backbone):
     """GPT-2 core network with hyperparameters.
 
@@ -74,11 +74,11 @@ class GPT2Backbone(Backbone):
     }
 
     # Pretrained GPT-2 decoder.
-    model = keras_nlp.models.GPT2Backbone.from_preset("gpt2_base_en")
+    model = keras_hub.models.GPT2Backbone.from_preset("gpt2_base_en")
     model(input_data)
 
     # Randomly initialized GPT-2 decoder with custom config.
-    model = keras_nlp.models.GPT2Backbone(
+    model = keras_hub.models.GPT2Backbone(
         vocabulary_size=50257,
         num_layers=12,
         num_heads=12,
diff --git a/keras_nlp/src/models/gpt2/gpt2_backbone_test.py b/keras_hub/src/models/gpt2/gpt2_backbone_test.py
similarity index 93%
rename from keras_nlp/src/models/gpt2/gpt2_backbone_test.py
rename to keras_hub/src/models/gpt2/gpt2_backbone_test.py
index 1240fc06fe..504dcbfe36 100644
--- a/keras_nlp/src/models/gpt2/gpt2_backbone_test.py
+++ b/keras_hub/src/models/gpt2/gpt2_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GPT2BackboneTest(TestCase):
diff --git a/keras_nlp/src/models/gpt2/gpt2_causal_lm.py b/keras_hub/src/models/gpt2/gpt2_causal_lm.py
similarity index 93%
rename from keras_nlp/src/models/gpt2/gpt2_causal_lm.py
rename to keras_hub/src/models/gpt2/gpt2_causal_lm.py
index 21cd6f3893..43eaec8c2d 100644
--- a/keras_nlp/src/models/gpt2/gpt2_causal_lm.py
+++ b/keras_hub/src/models/gpt2/gpt2_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,16 +16,16 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.models.gpt2.gpt2_causal_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.models.gpt2.gpt2_causal_lm_preprocessor import (
     GPT2CausalLMPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.GPT2CausalLM")
+@keras_hub_export("keras_hub.models.GPT2CausalLM")
 class GPT2CausalLM(CausalLM):
     """An end-to-end GPT2 model for causal language modeling.
 
@@ -38,7 +38,7 @@ class GPT2CausalLM(CausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"top_k"` sampling will be used.
 
     This model can optionally be configured with a `preprocessor` layer, in
@@ -52,8 +52,8 @@ class GPT2CausalLM(CausalLM):
     [here](https://github.com/openai/gpt-2).
 
     Args:
-        backbone: A `keras_nlp.models.GPT2Backbone` instance.
-        preprocessor: A `keras_nlp.models.GPT2CausalLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.GPT2Backbone` instance.
+        preprocessor: A `keras_hub.models.GPT2CausalLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
 
@@ -61,7 +61,7 @@ class GPT2CausalLM(CausalLM):
 
     Use `generate()` to do text generation.
     ```python
-    gpt2_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+    gpt2_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
     gpt2_lm.generate("I want to say", max_length=30)
 
     # Generate with batched prompts.
@@ -70,11 +70,11 @@ class GPT2CausalLM(CausalLM):
 
     Compile the `generate()` function with a custom sampler.
     ```python
-    gpt2_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+    gpt2_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
     gpt2_lm.compile(sampler="greedy")
     gpt2_lm.generate("I want to say", max_length=30)
 
-    gpt2_lm.compile(sampler=keras_nlp.samplers.BeamSampler(num_beams=2))
+    gpt2_lm.compile(sampler=keras_hub.samplers.BeamSampler(num_beams=2))
     gpt2_lm.generate("I want to say", max_length=30)
     ```
 
@@ -87,7 +87,7 @@ class GPT2CausalLM(CausalLM):
         "padding_mask": np.array([[1, 1, 0, 0, 0]] * 2),
     }
 
-    gpt2_lm = keras_nlp.models.GPT2CausalLM.from_preset(
+    gpt2_lm = keras_hub.models.GPT2CausalLM.from_preset(
         "gpt2_base_en",
         preprocessor=None,
     )
@@ -97,7 +97,7 @@ class GPT2CausalLM(CausalLM):
     Call `fit()` on a single batch.
     ```python
     features = ["The quick brown fox jumped.", "I forgot my homework."]
-    gpt2_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+    gpt2_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
     gpt2_lm.fit(x=features, batch_size=2)
     ```
 
@@ -110,7 +110,7 @@ class GPT2CausalLM(CausalLM):
     y = np.array([[1, 2, 3, 4, 50256]] * 2)
     sw = np.array([[1, 1, 1, 1, 1]] * 2)
 
-    gpt2_lm = keras_nlp.models.GPT2CausalLM.from_preset(
+    gpt2_lm = keras_hub.models.GPT2CausalLM.from_preset(
         "gpt2_base_en",
         preprocessor=None,
     )
@@ -124,15 +124,15 @@ class GPT2CausalLM(CausalLM):
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick"]
     merges += ["Ġ f", "o x", "Ġf ox"]
 
-    tokenizer = keras_nlp.models.GPT2Tokenizer(
+    tokenizer = keras_hub.models.GPT2Tokenizer(
         vocabulary=vocab,
         merges=merges,
     )
-    preprocessor = keras_nlp.models.GPT2CausalLMPreprocessor(
+    preprocessor = keras_hub.models.GPT2CausalLMPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.GPT2Backbone(
+    backbone = keras_hub.models.GPT2Backbone(
         vocabulary_size=30552,
         num_layers=4,
         num_heads=4,
@@ -140,7 +140,7 @@ class GPT2CausalLM(CausalLM):
         intermediate_dim=512,
         max_sequence_length=128,
     )
-    gpt2_lm = keras_nlp.models.GPT2CausalLM(
+    gpt2_lm = keras_hub.models.GPT2CausalLM(
         backbone=backbone,
         preprocessor=preprocessor,
     )
@@ -358,7 +358,7 @@ def score(
 
         Compute gradients between embeddings and loss scores with TensorFlow:
         ```python
-        gpt2_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+        gpt2_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
         generations = gpt2_lm.generate(
             ["This is a", "Where are you"],
             max_length=30
diff --git a/keras_nlp/src/models/gpt2/gpt2_causal_lm_preprocessor.py b/keras_hub/src/models/gpt2/gpt2_causal_lm_preprocessor.py
similarity index 83%
rename from keras_nlp/src/models/gpt2/gpt2_causal_lm_preprocessor.py
rename to keras_hub/src/models/gpt2/gpt2_causal_lm_preprocessor.py
index 1855031706..3e51d7cf49 100644
--- a/keras_nlp/src/models/gpt2/gpt2_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/gpt2/gpt2_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,30 +12,30 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
 
 
-@keras_nlp_export("keras_nlp.models.GPT2CausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.GPT2CausalLMPreprocessor")
 class GPT2CausalLMPreprocessor(CausalLMPreprocessor):
     """GPT2 Causal LM preprocessor.
 
     This preprocessing layer is meant for use with
-    `keras_nlp.models.GPT2CausalLM`. By default, it will take in batches of
+    `keras_hub.models.GPT2CausalLM`. By default, it will take in batches of
     strings, and return outputs in a `(x, y, sample_weight)` format, where the
     `y` label is the next token id in the `x` sequence.
 
     For use with generation, the layer also exposes two methods
     `generate_preprocess()` and `generate_postprocess()`. When this preprocessor
-    is attached to a `keras_nlp.models.GPT2CausalLM` instance, these methods
+    is attached to a `keras_hub.models.GPT2CausalLM` instance, these methods
     will be called implicitly in `generate()`. They can also be called
     standalone (e.g. to precompute preprocessing inputs for generation in a
     separate process).
 
     Args:
-        tokenizer: A `keras_nlp.models.GPT2Tokenizer` instance.
+        tokenizer: A `keras_hub.models.GPT2Tokenizer` instance.
         sequence_length: The length of the packed inputs.
         add_start_token: If `True`, the preprocessor will prepend the tokenizer
             start token to each input sequence.
@@ -53,7 +53,7 @@ class GPT2CausalLMPreprocessor(CausalLMPreprocessor):
     Examples:
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.GPT2CausalLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.GPT2CausalLMPreprocessor.from_preset(
         "gpt2_base_en"
     )
 
diff --git a/keras_nlp/src/models/gpt2/gpt2_causal_lm_preprocessor_test.py b/keras_hub/src/models/gpt2/gpt2_causal_lm_preprocessor_test.py
similarity index 94%
rename from keras_nlp/src/models/gpt2/gpt2_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/gpt2/gpt2_causal_lm_preprocessor_test.py
index 6dfacbd912..df6ef9b96e 100644
--- a/keras_nlp/src/models/gpt2/gpt2_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/gpt2/gpt2_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 import pytest
 
-from keras_nlp.src.models.gpt2.gpt2_causal_lm_preprocessor import (
+from keras_hub.src.models.gpt2.gpt2_causal_lm_preprocessor import (
     GPT2CausalLMPreprocessor,
 )
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GPT2CausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/gpt2/gpt2_causal_lm_test.py b/keras_hub/src/models/gpt2/gpt2_causal_lm_test.py
similarity index 95%
rename from keras_nlp/src/models/gpt2/gpt2_causal_lm_test.py
rename to keras_hub/src/models/gpt2/gpt2_causal_lm_test.py
index 2ec9863e49..ba3d9b404b 100644
--- a/keras_nlp/src/models/gpt2/gpt2_causal_lm_test.py
+++ b/keras_hub/src/models/gpt2/gpt2_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,13 +17,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.models.gpt2.gpt2_causal_lm import GPT2CausalLM
-from keras_nlp.src.models.gpt2.gpt2_causal_lm_preprocessor import (
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.models.gpt2.gpt2_causal_lm import GPT2CausalLM
+from keras_hub.src.models.gpt2.gpt2_causal_lm_preprocessor import (
     GPT2CausalLMPreprocessor,
 )
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GPT2CausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/gpt2/gpt2_preprocessor.py b/keras_hub/src/models/gpt2/gpt2_preprocessor.py
similarity index 87%
rename from keras_nlp/src/models/gpt2/gpt2_preprocessor.py
rename to keras_hub/src/models/gpt2/gpt2_preprocessor.py
index 720b052568..8ac34f2287 100644
--- a/keras_nlp/src/models/gpt2/gpt2_preprocessor.py
+++ b/keras_hub/src/models/gpt2/gpt2_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,15 +15,15 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.start_end_packer import StartEndPacker
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.start_end_packer import StartEndPacker
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.models.GPT2Preprocessor")
+@keras_hub_export("keras_hub.models.GPT2Preprocessor")
 class GPT2Preprocessor(Preprocessor):
     """Legacy preprocessing layer for GPT2.
 
diff --git a/keras_nlp/src/models/gpt2/gpt2_preprocessor_test.py b/keras_hub/src/models/gpt2/gpt2_preprocessor_test.py
similarity index 92%
rename from keras_nlp/src/models/gpt2/gpt2_preprocessor_test.py
rename to keras_hub/src/models/gpt2/gpt2_preprocessor_test.py
index 3362831576..f917ce4f99 100644
--- a/keras_nlp/src/models/gpt2/gpt2_preprocessor_test.py
+++ b/keras_hub/src/models/gpt2/gpt2_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,9 +14,9 @@
 
 import pytest
 
-from keras_nlp.src.models.gpt2.gpt2_preprocessor import GPT2Preprocessor
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt2.gpt2_preprocessor import GPT2Preprocessor
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GPT2PreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/gpt2/gpt2_presets.py b/keras_hub/src/models/gpt2/gpt2_presets.py
similarity index 98%
rename from keras_nlp/src/models/gpt2/gpt2_presets.py
rename to keras_hub/src/models/gpt2/gpt2_presets.py
index 92b699ee49..86331210f3 100644
--- a/keras_nlp/src/models/gpt2/gpt2_presets.py
+++ b/keras_hub/src/models/gpt2/gpt2_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/gpt2/gpt2_tokenizer.py b/keras_hub/src/models/gpt2/gpt2_tokenizer.py
similarity index 83%
rename from keras_nlp/src/models/gpt2/gpt2_tokenizer.py
rename to keras_hub/src/models/gpt2/gpt2_tokenizer.py
index a55e86d716..71ef409453 100644
--- a/keras_nlp/src/models/gpt2/gpt2_tokenizer.py
+++ b/keras_hub/src/models/gpt2/gpt2_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,22 +13,22 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.GPT2Tokenizer",
-        "keras_nlp.models.GPT2Tokenizer",
+        "keras_hub.tokenizers.GPT2Tokenizer",
+        "keras_hub.models.GPT2Tokenizer",
     ]
 )
 class GPT2Tokenizer(BytePairTokenizer):
     """A GPT-2 tokenizer using Byte-Pair Encoding subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.BytePairTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.BytePairTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by GPT-2
     models and provides a `from_preset()` method to automatically download
     a matching vocabulary for a GPT-2 preset.
@@ -51,7 +51,7 @@ class GPT2Tokenizer(BytePairTokenizer):
 
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.GPT2Tokenizer.from_preset("gpt2_base_en")
+    tokenizer = keras_hub.models.GPT2Tokenizer.from_preset("gpt2_base_en")
     tokenizer("The quick brown fox jumped.")
 
     # Batched input.
@@ -64,7 +64,7 @@ class GPT2Tokenizer(BytePairTokenizer):
     vocab = {"<|endoftext|>": 0, "a": 4, "Ġquick": 5, "Ġfox": 6}
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick"]
     merges += ["Ġ f", "o x", "Ġf ox"]
-    tokenizer = keras_nlp.models.GPT2Tokenizer(vocabulary=vocab, merges=merges)
+    tokenizer = keras_hub.models.GPT2Tokenizer(vocabulary=vocab, merges=merges)
     tokenizer("a quick fox.")
     ```
     """
diff --git a/keras_nlp/src/models/gpt2/gpt2_tokenizer_test.py b/keras_hub/src/models/gpt2/gpt2_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/gpt2/gpt2_tokenizer_test.py
rename to keras_hub/src/models/gpt2/gpt2_tokenizer_test.py
index 818393ae98..3131e7a93e 100644
--- a/keras_nlp/src/models/gpt2/gpt2_tokenizer_test.py
+++ b/keras_hub/src/models/gpt2/gpt2_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import pytest
 
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GPT2TokenizerTest(TestCase):
diff --git a/keras_hub/src/models/gpt_neo_x/__init__.py b/keras_hub/src/models/gpt_neo_x/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/gpt_neo_x/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_attention.py b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_attention.py
similarity index 98%
rename from keras_nlp/src/models/gpt_neo_x/gpt_neo_x_attention.py
rename to keras_hub/src/models/gpt_neo_x/gpt_neo_x_attention.py
index c2f5bf4d61..fbbe1362b0 100644
--- a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_attention.py
+++ b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.rotary_embedding import RotaryEmbedding
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.layers.modeling.rotary_embedding import RotaryEmbedding
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class GPTNeoXAttention(keras.layers.Layer):
diff --git a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_backbone.py b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_backbone.py
similarity index 94%
rename from keras_nlp/src/models/gpt_neo_x/gpt_neo_x_backbone.py
rename to keras_hub/src/models/gpt_neo_x/gpt_neo_x_backbone.py
index cbf4da42dd..d16efbb0e9 100644
--- a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_backbone.py
+++ b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,20 +14,20 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_decoder import GPTNeoXDecoder
-from keras_nlp.src.utils.keras_utils import gelu_approximate
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_decoder import GPTNeoXDecoder
+from keras_hub.src.utils.keras_utils import gelu_approximate
 
 
 def _gpt_neo_x_kernel_initializer(stddev=0.02):
     return keras.initializers.RandomNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.GPTNeoXBackbone")
+@keras_hub_export("keras_hub.models.GPTNeoXBackbone")
 class GPTNeoXBackbone(Backbone):
     """GPT-NeoX core network with hyperparameters.
 
diff --git a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_backbone_test.py b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_backbone_test.py
similarity index 90%
rename from keras_nlp/src/models/gpt_neo_x/gpt_neo_x_backbone_test.py
rename to keras_hub/src/models/gpt_neo_x/gpt_neo_x_backbone_test.py
index 823f927bd7..c1105fc9c3 100644
--- a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_backbone_test.py
+++ b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GPTNeoXBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm.py b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm.py
similarity index 93%
rename from keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm.py
rename to keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm.py
index 49186879d2..d512624348 100644
--- a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm.py
+++ b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,16 +14,16 @@
 
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_causal_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_causal_lm_preprocessor import (
     GPTNeoXCausalLMPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.GPTNeoXCausalLM")
+@keras_hub_export("keras_hub.models.GPTNeoXCausalLM")
 class GPTNeoXCausalLM(CausalLM):
     """An end-to-end GPTNeoX model for causal language modeling.
 
@@ -36,12 +36,12 @@ class GPTNeoXCausalLM(CausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"top_k"` sampling will be used.
 
     Args:
-        backbone: A `keras_nlp.models.GPTNeoXBackbone` instance.
-        preprocessor: A `keras_nlp.models.GPTNeoXCausalLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.GPTNeoXBackbone` instance.
+        preprocessor: A `keras_hub.models.GPTNeoXCausalLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
     """
diff --git a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor.py b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor.py
similarity index 78%
rename from keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor.py
rename to keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor.py
index a3c5efe4b7..489223fc26 100644
--- a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,30 +12,30 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
 
 
-@keras_nlp_export("keras_nlp.models.GPTNeoXCausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.GPTNeoXCausalLMPreprocessor")
 class GPTNeoXCausalLMPreprocessor(CausalLMPreprocessor):
     """GPT-NeoX Causal LM preprocessor.
 
     This preprocessing layer is meant for use with
-    `keras_nlp.models.GPTNeoXCausalLM`. By default, it will take in batches of
+    `keras_hub.models.GPTNeoXCausalLM`. By default, it will take in batches of
     strings, and return outputs in a `(x, y, sample_weight)` format, where the
     `y` label is the next token id in the `x` sequence.
 
     For use with generation, the layer also exposes two methods
     `generate_preprocess()` and `generate_postprocess()`. When this preprocessor
-    is attached to a `keras_nlp.models.GPTNeoXCausalLM` instance, these methods
+    is attached to a `keras_hub.models.GPTNeoXCausalLM` instance, these methods
     will be called implicitly in `generate()`. They can also be called
     standalone (e.g. to precompute preprocessing inputs for generation in a
     separate process).
 
     Args:
-        tokenizer: A `keras_nlp.models.GPTNeoXTokenizer` instance.
+        tokenizer: A `keras_hub.models.GPTNeoXTokenizer` instance.
         sequence_length: The length of the packed inputs.
         add_start_token: If `True`, the preprocessor will prepend the tokenizer
             start token to each input sequence.
diff --git a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor_test.py b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor_test.py
index d32fe7764e..635d74cd8f 100644
--- a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 from keras import ops
 
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_causal_lm_preprocessor import (
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_causal_lm_preprocessor import (
     GPTNeoXCausalLMPreprocessor,
 )
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GPTNeoXCausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm_test.py b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm_test.py
similarity index 93%
rename from keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm_test.py
rename to keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm_test.py
index db2cf4ae2e..f1456489ec 100644
--- a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_causal_lm_test.py
+++ b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,13 +17,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_causal_lm import GPTNeoXCausalLM
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_causal_lm_preprocessor import (
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_causal_lm import GPTNeoXCausalLM
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_causal_lm_preprocessor import (
     GPTNeoXCausalLMPreprocessor,
 )
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GPTNeoXCausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_decoder.py b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_decoder.py
similarity index 97%
rename from keras_nlp/src/models/gpt_neo_x/gpt_neo_x_decoder.py
rename to keras_hub/src/models/gpt_neo_x/gpt_neo_x_decoder.py
index 150951336a..c8029ce4c2 100644
--- a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_decoder.py
+++ b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_decoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,14 +15,14 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     compute_causal_mask,
 )
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     merge_padding_and_attention_mask,
 )
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_attention import GPTNeoXAttention
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_attention import GPTNeoXAttention
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class GPTNeoXDecoder(keras.layers.Layer):
diff --git a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_tokenizer.py b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_tokenizer.py
similarity index 84%
rename from keras_nlp/src/models/gpt_neo_x/gpt_neo_x_tokenizer.py
rename to keras_hub/src/models/gpt_neo_x/gpt_neo_x_tokenizer.py
index 1fcb44b539..55d3a3668f 100644
--- a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_tokenizer.py
+++ b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,22 +12,22 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.GPTNeoXTokenizer",
-        "keras_nlp.models.GPTNeoXTokenizer",
+        "keras_hub.tokenizers.GPTNeoXTokenizer",
+        "keras_hub.models.GPTNeoXTokenizer",
     ]
 )
 class GPTNeoXTokenizer(BytePairTokenizer):
     """A GPTNeoX tokenizer using Byte-Pair Encoding subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.BytePairTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.BytePairTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by GPTNeoX
     models and provides a `from_preset()` method to automatically download
     a matching vocabulary for a GPTNeoX preset.
diff --git a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_tokenizer_test.py b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_tokenizer_test.py
similarity index 91%
rename from keras_nlp/src/models/gpt_neo_x/gpt_neo_x_tokenizer_test.py
rename to keras_hub/src/models/gpt_neo_x/gpt_neo_x_tokenizer_test.py
index e3384cc3ae..0296d8f58a 100644
--- a/keras_nlp/src/models/gpt_neo_x/gpt_neo_x_tokenizer_test.py
+++ b/keras_hub/src/models/gpt_neo_x/gpt_neo_x_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GPTNeoXTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/image_classifier.py b/keras_hub/src/models/image_classifier.py
similarity index 91%
rename from keras_nlp/src/models/image_classifier.py
rename to keras_hub/src/models/image_classifier.py
index 0606a29821..189ecec270 100644
--- a/keras_nlp/src/models/image_classifier.py
+++ b/keras_hub/src/models/image_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2023 The KerasNLP Authors
+# Copyright 2023 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,16 +13,16 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.task import Task
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.task import Task
 
 
-@keras_nlp_export("keras_nlp.models.ImageClassifier")
+@keras_hub_export("keras_hub.models.ImageClassifier")
 class ImageClassifier(Task):
     """Base class for all image classification tasks.
 
-    `ImageClassifier` tasks wrap a `keras_nlp.models.Backbone` and
-    a `keras_nlp.models.Preprocessor` to create a model that can be used for
+    `ImageClassifier` tasks wrap a `keras_hub.models.Backbone` and
+    a `keras_hub.models.Preprocessor` to create a model that can be used for
     image classification. `ImageClassifier` tasks take an additional
     `num_classes` argument, controlling the number of predicted output classes.
 
diff --git a/keras_nlp/src/models/image_classifier_preprocessor.py b/keras_hub/src/models/image_classifier_preprocessor.py
similarity index 85%
rename from keras_nlp/src/models/image_classifier_preprocessor.py
rename to keras_hub/src/models/image_classifier_preprocessor.py
index c354169893..bf171d16fa 100644
--- a/keras_nlp/src/models/image_classifier_preprocessor.py
+++ b/keras_hub/src/models/image_classifier_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,19 +13,19 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.models.ImageClassifierPreprocessor")
+@keras_hub_export("keras_hub.models.ImageClassifierPreprocessor")
 class ImageClassifierPreprocessor(Preprocessor):
     """Base class for image classification preprocessing layers.
 
     `ImageClassifierPreprocessor` tasks wraps a
-    `keras_nlp.layers.ImageConverter` to create a preprocessing layer for
+    `keras_hub.layers.ImageConverter` to create a preprocessing layer for
     image classification tasks. It is intended to be paired with a
-    `keras_nlp.models.ImageClassifier` task.
+    `keras_hub.models.ImageClassifier` task.
 
     All `ImageClassifierPreprocessor` take inputs three inputs, `x`, `y`, and
     `sample_weight`. `x`, the first input, should always be included. It can
@@ -46,7 +46,7 @@ class ImageClassifierPreprocessor(Preprocessor):
 
     Examples.
     ```python
-    preprocessor = keras_nlp.models.ImageClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.ImageClassifierPreprocessor.from_preset(
         "resnet_50",
     )
 
diff --git a/keras_nlp/src/models/llama/__init__.py b/keras_hub/src/models/llama/__init__.py
similarity index 72%
rename from keras_nlp/src/models/llama/__init__.py
rename to keras_hub/src/models/llama/__init__.py
index 31e189db1a..cd47190f14 100644
--- a/keras_nlp/src/models/llama/__init__.py
+++ b/keras_hub/src/models/llama/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.llama.llama_backbone import LlamaBackbone
-from keras_nlp.src.models.llama.llama_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.llama.llama_backbone import LlamaBackbone
+from keras_hub.src.models.llama.llama_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, LlamaBackbone)
diff --git a/keras_nlp/src/models/llama/llama_attention.py b/keras_hub/src/models/llama/llama_attention.py
similarity index 97%
rename from keras_nlp/src/models/llama/llama_attention.py
rename to keras_hub/src/models/llama/llama_attention.py
index 858d4076b1..80e81f8687 100644
--- a/keras_nlp/src/models/llama/llama_attention.py
+++ b/keras_hub/src/models/llama/llama_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.rotary_embedding import RotaryEmbedding
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.layers.modeling.rotary_embedding import RotaryEmbedding
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class LlamaAttention(keras.layers.Layer):
diff --git a/keras_nlp/src/models/llama/llama_backbone.py b/keras_hub/src/models/llama/llama_backbone.py
similarity index 93%
rename from keras_nlp/src/models/llama/llama_backbone.py
rename to keras_hub/src/models/llama/llama_backbone.py
index 9165d718f2..616b377f23 100644
--- a/keras_nlp/src/models/llama/llama_backbone.py
+++ b/keras_hub/src/models/llama/llama_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,20 +14,20 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.llama.llama_decoder import LlamaTransformerDecoder
-from keras_nlp.src.models.llama.llama_layernorm import LlamaLayerNorm
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.llama.llama_decoder import LlamaTransformerDecoder
+from keras_hub.src.models.llama.llama_layernorm import LlamaLayerNorm
 
 
 def _llama_kernel_initializer(stddev=0.02):
     return keras.initializers.RandomNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.LlamaBackbone")
+@keras_hub_export("keras_hub.models.LlamaBackbone")
 class LlamaBackbone(Backbone):
     """
     The Llama Transformer core architecture with hyperparameters.
@@ -72,11 +72,11 @@ class LlamaBackbone(Backbone):
     }
 
     # Pretrained Llama decoder.
-    model = keras_nlp.models.LlamaBackbone.from_preset("llama7b_base_en")
+    model = keras_hub.models.LlamaBackbone.from_preset("llama7b_base_en")
     model(input_data)
 
     # Randomly initialized Llama decoder with custom config.
-    model = keras_nlp.models.LlamaBackbone(
+    model = keras_hub.models.LlamaBackbone(
         vocabulary_size=10,
         hidden_dim=512,
         num_layers=2,
diff --git a/keras_nlp/src/models/llama/llama_backbone_test.py b/keras_hub/src/models/llama/llama_backbone_test.py
similarity index 94%
rename from keras_nlp/src/models/llama/llama_backbone_test.py
rename to keras_hub/src/models/llama/llama_backbone_test.py
index b75bb5eb63..2b11994141 100644
--- a/keras_nlp/src/models/llama/llama_backbone_test.py
+++ b/keras_hub/src/models/llama/llama_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.llama.llama_backbone import LlamaBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.llama.llama_backbone import LlamaBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class LlamaTest(TestCase):
diff --git a/keras_nlp/src/models/llama/llama_causal_lm.py b/keras_hub/src/models/llama/llama_causal_lm.py
similarity index 95%
rename from keras_nlp/src/models/llama/llama_causal_lm.py
rename to keras_hub/src/models/llama/llama_causal_lm.py
index b2a5fecc8b..150a6b0d12 100644
--- a/keras_nlp/src/models/llama/llama_causal_lm.py
+++ b/keras_hub/src/models/llama/llama_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,16 +14,16 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.llama.llama_backbone import LlamaBackbone
-from keras_nlp.src.models.llama.llama_causal_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.llama.llama_backbone import LlamaBackbone
+from keras_hub.src.models.llama.llama_causal_lm_preprocessor import (
     LlamaCausalLMPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.LlamaCausalLM")
+@keras_hub_export("keras_hub.models.LlamaCausalLM")
 class LlamaCausalLM(CausalLM):
     """An end-to-end Llama model for causal language modeling.
 
@@ -36,12 +36,12 @@ class LlamaCausalLM(CausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"top_k"` sampling will be used.
 
     Args:
-        backbone: A `keras_nlp.models.LlamaBackbone` instance.
-        preprocessor: A `keras_nlp.models.LlamaCausalLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.LlamaBackbone` instance.
+        preprocessor: A `keras_hub.models.LlamaCausalLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
     """
@@ -253,7 +253,7 @@ def score(
 
         Compute gradients between embeddings and loss scores with TensorFlow:
         ```python
-        llama_lm = keras_nlp.models.LlamaCausalLM.from_preset("llama2_7b_en")
+        llama_lm = keras_hub.models.LlamaCausalLM.from_preset("llama2_7b_en")
         generations = llama_lm.generate(
             ["This is a", "Where are you"],
             max_length=30
diff --git a/keras_nlp/src/models/llama/llama_causal_lm_preprocessor.py b/keras_hub/src/models/llama/llama_causal_lm_preprocessor.py
similarity index 83%
rename from keras_nlp/src/models/llama/llama_causal_lm_preprocessor.py
rename to keras_hub/src/models/llama/llama_causal_lm_preprocessor.py
index ac7c444bbd..efb664ed68 100644
--- a/keras_nlp/src/models/llama/llama_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/llama/llama_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,30 +13,30 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.llama.llama_backbone import LlamaBackbone
-from keras_nlp.src.models.llama.llama_tokenizer import LlamaTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.llama.llama_backbone import LlamaBackbone
+from keras_hub.src.models.llama.llama_tokenizer import LlamaTokenizer
 
 
-@keras_nlp_export("keras_nlp.models.LlamaCausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.LlamaCausalLMPreprocessor")
 class LlamaCausalLMPreprocessor(CausalLMPreprocessor):
     """Llama Causal LM preprocessor.
 
     This preprocessing layer is meant for use with
-    `keras_nlp.models.LlamaCausalLM`. By default, it will take in batches of
+    `keras_hub.models.LlamaCausalLM`. By default, it will take in batches of
     strings, and return outputs in a `(x, y, sample_weight)` format, where the
     `y` label is the next token id in the `x` sequence.
 
     For use with generation, the layer also exposes two methods
     `generate_preprocess()` and `generate_postprocess()`. When this preprocessor
-    is attached to a `keras_nlp.models.LlamaCausalLM` instance, these methods
+    is attached to a `keras_hub.models.LlamaCausalLM` instance, these methods
     will be called implicitly in `generate()`. They can also be called
     standalone (e.g. to precompute preprocessing inputs for generation in a
     separate process).
 
     Args:
-        tokenizer: A `keras_nlp.models.LlamaTokenizer` instance.
+        tokenizer: A `keras_hub.models.LlamaTokenizer` instance.
         sequence_length: The length of the packed inputs.
         add_start_token: If `True`, the preprocessor will prepend the tokenizer
             start token to each input sequence. Default is `True`.
@@ -54,7 +54,7 @@ class LlamaCausalLMPreprocessor(CausalLMPreprocessor):
     Examples:
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.LlamaCausalLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.LlamaCausalLMPreprocessor.from_preset(
         "llama_base_en"
     )
 
diff --git a/keras_nlp/src/models/llama/llama_causal_lm_preprocessor_test.py b/keras_hub/src/models/llama/llama_causal_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/llama/llama_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/llama/llama_causal_lm_preprocessor_test.py
index 5cb902baed..f2ac0494b1 100644
--- a/keras_nlp/src/models/llama/llama_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/llama/llama_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,11 +16,11 @@
 
 import pytest
 
-from keras_nlp.src.models.llama.llama_causal_lm_preprocessor import (
+from keras_hub.src.models.llama.llama_causal_lm_preprocessor import (
     LlamaCausalLMPreprocessor,
 )
-from keras_nlp.src.models.llama.llama_tokenizer import LlamaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.llama.llama_tokenizer import LlamaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class LlamaCausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/llama/llama_causal_lm_test.py b/keras_hub/src/models/llama/llama_causal_lm_test.py
similarity index 95%
rename from keras_nlp/src/models/llama/llama_causal_lm_test.py
rename to keras_hub/src/models/llama/llama_causal_lm_test.py
index 0fac88ea03..c222942823 100644
--- a/keras_nlp/src/models/llama/llama_causal_lm_test.py
+++ b/keras_hub/src/models/llama/llama_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,13 +18,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.llama.llama_backbone import LlamaBackbone
-from keras_nlp.src.models.llama.llama_causal_lm import LlamaCausalLM
-from keras_nlp.src.models.llama.llama_causal_lm_preprocessor import (
+from keras_hub.src.models.llama.llama_backbone import LlamaBackbone
+from keras_hub.src.models.llama.llama_causal_lm import LlamaCausalLM
+from keras_hub.src.models.llama.llama_causal_lm_preprocessor import (
     LlamaCausalLMPreprocessor,
 )
-from keras_nlp.src.models.llama.llama_tokenizer import LlamaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.llama.llama_tokenizer import LlamaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class LlamaCausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/llama/llama_decoder.py b/keras_hub/src/models/llama/llama_decoder.py
similarity index 96%
rename from keras_nlp/src/models/llama/llama_decoder.py
rename to keras_hub/src/models/llama/llama_decoder.py
index 2f509298da..b15b26bb09 100644
--- a/keras_nlp/src/models/llama/llama_decoder.py
+++ b/keras_hub/src/models/llama/llama_decoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,15 +14,15 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     compute_causal_mask,
 )
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     merge_padding_and_attention_mask,
 )
-from keras_nlp.src.models.llama.llama_attention import LlamaAttention
-from keras_nlp.src.models.llama.llama_layernorm import LlamaLayerNorm
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.models.llama.llama_attention import LlamaAttention
+from keras_hub.src.models.llama.llama_layernorm import LlamaLayerNorm
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class LlamaTransformerDecoder(keras.layers.Layer):
diff --git a/keras_nlp/src/models/llama/llama_layernorm.py b/keras_hub/src/models/llama/llama_layernorm.py
similarity index 97%
rename from keras_nlp/src/models/llama/llama_layernorm.py
rename to keras_hub/src/models/llama/llama_layernorm.py
index 6c98bd3c36..fe7ea03712 100644
--- a/keras_nlp/src/models/llama/llama_layernorm.py
+++ b/keras_hub/src/models/llama/llama_layernorm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/llama/llama_presets.py b/keras_hub/src/models/llama/llama_presets.py
similarity index 98%
rename from keras_nlp/src/models/llama/llama_presets.py
rename to keras_hub/src/models/llama/llama_presets.py
index ea8a361231..0403bb6dd7 100644
--- a/keras_nlp/src/models/llama/llama_presets.py
+++ b/keras_hub/src/models/llama/llama_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/llama/llama_tokenizer.py b/keras_hub/src/models/llama/llama_tokenizer.py
similarity index 82%
rename from keras_nlp/src/models/llama/llama_tokenizer.py
rename to keras_hub/src/models/llama/llama_tokenizer.py
index ad2240bda9..7b84fe7d40 100644
--- a/keras_nlp/src/models/llama/llama_tokenizer.py
+++ b/keras_hub/src/models/llama/llama_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,24 +12,24 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.llama.llama_backbone import LlamaBackbone
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.llama.llama_backbone import LlamaBackbone
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.LlamaTokenizer",
-        "keras_nlp.models.LlamaTokenizer",
+        "keras_hub.tokenizers.LlamaTokenizer",
+        "keras_hub.models.LlamaTokenizer",
     ]
 )
 class LlamaTokenizer(SentencePieceTokenizer):
     """Llama tokenizer layer based on SentencePiece.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.SentencePieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.SentencePieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by
     Llama models and provides a `from_preset()` method to automatically
     download a matching vocabulary for a Llama preset.
@@ -49,7 +49,7 @@ class LlamaTokenizer(SentencePieceTokenizer):
     Examples:
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.LlamaTokenizer.from_preset(
+    tokenizer = keras_hub.models.LlamaTokenizer.from_preset(
         "llama_7b_en",
     )
     tokenizer("The quick brown fox jumped.")
diff --git a/keras_nlp/src/models/llama/llama_tokenizer_test.py b/keras_hub/src/models/llama/llama_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/llama/llama_tokenizer_test.py
rename to keras_hub/src/models/llama/llama_tokenizer_test.py
index 6a4e98174d..caa2b958f7 100644
--- a/keras_nlp/src/models/llama/llama_tokenizer_test.py
+++ b/keras_hub/src/models/llama/llama_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 
 import pytest
 
-from keras_nlp.src.models.llama.llama_tokenizer import LlamaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.llama.llama_tokenizer import LlamaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class LlamaTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/llama3/__init__.py b/keras_hub/src/models/llama3/__init__.py
similarity index 72%
rename from keras_nlp/src/models/llama3/__init__.py
rename to keras_hub/src/models/llama3/__init__.py
index 7fd8bf1e35..8b9eb47375 100644
--- a/keras_nlp/src/models/llama3/__init__.py
+++ b/keras_hub/src/models/llama3/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.llama3.llama3_backbone import Llama3Backbone
-from keras_nlp.src.models.llama3.llama3_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.llama3.llama3_backbone import Llama3Backbone
+from keras_hub.src.models.llama3.llama3_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, Llama3Backbone)
diff --git a/keras_nlp/src/models/llama3/llama3_backbone.py b/keras_hub/src/models/llama3/llama3_backbone.py
similarity index 90%
rename from keras_nlp/src/models/llama3/llama3_backbone.py
rename to keras_hub/src/models/llama3/llama3_backbone.py
index 52255fb138..10d79612da 100644
--- a/keras_nlp/src/models/llama3/llama3_backbone.py
+++ b/keras_hub/src/models/llama3/llama3_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,13 +12,13 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.llama.llama_backbone import LlamaBackbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.llama.llama_backbone import LlamaBackbone
 
 
 # LLaMA 3 shares the same architecture as its predecessors
 # So, we simply create an alias for API consistency
-@keras_nlp_export("keras_nlp.models.Llama3Backbone")
+@keras_hub_export("keras_hub.models.Llama3Backbone")
 class Llama3Backbone(LlamaBackbone):
     """
     The Llama Transformer core architecture with hyperparameters.
@@ -63,11 +63,11 @@ class Llama3Backbone(LlamaBackbone):
     }
 
     # Pretrained Llama decoder.
-    model = keras_nlp.models.Llama3Backbone.from_preset("llama3_8b_en")
+    model = keras_hub.models.Llama3Backbone.from_preset("llama3_8b_en")
     model(input_data)
 
     # Randomly initialized Llama decoder with custom config.
-    model = keras_nlp.models.Llama3Backbone(
+    model = keras_hub.models.Llama3Backbone(
         vocabulary_size=10,
         hidden_dim=512,
         num_layers=2,
diff --git a/keras_nlp/src/models/llama3/llama3_causal_lm.py b/keras_hub/src/models/llama3/llama3_causal_lm.py
similarity index 75%
rename from keras_nlp/src/models/llama3/llama3_causal_lm.py
rename to keras_hub/src/models/llama3/llama3_causal_lm.py
index 16b76103fa..eb79716512 100644
--- a/keras_nlp/src/models/llama3/llama3_causal_lm.py
+++ b/keras_hub/src/models/llama3/llama3_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,15 +11,15 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.llama3.llama3_backbone import Llama3Backbone
-from keras_nlp.src.models.llama3.llama3_causal_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.llama3.llama3_backbone import Llama3Backbone
+from keras_hub.src.models.llama3.llama3_causal_lm_preprocessor import (
     Llama3CausalLMPreprocessor,
 )
-from keras_nlp.src.models.llama.llama_causal_lm import LlamaCausalLM
+from keras_hub.src.models.llama.llama_causal_lm import LlamaCausalLM
 
 
-@keras_nlp_export("keras_nlp.models.Llama3CausalLM")
+@keras_hub_export("keras_hub.models.Llama3CausalLM")
 class Llama3CausalLM(LlamaCausalLM):
     """An end-to-end Llama 3 model for causal language modeling.
 
@@ -32,12 +32,12 @@ class Llama3CausalLM(LlamaCausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"top_k"` sampling will be used.
 
     Args:
-        backbone: A `keras_nlp.models.Llama3Backbone` instance.
-        preprocessor: A `keras_nlp.models.Llama3CausalLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.Llama3Backbone` instance.
+        preprocessor: A `keras_hub.models.Llama3CausalLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
     """
diff --git a/keras_nlp/src/models/llama3/llama3_causal_lm_preprocessor.py b/keras_hub/src/models/llama3/llama3_causal_lm_preprocessor.py
similarity index 83%
rename from keras_nlp/src/models/llama3/llama3_causal_lm_preprocessor.py
rename to keras_hub/src/models/llama3/llama3_causal_lm_preprocessor.py
index 88bb06456b..4967fce67d 100644
--- a/keras_nlp/src/models/llama3/llama3_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/llama3/llama3_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,30 +12,30 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.llama3.llama3_backbone import Llama3Backbone
-from keras_nlp.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.llama3.llama3_backbone import Llama3Backbone
+from keras_hub.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
 
 
-@keras_nlp_export("keras_nlp.models.Llama3CausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.Llama3CausalLMPreprocessor")
 class Llama3CausalLMPreprocessor(CausalLMPreprocessor):
     """Llama 3 Causal LM preprocessor.
 
     This preprocessing layer is meant for use with
-    `keras_nlp.models.Llama3CausalLM`. By default, it will take in batches of
+    `keras_hub.models.Llama3CausalLM`. By default, it will take in batches of
     strings, and return outputs in a `(x, y, sample_weight)` format, where the
     `y` label is the next token id in the `x` sequence.
 
     For use with generation, the layer also exposes two methods
     `generate_preprocess()` and `generate_postprocess()`. When this preprocessor
-    is attached to a `keras_nlp.models.Llama3CausalLM` instance, these methods
+    is attached to a `keras_hub.models.Llama3CausalLM` instance, these methods
     will be called implicitly in `generate()`. They can also be called
     standalone (e.g. to precompute preprocessing inputs for generation in a
     separate process).
 
     Args:
-        tokenizer: A `keras_nlp.models.Llama3Tokenizer` instance.
+        tokenizer: A `keras_hub.models.Llama3Tokenizer` instance.
         sequence_length: The length of the packed inputs.
         add_start_token: If `True`, the preprocessor will prepend the tokenizer
             start token to each input sequence. Default is `False`.
@@ -53,7 +53,7 @@ class Llama3CausalLMPreprocessor(CausalLMPreprocessor):
     Examples:
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.Llama3CausalLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.Llama3CausalLMPreprocessor.from_preset(
         "llama_base_en"
     )
 
diff --git a/keras_nlp/src/models/llama3/llama3_causal_lm_preprocessor_test.py b/keras_hub/src/models/llama3/llama3_causal_lm_preprocessor_test.py
similarity index 94%
rename from keras_nlp/src/models/llama3/llama3_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/llama3/llama3_causal_lm_preprocessor_test.py
index 8c7d8a15bf..be3d5963f5 100644
--- a/keras_nlp/src/models/llama3/llama3_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/llama3/llama3_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 import pytest
 
-from keras_nlp.src.models.llama3.llama3_causal_lm_preprocessor import (
+from keras_hub.src.models.llama3.llama3_causal_lm_preprocessor import (
     Llama3CausalLMPreprocessor,
 )
-from keras_nlp.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class Llama3CausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/llama3/llama3_causal_lm_test.py b/keras_hub/src/models/llama3/llama3_causal_lm_test.py
similarity index 93%
rename from keras_nlp/src/models/llama3/llama3_causal_lm_test.py
rename to keras_hub/src/models/llama3/llama3_causal_lm_test.py
index f2ffd2686b..123535413c 100644
--- a/keras_nlp/src/models/llama3/llama3_causal_lm_test.py
+++ b/keras_hub/src/models/llama3/llama3_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,13 +17,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.llama3.llama3_backbone import Llama3Backbone
-from keras_nlp.src.models.llama3.llama3_causal_lm import Llama3CausalLM
-from keras_nlp.src.models.llama3.llama3_causal_lm_preprocessor import (
+from keras_hub.src.models.llama3.llama3_backbone import Llama3Backbone
+from keras_hub.src.models.llama3.llama3_causal_lm import Llama3CausalLM
+from keras_hub.src.models.llama3.llama3_causal_lm_preprocessor import (
     Llama3CausalLMPreprocessor,
 )
-from keras_nlp.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class Llama3CausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/llama3/llama3_presets.py b/keras_hub/src/models/llama3/llama3_presets.py
similarity index 98%
rename from keras_nlp/src/models/llama3/llama3_presets.py
rename to keras_hub/src/models/llama3/llama3_presets.py
index 3648b218e0..628a61aa85 100644
--- a/keras_nlp/src/models/llama3/llama3_presets.py
+++ b/keras_hub/src/models/llama3/llama3_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/llama3/llama3_tokenizer.py b/keras_hub/src/models/llama3/llama3_tokenizer.py
similarity index 75%
rename from keras_nlp/src/models/llama3/llama3_tokenizer.py
rename to keras_hub/src/models/llama3/llama3_tokenizer.py
index b4793312b3..2dceeb8bbf 100644
--- a/keras_nlp/src/models/llama3/llama3_tokenizer.py
+++ b/keras_hub/src/models/llama3/llama3_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,15 +12,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.llama3.llama3_backbone import Llama3Backbone
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.llama3.llama3_backbone import Llama3Backbone
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.Llama3Tokenizer",
-        "keras_nlp.models.Llama3Tokenizer",
+        "keras_hub.tokenizers.Llama3Tokenizer",
+        "keras_hub.models.Llama3Tokenizer",
     ]
 )
 class Llama3Tokenizer(BytePairTokenizer):
diff --git a/keras_nlp/src/models/llama3/llama3_tokenizer_test.py b/keras_hub/src/models/llama3/llama3_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/llama3/llama3_tokenizer_test.py
rename to keras_hub/src/models/llama3/llama3_tokenizer_test.py
index fe3279664d..a54337e658 100644
--- a/keras_nlp/src/models/llama3/llama3_tokenizer_test.py
+++ b/keras_hub/src/models/llama3/llama3_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import pytest
 
-from keras_nlp.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class Llama3TokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/masked_lm.py b/keras_hub/src/models/masked_lm.py
similarity index 91%
rename from keras_nlp/src/models/masked_lm.py
rename to keras_hub/src/models/masked_lm.py
index 52703cdb7c..b7225da71d 100644
--- a/keras_nlp/src/models/masked_lm.py
+++ b/keras_hub/src/models/masked_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,16 +13,16 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.task import Task
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.task import Task
 
 
-@keras_nlp_export("keras_nlp.models.MaskedLM")
+@keras_hub_export("keras_hub.models.MaskedLM")
 class MaskedLM(Task):
     """Base class for masked language modeling tasks.
 
-    `MaskedLM` tasks wrap a `keras_nlp.models.Backbone` and
-    a `keras_nlp.models.Preprocessor` to create a model that can be used for
+    `MaskedLM` tasks wrap a `keras_hub.models.Backbone` and
+    a `keras_hub.models.Preprocessor` to create a model that can be used for
     unsupervised fine-tuning with a masked language modeling loss.
 
     When calling `fit()`, all input will be tokenized, and random tokens in
@@ -36,7 +36,7 @@ class MaskedLM(Task):
     Example:
     ```python
     # Load a Bert MaskedLM with pre-trained weights.
-    masked_lm = keras_nlp.models.MaskedLM.from_preset(
+    masked_lm = keras_hub.models.MaskedLM.from_preset(
         "bert_base_en",
     )
     masked_lm.fit(train_ds)
diff --git a/keras_nlp/src/models/masked_lm_preprocessor.py b/keras_hub/src/models/masked_lm_preprocessor.py
similarity index 91%
rename from keras_nlp/src/models/masked_lm_preprocessor.py
rename to keras_hub/src/models/masked_lm_preprocessor.py
index c4dab122ab..618c55b7b1 100644
--- a/keras_nlp/src/models/masked_lm_preprocessor.py
+++ b/keras_hub/src/models/masked_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,22 +13,22 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.masked_lm_mask_generator import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.masked_lm_mask_generator import (
     MaskedLMMaskGenerator,
 )
-from keras_nlp.src.layers.preprocessing.multi_segment_packer import (
+from keras_hub.src.layers.preprocessing.multi_segment_packer import (
     MultiSegmentPacker,
 )
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.models.MaskedLMPreprocessor")
+@keras_hub_export("keras_hub.models.MaskedLMPreprocessor")
 class MaskedLMPreprocessor(Preprocessor):
     """Base class for masked language modeling preprocessing layers.
 
-    `MaskedLMPreprocessor` tasks wrap a `keras_nlp.tokenizer.Tokenizer` to
+    `MaskedLMPreprocessor` tasks wrap a `keras_hub.tokenizer.Tokenizer` to
     create a preprocessing layer for masked language modeling tasks. It is
     intended to be paired with a `keras.models.MaskedLM` task.
 
@@ -50,7 +50,7 @@ class MaskedLMPreprocessor(Preprocessor):
 
     Examples.
     ```python
-    preprocessor = keras_nlp.models.MaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.MaskedLMPreprocessor.from_preset(
         "bert_base_en_uncased",
         sequence_length=256, # Optional.
     )
diff --git a/keras_nlp/src/models/masked_lm_preprocessor_test.py b/keras_hub/src/models/masked_lm_preprocessor_test.py
similarity index 87%
rename from keras_nlp/src/models/masked_lm_preprocessor_test.py
rename to keras_hub/src/models/masked_lm_preprocessor_test.py
index 2e75ed5166..897b809b9a 100644
--- a/keras_nlp/src/models/masked_lm_preprocessor_test.py
+++ b/keras_hub/src/models/masked_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,12 +13,12 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.bert.bert_masked_lm_preprocessor import (
+from keras_hub.src.models.bert.bert_masked_lm_preprocessor import (
     BertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
-from keras_nlp.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestMaskedLMPreprocessor(TestCase):
diff --git a/keras_nlp/src/models/mistral/__init__.py b/keras_hub/src/models/mistral/__init__.py
similarity index 73%
rename from keras_nlp/src/models/mistral/__init__.py
rename to keras_hub/src/models/mistral/__init__.py
index 54036b4534..d7a0fedfa4 100644
--- a/keras_nlp/src/models/mistral/__init__.py
+++ b/keras_hub/src/models/mistral/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.mistral.mistral_backbone import MistralBackbone
-from keras_nlp.src.models.mistral.mistral_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.mistral.mistral_backbone import MistralBackbone
+from keras_hub.src.models.mistral.mistral_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, MistralBackbone)
diff --git a/keras_nlp/src/models/mistral/mistral_attention.py b/keras_hub/src/models/mistral/mistral_attention.py
similarity index 97%
rename from keras_nlp/src/models/mistral/mistral_attention.py
rename to keras_hub/src/models/mistral/mistral_attention.py
index f1c31762d9..925a6d36c5 100644
--- a/keras_nlp/src/models/mistral/mistral_attention.py
+++ b/keras_hub/src/models/mistral/mistral_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,12 +14,12 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.rotary_embedding import RotaryEmbedding
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.layers.modeling.rotary_embedding import RotaryEmbedding
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 # This is just a self-attention layer in Mistral. But it can be generalized
-# to use the `keras_nlp.layers.CachedMultiHeadAttention` API. Since this layer
+# to use the `keras_hub.layers.CachedMultiHeadAttention` API. Since this layer
 # implements grouped-query attention and sliding window attention, it might be
 # useful outside of Mistral itself.
 # TODO(tirthasheshpatel): Generalize the attention layer
diff --git a/keras_nlp/src/models/mistral/mistral_backbone.py b/keras_hub/src/models/mistral/mistral_backbone.py
similarity index 93%
rename from keras_nlp/src/models/mistral/mistral_backbone.py
rename to keras_hub/src/models/mistral/mistral_backbone.py
index 80c1dee42c..952ff4f1a6 100644
--- a/keras_nlp/src/models/mistral/mistral_backbone.py
+++ b/keras_hub/src/models/mistral/mistral_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,15 +15,15 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.mistral.mistral_layer_norm import (
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.mistral.mistral_layer_norm import (
     MistralLayerNormalization,
 )
-from keras_nlp.src.models.mistral.mistral_transformer_decoder import (
+from keras_hub.src.models.mistral.mistral_transformer_decoder import (
     MistralTransformerDecoder,
 )
 
@@ -32,7 +32,7 @@ def _mistral_kernel_initializer(stddev=0.02):
     return keras.initializers.RandomNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.MistralBackbone")
+@keras_hub_export("keras_hub.models.MistralBackbone")
 class MistralBackbone(Backbone):
     """
     The Mistral Transformer core architecture with hyperparameters.
@@ -82,11 +82,11 @@ class MistralBackbone(Backbone):
     }
 
     # Pretrained Mistral decoder.
-    model = keras_nlp.models.MistralBackbone.from_preset("mistral7b_base_en")
+    model = keras_hub.models.MistralBackbone.from_preset("mistral7b_base_en")
     model(input_data)
 
     # Randomly initialized Mistral decoder with custom config.
-    model = keras_nlp.models.MistralBackbone(
+    model = keras_hub.models.MistralBackbone(
         vocabulary_size=10,
         hidden_dim=512,
         num_layers=2,
diff --git a/keras_nlp/src/models/mistral/mistral_backbone_test.py b/keras_hub/src/models/mistral/mistral_backbone_test.py
similarity index 94%
rename from keras_nlp/src/models/mistral/mistral_backbone_test.py
rename to keras_hub/src/models/mistral/mistral_backbone_test.py
index cd9d8f92d5..3c70a217c3 100644
--- a/keras_nlp/src/models/mistral/mistral_backbone_test.py
+++ b/keras_hub/src/models/mistral/mistral_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.mistral.mistral_backbone import MistralBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.mistral.mistral_backbone import MistralBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MistralBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/mistral/mistral_causal_lm.py b/keras_hub/src/models/mistral/mistral_causal_lm.py
similarity index 95%
rename from keras_nlp/src/models/mistral/mistral_causal_lm.py
rename to keras_hub/src/models/mistral/mistral_causal_lm.py
index c06a52b004..d7c7fd753c 100644
--- a/keras_nlp/src/models/mistral/mistral_causal_lm.py
+++ b/keras_hub/src/models/mistral/mistral_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,16 +15,16 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.mistral.mistral_backbone import MistralBackbone
-from keras_nlp.src.models.mistral.mistral_causal_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.mistral.mistral_backbone import MistralBackbone
+from keras_hub.src.models.mistral.mistral_causal_lm_preprocessor import (
     MistralCausalLMPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.MistralCausalLM")
+@keras_hub_export("keras_hub.models.MistralCausalLM")
 class MistralCausalLM(CausalLM):
     """An end-to-end Mistral model for causal language modeling.
 
@@ -37,12 +37,12 @@ class MistralCausalLM(CausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"top_k"` sampling will be used.
 
     Args:
-        backbone: A `keras_nlp.models.MistralBackbone` instance.
-        preprocessor: A `keras_nlp.models.MistralCausalLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.MistralBackbone` instance.
+        preprocessor: A `keras_hub.models.MistralCausalLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
     """
@@ -255,7 +255,7 @@ def score(
 
         Compute gradients between embeddings and loss scores with TensorFlow:
         ```python
-        mistral_lm = keras_nlp.models.MistralCausalLM.from_preset(
+        mistral_lm = keras_hub.models.MistralCausalLM.from_preset(
             "mistral_7b_en"
         )
         generations = mistral_lm.generate(
diff --git a/keras_nlp/src/models/mistral/mistral_causal_lm_preprocessor.py b/keras_hub/src/models/mistral/mistral_causal_lm_preprocessor.py
similarity index 83%
rename from keras_nlp/src/models/mistral/mistral_causal_lm_preprocessor.py
rename to keras_hub/src/models/mistral/mistral_causal_lm_preprocessor.py
index fe74de6c6b..45cf29be76 100644
--- a/keras_nlp/src/models/mistral/mistral_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/mistral/mistral_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,30 +12,30 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.mistral.mistral_backbone import MistralBackbone
-from keras_nlp.src.models.mistral.mistral_tokenizer import MistralTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.mistral.mistral_backbone import MistralBackbone
+from keras_hub.src.models.mistral.mistral_tokenizer import MistralTokenizer
 
 
-@keras_nlp_export("keras_nlp.models.MistralCausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.MistralCausalLMPreprocessor")
 class MistralCausalLMPreprocessor(CausalLMPreprocessor):
     """Mistral Causal LM preprocessor.
 
     This preprocessing layer is meant for use with
-    `keras_nlp.models.MistralCausalLM`. By default, it will take in batches of
+    `keras_hub.models.MistralCausalLM`. By default, it will take in batches of
     strings, and return outputs in a `(x, y, sample_weight)` format, where the
     `y` label is the next token id in the `x` sequence.
 
     For use with generation, the layer also exposes two methods
     `generate_preprocess()` and `generate_postprocess()`. When this preprocessor
-    is attached to a `keras_nlp.models.MistralCausalLM` instance, these methods
+    is attached to a `keras_hub.models.MistralCausalLM` instance, these methods
     will be called implicitly in `generate()`. They can also be called
     standalone (e.g. to precompute preprocessing inputs for generation in a
     separate process).
 
     Args:
-        tokenizer: A `keras_nlp.models.MistralTokenizer` instance.
+        tokenizer: A `keras_hub.models.MistralTokenizer` instance.
         sequence_length: The length of the packed inputs.
         add_start_token: If `True`, the preprocessor will prepend the tokenizer
             start token to each input sequence. Default is `True`.
@@ -53,7 +53,7 @@ class MistralCausalLMPreprocessor(CausalLMPreprocessor):
     Examples:
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.MistralCausalLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.MistralCausalLMPreprocessor.from_preset(
         "mistral_base_en"
     )
 
diff --git a/keras_nlp/src/models/mistral/mistral_causal_lm_preprocessor_test.py b/keras_hub/src/models/mistral/mistral_causal_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/mistral/mistral_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/mistral/mistral_causal_lm_preprocessor_test.py
index e2b9bf185b..cafda72ca1 100644
--- a/keras_nlp/src/models/mistral/mistral_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/mistral/mistral_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,11 +16,11 @@
 
 import pytest
 
-from keras_nlp.src.models.mistral.mistral_causal_lm_preprocessor import (
+from keras_hub.src.models.mistral.mistral_causal_lm_preprocessor import (
     MistralCausalLMPreprocessor,
 )
-from keras_nlp.src.models.mistral.mistral_tokenizer import MistralTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.mistral.mistral_tokenizer import MistralTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MistralCausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/mistral/mistral_causal_lm_test.py b/keras_hub/src/models/mistral/mistral_causal_lm_test.py
similarity index 95%
rename from keras_nlp/src/models/mistral/mistral_causal_lm_test.py
rename to keras_hub/src/models/mistral/mistral_causal_lm_test.py
index 03e5ec8a43..104af37998 100644
--- a/keras_nlp/src/models/mistral/mistral_causal_lm_test.py
+++ b/keras_hub/src/models/mistral/mistral_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,13 +18,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.mistral.mistral_backbone import MistralBackbone
-from keras_nlp.src.models.mistral.mistral_causal_lm import MistralCausalLM
-from keras_nlp.src.models.mistral.mistral_causal_lm_preprocessor import (
+from keras_hub.src.models.mistral.mistral_backbone import MistralBackbone
+from keras_hub.src.models.mistral.mistral_causal_lm import MistralCausalLM
+from keras_hub.src.models.mistral.mistral_causal_lm_preprocessor import (
     MistralCausalLMPreprocessor,
 )
-from keras_nlp.src.models.mistral.mistral_tokenizer import MistralTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.mistral.mistral_tokenizer import MistralTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MistralCausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/mistral/mistral_layer_norm.py b/keras_hub/src/models/mistral/mistral_layer_norm.py
similarity index 97%
rename from keras_nlp/src/models/mistral/mistral_layer_norm.py
rename to keras_hub/src/models/mistral/mistral_layer_norm.py
index 82bec0f78e..3aadf78918 100644
--- a/keras_nlp/src/models/mistral/mistral_layer_norm.py
+++ b/keras_hub/src/models/mistral/mistral_layer_norm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/mistral/mistral_presets.py b/keras_hub/src/models/mistral/mistral_presets.py
similarity index 98%
rename from keras_nlp/src/models/mistral/mistral_presets.py
rename to keras_hub/src/models/mistral/mistral_presets.py
index 50691ce67d..0466380229 100644
--- a/keras_nlp/src/models/mistral/mistral_presets.py
+++ b/keras_hub/src/models/mistral/mistral_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/mistral/mistral_tokenizer.py b/keras_hub/src/models/mistral/mistral_tokenizer.py
similarity index 82%
rename from keras_nlp/src/models/mistral/mistral_tokenizer.py
rename to keras_hub/src/models/mistral/mistral_tokenizer.py
index 42895adc4f..a5a35b0709 100644
--- a/keras_nlp/src/models/mistral/mistral_tokenizer.py
+++ b/keras_hub/src/models/mistral/mistral_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,24 +12,24 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.mistral.mistral_backbone import MistralBackbone
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.mistral.mistral_backbone import MistralBackbone
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.MistralTokenizer",
-        "keras_nlp.models.MistralTokenizer",
+        "keras_hub.tokenizers.MistralTokenizer",
+        "keras_hub.models.MistralTokenizer",
     ]
 )
 class MistralTokenizer(SentencePieceTokenizer):
     """Mistral tokenizer layer based on SentencePiece.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.SentencePieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.SentencePieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by
     Mistral models and provides a `from_preset()` method to automatically
     download a matching vocabulary for a Mistral preset.
@@ -49,7 +49,7 @@ class MistralTokenizer(SentencePieceTokenizer):
     Examples:
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.MistralTokenizer.from_preset(
+    tokenizer = keras_hub.models.MistralTokenizer.from_preset(
         "mistral_7b_en",
     )
     tokenizer("The quick brown fox jumped.")
diff --git a/keras_nlp/src/models/mistral/mistral_tokenizer_test.py b/keras_hub/src/models/mistral/mistral_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/mistral/mistral_tokenizer_test.py
rename to keras_hub/src/models/mistral/mistral_tokenizer_test.py
index e6654dad34..39a8c1d970 100644
--- a/keras_nlp/src/models/mistral/mistral_tokenizer_test.py
+++ b/keras_hub/src/models/mistral/mistral_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 
 import pytest
 
-from keras_nlp.src.models.mistral.mistral_tokenizer import MistralTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.mistral.mistral_tokenizer import MistralTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MistralTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/mistral/mistral_transformer_decoder.py b/keras_hub/src/models/mistral/mistral_transformer_decoder.py
similarity index 96%
rename from keras_nlp/src/models/mistral/mistral_transformer_decoder.py
rename to keras_hub/src/models/mistral/mistral_transformer_decoder.py
index c9bd1156d2..215af4257c 100644
--- a/keras_nlp/src/models/mistral/mistral_transformer_decoder.py
+++ b/keras_hub/src/models/mistral/mistral_transformer_decoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,19 +14,19 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     compute_causal_mask,
 )
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     merge_padding_and_attention_mask,
 )
-from keras_nlp.src.models.mistral.mistral_attention import (
+from keras_hub.src.models.mistral.mistral_attention import (
     CachedMistralAttention,
 )
-from keras_nlp.src.models.mistral.mistral_layer_norm import (
+from keras_hub.src.models.mistral.mistral_layer_norm import (
     MistralLayerNormalization,
 )
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class MistralTransformerDecoder(keras.layers.Layer):
diff --git a/keras_hub/src/models/mix_transformer/__init__.py b/keras_hub/src/models/mix_transformer/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/mix_transformer/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/mix_transformer/mix_transformer_backbone.py b/keras_hub/src/models/mix_transformer/mix_transformer_backbone.py
similarity index 94%
rename from keras_nlp/src/models/mix_transformer/mix_transformer_backbone.py
rename to keras_hub/src/models/mix_transformer/mix_transformer_backbone.py
index 35c5f7fd5a..5127bd357b 100644
--- a/keras_nlp/src/models/mix_transformer/mix_transformer_backbone.py
+++ b/keras_hub/src/models/mix_transformer/mix_transformer_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,17 +15,17 @@
 import numpy as np
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
-from keras_nlp.src.models.mix_transformer.mix_transformer_layers import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
+from keras_hub.src.models.mix_transformer.mix_transformer_layers import (
     HierarchicalTransformerEncoder,
 )
-from keras_nlp.src.models.mix_transformer.mix_transformer_layers import (
+from keras_hub.src.models.mix_transformer.mix_transformer_layers import (
     OverlappingPatchingAndEmbedding,
 )
 
 
-@keras_nlp_export("keras_nlp.models.MiTBackbone")
+@keras_hub_export("keras_hub.models.MiTBackbone")
 class MiTBackbone(FeaturePyramidBackbone):
     def __init__(
         self,
@@ -76,7 +76,7 @@ def __init__(
         ```python
         images = np.ones(shape=(1, 96, 96, 3))
         labels = np.zeros(shape=(1, 96, 96, 1))
-        backbone = keras_nlp.models.MiTBackbone.from_preset("mit_b0_imagenet")
+        backbone = keras_hub.models.MiTBackbone.from_preset("mit_b0_imagenet")
 
         # Evaluate model
         model(images)
diff --git a/keras_nlp/src/models/mix_transformer/mix_transformer_backbone_test.py b/keras_hub/src/models/mix_transformer/mix_transformer_backbone_test.py
similarity index 92%
rename from keras_nlp/src/models/mix_transformer/mix_transformer_backbone_test.py
rename to keras_hub/src/models/mix_transformer/mix_transformer_backbone_test.py
index 280adca065..9cab12b7bb 100644
--- a/keras_nlp/src/models/mix_transformer/mix_transformer_backbone_test.py
+++ b/keras_hub/src/models/mix_transformer/mix_transformer_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.mix_transformer.mix_transformer_backbone import (
+from keras_hub.src.models.mix_transformer.mix_transformer_backbone import (
     MiTBackbone,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MiTBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/mix_transformer/mix_transformer_classifier.py b/keras_hub/src/models/mix_transformer/mix_transformer_classifier.py
similarity index 85%
rename from keras_nlp/src/models/mix_transformer/mix_transformer_classifier.py
rename to keras_hub/src/models/mix_transformer/mix_transformer_classifier.py
index a9a51b63ba..c6ff3fba1e 100644
--- a/keras_nlp/src/models/mix_transformer/mix_transformer_classifier.py
+++ b/keras_hub/src/models/mix_transformer/mix_transformer_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,19 +13,19 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.image_classifier import ImageClassifier
-from keras_nlp.src.models.mix_transformer.mix_transformer_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.image_classifier import ImageClassifier
+from keras_hub.src.models.mix_transformer.mix_transformer_backbone import (
     MiTBackbone,
 )
 
 
-@keras_nlp_export("keras_nlp.models.MiTImageClassifier")
+@keras_hub_export("keras_hub.models.MiTImageClassifier")
 class MiTImageClassifier(ImageClassifier):
     """MiTImageClassifier image classifier model.
 
     Args:
-        backbone: A `keras_nlp.models.MiTBackbone` instance.
+        backbone: A `keras_hub.models.MiTBackbone` instance.
         num_classes: int. The number of classes to predict.
         activation: `None`, str or callable. The activation function to use on
             the `Dense` layer. Set `activation=None` to return the output
@@ -42,7 +42,7 @@ class MiTImageClassifier(ImageClassifier):
     ```python
     # Load preset and train
     images = np.ones((2, 224, 224, 3), dtype="float32")
-    classifier = keras_nlp.models.MiTImageClassifier.from_preset(
+    classifier = keras_hub.models.MiTImageClassifier.from_preset(
         "mit_b0_imagenet")
     classifier.predict(images)
     ```
@@ -52,14 +52,14 @@ class MiTImageClassifier(ImageClassifier):
     # Load preset and train
     images = np.ones((2, 224, 224, 3), dtype="float32")
     labels = [0, 3]
-    classifier = keras_nlp.models.MixTransformerImageClassifier.from_preset(
+    classifier = keras_hub.models.MixTransformerImageClassifier.from_preset(
         "mit_b0_imagenet")
     classifier.fit(x=images, y=labels, batch_size=2)
     ```
 
     Call `fit()` with custom loss, optimizer and backbone.
     ```python
-    classifier = keras_nlp.models.MiTImageClassifier.from_preset(
+    classifier = keras_hub.models.MiTImageClassifier.from_preset(
         "mit_b0_imagenet")
     classifier.compile(
         loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
@@ -73,14 +73,14 @@ class MiTImageClassifier(ImageClassifier):
     ```python
     images = np.ones((2, 224, 224, 3), dtype="float32")
     labels = [0, 3]
-    backbone = keras_nlp.models.MiTBackbone(
+    backbone = keras_hub.models.MiTBackbone(
         stackwise_num_filters=[128, 256, 512, 1024],
         stackwise_depth=[3, 9, 9, 3],
         include_rescaling=False,
         block_type="basic_block",
         image_shape = (224, 224, 3),
     )
-    classifier = keras_nlp.models.MiTImageClassifier(
+    classifier = keras_hub.models.MiTImageClassifier(
         backbone=backbone,
         num_classes=4,
     )
diff --git a/keras_nlp/src/models/mix_transformer/mix_transformer_classifier_test.py b/keras_hub/src/models/mix_transformer/mix_transformer_classifier_test.py
similarity index 90%
rename from keras_nlp/src/models/mix_transformer/mix_transformer_classifier_test.py
rename to keras_hub/src/models/mix_transformer/mix_transformer_classifier_test.py
index 57b0671be2..e17071229a 100644
--- a/keras_nlp/src/models/mix_transformer/mix_transformer_classifier_test.py
+++ b/keras_hub/src/models/mix_transformer/mix_transformer_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,13 +14,13 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.mix_transformer.mix_transformer_backbone import (
+from keras_hub.src.models.mix_transformer.mix_transformer_backbone import (
     MiTBackbone,
 )
-from keras_nlp.src.models.mix_transformer.mix_transformer_classifier import (
+from keras_hub.src.models.mix_transformer.mix_transformer_classifier import (
     MiTImageClassifier,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MiTImageClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/mix_transformer/mix_transformer_layers.py b/keras_hub/src/models/mix_transformer/mix_transformer_layers.py
similarity index 99%
rename from keras_nlp/src/models/mix_transformer/mix_transformer_layers.py
rename to keras_hub/src/models/mix_transformer/mix_transformer_layers.py
index 53d99fe484..57ffde94e7 100644
--- a/keras_nlp/src/models/mix_transformer/mix_transformer_layers.py
+++ b/keras_hub/src/models/mix_transformer/mix_transformer_layers.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_hub/src/models/mobilenet/__init__.py b/keras_hub/src/models/mobilenet/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/mobilenet/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/mobilenet/mobilenet_backbone.py b/keras_hub/src/models/mobilenet/mobilenet_backbone.py
similarity index 98%
rename from keras_nlp/src/models/mobilenet/mobilenet_backbone.py
rename to keras_hub/src/models/mobilenet/mobilenet_backbone.py
index 4054b6d76f..27072ddf37 100644
--- a/keras_nlp/src/models/mobilenet/mobilenet_backbone.py
+++ b/keras_hub/src/models/mobilenet/mobilenet_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,14 +14,14 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.backbone import Backbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.backbone import Backbone
 
 BN_EPSILON = 1e-3
 BN_MOMENTUM = 0.999
 
 
-@keras_nlp_export("keras_nlp.models.MobileNetBackbone")
+@keras_hub_export("keras_hub.models.MobileNetBackbone")
 class MobileNetBackbone(Backbone):
     """Instantiates the MobileNet architecture.
 
diff --git a/keras_nlp/src/models/mobilenet/mobilenet_backbone_test.py b/keras_hub/src/models/mobilenet/mobilenet_backbone_test.py
similarity index 92%
rename from keras_nlp/src/models/mobilenet/mobilenet_backbone_test.py
rename to keras_hub/src/models/mobilenet/mobilenet_backbone_test.py
index 80225abe04..32d1c27c47 100644
--- a/keras_nlp/src/models/mobilenet/mobilenet_backbone_test.py
+++ b/keras_hub/src/models/mobilenet/mobilenet_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.mobilenet.mobilenet_backbone import MobileNetBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.mobilenet.mobilenet_backbone import MobileNetBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MobileNetBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/mobilenet/mobilenet_image_classifier.py b/keras_hub/src/models/mobilenet/mobilenet_image_classifier.py
similarity index 87%
rename from keras_nlp/src/models/mobilenet/mobilenet_image_classifier.py
rename to keras_hub/src/models/mobilenet/mobilenet_image_classifier.py
index 3e08f3482c..b744e7c40f 100644
--- a/keras_nlp/src/models/mobilenet/mobilenet_image_classifier.py
+++ b/keras_hub/src/models/mobilenet/mobilenet_image_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,12 +13,12 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.image_classifier import ImageClassifier
-from keras_nlp.src.models.mobilenet.mobilenet_backbone import MobileNetBackbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.image_classifier import ImageClassifier
+from keras_hub.src.models.mobilenet.mobilenet_backbone import MobileNetBackbone
 
 
-@keras_nlp_export("keras_nlp.models.MobileNetImageClassifier")
+@keras_hub_export("keras_hub.models.MobileNetImageClassifier")
 class MobileNetImageClassifier(ImageClassifier):
     """MobileNetV3 image classifier task model.
 
@@ -28,7 +28,7 @@ class MobileNetImageClassifier(ImageClassifier):
     be used to load a pre-trained config and weights.
 
     Args:
-        backbone: A `keras_nlp.models.MobileNetBackbone` instance.
+        backbone: A `keras_hub.models.MobileNetBackbone` instance.
         num_classes: int. The number of classes to predict.
         activation: `None`, str or callable. The activation function to use on
             the `Dense` layer. Set `activation=None` to return the output
@@ -40,7 +40,7 @@ class MobileNetImageClassifier(ImageClassifier):
     ```python
     # Load preset and train
     images = np.ones((2, 224, 224, 3), dtype="float32")
-    classifier = keras_nlp.models.MobileNetImageClassifier.from_preset(
+    classifier = keras_hub.models.MobileNetImageClassifier.from_preset(
         "mobilenet_v3_small_imagenet")
     classifier.predict(images)
     ```
@@ -61,7 +61,7 @@ class MobileNetImageClassifier(ImageClassifier):
         activation="hard_swish",
         inverted_res_block=True,
     )
-    classifier = keras_nlp.models.MobileNetImageClassifier(
+    classifier = keras_hub.models.MobileNetImageClassifier(
         backbone=backbone,
         num_classes=4,
     )
diff --git a/keras_nlp/src/models/mobilenet/mobilenet_image_classifier_test.py b/keras_hub/src/models/mobilenet/mobilenet_image_classifier_test.py
similarity index 91%
rename from keras_nlp/src/models/mobilenet/mobilenet_image_classifier_test.py
rename to keras_hub/src/models/mobilenet/mobilenet_image_classifier_test.py
index 29d00e6d24..0fbcca7675 100644
--- a/keras_nlp/src/models/mobilenet/mobilenet_image_classifier_test.py
+++ b/keras_hub/src/models/mobilenet/mobilenet_image_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 The KerasNLP Authors
+# Copyright 2023 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.mobilenet.mobilenet_backbone import MobileNetBackbone
-from keras_nlp.src.models.mobilenet.mobilenet_image_classifier import (
+from keras_hub.src.models.mobilenet.mobilenet_backbone import MobileNetBackbone
+from keras_hub.src.models.mobilenet.mobilenet_image_classifier import (
     MobileNetImageClassifier,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class MobileNetImageClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/opt/__init__.py b/keras_hub/src/models/opt/__init__.py
similarity index 72%
rename from keras_nlp/src/models/opt/__init__.py
rename to keras_hub/src/models/opt/__init__.py
index 3c9e8dc7ad..77c30aab28 100644
--- a/keras_nlp/src/models/opt/__init__.py
+++ b/keras_hub/src/models/opt/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.opt.opt_backbone import OPTBackbone
-from keras_nlp.src.models.opt.opt_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.opt.opt_backbone import OPTBackbone
+from keras_hub.src.models.opt.opt_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, OPTBackbone)
diff --git a/keras_nlp/src/models/opt/opt_backbone.py b/keras_hub/src/models/opt/opt_backbone.py
similarity index 93%
rename from keras_nlp/src/models/opt/opt_backbone.py
rename to keras_hub/src/models/opt/opt_backbone.py
index 10892ae3a2..5dda2002ab 100644
--- a/keras_nlp/src/models/opt/opt_backbone.py
+++ b/keras_hub/src/models/opt/opt_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,19 +15,19 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.token_and_position_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.token_and_position_embedding import (
     TokenAndPositionEmbedding,
 )
-from keras_nlp.src.layers.modeling.transformer_decoder import TransformerDecoder
-from keras_nlp.src.models.backbone import Backbone
+from keras_hub.src.layers.modeling.transformer_decoder import TransformerDecoder
+from keras_hub.src.models.backbone import Backbone
 
 
 def opt_kernel_initializer(stddev=0.02):
     return keras.initializers.TruncatedNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.OPTBackbone")
+@keras_hub_export("keras_hub.models.OPTBackbone")
 class OPTBackbone(Backbone):
     """An OPT decoder network.
 
@@ -68,11 +68,11 @@ class OPTBackbone(Backbone):
     }
 
     # Pretrained OPT decoder
-    model = keras_nlp.models.OPTBackbone.from_preset("opt_125m_en")
+    model = keras_hub.models.OPTBackbone.from_preset("opt_125m_en")
     model(input_data)
 
     # Randomly initialized OPT decoder model with a custom config
-    model = keras_nlp.models.OPTBackbone(
+    model = keras_hub.models.OPTBackbone(
         vocabulary_size=50265,
         num_layers=4,
         num_heads=4,
diff --git a/keras_nlp/src/models/opt/opt_backbone_test.py b/keras_hub/src/models/opt/opt_backbone_test.py
similarity index 93%
rename from keras_nlp/src/models/opt/opt_backbone_test.py
rename to keras_hub/src/models/opt/opt_backbone_test.py
index a6fd3b0920..b73f74e887 100644
--- a/keras_nlp/src/models/opt/opt_backbone_test.py
+++ b/keras_hub/src/models/opt/opt_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.opt.opt_backbone import OPTBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.opt.opt_backbone import OPTBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class OPTBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/opt/opt_causal_lm.py b/keras_hub/src/models/opt/opt_causal_lm.py
similarity index 90%
rename from keras_nlp/src/models/opt/opt_causal_lm.py
rename to keras_hub/src/models/opt/opt_causal_lm.py
index 0deae0e6aa..8f51a7812f 100644
--- a/keras_nlp/src/models/opt/opt_causal_lm.py
+++ b/keras_hub/src/models/opt/opt_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,16 +15,16 @@
 
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.opt.opt_backbone import OPTBackbone
-from keras_nlp.src.models.opt.opt_causal_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.opt.opt_backbone import OPTBackbone
+from keras_hub.src.models.opt.opt_causal_lm_preprocessor import (
     OPTCausalLMPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.OPTCausalLM")
+@keras_hub_export("keras_hub.models.OPTCausalLM")
 class OPTCausalLM(CausalLM):
     """An end-to-end OPT model for causal language modeling.
 
@@ -37,7 +37,7 @@ class OPTCausalLM(CausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"top_k"` sampling will be used.
 
     This model can optionally be configured with a `preprocessor` layer, in
@@ -51,8 +51,8 @@ class OPTCausalLM(CausalLM):
     [here](https://github.com/facebookresearch/fairseq/).
 
     Args:
-        backbone: A `keras_nlp.models.OPTBackbone` instance.
-        preprocessor: A `keras_nlp.models.OPTCausalLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.OPTBackbone` instance.
+        preprocessor: A `keras_hub.models.OPTCausalLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
 
@@ -60,7 +60,7 @@ class OPTCausalLM(CausalLM):
 
     Use `generate()` to do text generation.
     ```python
-    opt_lm = keras_nlp.models.OPTCausalLM.from_preset("opt_125m_en")
+    opt_lm = keras_hub.models.OPTCausalLM.from_preset("opt_125m_en")
     opt_lm.generate("I want to say", max_length=30)
 
     # Generate with batched prompts.
@@ -69,11 +69,11 @@ class OPTCausalLM(CausalLM):
 
     Compile the `generate()` function with a custom sampler.
     ```python
-    opt_lm = keras_nlp.models.OPTCausalLM.from_preset("opt_125m_en")
+    opt_lm = keras_hub.models.OPTCausalLM.from_preset("opt_125m_en")
     opt_lm.compile(sampler="greedy")
     opt_lm.generate("I want to say", max_length=30)
 
-    opt_lm.compile(sampler=keras_nlp.samplers.BeamSampler(num_beams=2))
+    opt_lm.compile(sampler=keras_hub.samplers.BeamSampler(num_beams=2))
     opt_lm.generate("I want to say", max_length=30)
     ```
 
@@ -86,7 +86,7 @@ class OPTCausalLM(CausalLM):
         "padding_mask": np.array([[1, 1, 0, 0, 0]] * 2),
     }
 
-    opt_lm = keras_nlp.models.OPTCausalLM.from_preset(
+    opt_lm = keras_hub.models.OPTCausalLM.from_preset(
         "opt_125m_en",
         preprocessor=None,
     )
@@ -96,7 +96,7 @@ class OPTCausalLM(CausalLM):
     Call `fit()` on a single batch.
     ```python
     features = ["The quick brown fox jumped.", "I forgot my homework."]
-    opt_lm = keras_nlp.models.OPTCausalLM.from_preset("opt_125m_en")
+    opt_lm = keras_hub.models.OPTCausalLM.from_preset("opt_125m_en")
     opt_lm.fit(x=features, batch_size=2)
     ```
 
@@ -109,7 +109,7 @@ class OPTCausalLM(CausalLM):
     y = np.array([[2, 3, 4, 5, 0]] * 2)
     sw = np.array([[1, 1, 1, 1, 1]] * 2)
 
-    opt_lm = keras_nlp.models.OPTCausalLM.from_preset(
+    opt_lm = keras_hub.models.OPTCausalLM.from_preset(
         "opt_base_en",
         preprocessor=None,
     )
@@ -123,15 +123,15 @@ class OPTCausalLM(CausalLM):
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick"]
     merges += ["Ġ f", "o x", "Ġf ox"]
 
-    tokenizer = keras_nlp.models.OPTTokenizer(
+    tokenizer = keras_hub.models.OPTTokenizer(
         vocabulary=vocab,
         merges=merges,
     )
-    preprocessor = keras_nlp.models.OPTCausalLMPreprocessor(
+    preprocessor = keras_hub.models.OPTCausalLMPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    model = keras_nlp.models.OPTBackbone(
+    model = keras_hub.models.OPTBackbone(
         vocabulary_size=50265,
         num_layers=4,
         num_heads=4,
@@ -139,7 +139,7 @@ class OPTCausalLM(CausalLM):
         intermediate_dim=512,
         max_sequence_length=128,
     )
-    opt_lm = keras_nlp.models.OPTCausalLM(
+    opt_lm = keras_hub.models.OPTCausalLM(
         backbone=backbone,
         preprocessor=preprocessor,
     )
diff --git a/keras_nlp/src/models/opt/opt_causal_lm_preprocessor.py b/keras_hub/src/models/opt/opt_causal_lm_preprocessor.py
similarity index 85%
rename from keras_nlp/src/models/opt/opt_causal_lm_preprocessor.py
rename to keras_hub/src/models/opt/opt_causal_lm_preprocessor.py
index 1823b1acc7..b671f16004 100644
--- a/keras_nlp/src/models/opt/opt_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/opt/opt_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,25 +11,25 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.opt.opt_backbone import OPTBackbone
-from keras_nlp.src.models.opt.opt_tokenizer import OPTTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.opt.opt_backbone import OPTBackbone
+from keras_hub.src.models.opt.opt_tokenizer import OPTTokenizer
 
 
-@keras_nlp_export("keras_nlp.models.OPTCausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.OPTCausalLMPreprocessor")
 class OPTCausalLMPreprocessor(CausalLMPreprocessor):
     """OPT Causal LM preprocessor.
 
     This preprocessing layer is primarily meant to be used with
-    `keras_nlp.models.OPTCausalLM`. By default, it will take in batches of
+    `keras_hub.models.OPTCausalLM`. By default, it will take in batches of
     strings, and return outputs in a `(x, y, sample_weight)` format, where the
     `y` label is the next token id in the `x` sequence. For use with generation,
     pass `return_labels=False` in which case the output will simply be the
     encoded string features.
 
     Args:
-        tokenizer: A `keras_nlp.models.OPTTokenizer` instance.
+        tokenizer: A `keras_hub.models.OPTTokenizer` instance.
         sequence_length: The length of the packed inputs.
         add_start_token: If `True`, the preprocessor will prepend the tokenizer
             start token to each input sequence.
@@ -53,7 +53,7 @@ class OPTCausalLMPreprocessor(CausalLMPreprocessor):
     Examples:
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.OPTCausalLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.OPTCausalLMPreprocessor.from_preset(
         "opt_125m_en"
     )
 
diff --git a/keras_nlp/src/models/opt/opt_causal_lm_preprocessor_test.py b/keras_hub/src/models/opt/opt_causal_lm_preprocessor_test.py
similarity index 94%
rename from keras_nlp/src/models/opt/opt_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/opt/opt_causal_lm_preprocessor_test.py
index ff8ae9235c..959804a40c 100644
--- a/keras_nlp/src/models/opt/opt_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/opt/opt_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 import pytest
 
-from keras_nlp.src.models.opt.opt_causal_lm_preprocessor import (
+from keras_hub.src.models.opt.opt_causal_lm_preprocessor import (
     OPTCausalLMPreprocessor,
 )
-from keras_nlp.src.models.opt.opt_tokenizer import OPTTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.opt.opt_tokenizer import OPTTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class OPTCausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/opt/opt_causal_lm_test.py b/keras_hub/src/models/opt/opt_causal_lm_test.py
similarity index 93%
rename from keras_nlp/src/models/opt/opt_causal_lm_test.py
rename to keras_hub/src/models/opt/opt_causal_lm_test.py
index 94ee5ada04..e4b4dc67c9 100644
--- a/keras_nlp/src/models/opt/opt_causal_lm_test.py
+++ b/keras_hub/src/models/opt/opt_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,13 +17,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.opt.opt_backbone import OPTBackbone
-from keras_nlp.src.models.opt.opt_causal_lm import OPTCausalLM
-from keras_nlp.src.models.opt.opt_causal_lm_preprocessor import (
+from keras_hub.src.models.opt.opt_backbone import OPTBackbone
+from keras_hub.src.models.opt.opt_causal_lm import OPTCausalLM
+from keras_hub.src.models.opt.opt_causal_lm_preprocessor import (
     OPTCausalLMPreprocessor,
 )
-from keras_nlp.src.models.opt.opt_tokenizer import OPTTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.opt.opt_tokenizer import OPTTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class OPTCausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/opt/opt_presets.py b/keras_hub/src/models/opt/opt_presets.py
similarity index 98%
rename from keras_nlp/src/models/opt/opt_presets.py
rename to keras_hub/src/models/opt/opt_presets.py
index 7b5750e0c8..d50c7b29a5 100644
--- a/keras_nlp/src/models/opt/opt_presets.py
+++ b/keras_hub/src/models/opt/opt_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/opt/opt_tokenizer.py b/keras_hub/src/models/opt/opt_tokenizer.py
similarity index 83%
rename from keras_nlp/src/models/opt/opt_tokenizer.py
rename to keras_hub/src/models/opt/opt_tokenizer.py
index 0565c24404..1840996f7f 100644
--- a/keras_nlp/src/models/opt/opt_tokenizer.py
+++ b/keras_hub/src/models/opt/opt_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,22 +13,22 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.opt.opt_backbone import OPTBackbone
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.opt.opt_backbone import OPTBackbone
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.OPTTokenizer",
-        "keras_nlp.models.OPTTokenizer",
+        "keras_hub.tokenizers.OPTTokenizer",
+        "keras_hub.models.OPTTokenizer",
     ]
 )
 class OPTTokenizer(BytePairTokenizer):
     """An OPT tokenizer using Byte-Pair Encoding subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.BytePairTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.BytePairTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by OPT
     models and provides a `from_preset()` method to automatically download
     a matching vocabulary for a OPT preset.
@@ -49,7 +49,7 @@ class OPTTokenizer(BytePairTokenizer):
     Examples:
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.OPTTokenizer.from_preset(
+    tokenizer = keras_hub.models.OPTTokenizer.from_preset(
         "opt_125m_en",
     )
     tokenizer("The quick brown fox jumped.")
@@ -64,7 +64,7 @@ class OPTTokenizer(BytePairTokenizer):
     vocab = {"<pad>": 1, "</s>": 2, "Ġquick": 4, "Ġfox": 5}
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick"]
     merges += ["Ġ f", "o x", "Ġf ox"]
-    tokenizer = keras_nlp.models.OPTTokenizer(vocabulary=vocab, merges=merges)
+    tokenizer = keras_hub.models.OPTTokenizer(vocabulary=vocab, merges=merges)
     tokenizer("The quick brown fox jumped.")
     ```
     """
diff --git a/keras_nlp/src/models/opt/opt_tokenizer_test.py b/keras_hub/src/models/opt/opt_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/opt/opt_tokenizer_test.py
rename to keras_hub/src/models/opt/opt_tokenizer_test.py
index ffdfe63cf0..81853d74dd 100644
--- a/keras_nlp/src/models/opt/opt_tokenizer_test.py
+++ b/keras_hub/src/models/opt/opt_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import pytest
 
-from keras_nlp.src.models.opt.opt_tokenizer import OPTTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.opt.opt_tokenizer import OPTTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class OPTTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/pali_gemma/__init__.py b/keras_hub/src/models/pali_gemma/__init__.py
similarity index 73%
rename from keras_nlp/src/models/pali_gemma/__init__.py
rename to keras_hub/src/models/pali_gemma/__init__.py
index 1c60953d75..0e7f2548b4 100644
--- a/keras_nlp/src/models/pali_gemma/__init__.py
+++ b/keras_hub/src/models/pali_gemma/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,10 +11,10 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
     PaliGemmaBackbone,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.pali_gemma.pali_gemma_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, PaliGemmaBackbone)
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_backbone.py b/keras_hub/src/models/pali_gemma/pali_gemma_backbone.py
similarity index 94%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_backbone.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_backbone.py
index cd4bc76569..a6440b962a 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_backbone.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,19 +14,19 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.gemma.rms_normalization import RMSNormalization
-from keras_nlp.src.models.pali_gemma.pali_gemma_decoder_block import (
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.gemma.rms_normalization import RMSNormalization
+from keras_hub.src.models.pali_gemma.pali_gemma_decoder_block import (
     PaliGemmaDecoderBlock,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_vit import PaliGemmaVit
+from keras_hub.src.models.pali_gemma.pali_gemma_vit import PaliGemmaVit
 
 
-@keras_nlp_export("keras_nlp.models.PaliGemmaBackbone")
+@keras_hub_export("keras_hub.models.PaliGemmaBackbone")
 class PaliGemmaBackbone(Backbone):
     """PaliGemma core network with hyperparameters.
 
@@ -39,7 +39,7 @@ class PaliGemmaBackbone(Backbone):
     represents probabilistic values for output tokens.
 
     For a higher-level object for text-generation,
-    see `keras_nlp.models.PaliGemmaCausalLM`.
+    see `keras_hub.models.PaliGemmaCausalLM`.
 
     The default constructor gives a fully customizable, randomly initialized
     PaliGemma model with any number of vit layers, heads, embedding
@@ -93,11 +93,11 @@ class PaliGemmaBackbone(Backbone):
     }
 
     # Pretrained PaliGemma decoder.
-    model = keras_nlp.models.PaliGemmaBackbone.from_preset("pali_gemma_mix_224")
+    model = keras_hub.models.PaliGemmaBackbone.from_preset("pali_gemma_mix_224")
     model(input_data)
 
     # Randomly initialized PaliGemma decoder with custom config.
-    model = keras_nlp.models.PaliGemmaBackbone(
+    model = keras_hub.models.PaliGemmaBackbone(
         vocabulary_size=50257,
         images_size=224,
         num_layers=12,
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_backbone_test.py b/keras_hub/src/models/pali_gemma/pali_gemma_backbone_test.py
similarity index 96%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_backbone_test.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_backbone_test.py
index 44d80b307d..e0e4642a55 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_backbone_test.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
     PaliGemmaBackbone,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class PaliGemmaBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm.py b/keras_hub/src/models/pali_gemma/pali_gemma_causal_lm.py
similarity index 92%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_causal_lm.py
index 66c6a85b9a..875c5d616c 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,18 +13,18 @@
 # limitations under the License.
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
     PaliGemmaBackbone,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_causal_lm_preprocessor import (
+from keras_hub.src.models.pali_gemma.pali_gemma_causal_lm_preprocessor import (
     PaliGemmaCausalLMPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.PaliGemmaCausalLM")
+@keras_hub_export("keras_hub.models.PaliGemmaCausalLM")
 class PaliGemmaCausalLM(CausalLM):
     """An end-to-end multi modal PaliGemma model for causal language modeling.
 
@@ -36,7 +36,7 @@ class PaliGemmaCausalLM(CausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"greedy"` sampling will be used.
 
     This model can optionally be configured with a `preprocessor` layer, in
@@ -45,8 +45,8 @@ class PaliGemmaCausalLM(CausalLM):
     when creating the model with `from_preset()`.
 
     Args:
-        backbone: A `keras_nlp.models.PaliGemmaBackbone` instance.
-        preprocessor: A `keras_nlp.models.PaliGemmaCausalLMPreprocessor` or
+        backbone: A `keras_hub.models.PaliGemmaBackbone` instance.
+        preprocessor: A `keras_hub.models.PaliGemmaCausalLMPreprocessor` or
             `None`. If `None`, this model will not apply preprocessing, and
             inputs should be preprocessed before calling the model.
 
@@ -55,7 +55,7 @@ class PaliGemmaCausalLM(CausalLM):
     Use `generate()` to do text generation.
     ```python
     image = np.random.rand(224, 224, 3)
-    pali_gemma_lm = keras_nlp.models.PaliGemmaCausalLM.from_preset(
+    pali_gemma_lm = keras_hub.models.PaliGemmaCausalLM.from_preset(
         "pali_gemma_3b_mix_224"
     )
     pali_gemma_lm.generate(
@@ -85,7 +85,7 @@ class PaliGemmaCausalLM(CausalLM):
         "padding_mask": np.array([[1, 1, 1, 0, 0, 0, 0]] * 2),
     }
 
-    pali_gemma_lm = keras_nlp.models.PaliGemmaCausalLM.from_preset(
+    pali_gemma_lm = keras_hub.models.PaliGemmaCausalLM.from_preset(
         "pali_gemma_3b_mix_224",
         preprocessor=None,
     )
@@ -94,15 +94,15 @@ class PaliGemmaCausalLM(CausalLM):
 
     Custom backbone and vocabulary.
     ```python
-    tokenizer = keras_nlp.models.PaliGemmaTokenizer(
+    tokenizer = keras_hub.models.PaliGemmaTokenizer(
         proto="proto.spm",
     )
-    preprocessor = keras_nlp.models.PaliGemmaCausalLMPreprocessor(
+    preprocessor = keras_hub.models.PaliGemmaCausalLMPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.PaliGemmaBackbone()
-    pali_gemma_lm = keras_nlp.models.PaliGemmaCausalLM(
+    backbone = keras_hub.models.PaliGemmaBackbone()
+    pali_gemma_lm = keras_hub.models.PaliGemmaCausalLM(
         backbone=backbone,
         preprocessor=preprocessor,
     )
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor.py b/keras_hub/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor.py
similarity index 89%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor.py
index 01493ef454..04d68769e6 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,24 +13,24 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.multi_segment_packer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.multi_segment_packer import (
     MultiSegmentPacker,
 )
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
     PaliGemmaBackbone,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_image_converter import (
+from keras_hub.src.models.pali_gemma.pali_gemma_image_converter import (
     PaliGemmaImageConverter,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_tokenizer import (
+from keras_hub.src.models.pali_gemma.pali_gemma_tokenizer import (
     PaliGemmaTokenizer,
 )
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.models.PaliGemmaCausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.PaliGemmaCausalLMPreprocessor")
 class PaliGemmaCausalLMPreprocessor(CausalLMPreprocessor):
     backbone_cls = PaliGemmaBackbone
     tokenizer_cls = PaliGemmaTokenizer
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor_test.py b/keras_hub/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor_test.py
index 5553bcd71a..0e1363e33d 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,16 +17,16 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.pali_gemma.pali_gemma_causal_lm_preprocessor import (
+from keras_hub.src.models.pali_gemma.pali_gemma_causal_lm_preprocessor import (
     PaliGemmaCausalLMPreprocessor,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_image_converter import (
+from keras_hub.src.models.pali_gemma.pali_gemma_image_converter import (
     PaliGemmaImageConverter,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_tokenizer import (
+from keras_hub.src.models.pali_gemma.pali_gemma_tokenizer import (
     PaliGemmaTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class PaliGemmaCausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm_test.py b/keras_hub/src/models/pali_gemma/pali_gemma_causal_lm_test.py
similarity index 91%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm_test.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_causal_lm_test.py
index 3009cbf944..cc342f8665 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_causal_lm_test.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,22 +16,22 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
     PaliGemmaBackbone,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_causal_lm import (
+from keras_hub.src.models.pali_gemma.pali_gemma_causal_lm import (
     PaliGemmaCausalLM,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_causal_lm_preprocessor import (
+from keras_hub.src.models.pali_gemma.pali_gemma_causal_lm_preprocessor import (
     PaliGemmaCausalLMPreprocessor,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_image_converter import (
+from keras_hub.src.models.pali_gemma.pali_gemma_image_converter import (
     PaliGemmaImageConverter,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_tokenizer import (
+from keras_hub.src.models.pali_gemma.pali_gemma_tokenizer import (
     PaliGemmaTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class PaliGemmaCausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_decoder_block.py b/keras_hub/src/models/pali_gemma/pali_gemma_decoder_block.py
similarity index 97%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_decoder_block.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_decoder_block.py
index ec3f4749dc..8778505c8f 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_decoder_block.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_decoder_block.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     compute_causal_mask,
 )
-from keras_nlp.src.models.gemma.gemma_decoder_block import GemmaDecoderBlock
+from keras_hub.src.models.gemma.gemma_decoder_block import GemmaDecoderBlock
 
 
 class PaliGemmaDecoderBlock(GemmaDecoderBlock):
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_decoder_block_test.py b/keras_hub/src/models/pali_gemma/pali_gemma_decoder_block_test.py
similarity index 95%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_decoder_block_test.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_decoder_block_test.py
index 3f28bb407f..51bcb5a419 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_decoder_block_test.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_decoder_block_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,10 +14,10 @@
 
 import numpy as np
 
-from keras_nlp.src.models.pali_gemma.pali_gemma_decoder_block import (
+from keras_hub.src.models.pali_gemma.pali_gemma_decoder_block import (
     PaliGemmaDecoderBlock,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class PaliGemmaDecoderBlockTest(TestCase):
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_image_converter.py b/keras_hub/src/models/pali_gemma/pali_gemma_image_converter.py
similarity index 71%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_image_converter.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_image_converter.py
index 6bef25d8b4..e75e446e14 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_image_converter.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_image_converter.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,15 +11,15 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.resizing_image_converter import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.resizing_image_converter import (
     ResizingImageConverter,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
     PaliGemmaBackbone,
 )
 
 
-@keras_nlp_export("keras_nlp.layers.PaliGemmaImageConverter")
+@keras_hub_export("keras_hub.layers.PaliGemmaImageConverter")
 class PaliGemmaImageConverter(ResizingImageConverter):
     backbone_cls = PaliGemmaBackbone
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_presets.py b/keras_hub/src/models/pali_gemma/pali_gemma_presets.py
similarity index 98%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_presets.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_presets.py
index 8cabcc7038..2a13c855e9 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_presets.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_tokenizer.py b/keras_hub/src/models/pali_gemma/pali_gemma_tokenizer.py
similarity index 82%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_tokenizer.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_tokenizer.py
index d52c1de4d4..a01a162b7d 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_tokenizer.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,24 +11,24 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.gemma.gemma_tokenizer import GemmaTokenizer
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.gemma.gemma_tokenizer import GemmaTokenizer
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
     PaliGemmaBackbone,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.PaliGemmaTokenizer",
-        "keras_nlp.models.PaliGemmaTokenizer",
+        "keras_hub.tokenizers.PaliGemmaTokenizer",
+        "keras_hub.models.PaliGemmaTokenizer",
     ]
 )
 class PaliGemmaTokenizer(GemmaTokenizer):
     """PaliGemma tokenizer layer based on SentencePiece.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.SentencePieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.SentencePieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by
     PaliGemma models and provides a `from_preset()` method to automatically
     download a matching vocabulary for a PaliGemma preset.
@@ -49,7 +49,7 @@ class PaliGemmaTokenizer(GemmaTokenizer):
 
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.PaliGemmaTokenizer.from_preset(
+    tokenizer = keras_hub.models.PaliGemmaTokenizer.from_preset(
         "pali_gemma_3b_224"
     )
     tokenizer("The quick brown fox jumped.")
@@ -77,7 +77,7 @@ class PaliGemmaTokenizer(GemmaTokenizer):
         eos_piece="<eos>",
         unk_piece="<unk>",
     )
-    tokenizer = keras_nlp.models.PaliGemmaTokenizer(
+    tokenizer = keras_hub.models.PaliGemmaTokenizer(
         proto=bytes_io.getvalue(),
     )
     tokenizer("The quick brown fox jumped.")
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_vit.py b/keras_hub/src/models/pali_gemma/pali_gemma_vit.py
similarity index 99%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_vit.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_vit.py
index 7810a030b1..e9da150c08 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_vit.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_vit.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/pali_gemma/pali_gemma_vit_test.py b/keras_hub/src/models/pali_gemma/pali_gemma_vit_test.py
similarity index 90%
rename from keras_nlp/src/models/pali_gemma/pali_gemma_vit_test.py
rename to keras_hub/src/models/pali_gemma/pali_gemma_vit_test.py
index bab7af6d85..2049f61c4a 100644
--- a/keras_nlp/src/models/pali_gemma/pali_gemma_vit_test.py
+++ b/keras_hub/src/models/pali_gemma/pali_gemma_vit_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,12 +13,12 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.pali_gemma.pali_gemma_vit import PaliGemmaVit
-from keras_nlp.src.models.pali_gemma.pali_gemma_vit import (
+from keras_hub.src.models.pali_gemma.pali_gemma_vit import PaliGemmaVit
+from keras_hub.src.models.pali_gemma.pali_gemma_vit import (
     PaliGemmaVitEmbeddings,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_vit import PaliGemmaVitEncoder
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.pali_gemma.pali_gemma_vit import PaliGemmaVitEncoder
+from keras_hub.src.tests.test_case import TestCase
 
 
 class PaliGemmaVitTest(TestCase):
diff --git a/keras_nlp/src/models/phi3/__init__.py b/keras_hub/src/models/phi3/__init__.py
similarity index 72%
rename from keras_nlp/src/models/phi3/__init__.py
rename to keras_hub/src/models/phi3/__init__.py
index bf0b3c7124..ae23828539 100644
--- a/keras_nlp/src/models/phi3/__init__.py
+++ b/keras_hub/src/models/phi3/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.phi3.phi3_backbone import Phi3Backbone
-from keras_nlp.src.models.phi3.phi3_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.phi3.phi3_backbone import Phi3Backbone
+from keras_hub.src.models.phi3.phi3_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, Phi3Backbone)
diff --git a/keras_nlp/src/models/phi3/phi3_attention.py b/keras_hub/src/models/phi3/phi3_attention.py
similarity index 97%
rename from keras_nlp/src/models/phi3/phi3_attention.py
rename to keras_hub/src/models/phi3/phi3_attention.py
index f57150e78f..e3a43bb46f 100644
--- a/keras_nlp/src/models/phi3/phi3_attention.py
+++ b/keras_hub/src/models/phi3/phi3_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.rotary_embedding import RotaryEmbedding
-from keras_nlp.src.models.phi3.phi3_rotary_embedding import (
+from keras_hub.src.layers.modeling.rotary_embedding import RotaryEmbedding
+from keras_hub.src.models.phi3.phi3_rotary_embedding import (
     Phi3SuScaledRotaryEmbedding,
 )
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class Phi3Attention(keras.layers.Layer):
diff --git a/keras_nlp/src/models/phi3/phi3_backbone.py b/keras_hub/src/models/phi3/phi3_backbone.py
similarity index 94%
rename from keras_nlp/src/models/phi3/phi3_backbone.py
rename to keras_hub/src/models/phi3/phi3_backbone.py
index a9bfb8ed88..506ac56ec8 100644
--- a/keras_nlp/src/models/phi3/phi3_backbone.py
+++ b/keras_hub/src/models/phi3/phi3_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,20 +13,20 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.phi3.phi3_decoder import Phi3Decoder
-from keras_nlp.src.models.phi3.phi3_layernorm import Phi3LayerNorm
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.phi3.phi3_decoder import Phi3Decoder
+from keras_hub.src.models.phi3.phi3_layernorm import Phi3LayerNorm
 
 
 def _phi3_kernel_initializer(stddev=0.02):
     return keras.initializers.RandomNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.Phi3Backbone")
+@keras_hub_export("keras_hub.models.Phi3Backbone")
 class Phi3Backbone(Backbone):
     """Phi-3 core network with hyperparameters.
 
@@ -88,13 +88,13 @@ class Phi3Backbone(Backbone):
     }
 
     # Pretrained Phi3 decoder.
-    model = keras_nlp.models.Phi3Backbone.from_preset(
+    model = keras_hub.models.Phi3Backbone.from_preset(
         "phi3_mini_4k_instruct_en"
     )
     model(input_data)
 
     # Randomly initialized Phi3 decoder with custom config.
-    model = keras_nlp.models.Phi3Backbone(
+    model = keras_hub.models.Phi3Backbone(
         vocabulary_size=10,
         num_layers=2,
         hidden_dim=512,
diff --git a/keras_nlp/src/models/phi3/phi3_backbone_test.py b/keras_hub/src/models/phi3/phi3_backbone_test.py
similarity index 95%
rename from keras_nlp/src/models/phi3/phi3_backbone_test.py
rename to keras_hub/src/models/phi3/phi3_backbone_test.py
index 5d04ade0f7..df4424e70f 100644
--- a/keras_nlp/src/models/phi3/phi3_backbone_test.py
+++ b/keras_hub/src/models/phi3/phi3_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.phi3.phi3_backbone import Phi3Backbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.phi3.phi3_backbone import Phi3Backbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class Phi3Test(TestCase):
diff --git a/keras_nlp/src/models/phi3/phi3_causal_lm.py b/keras_hub/src/models/phi3/phi3_causal_lm.py
similarity index 93%
rename from keras_nlp/src/models/phi3/phi3_causal_lm.py
rename to keras_hub/src/models/phi3/phi3_causal_lm.py
index a567782b38..8684545c7f 100644
--- a/keras_nlp/src/models/phi3/phi3_causal_lm.py
+++ b/keras_hub/src/models/phi3/phi3_causal_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,16 +13,16 @@
 # limitations under the License.
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.phi3.phi3_backbone import Phi3Backbone
-from keras_nlp.src.models.phi3.phi3_causal_lm_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.phi3.phi3_backbone import Phi3Backbone
+from keras_hub.src.models.phi3.phi3_causal_lm_preprocessor import (
     Phi3CausalLMPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.models.Phi3CausalLM")
+@keras_hub_export("keras_hub.models.Phi3CausalLM")
 class Phi3CausalLM(CausalLM):
     """An end-to-end Phi3 model for causal language modeling.
 
@@ -35,12 +35,12 @@ class Phi3CausalLM(CausalLM):
     This model has a `generate()` method, which generates text based on a
     prompt. The generation strategy used is controlled by an additional
     `sampler` argument on `compile()`. You can recompile the model with
-    different `keras_nlp.samplers` objects to control the generation. By
+    different `keras_hub.samplers` objects to control the generation. By
     default, `"top_k"` sampling will be used.
 
     Args:
-        backbone: A `keras_nlp.models.Phi3Backbone` instance.
-        preprocessor: A `keras_nlp.models.Phi3CausalLMPreprocessor` or `None`.
+        backbone: A `keras_hub.models.Phi3Backbone` instance.
+        preprocessor: A `keras_hub.models.Phi3CausalLMPreprocessor` or `None`.
             If `None`, this model will not apply preprocessing, and inputs
             should be preprocessed before calling the model.
     """
diff --git a/keras_nlp/src/models/phi3/phi3_causal_lm_preprocessor.py b/keras_hub/src/models/phi3/phi3_causal_lm_preprocessor.py
similarity index 83%
rename from keras_nlp/src/models/phi3/phi3_causal_lm_preprocessor.py
rename to keras_hub/src/models/phi3/phi3_causal_lm_preprocessor.py
index 2dcbc778c5..5207180778 100644
--- a/keras_nlp/src/models/phi3/phi3_causal_lm_preprocessor.py
+++ b/keras_hub/src/models/phi3/phi3_causal_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,30 +12,30 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.phi3.phi3_backbone import Phi3Backbone
-from keras_nlp.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm_preprocessor import CausalLMPreprocessor
+from keras_hub.src.models.phi3.phi3_backbone import Phi3Backbone
+from keras_hub.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
 
 
-@keras_nlp_export("keras_nlp.models.Phi3CausalLMPreprocessor")
+@keras_hub_export("keras_hub.models.Phi3CausalLMPreprocessor")
 class Phi3CausalLMPreprocessor(CausalLMPreprocessor):
     """Phi3 Causal LM preprocessor.
 
     This preprocessing layer is meant for use with
-    `keras_nlp.models.Phi3CausalLM`. By default, it will take in batches of
+    `keras_hub.models.Phi3CausalLM`. By default, it will take in batches of
     strings, and return outputs in a `(x, y, sample_weight)` format, where the
     `y` label is the next token id in the `x` sequence.
 
     For use with generation, the layer also exposes two methods
     `generate_preprocess()` and `generate_postprocess()`. When this preprocessor
-    is attached to a `keras_nlp.models.Phi3CausalLM` instance, these methods
+    is attached to a `keras_hub.models.Phi3CausalLM` instance, these methods
     will be called implicitly in `generate()`. They can also be called
     standalone (e.g. to precompute preprocessing inputs for generation in a
     separate process).
 
     Args:
-        tokenizer: A `keras_nlp.models.Phi3Tokenizer` instance.
+        tokenizer: A `keras_hub.models.Phi3Tokenizer` instance.
         sequence_length: The length of the packed inputs.
         add_start_token: If `True`, the preprocessor will prepend the tokenizer
             start token to each input sequence. Default is `True`.
@@ -53,7 +53,7 @@ class Phi3CausalLMPreprocessor(CausalLMPreprocessor):
     Examples:
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.Phi3CausalLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.Phi3CausalLMPreprocessor.from_preset(
         "phi3_mini_4k_instruct_en"
     )
 
diff --git a/keras_nlp/src/models/phi3/phi3_causal_lm_preprocessor_test.py b/keras_hub/src/models/phi3/phi3_causal_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/phi3/phi3_causal_lm_preprocessor_test.py
rename to keras_hub/src/models/phi3/phi3_causal_lm_preprocessor_test.py
index a09f268c10..fdd4b95277 100644
--- a/keras_nlp/src/models/phi3/phi3_causal_lm_preprocessor_test.py
+++ b/keras_hub/src/models/phi3/phi3_causal_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,11 +16,11 @@
 
 import pytest
 
-from keras_nlp.src.models.phi3.phi3_causal_lm_preprocessor import (
+from keras_hub.src.models.phi3.phi3_causal_lm_preprocessor import (
     Phi3CausalLMPreprocessor,
 )
-from keras_nlp.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class Phi3CausalLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/phi3/phi3_causal_lm_test.py b/keras_hub/src/models/phi3/phi3_causal_lm_test.py
similarity index 92%
rename from keras_nlp/src/models/phi3/phi3_causal_lm_test.py
rename to keras_hub/src/models/phi3/phi3_causal_lm_test.py
index 897dab42f4..741da19bdf 100644
--- a/keras_nlp/src/models/phi3/phi3_causal_lm_test.py
+++ b/keras_hub/src/models/phi3/phi3_causal_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,13 +18,13 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.phi3.phi3_backbone import Phi3Backbone
-from keras_nlp.src.models.phi3.phi3_causal_lm import Phi3CausalLM
-from keras_nlp.src.models.phi3.phi3_causal_lm_preprocessor import (
+from keras_hub.src.models.phi3.phi3_backbone import Phi3Backbone
+from keras_hub.src.models.phi3.phi3_causal_lm import Phi3CausalLM
+from keras_hub.src.models.phi3.phi3_causal_lm_preprocessor import (
     Phi3CausalLMPreprocessor,
 )
-from keras_nlp.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class Phi3CausalLMTest(TestCase):
diff --git a/keras_nlp/src/models/phi3/phi3_decoder.py b/keras_hub/src/models/phi3/phi3_decoder.py
similarity index 96%
rename from keras_nlp/src/models/phi3/phi3_decoder.py
rename to keras_hub/src/models/phi3/phi3_decoder.py
index e2c7713ace..077adf615a 100644
--- a/keras_nlp/src/models/phi3/phi3_decoder.py
+++ b/keras_hub/src/models/phi3/phi3_decoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,15 +14,15 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     compute_causal_mask,
 )
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     merge_padding_and_attention_mask,
 )
-from keras_nlp.src.models.phi3.phi3_attention import Phi3Attention
-from keras_nlp.src.models.phi3.phi3_layernorm import Phi3LayerNorm
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.models.phi3.phi3_attention import Phi3Attention
+from keras_hub.src.models.phi3.phi3_layernorm import Phi3LayerNorm
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class Phi3Decoder(keras.layers.Layer):
diff --git a/keras_nlp/src/models/phi3/phi3_layernorm.py b/keras_hub/src/models/phi3/phi3_layernorm.py
similarity index 97%
rename from keras_nlp/src/models/phi3/phi3_layernorm.py
rename to keras_hub/src/models/phi3/phi3_layernorm.py
index 3ff62b386f..bf0b97b826 100644
--- a/keras_nlp/src/models/phi3/phi3_layernorm.py
+++ b/keras_hub/src/models/phi3/phi3_layernorm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/phi3/phi3_presets.py b/keras_hub/src/models/phi3/phi3_presets.py
similarity index 98%
rename from keras_nlp/src/models/phi3/phi3_presets.py
rename to keras_hub/src/models/phi3/phi3_presets.py
index 0f935d3371..76a3cee19f 100644
--- a/keras_nlp/src/models/phi3/phi3_presets.py
+++ b/keras_hub/src/models/phi3/phi3_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/phi3/phi3_rotary_embedding.py b/keras_hub/src/models/phi3/phi3_rotary_embedding.py
similarity index 98%
rename from keras_nlp/src/models/phi3/phi3_rotary_embedding.py
rename to keras_hub/src/models/phi3/phi3_rotary_embedding.py
index 7d1cf40513..bd3242e901 100644
--- a/keras_nlp/src/models/phi3/phi3_rotary_embedding.py
+++ b/keras_hub/src/models/phi3/phi3_rotary_embedding.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,7 +15,7 @@
 
 from keras import ops
 
-from keras_nlp.src.layers.modeling.rotary_embedding import RotaryEmbedding
+from keras_hub.src.layers.modeling.rotary_embedding import RotaryEmbedding
 
 
 class Phi3SuScaledRotaryEmbedding(RotaryEmbedding):
diff --git a/keras_nlp/src/models/phi3/phi3_tokenizer.py b/keras_hub/src/models/phi3/phi3_tokenizer.py
similarity index 82%
rename from keras_nlp/src/models/phi3/phi3_tokenizer.py
rename to keras_hub/src/models/phi3/phi3_tokenizer.py
index 8a1db63442..4cfc21f3b7 100644
--- a/keras_nlp/src/models/phi3/phi3_tokenizer.py
+++ b/keras_hub/src/models/phi3/phi3_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,24 +11,24 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.phi3.phi3_backbone import Phi3Backbone
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.phi3.phi3_backbone import Phi3Backbone
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.Phi3Tokenizer",
-        "keras_nlp.models.Phi3Tokenizer",
+        "keras_hub.tokenizers.Phi3Tokenizer",
+        "keras_hub.models.Phi3Tokenizer",
     ]
 )
 class Phi3Tokenizer(SentencePieceTokenizer):
     """Phi3 tokenizer layer based on SentencePiece.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.SentencePieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.SentencePieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by
     Phi3 models and provides a `from_preset()` method to automatically
     download a matching vocabulary for a Phi3 preset.
@@ -48,7 +48,7 @@ class Phi3Tokenizer(SentencePieceTokenizer):
     Examples:
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.Phi3Tokenizer.from_preset(
+    tokenizer = keras_hub.models.Phi3Tokenizer.from_preset(
         "phi3_mini_4k_instruct_en",
     )
     tokenizer("The quick brown fox jumped.")
diff --git a/keras_nlp/src/models/phi3/phi3_tokenizer_test.py b/keras_hub/src/models/phi3/phi3_tokenizer_test.py
similarity index 94%
rename from keras_nlp/src/models/phi3/phi3_tokenizer_test.py
rename to keras_hub/src/models/phi3/phi3_tokenizer_test.py
index 2c823acfb7..31c6abf574 100644
--- a/keras_nlp/src/models/phi3/phi3_tokenizer_test.py
+++ b/keras_hub/src/models/phi3/phi3_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 
 import pytest
 
-from keras_nlp.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class Phi3TokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/preprocessor.py b/keras_hub/src/models/preprocessor.py
similarity index 86%
rename from keras_nlp/src/models/preprocessor.py
rename to keras_hub/src/models/preprocessor.py
index 20bec93b9a..01ed2b6bfc 100644
--- a/keras_nlp/src/models/preprocessor.py
+++ b/keras_hub/src/models/preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,23 +14,23 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.preprocessing_layer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.preprocessing_layer import (
     PreprocessingLayer,
 )
-from keras_nlp.src.utils.preset_utils import PREPROCESSOR_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import builtin_presets
-from keras_nlp.src.utils.preset_utils import find_subclass
-from keras_nlp.src.utils.preset_utils import get_preset_loader
-from keras_nlp.src.utils.preset_utils import save_serialized_object
-from keras_nlp.src.utils.python_utils import classproperty
+from keras_hub.src.utils.preset_utils import PREPROCESSOR_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import builtin_presets
+from keras_hub.src.utils.preset_utils import find_subclass
+from keras_hub.src.utils.preset_utils import get_preset_loader
+from keras_hub.src.utils.preset_utils import save_serialized_object
+from keras_hub.src.utils.python_utils import classproperty
 
 
-@keras_nlp_export("keras_nlp.models.Preprocessor")
+@keras_hub_export("keras_hub.models.Preprocessor")
 class Preprocessor(PreprocessingLayer):
     """Base class for preprocessing layers.
 
-    A `Preprocessor` layer wraps a `keras_nlp.tokenizer.Tokenizer` to provide a
+    A `Preprocessor` layer wraps a `keras_hub.tokenizer.Tokenizer` to provide a
     complete preprocessing setup for a given task. For example a masked language
     modeling preprocessor will take in raw input strings, and output
     `(x, y, sample_weight)` tuples. Where `x` contains token id sequences with
@@ -128,7 +128,7 @@ def from_preset(
         preset,
         **kwargs,
     ):
-        """Instantiate a `keras_nlp.models.Preprocessor` from a model preset.
+        """Instantiate a `keras_hub.models.Preprocessor` from a model preset.
 
         A preset is a directory of configs, weights and other file assets used
         to save and load a pre-trained model. The `preset` can be passed as
@@ -144,7 +144,7 @@ def from_preset(
 
         As there are usually multiple preprocessing classes for a given model,
         this method should be called on a specific subclass like
-        `keras_nlp.models.BertTextClassifierPreprocessor.from_preset()`.
+        `keras_hub.models.BertTextClassifierPreprocessor.from_preset()`.
 
         Args:
             preset: string. A built-in preset identifier, a Kaggle Models
@@ -153,12 +153,12 @@ def from_preset(
         Examples:
         ```python
         # Load a preprocessor for Gemma generation.
-        preprocessor = keras_nlp.models.GemmaCausalLMPreprocessor.from_preset(
+        preprocessor = keras_hub.models.GemmaCausalLMPreprocessor.from_preset(
             "gemma_2b_en",
         )
 
         # Load a preprocessor for Bert classification.
-        preprocessor = keras_nlp.models.BertTextClassifierPreprocessor.from_preset(
+        preprocessor = keras_hub.models.BertTextClassifierPreprocessor.from_preset(
             "bert_base_en",
         )
         ```
@@ -167,7 +167,7 @@ def from_preset(
             raise ValueError(
                 "Do not call `Preprocessor.from_preset()` directly. Instead "
                 "choose a particular task preprocessing class, e.g. "
-                "`keras_nlp.models.TextClassifierPreprocessor.from_preset()`."
+                "`keras_hub.models.TextClassifierPreprocessor.from_preset()`."
             )
 
         loader = get_preset_loader(preset)
diff --git a/keras_nlp/src/models/preprocessor_test.py b/keras_hub/src/models/preprocessor_test.py
similarity index 87%
rename from keras_nlp/src/models/preprocessor_test.py
rename to keras_hub/src/models/preprocessor_test.py
index 42de5c22b6..6c36f64eae 100644
--- a/keras_nlp/src/models/preprocessor_test.py
+++ b/keras_hub/src/models/preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,26 +17,26 @@
 import pytest
 from absl.testing import parameterized
 
-from keras_nlp.src.models.albert.albert_text_classifier_preprocessor import (
+from keras_hub.src.models.albert.albert_text_classifier_preprocessor import (
     AlbertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.bert.bert_masked_lm_preprocessor import (
+from keras_hub.src.models.bert.bert_masked_lm_preprocessor import (
     BertMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.bert.bert_text_classifier_preprocessor import (
+from keras_hub.src.models.bert.bert_text_classifier_preprocessor import (
     BertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.gpt2.gpt2_preprocessor import GPT2Preprocessor
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.models.roberta.roberta_text_classifier_preprocessor import (
+from keras_hub.src.models.gpt2.gpt2_preprocessor import GPT2Preprocessor
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.models.roberta.roberta_text_classifier_preprocessor import (
     RobertaTextClassifierPreprocessor,
 )
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
-from keras_nlp.src.utils.preset_utils import TOKENIZER_ASSET_DIR
+from keras_hub.src.utils.preset_utils import TOKENIZER_ASSET_DIR
 
 
 class TestPreprocessor(TestCase):
diff --git a/keras_nlp/src/models/resnet/__init__.py b/keras_hub/src/models/resnet/__init__.py
similarity index 72%
rename from keras_nlp/src/models/resnet/__init__.py
rename to keras_hub/src/models/resnet/__init__.py
index a09d7a80bb..9d166a8c25 100644
--- a/keras_nlp/src/models/resnet/__init__.py
+++ b/keras_hub/src/models/resnet/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.resnet.resnet_backbone import ResNetBackbone
-from keras_nlp.src.models.resnet.resnet_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.resnet.resnet_backbone import ResNetBackbone
+from keras_hub.src.models.resnet.resnet_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, ResNetBackbone)
diff --git a/keras_nlp/src/models/resnet/resnet_backbone.py b/keras_hub/src/models/resnet/resnet_backbone.py
similarity index 98%
rename from keras_nlp/src/models/resnet/resnet_backbone.py
rename to keras_hub/src/models/resnet/resnet_backbone.py
index c9a794b3c8..7f585ba1f2 100644
--- a/keras_nlp/src/models/resnet/resnet_backbone.py
+++ b/keras_hub/src/models/resnet/resnet_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,12 +15,12 @@
 from keras import layers
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
-from keras_nlp.src.utils.keras_utils import standardize_data_format
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
+from keras_hub.src.utils.keras_utils import standardize_data_format
 
 
-@keras_nlp_export("keras_nlp.models.ResNetBackbone")
+@keras_hub_export("keras_hub.models.ResNetBackbone")
 class ResNetBackbone(FeaturePyramidBackbone):
     """ResNet and ResNetV2 core network with hyperparameters.
 
@@ -97,11 +97,11 @@ class ResNetBackbone(FeaturePyramidBackbone):
     input_data = np.random.uniform(0, 255, size=(2, 224, 224, 3))
 
     # Pretrained ResNet backbone.
-    model = keras_nlp.models.ResNetBackbone.from_preset("resnet50")
+    model = keras_hub.models.ResNetBackbone.from_preset("resnet50")
     model(input_data)
 
     # Randomly initialized ResNetV2 backbone with a custom config.
-    model = keras_nlp.models.ResNetBackbone(
+    model = keras_hub.models.ResNetBackbone(
         input_conv_filters=[64],
         input_conv_kernel_sizes=[7],
         stackwise_num_filters=[64, 64, 64],
diff --git a/keras_nlp/src/models/resnet/resnet_backbone_test.py b/keras_hub/src/models/resnet/resnet_backbone_test.py
similarity index 96%
rename from keras_nlp/src/models/resnet/resnet_backbone_test.py
rename to keras_hub/src/models/resnet/resnet_backbone_test.py
index c40959afdc..306ec923c8 100644
--- a/keras_nlp/src/models/resnet/resnet_backbone_test.py
+++ b/keras_hub/src/models/resnet/resnet_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 from absl.testing import parameterized
 from keras import ops
 
-from keras_nlp.src.models.resnet.resnet_backbone import ResNetBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.resnet.resnet_backbone import ResNetBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ResNetBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/resnet/resnet_image_classifier.py b/keras_hub/src/models/resnet/resnet_image_classifier.py
similarity index 86%
rename from keras_nlp/src/models/resnet/resnet_image_classifier.py
rename to keras_hub/src/models/resnet/resnet_image_classifier.py
index c9844b12b0..314f273179 100644
--- a/keras_nlp/src/models/resnet/resnet_image_classifier.py
+++ b/keras_hub/src/models/resnet/resnet_image_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,20 +13,20 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.image_classifier import ImageClassifier
-from keras_nlp.src.models.resnet.resnet_backbone import ResNetBackbone
-from keras_nlp.src.models.resnet.resnet_image_classifier_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.image_classifier import ImageClassifier
+from keras_hub.src.models.resnet.resnet_backbone import ResNetBackbone
+from keras_hub.src.models.resnet.resnet_image_classifier_preprocessor import (
     ResNetImageClassifierPreprocessor,
 )
 
 
-@keras_nlp_export("keras_nlp.models.ResNetImageClassifier")
+@keras_hub_export("keras_hub.models.ResNetImageClassifier")
 class ResNetImageClassifier(ImageClassifier):
     """ResNet image classifier task model.
 
     Args:
-        backbone: A `keras_nlp.models.ResNetBackbone` instance.
+        backbone: A `keras_hub.models.ResNetBackbone` instance.
         num_classes: int. The number of classes to predict.
         activation: `None`, str or callable. The activation function to use on
             the `Dense` layer. Set `activation=None` to return the output
@@ -45,7 +45,7 @@ class ResNetImageClassifier(ImageClassifier):
     ```python
     # Load preset and train
     images = np.ones((2, 224, 224, 3), dtype="float32")
-    classifier = keras_nlp.models.ResNetImageClassifier.from_preset("resnet50")
+    classifier = keras_hub.models.ResNetImageClassifier.from_preset("resnet50")
     classifier.predict(images)
     ```
 
@@ -54,13 +54,13 @@ class ResNetImageClassifier(ImageClassifier):
     # Load preset and train
     images = np.ones((2, 224, 224, 3), dtype="float32")
     labels = [0, 3]
-    classifier = keras_nlp.models.ResNetImageClassifier.from_preset("resnet50")
+    classifier = keras_hub.models.ResNetImageClassifier.from_preset("resnet50")
     classifier.fit(x=images, y=labels, batch_size=2)
     ```
 
     Call `fit()` with custom loss, optimizer and backbone.
     ```python
-    classifier = keras_nlp.models.ResNetImageClassifier.from_preset("resnet50")
+    classifier = keras_hub.models.ResNetImageClassifier.from_preset("resnet50")
     classifier.compile(
         loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
         optimizer=keras.optimizers.Adam(5e-5),
@@ -73,7 +73,7 @@ class ResNetImageClassifier(ImageClassifier):
     ```python
     images = np.ones((2, 224, 224, 3), dtype="float32")
     labels = [0, 3]
-    backbone = keras_nlp.models.ResNetBackbone(
+    backbone = keras_hub.models.ResNetBackbone(
         stackwise_num_filters=[64, 64, 64],
         stackwise_num_blocks=[2, 2, 2],
         stackwise_num_strides=[1, 2, 2],
@@ -82,7 +82,7 @@ class ResNetImageClassifier(ImageClassifier):
         include_rescaling=False,
         pooling="avg",
     )
-    classifier = keras_nlp.models.ResNetImageClassifier(
+    classifier = keras_hub.models.ResNetImageClassifier(
         backbone=backbone,
         num_classes=4,
     )
diff --git a/keras_nlp/src/models/resnet/resnet_image_classifier_preprocessor.py b/keras_hub/src/models/resnet/resnet_image_classifier_preprocessor.py
similarity index 69%
rename from keras_nlp/src/models/resnet/resnet_image_classifier_preprocessor.py
rename to keras_hub/src/models/resnet/resnet_image_classifier_preprocessor.py
index 0da2c67deb..f7c24d4b33 100644
--- a/keras_nlp/src/models/resnet/resnet_image_classifier_preprocessor.py
+++ b/keras_hub/src/models/resnet/resnet_image_classifier_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,17 +12,17 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.image_classifier_preprocessor import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.image_classifier_preprocessor import (
     ImageClassifierPreprocessor,
 )
-from keras_nlp.src.models.resnet.resnet_backbone import ResNetBackbone
-from keras_nlp.src.models.resnet.resnet_image_converter import (
+from keras_hub.src.models.resnet.resnet_backbone import ResNetBackbone
+from keras_hub.src.models.resnet.resnet_image_converter import (
     ResNetImageConverter,
 )
 
 
-@keras_nlp_export("keras_nlp.models.ResNetImageClassifierPreprocessor")
+@keras_hub_export("keras_hub.models.ResNetImageClassifierPreprocessor")
 class ResNetImageClassifierPreprocessor(ImageClassifierPreprocessor):
     backbone_cls = ResNetBackbone
     image_converter_cls = ResNetImageConverter
diff --git a/keras_nlp/src/models/resnet/resnet_image_classifier_test.py b/keras_hub/src/models/resnet/resnet_image_classifier_test.py
similarity index 93%
rename from keras_nlp/src/models/resnet/resnet_image_classifier_test.py
rename to keras_hub/src/models/resnet/resnet_image_classifier_test.py
index a9afccf3f2..d9de3719ac 100644
--- a/keras_nlp/src/models/resnet/resnet_image_classifier_test.py
+++ b/keras_hub/src/models/resnet/resnet_image_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.resnet.resnet_backbone import ResNetBackbone
-from keras_nlp.src.models.resnet.resnet_image_classifier import (
+from keras_hub.src.models.resnet.resnet_backbone import ResNetBackbone
+from keras_hub.src.models.resnet.resnet_image_classifier import (
     ResNetImageClassifier,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ResNetImageClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/resnet/resnet_image_converter.py b/keras_hub/src/models/resnet/resnet_image_converter.py
similarity index 70%
rename from keras_nlp/src/models/resnet/resnet_image_converter.py
rename to keras_hub/src/models/resnet/resnet_image_converter.py
index 876dffc96d..64dab4e302 100644
--- a/keras_nlp/src/models/resnet/resnet_image_converter.py
+++ b/keras_hub/src/models/resnet/resnet_image_converter.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,13 +11,13 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.resizing_image_converter import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.resizing_image_converter import (
     ResizingImageConverter,
 )
-from keras_nlp.src.models.resnet.resnet_backbone import ResNetBackbone
+from keras_hub.src.models.resnet.resnet_backbone import ResNetBackbone
 
 
-@keras_nlp_export("keras_nlp.layers.ResNetImageConverter")
+@keras_hub_export("keras_hub.layers.ResNetImageConverter")
 class ResNetImageConverter(ResizingImageConverter):
     backbone_cls = ResNetBackbone
diff --git a/keras_nlp/src/models/resnet/resnet_presets.py b/keras_hub/src/models/resnet/resnet_presets.py
similarity index 98%
rename from keras_nlp/src/models/resnet/resnet_presets.py
rename to keras_hub/src/models/resnet/resnet_presets.py
index 8030e7257f..99e448f24d 100644
--- a/keras_nlp/src/models/resnet/resnet_presets.py
+++ b/keras_hub/src/models/resnet/resnet_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/roberta/__init__.py b/keras_hub/src/models/roberta/__init__.py
similarity index 73%
rename from keras_nlp/src/models/roberta/__init__.py
rename to keras_hub/src/models/roberta/__init__.py
index 8376c4634e..d54b418a67 100644
--- a/keras_nlp/src/models/roberta/__init__.py
+++ b/keras_hub/src/models/roberta/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.roberta.roberta_backbone import RobertaBackbone
-from keras_nlp.src.models.roberta.roberta_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.roberta.roberta_backbone import RobertaBackbone
+from keras_hub.src.models.roberta.roberta_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, RobertaBackbone)
diff --git a/keras_nlp/src/models/roberta/roberta_backbone.py b/keras_hub/src/models/roberta/roberta_backbone.py
similarity index 93%
rename from keras_nlp/src/models/roberta/roberta_backbone.py
rename to keras_hub/src/models/roberta/roberta_backbone.py
index 6fb4b80558..00cb2e9af9 100644
--- a/keras_nlp/src/models/roberta/roberta_backbone.py
+++ b/keras_hub/src/models/roberta/roberta_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,19 +15,19 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.token_and_position_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.token_and_position_embedding import (
     TokenAndPositionEmbedding,
 )
-from keras_nlp.src.layers.modeling.transformer_encoder import TransformerEncoder
-from keras_nlp.src.models.backbone import Backbone
+from keras_hub.src.layers.modeling.transformer_encoder import TransformerEncoder
+from keras_hub.src.models.backbone import Backbone
 
 
 def roberta_kernel_initializer(stddev=0.02):
     return keras.initializers.TruncatedNormal(stddev=stddev)
 
 
-@keras_nlp_export("keras_nlp.models.RobertaBackbone")
+@keras_hub_export("keras_hub.models.RobertaBackbone")
 class RobertaBackbone(Backbone):
     """A RoBERTa encoder network.
 
@@ -73,11 +73,11 @@ class RobertaBackbone(Backbone):
     }
 
     # Pretrained RoBERTa encoder
-    model = keras_nlp.models.RobertaBackbone.from_preset("roberta_base_en")
+    model = keras_hub.models.RobertaBackbone.from_preset("roberta_base_en")
     model(input_data)
 
     # Randomly initialized RoBERTa model with custom config
-    model = keras_nlp.models.RobertaBackbone(
+    model = keras_hub.models.RobertaBackbone(
         vocabulary_size=50265,
         num_layers=4,
         num_heads=4,
diff --git a/keras_nlp/src/models/roberta/roberta_backbone_test.py b/keras_hub/src/models/roberta/roberta_backbone_test.py
similarity index 93%
rename from keras_nlp/src/models/roberta/roberta_backbone_test.py
rename to keras_hub/src/models/roberta/roberta_backbone_test.py
index 833188312a..68ffe549eb 100644
--- a/keras_nlp/src/models/roberta/roberta_backbone_test.py
+++ b/keras_hub/src/models/roberta/roberta_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.roberta.roberta_backbone import RobertaBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.roberta.roberta_backbone import RobertaBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RobertaBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/roberta/roberta_masked_lm.py b/keras_hub/src/models/roberta/roberta_masked_lm.py
similarity index 85%
rename from keras_nlp/src/models/roberta/roberta_masked_lm.py
rename to keras_hub/src/models/roberta/roberta_masked_lm.py
index 9206f0509e..49553e90b8 100644
--- a/keras_nlp/src/models/roberta/roberta_masked_lm.py
+++ b/keras_hub/src/models/roberta/roberta_masked_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,19 +15,19 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.masked_lm_head import MaskedLMHead
-from keras_nlp.src.models.masked_lm import MaskedLM
-from keras_nlp.src.models.roberta.roberta_backbone import RobertaBackbone
-from keras_nlp.src.models.roberta.roberta_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.masked_lm_head import MaskedLMHead
+from keras_hub.src.models.masked_lm import MaskedLM
+from keras_hub.src.models.roberta.roberta_backbone import RobertaBackbone
+from keras_hub.src.models.roberta.roberta_backbone import (
     roberta_kernel_initializer,
 )
-from keras_nlp.src.models.roberta.roberta_masked_lm_preprocessor import (
+from keras_hub.src.models.roberta.roberta_masked_lm_preprocessor import (
     RobertaMaskedLMPreprocessor,
 )
 
 
-@keras_nlp_export("keras_nlp.models.RobertaMaskedLM")
+@keras_hub_export("keras_hub.models.RobertaMaskedLM")
 class RobertaMaskedLM(MaskedLM):
     """An end-to-end RoBERTa model for the masked language modeling task.
 
@@ -48,8 +48,8 @@ class RobertaMaskedLM(MaskedLM):
     [here](https://github.com/facebookresearch/fairseq).
 
     Args:
-        backbone: A `keras_nlp.models.RobertaBackbone` instance.
-        preprocessor: A `keras_nlp.models.RobertaMaskedLMPreprocessor` or
+        backbone: A `keras_hub.models.RobertaBackbone` instance.
+        preprocessor: A `keras_hub.models.RobertaMaskedLMPreprocessor` or
             `None`. If `None`, this model will not apply preprocessing, and
             inputs should be preprocessed before calling the model.
 
@@ -60,7 +60,7 @@ class RobertaMaskedLM(MaskedLM):
     features = ["The quick brown fox jumped.", "I forgot my homework."]
 
     # Pretrained language model.
-    masked_lm = keras_nlp.models.RobertaMaskedLM.from_preset(
+    masked_lm = keras_hub.models.RobertaMaskedLM.from_preset(
         "roberta_base_en",
     )
     masked_lm.fit(x=features, batch_size=2)
@@ -88,7 +88,7 @@ class RobertaMaskedLM(MaskedLM):
     # Labels are the original masked values.
     labels = [[3, 5]] * 2
 
-    masked_lm = keras_nlp.models.RobertaMaskedLM.from_preset(
+    masked_lm = keras_hub.models.RobertaMaskedLM.from_preset(
         "roberta_base_en",
         preprocessor=None,
     )
diff --git a/keras_nlp/src/models/roberta/roberta_masked_lm_preprocessor.py b/keras_hub/src/models/roberta/roberta_masked_lm_preprocessor.py
similarity index 87%
rename from keras_nlp/src/models/roberta/roberta_masked_lm_preprocessor.py
rename to keras_hub/src/models/roberta/roberta_masked_lm_preprocessor.py
index 369a34ce61..17da64734d 100644
--- a/keras_nlp/src/models/roberta/roberta_masked_lm_preprocessor.py
+++ b/keras_hub/src/models/roberta/roberta_masked_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,23 +14,23 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.multi_segment_packer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.multi_segment_packer import (
     MultiSegmentPacker,
 )
-from keras_nlp.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
-from keras_nlp.src.models.roberta.roberta_backbone import RobertaBackbone
-from keras_nlp.src.models.roberta.roberta_tokenizer import RobertaTokenizer
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
+from keras_hub.src.models.roberta.roberta_backbone import RobertaBackbone
+from keras_hub.src.models.roberta.roberta_tokenizer import RobertaTokenizer
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.models.RobertaMaskedLMPreprocessor")
+@keras_hub_export("keras_hub.models.RobertaMaskedLMPreprocessor")
 class RobertaMaskedLMPreprocessor(MaskedLMPreprocessor):
     """RoBERTa preprocessing for the masked language modeling task.
 
     This preprocessing layer will prepare inputs for a masked language modeling
     task. It is primarily intended for use with the
-    `keras_nlp.models.RobertaMaskedLM` task model. Preprocessing will occur in
+    `keras_hub.models.RobertaMaskedLM` task model. Preprocessing will occur in
     multiple steps.
 
     1. Tokenize any number of input segments using the `tokenizer`.
@@ -41,10 +41,10 @@ class RobertaMaskedLMPreprocessor(MaskedLMPreprocessor):
     3. Randomly select non-special tokens to mask, controlled by
       `mask_selection_rate`.
     4. Construct a `(x, y, sample_weight)` tuple suitable for training with a
-      `keras_nlp.models.RobertaMaskedLM` task model.
+      `keras_hub.models.RobertaMaskedLM` task model.
 
     Args:
-        tokenizer: A `keras_nlp.models.RobertaTokenizer` instance.
+        tokenizer: A `keras_hub.models.RobertaTokenizer` instance.
         sequence_length: int. The length of the packed inputs.
         truncate: string. The algorithm to truncate a list of batched segments
             to fit within `sequence_length`. The value can be either
@@ -81,7 +81,7 @@ class RobertaMaskedLMPreprocessor(MaskedLMPreprocessor):
     Directly calling the layer on data.
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.RobertaMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.RobertaMaskedLMPreprocessor.from_preset(
         "roberta_base_en"
     )
 
@@ -100,7 +100,7 @@ class RobertaMaskedLMPreprocessor(MaskedLMPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.RobertaMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.RobertaMaskedLMPreprocessor.from_preset(
         "roberta_base_en"
     )
 
diff --git a/keras_nlp/src/models/roberta/roberta_masked_lm_preprocessor_test.py b/keras_hub/src/models/roberta/roberta_masked_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/roberta/roberta_masked_lm_preprocessor_test.py
rename to keras_hub/src/models/roberta/roberta_masked_lm_preprocessor_test.py
index 5880b141da..7bd9c25fc9 100644
--- a/keras_nlp/src/models/roberta/roberta_masked_lm_preprocessor_test.py
+++ b/keras_hub/src/models/roberta/roberta_masked_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 import pytest
 
-from keras_nlp.src.models.roberta.roberta_masked_lm_preprocessor import (
+from keras_hub.src.models.roberta.roberta_masked_lm_preprocessor import (
     RobertaMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.roberta.roberta_tokenizer import RobertaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.roberta.roberta_tokenizer import RobertaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RobertaMaskedLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/roberta/roberta_masked_lm_test.py b/keras_hub/src/models/roberta/roberta_masked_lm_test.py
similarity index 89%
rename from keras_nlp/src/models/roberta/roberta_masked_lm_test.py
rename to keras_hub/src/models/roberta/roberta_masked_lm_test.py
index b3293e2f49..409f91daf4 100644
--- a/keras_nlp/src/models/roberta/roberta_masked_lm_test.py
+++ b/keras_hub/src/models/roberta/roberta_masked_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,13 +14,13 @@
 
 import pytest
 
-from keras_nlp.src.models.roberta.roberta_backbone import RobertaBackbone
-from keras_nlp.src.models.roberta.roberta_masked_lm import RobertaMaskedLM
-from keras_nlp.src.models.roberta.roberta_masked_lm_preprocessor import (
+from keras_hub.src.models.roberta.roberta_backbone import RobertaBackbone
+from keras_hub.src.models.roberta.roberta_masked_lm import RobertaMaskedLM
+from keras_hub.src.models.roberta.roberta_masked_lm_preprocessor import (
     RobertaMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.roberta.roberta_tokenizer import RobertaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.roberta.roberta_tokenizer import RobertaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RobertaMaskedLMTest(TestCase):
diff --git a/keras_nlp/src/models/roberta/roberta_presets.py b/keras_hub/src/models/roberta/roberta_presets.py
similarity index 97%
rename from keras_nlp/src/models/roberta/roberta_presets.py
rename to keras_hub/src/models/roberta/roberta_presets.py
index 79e58a1494..45dabf83e6 100644
--- a/keras_nlp/src/models/roberta/roberta_presets.py
+++ b/keras_hub/src/models/roberta/roberta_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/roberta/roberta_text_classifier.py b/keras_hub/src/models/roberta/roberta_text_classifier.py
similarity index 86%
rename from keras_nlp/src/models/roberta/roberta_text_classifier.py
rename to keras_hub/src/models/roberta/roberta_text_classifier.py
index f61c6649a8..7ab02460a3 100644
--- a/keras_nlp/src/models/roberta/roberta_text_classifier.py
+++ b/keras_hub/src/models/roberta/roberta_text_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,28 +15,28 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.roberta.roberta_backbone import RobertaBackbone
-from keras_nlp.src.models.roberta.roberta_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.roberta.roberta_backbone import RobertaBackbone
+from keras_hub.src.models.roberta.roberta_backbone import (
     roberta_kernel_initializer,
 )
-from keras_nlp.src.models.roberta.roberta_text_classifier_preprocessor import (
+from keras_hub.src.models.roberta.roberta_text_classifier_preprocessor import (
     RobertaTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.text_classifier import TextClassifier
+from keras_hub.src.models.text_classifier import TextClassifier
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.RobertaTextClassifier",
-        "keras_nlp.models.RobertaClassifier",
+        "keras_hub.models.RobertaTextClassifier",
+        "keras_hub.models.RobertaClassifier",
     ]
 )
 class RobertaTextClassifier(TextClassifier):
     """An end-to-end RoBERTa model for classification tasks.
 
     This model attaches a classification head to a
-    `keras_nlp.model.RobertaBackbone` instance, mapping from the backbone
+    `keras_hub.model.RobertaBackbone` instance, mapping from the backbone
     outputs to logits suitable for a classification task. For usage of this
     model with pre-trained weights, see the `from_preset()` constructor.
 
@@ -51,9 +51,9 @@ class RobertaTextClassifier(TextClassifier):
     [here](https://github.com/facebookresearch/fairseq).
 
     Args:
-        backbone: A `keras_nlp.models.RobertaBackbone` instance.
+        backbone: A `keras_hub.models.RobertaBackbone` instance.
         num_classes: int. Number of classes to predict.
-        preprocessor: A `keras_nlp.models.RobertaTextClassifierPreprocessor` or `None`. If
+        preprocessor: A `keras_hub.models.RobertaTextClassifierPreprocessor` or `None`. If
             `None`, this model will not apply preprocessing, and inputs should
             be preprocessed before calling the model.
         activation: Optional `str` or callable. The activation function to use
@@ -71,7 +71,7 @@ class RobertaTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier.
-    classifier = keras_nlp.models.RobertaTextClassifier.from_preset(
+    classifier = keras_hub.models.RobertaTextClassifier.from_preset(
         "roberta_base_en",
         num_classes=4,
     )
@@ -99,7 +99,7 @@ class RobertaTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier without preprocessing.
-    classifier = keras_nlp.models.RobertaTextClassifier.from_preset(
+    classifier = keras_hub.models.RobertaTextClassifier.from_preset(
         "roberta_base_en",
         num_classes=4,
         preprocessor=None,
@@ -116,15 +116,15 @@ class RobertaTextClassifier(TextClassifier):
     vocab = {**vocab, "a": 4, "Ġquick": 5, "Ġfox": 6}
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick"]
     merges += ["Ġ f", "o x", "Ġf ox"]
-    tokenizer = keras_nlp.models.RobertaTokenizer(
+    tokenizer = keras_hub.models.RobertaTokenizer(
         vocabulary=vocab,
         merges=merges
     )
-    preprocessor = keras_nlp.models.RobertaTextClassifierPreprocessor(
+    preprocessor = keras_hub.models.RobertaTextClassifierPreprocessor(
         tokenizer=tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.RobertaBackbone(
+    backbone = keras_hub.models.RobertaBackbone(
         vocabulary_size=20,
         num_layers=4,
         num_heads=4,
@@ -132,7 +132,7 @@ class RobertaTextClassifier(TextClassifier):
         intermediate_dim=512,
         max_sequence_length=128
     )
-    classifier = keras_nlp.models.RobertaTextClassifier(
+    classifier = keras_hub.models.RobertaTextClassifier(
         backbone=backbone,
         preprocessor=preprocessor,
         num_classes=4,
diff --git a/keras_nlp/src/models/roberta/roberta_text_classifier_preprocessor.py b/keras_hub/src/models/roberta/roberta_text_classifier_preprocessor.py
similarity index 87%
rename from keras_nlp/src/models/roberta/roberta_text_classifier_preprocessor.py
rename to keras_hub/src/models/roberta/roberta_text_classifier_preprocessor.py
index a905156615..71355bc619 100644
--- a/keras_nlp/src/models/roberta/roberta_text_classifier_preprocessor.py
+++ b/keras_hub/src/models/roberta/roberta_text_classifier_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,22 +14,22 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.multi_segment_packer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.multi_segment_packer import (
     MultiSegmentPacker,
 )
-from keras_nlp.src.models.roberta.roberta_backbone import RobertaBackbone
-from keras_nlp.src.models.roberta.roberta_tokenizer import RobertaTokenizer
-from keras_nlp.src.models.text_classifier_preprocessor import (
+from keras_hub.src.models.roberta.roberta_backbone import RobertaBackbone
+from keras_hub.src.models.roberta.roberta_tokenizer import RobertaTokenizer
+from keras_hub.src.models.text_classifier_preprocessor import (
     TextClassifierPreprocessor,
 )
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.RobertaTextClassifierPreprocessor",
-        "keras_nlp.models.RobertaPreprocessor",
+        "keras_hub.models.RobertaTextClassifierPreprocessor",
+        "keras_hub.models.RobertaPreprocessor",
     ]
 )
 class RobertaTextClassifierPreprocessor(TextClassifierPreprocessor):
@@ -50,7 +50,7 @@ class RobertaTextClassifierPreprocessor(TextClassifierPreprocessor):
     `keras.Model.fit`.
 
     Args:
-        tokenizer: A `keras_nlp.models.RobertaTokenizer` instance.
+        tokenizer: A `keras_hub.models.RobertaTokenizer` instance.
         sequence_length: The length of the packed inputs.
         truncate: string. The algorithm to truncate a list of batched segments
             to fit within `sequence_length`. The value can be either
@@ -76,7 +76,7 @@ class RobertaTextClassifierPreprocessor(TextClassifierPreprocessor):
 
     Directly calling the layer on data.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "roberta_base_en"
     )
 
@@ -96,16 +96,16 @@ class RobertaTextClassifierPreprocessor(TextClassifierPreprocessor):
     vocab = {"<s>": 0, "<pad>": 1, "</s>": 2, "<mask>": 3}
     vocab = {**vocab, "a": 4, "Ġquick": 5, "Ġfox": 6}
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick", "Ġ f", "o x", "Ġf ox"]
-    tokenizer = keras_nlp.models.RobertaTokenizer(
+    tokenizer = keras_hub.models.RobertaTokenizer(
         vocabulary=vocab,
         merges=merges
     )
-    preprocessor = keras_nlp.models.RobertaTextClassifierPreprocessor(tokenizer)
+    preprocessor = keras_hub.models.RobertaTextClassifierPreprocessor(tokenizer)
     preprocessor("a quick fox")
     ```
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "roberta_base_en"
     )
 
diff --git a/keras_nlp/src/models/roberta/roberta_text_classifier_preprocessor_test.py b/keras_hub/src/models/roberta/roberta_text_classifier_preprocessor_test.py
similarity index 92%
rename from keras_nlp/src/models/roberta/roberta_text_classifier_preprocessor_test.py
rename to keras_hub/src/models/roberta/roberta_text_classifier_preprocessor_test.py
index e9b9b6cab9..adaf11f043 100644
--- a/keras_nlp/src/models/roberta/roberta_text_classifier_preprocessor_test.py
+++ b/keras_hub/src/models/roberta/roberta_text_classifier_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 import pytest
 
-from keras_nlp.src.models.roberta.roberta_text_classifier_preprocessor import (
+from keras_hub.src.models.roberta.roberta_text_classifier_preprocessor import (
     RobertaTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.roberta.roberta_tokenizer import RobertaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.roberta.roberta_tokenizer import RobertaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RobertaTextClassifierPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/roberta/roberta_text_classifier_test.py b/keras_hub/src/models/roberta/roberta_text_classifier_test.py
similarity index 89%
rename from keras_nlp/src/models/roberta/roberta_text_classifier_test.py
rename to keras_hub/src/models/roberta/roberta_text_classifier_test.py
index 72f2cd343f..f71994ff6d 100644
--- a/keras_nlp/src/models/roberta/roberta_text_classifier_test.py
+++ b/keras_hub/src/models/roberta/roberta_text_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,15 +14,15 @@
 
 import pytest
 
-from keras_nlp.src.models.roberta.roberta_backbone import RobertaBackbone
-from keras_nlp.src.models.roberta.roberta_text_classifier import (
+from keras_hub.src.models.roberta.roberta_backbone import RobertaBackbone
+from keras_hub.src.models.roberta.roberta_text_classifier import (
     RobertaTextClassifier,
 )
-from keras_nlp.src.models.roberta.roberta_text_classifier_preprocessor import (
+from keras_hub.src.models.roberta.roberta_text_classifier_preprocessor import (
     RobertaTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.roberta.roberta_tokenizer import RobertaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.roberta.roberta_tokenizer import RobertaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RobertaTextClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/roberta/roberta_tokenizer.py b/keras_hub/src/models/roberta/roberta_tokenizer.py
similarity index 84%
rename from keras_nlp/src/models/roberta/roberta_tokenizer.py
rename to keras_hub/src/models/roberta/roberta_tokenizer.py
index 1097e3ba09..a03cee272f 100644
--- a/keras_nlp/src/models/roberta/roberta_tokenizer.py
+++ b/keras_hub/src/models/roberta/roberta_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,22 +13,22 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.roberta.roberta_backbone import RobertaBackbone
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.roberta.roberta_backbone import RobertaBackbone
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.RobertaTokenizer",
-        "keras_nlp.models.RobertaTokenizer",
+        "keras_hub.tokenizers.RobertaTokenizer",
+        "keras_hub.models.RobertaTokenizer",
     ]
 )
 class RobertaTokenizer(BytePairTokenizer):
     """A RoBERTa tokenizer using Byte-Pair Encoding subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.BytePairTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.BytePairTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by RoBERTa
     models and provides a `from_preset()` method to automatically download
     a matching vocabulary for a RoBERTa preset.
@@ -49,7 +49,7 @@ class RobertaTokenizer(BytePairTokenizer):
     Examples:
     ```python
     # Unbatched input.
-    tokenizer = keras_nlp.models.RobertaTokenizer.from_preset(
+    tokenizer = keras_hub.models.RobertaTokenizer.from_preset(
         "roberta_base_en",
     )
     tokenizer("The quick brown fox jumped.")
@@ -66,7 +66,7 @@ class RobertaTokenizer(BytePairTokenizer):
     vocab = {**vocab, "a": 4, "Ġquick": 5, "Ġfox": 6}
     merges = ["Ġ q", "u i", "c k", "ui ck", "Ġq uick"]
     merges += ["Ġ f", "o x", "Ġf ox"]
-    tokenizer = keras_nlp.models.RobertaTokenizer(
+    tokenizer = keras_hub.models.RobertaTokenizer(
         vocabulary=vocab,
         merges=merges
     )
diff --git a/keras_nlp/src/models/roberta/roberta_tokenizer_test.py b/keras_hub/src/models/roberta/roberta_tokenizer_test.py
similarity index 94%
rename from keras_nlp/src/models/roberta/roberta_tokenizer_test.py
rename to keras_hub/src/models/roberta/roberta_tokenizer_test.py
index 35c7b628e2..6deb56be97 100644
--- a/keras_nlp/src/models/roberta/roberta_tokenizer_test.py
+++ b/keras_hub/src/models/roberta/roberta_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import pytest
 
-from keras_nlp.src.models.roberta.roberta_tokenizer import RobertaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.roberta.roberta_tokenizer import RobertaTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RobertaTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/seq_2_seq_lm.py b/keras_hub/src/models/seq_2_seq_lm.py
similarity index 79%
rename from keras_nlp/src/models/seq_2_seq_lm.py
rename to keras_hub/src/models/seq_2_seq_lm.py
index c3bf0d4cd0..6eb52540c3 100644
--- a/keras_nlp/src/models/seq_2_seq_lm.py
+++ b/keras_hub/src/models/seq_2_seq_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,28 +11,28 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.causal_lm import CausalLM
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.causal_lm import CausalLM
 
 
-@keras_nlp_export("keras_nlp.models.Seq2SeqLM")
+@keras_hub_export("keras_hub.models.Seq2SeqLM")
 class Seq2SeqLM(CausalLM):
     """Base class for sequence to sequence language modeling tasks.
 
-    `Seq2SeqLM` tasks wrap a `keras_nlp.models.Backbone` and
-    a `keras_nlp.models.Preprocessor` to create a model that can be used for
+    `Seq2SeqLM` tasks wrap a `keras_hub.models.Backbone` and
+    a `keras_hub.models.Preprocessor` to create a model that can be used for
     generation and generative fine-tuning, when generation is conditioned on
     additional input sequence in a sequence-to-sequence setting.
 
     `Seq2SeqLM` tasks provide an additional, high-level `generate()` function
     which can be used to auto-regressively sample an output sequence token by
     token. The `compile()` method of `Seq2SeqLM` classes contains an additional
-    `sampler` argument, which can be used to pass a `keras_nlp.samplers.Sampler`
+    `sampler` argument, which can be used to pass a `keras_hub.samplers.Sampler`
     to control how the predicted distribution will be sampled.
 
     When calling `fit()`, each input should contain an input and output
     sequence. The model will be trained to predict the output sequence
-    token-by-token using a causal mask, similar to a `keras_nlp.models.CausalLM`
+    token-by-token using a causal mask, similar to a `keras_hub.models.CausalLM`
     task. Unlike the `CausalLM` task, an input sequence must be passed, and
     can be attended to in full by all tokens in the output sequence.
 
@@ -42,7 +42,7 @@ class Seq2SeqLM(CausalLM):
     Example:
     ```python
     # Load a Bart backbone with pre-trained weights.
-    seq_2_seq_lm = keras_nlp.models.Seq2SeqLM.from_preset(
+    seq_2_seq_lm = keras_hub.models.Seq2SeqLM.from_preset(
         "bart_base_en",
     )
     seq_2_seq_lm.compile(sampler="top_k")
diff --git a/keras_nlp/src/models/seq_2_seq_lm_preprocessor.py b/keras_hub/src/models/seq_2_seq_lm_preprocessor.py
similarity index 94%
rename from keras_nlp/src/models/seq_2_seq_lm_preprocessor.py
rename to keras_hub/src/models/seq_2_seq_lm_preprocessor.py
index 27405f99d0..d5ea9342fc 100644
--- a/keras_nlp/src/models/seq_2_seq_lm_preprocessor.py
+++ b/keras_hub/src/models/seq_2_seq_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.start_end_packer import StartEndPacker
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
-from keras_nlp.src.utils.tensor_utils import strip_to_ragged
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.start_end_packer import StartEndPacker
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import strip_to_ragged
 
 try:
     import tensorflow as tf
@@ -25,11 +25,11 @@
     tf = None
 
 
-@keras_nlp_export("keras_nlp.models.Seq2SeqLMPreprocessor")
+@keras_hub_export("keras_hub.models.Seq2SeqLMPreprocessor")
 class Seq2SeqLMPreprocessor(Preprocessor):
     """Base class for seq2seq language modeling preprocessing layers.
 
-    `Seq2SeqLMPreprocessor` tasks wrap a `keras_nlp.tokenizer.Tokenizer` to
+    `Seq2SeqLMPreprocessor` tasks wrap a `keras_hub.tokenizer.Tokenizer` to
     create a preprocessing layer for seq2seq language modeling tasks. It is
     intended to be paired with a `keras.models.Seq2SeqLM` task.
 
@@ -52,7 +52,7 @@ class Seq2SeqLMPreprocessor(Preprocessor):
 
     Examples.
     ```python
-    preprocessor = keras_nlp.models.Seq2SeqLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.Seq2SeqLMPreprocessor.from_preset(
         "bart_base_en",
         encoder_sequence_length=256,
         decoder_sequence_length=256,
diff --git a/keras_nlp/src/models/seq_2_seq_lm_preprocessor_test.py b/keras_hub/src/models/seq_2_seq_lm_preprocessor_test.py
similarity index 87%
rename from keras_nlp/src/models/seq_2_seq_lm_preprocessor_test.py
rename to keras_hub/src/models/seq_2_seq_lm_preprocessor_test.py
index f784b571fe..3cd6977967 100644
--- a/keras_nlp/src/models/seq_2_seq_lm_preprocessor_test.py
+++ b/keras_hub/src/models/seq_2_seq_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,12 +13,12 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.bart.bart_seq_2_seq_lm_preprocessor import (
+from keras_hub.src.models.bart.bart_seq_2_seq_lm_preprocessor import (
     BartSeq2SeqLMPreprocessor,
 )
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.models.seq_2_seq_lm_preprocessor import Seq2SeqLMPreprocessor
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.models.seq_2_seq_lm_preprocessor import Seq2SeqLMPreprocessor
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestSeq2SeqLMPreprocessor(TestCase):
diff --git a/keras_hub/src/models/stable_diffusion_v3/__init__.py b/keras_hub/src/models/stable_diffusion_v3/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/stable_diffusion_v3/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/stable_diffusion_v3/clip_encoder_block.py b/keras_hub/src/models/stable_diffusion_v3/clip_encoder_block.py
similarity index 98%
rename from keras_nlp/src/models/stable_diffusion_v3/clip_encoder_block.py
rename to keras_hub/src/models/stable_diffusion_v3/clip_encoder_block.py
index c4e16f8626..6fe4bc9b59 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/clip_encoder_block.py
+++ b/keras_hub/src/models/stable_diffusion_v3/clip_encoder_block.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/stable_diffusion_v3/clip_preprocessor.py b/keras_hub/src/models/stable_diffusion_v3/clip_preprocessor.py
similarity index 90%
rename from keras_nlp/src/models/stable_diffusion_v3/clip_preprocessor.py
rename to keras_hub/src/models/stable_diffusion_v3/clip_preprocessor.py
index 3ffefab05d..f7cfa461d5 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/clip_preprocessor.py
+++ b/keras_hub/src/models/stable_diffusion_v3/clip_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,12 +13,12 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.layers.preprocessing.start_end_packer import StartEndPacker
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.models.stable_diffusion_v3.clip_tokenizer import (
+from keras_hub.src.layers.preprocessing.start_end_packer import StartEndPacker
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.models.stable_diffusion_v3.clip_tokenizer import (
     CLIPTokenizer,
 )
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 try:
     import tensorflow as tf
diff --git a/keras_nlp/src/models/stable_diffusion_v3/clip_preprocessor_test.py b/keras_hub/src/models/stable_diffusion_v3/clip_preprocessor_test.py
similarity index 92%
rename from keras_nlp/src/models/stable_diffusion_v3/clip_preprocessor_test.py
rename to keras_hub/src/models/stable_diffusion_v3/clip_preprocessor_test.py
index 4365a14673..8585752b84 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/clip_preprocessor_test.py
+++ b/keras_hub/src/models/stable_diffusion_v3/clip_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,13 +13,13 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.stable_diffusion_v3.clip_preprocessor import (
+from keras_hub.src.models.stable_diffusion_v3.clip_preprocessor import (
     CLIPPreprocessor,
 )
-from keras_nlp.src.models.stable_diffusion_v3.clip_tokenizer import (
+from keras_hub.src.models.stable_diffusion_v3.clip_tokenizer import (
     CLIPTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class CLIPPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/stable_diffusion_v3/clip_text_encoder.py b/keras_hub/src/models/stable_diffusion_v3/clip_text_encoder.py
similarity index 96%
rename from keras_nlp/src/models/stable_diffusion_v3/clip_text_encoder.py
rename to keras_hub/src/models/stable_diffusion_v3/clip_text_encoder.py
index 899ae665c7..77cfc7e98e 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/clip_text_encoder.py
+++ b/keras_hub/src/models/stable_diffusion_v3/clip_text_encoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 from keras import layers
 from keras import ops
 
-from keras_nlp.src.layers.modeling.token_and_position_embedding import (
+from keras_hub.src.layers.modeling.token_and_position_embedding import (
     TokenAndPositionEmbedding,
 )
-from keras_nlp.src.models.stable_diffusion_v3.clip_encoder_block import (
+from keras_hub.src.models.stable_diffusion_v3.clip_encoder_block import (
     CLIPEncoderBlock,
 )
 
diff --git a/keras_nlp/src/models/stable_diffusion_v3/clip_tokenizer.py b/keras_hub/src/models/stable_diffusion_v3/clip_tokenizer.py
similarity index 96%
rename from keras_nlp/src/models/stable_diffusion_v3/clip_tokenizer.py
rename to keras_hub/src/models/stable_diffusion_v3/clip_tokenizer.py
index 59c046d9f5..a9e17e8bda 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/clip_tokenizer.py
+++ b/keras_hub/src/models/stable_diffusion_v3/clip_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,9 +11,9 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import convert_to_ragged_batch
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import split_strings_for_bpe
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.tokenizers.byte_pair_tokenizer import convert_to_ragged_batch
+from keras_hub.src.tokenizers.byte_pair_tokenizer import split_strings_for_bpe
 
 try:
     import tensorflow as tf
diff --git a/keras_nlp/src/models/stable_diffusion_v3/clip_tokenizer_test.py b/keras_hub/src/models/stable_diffusion_v3/clip_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/stable_diffusion_v3/clip_tokenizer_test.py
rename to keras_hub/src/models/stable_diffusion_v3/clip_tokenizer_test.py
index 4ceaea8057..77a5db8780 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/clip_tokenizer_test.py
+++ b/keras_hub/src/models/stable_diffusion_v3/clip_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,10 +13,10 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.stable_diffusion_v3.clip_tokenizer import (
+from keras_hub.src.models.stable_diffusion_v3.clip_tokenizer import (
     CLIPTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class CLIPTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/stable_diffusion_v3/mmdit.py b/keras_hub/src/models/stable_diffusion_v3/mmdit.py
similarity index 98%
rename from keras_nlp/src/models/stable_diffusion_v3/mmdit.py
rename to keras_hub/src/models/stable_diffusion_v3/mmdit.py
index b26f0d04b3..619888baf1 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/mmdit.py
+++ b/keras_hub/src/models/stable_diffusion_v3/mmdit.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,9 +18,9 @@
 from keras import models
 from keras import ops
 
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.models.stable_diffusion_v3.mmdit_block import MMDiTBlock
-from keras_nlp.src.utils.keras_utils import standardize_data_format
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.models.stable_diffusion_v3.mmdit_block import MMDiTBlock
+from keras_hub.src.utils.keras_utils import standardize_data_format
 
 
 class PatchEmbedding(layers.Layer):
diff --git a/keras_nlp/src/models/stable_diffusion_v3/mmdit_block.py b/keras_hub/src/models/stable_diffusion_v3/mmdit_block.py
similarity index 99%
rename from keras_nlp/src/models/stable_diffusion_v3/mmdit_block.py
rename to keras_hub/src/models/stable_diffusion_v3/mmdit_block.py
index d537e856ef..a46d4c1b86 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/mmdit_block.py
+++ b/keras_hub/src/models/stable_diffusion_v3/mmdit_block.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,7 +17,7 @@
 from keras import models
 from keras import ops
 
-from keras_nlp.src.utils.keras_utils import gelu_approximate
+from keras_hub.src.utils.keras_utils import gelu_approximate
 
 
 class DismantledBlock(layers.Layer):
diff --git a/keras_nlp/src/models/stable_diffusion_v3/t5_xxl_preprocessor.py b/keras_hub/src/models/stable_diffusion_v3/t5_xxl_preprocessor.py
similarity index 89%
rename from keras_nlp/src/models/stable_diffusion_v3/t5_xxl_preprocessor.py
rename to keras_hub/src/models/stable_diffusion_v3/t5_xxl_preprocessor.py
index b559d5faad..31fe9f38f6 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/t5_xxl_preprocessor.py
+++ b/keras_hub/src/models/stable_diffusion_v3/t5_xxl_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,10 +13,10 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.layers.preprocessing.start_end_packer import StartEndPacker
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.models.t5.t5_tokenizer import T5Tokenizer
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.layers.preprocessing.start_end_packer import StartEndPacker
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.models.t5.t5_tokenizer import T5Tokenizer
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
 class T5XXLPreprocessor(Preprocessor):
diff --git a/keras_nlp/src/models/stable_diffusion_v3/t5_xxl_preprocessor_test.py b/keras_hub/src/models/stable_diffusion_v3/t5_xxl_preprocessor_test.py
similarity index 91%
rename from keras_nlp/src/models/stable_diffusion_v3/t5_xxl_preprocessor_test.py
rename to keras_hub/src/models/stable_diffusion_v3/t5_xxl_preprocessor_test.py
index 90b7dfaf9c..acf97ab357 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/t5_xxl_preprocessor_test.py
+++ b/keras_hub/src/models/stable_diffusion_v3/t5_xxl_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,11 +15,11 @@
 
 import pytest
 
-from keras_nlp.src.models.stable_diffusion_v3.t5_xxl_preprocessor import (
+from keras_hub.src.models.stable_diffusion_v3.t5_xxl_preprocessor import (
     T5XXLPreprocessor,
 )
-from keras_nlp.src.models.t5.t5_tokenizer import T5Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.t5.t5_tokenizer import T5Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class T5XXLPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/stable_diffusion_v3/t5_xxl_text_encoder.py b/keras_hub/src/models/stable_diffusion_v3/t5_xxl_text_encoder.py
similarity index 96%
rename from keras_nlp/src/models/stable_diffusion_v3/t5_xxl_text_encoder.py
rename to keras_hub/src/models/stable_diffusion_v3/t5_xxl_text_encoder.py
index 5c44395489..0a137551d0 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/t5_xxl_text_encoder.py
+++ b/keras_hub/src/models/stable_diffusion_v3/t5_xxl_text_encoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.t5.t5_layer_norm import T5LayerNorm
-from keras_nlp.src.models.t5.t5_transformer_layer import T5TransformerLayer
+from keras_hub.src.models.t5.t5_layer_norm import T5LayerNorm
+from keras_hub.src.models.t5.t5_transformer_layer import T5TransformerLayer
 
 
 class T5XXLTextEncoder(keras.Model):
diff --git a/keras_nlp/src/models/stable_diffusion_v3/vae_attention.py b/keras_hub/src/models/stable_diffusion_v3/vae_attention.py
similarity index 97%
rename from keras_nlp/src/models/stable_diffusion_v3/vae_attention.py
rename to keras_hub/src/models/stable_diffusion_v3/vae_attention.py
index 1fba90d681..f9f2239f05 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/vae_attention.py
+++ b/keras_hub/src/models/stable_diffusion_v3/vae_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,7 +16,7 @@
 from keras import layers
 from keras import ops
 
-from keras_nlp.src.utils.keras_utils import standardize_data_format
+from keras_hub.src.utils.keras_utils import standardize_data_format
 
 
 class VAEAttention(layers.Layer):
diff --git a/keras_nlp/src/models/stable_diffusion_v3/vae_image_decoder.py b/keras_hub/src/models/stable_diffusion_v3/vae_image_decoder.py
similarity index 97%
rename from keras_nlp/src/models/stable_diffusion_v3/vae_image_decoder.py
rename to keras_hub/src/models/stable_diffusion_v3/vae_image_decoder.py
index 9cfd6d4ff6..09afbef614 100644
--- a/keras_nlp/src/models/stable_diffusion_v3/vae_image_decoder.py
+++ b/keras_hub/src/models/stable_diffusion_v3/vae_image_decoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 import keras
 from keras import layers
 
-from keras_nlp.src.models.stable_diffusion_v3.vae_attention import VAEAttention
-from keras_nlp.src.utils.keras_utils import standardize_data_format
+from keras_hub.src.models.stable_diffusion_v3.vae_attention import VAEAttention
+from keras_hub.src.utils.keras_utils import standardize_data_format
 
 
 class VAEImageDecoder(keras.Model):
diff --git a/keras_nlp/src/models/t5/__init__.py b/keras_hub/src/models/t5/__init__.py
similarity index 72%
rename from keras_nlp/src/models/t5/__init__.py
rename to keras_hub/src/models/t5/__init__.py
index c8f612f62b..63c98610f9 100644
--- a/keras_nlp/src/models/t5/__init__.py
+++ b/keras_hub/src/models/t5/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.t5.t5_backbone import T5Backbone
-from keras_nlp.src.models.t5.t5_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.t5.t5_backbone import T5Backbone
+from keras_hub.src.models.t5.t5_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, T5Backbone)
diff --git a/keras_nlp/src/models/t5/t5_backbone.py b/keras_hub/src/models/t5/t5_backbone.py
similarity index 96%
rename from keras_nlp/src/models/t5/t5_backbone.py
rename to keras_hub/src/models/t5/t5_backbone.py
index 90f322acb4..49a84cf92e 100644
--- a/keras_nlp/src/models/t5/t5_backbone.py
+++ b/keras_hub/src/models/t5/t5_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,16 +14,16 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.t5.t5_layer_norm import T5LayerNorm
-from keras_nlp.src.models.t5.t5_transformer_layer import T5TransformerLayer
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.t5.t5_layer_norm import T5LayerNorm
+from keras_hub.src.models.t5.t5_transformer_layer import T5TransformerLayer
 
 
-@keras_nlp_export("keras_nlp.models.T5Backbone")
+@keras_hub_export("keras_hub.models.T5Backbone")
 class T5Backbone(Backbone):
     """T5 encoder-decoder backbone model.
 
diff --git a/keras_nlp/src/models/t5/t5_backbone_test.py b/keras_hub/src/models/t5/t5_backbone_test.py
similarity index 94%
rename from keras_nlp/src/models/t5/t5_backbone_test.py
rename to keras_hub/src/models/t5/t5_backbone_test.py
index 040cffdb84..ef15dc23e0 100644
--- a/keras_nlp/src/models/t5/t5_backbone_test.py
+++ b/keras_hub/src/models/t5/t5_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.t5.t5_backbone import T5Backbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.t5.t5_backbone import T5Backbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class T5BackboneTest(TestCase):
diff --git a/keras_nlp/src/models/t5/t5_layer_norm.py b/keras_hub/src/models/t5/t5_layer_norm.py
similarity index 96%
rename from keras_nlp/src/models/t5/t5_layer_norm.py
rename to keras_hub/src/models/t5/t5_layer_norm.py
index 2ff24c3fef..70a9d9f079 100644
--- a/keras_nlp/src/models/t5/t5_layer_norm.py
+++ b/keras_hub/src/models/t5/t5_layer_norm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/t5/t5_multi_head_attention.py b/keras_hub/src/models/t5/t5_multi_head_attention.py
similarity index 99%
rename from keras_nlp/src/models/t5/t5_multi_head_attention.py
rename to keras_hub/src/models/t5/t5_multi_head_attention.py
index 04e6b495d2..961be4709c 100644
--- a/keras_nlp/src/models/t5/t5_multi_head_attention.py
+++ b/keras_hub/src/models/t5/t5_multi_head_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/t5/t5_presets.py b/keras_hub/src/models/t5/t5_presets.py
similarity index 98%
rename from keras_nlp/src/models/t5/t5_presets.py
rename to keras_hub/src/models/t5/t5_presets.py
index 8c760c8223..e256794477 100644
--- a/keras_nlp/src/models/t5/t5_presets.py
+++ b/keras_hub/src/models/t5/t5_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/t5/t5_tokenizer.py b/keras_hub/src/models/t5/t5_tokenizer.py
similarity index 86%
rename from keras_nlp/src/models/t5/t5_tokenizer.py
rename to keras_hub/src/models/t5/t5_tokenizer.py
index 7a1987f275..8e67c918e1 100644
--- a/keras_nlp/src/models/t5/t5_tokenizer.py
+++ b/keras_hub/src/models/t5/t5_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,24 +12,24 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.t5.t5_backbone import T5Backbone
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.t5.t5_backbone import T5Backbone
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.T5Tokenizer",
-        "keras_nlp.models.T5Tokenizer",
+        "keras_hub.tokenizers.T5Tokenizer",
+        "keras_hub.models.T5Tokenizer",
     ]
 )
 class T5Tokenizer(SentencePieceTokenizer):
     """T5 tokenizer layer based on SentencePiece.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.SentencePieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.SentencePieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by
     T5 models and provides a `from_preset()` method to automatically
     download a matching vocabulary for a T5 preset.
@@ -64,7 +64,7 @@ class T5Tokenizer(SentencePieceTokenizer):
         eos_piece="</s>",
         unk_piece="<unk>",
     )
-    tokenizer = keras_nlp.models.T5Tokenizer(
+    tokenizer = keras_hub.models.T5Tokenizer(
         proto=bytes_io.getvalue(),
     )
     tokenizer("The quick brown fox jumped.")
diff --git a/keras_nlp/src/models/t5/t5_tokenizer_test.py b/keras_hub/src/models/t5/t5_tokenizer_test.py
similarity index 93%
rename from keras_nlp/src/models/t5/t5_tokenizer_test.py
rename to keras_hub/src/models/t5/t5_tokenizer_test.py
index b8e4835643..95f40eb887 100644
--- a/keras_nlp/src/models/t5/t5_tokenizer_test.py
+++ b/keras_hub/src/models/t5/t5_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 
 import pytest
 
-from keras_nlp.src.models.t5.t5_tokenizer import T5Tokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.t5.t5_tokenizer import T5Tokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class T5TokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/t5/t5_transformer_layer.py b/keras_hub/src/models/t5/t5_transformer_layer.py
similarity index 96%
rename from keras_nlp/src/models/t5/t5_transformer_layer.py
rename to keras_hub/src/models/t5/t5_transformer_layer.py
index b3dfa44bc0..cce369cda6 100644
--- a/keras_nlp/src/models/t5/t5_transformer_layer.py
+++ b/keras_hub/src/models/t5/t5_transformer_layer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,11 +15,11 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.layers.modeling.transformer_layer_utils import (
+from keras_hub.src.layers.modeling.transformer_layer_utils import (
     compute_causal_mask,
 )
-from keras_nlp.src.models.t5.t5_layer_norm import T5LayerNorm
-from keras_nlp.src.models.t5.t5_multi_head_attention import T5MultiHeadAttention
+from keras_hub.src.models.t5.t5_layer_norm import T5LayerNorm
+from keras_hub.src.models.t5.t5_multi_head_attention import T5MultiHeadAttention
 
 
 class T5TransformerLayer(keras.layers.Layer):
diff --git a/keras_nlp/src/models/task.py b/keras_hub/src/models/task.py
similarity index 90%
rename from keras_nlp/src/models/task.py
rename to keras_hub/src/models/task.py
index 703bee764c..0cf206f02c 100644
--- a/keras_nlp/src/models/task.py
+++ b/keras_hub/src/models/task.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -19,24 +19,24 @@
 from rich import markup
 from rich import table as rich_table
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.utils.keras_utils import print_msg
-from keras_nlp.src.utils.pipeline_model import PipelineModel
-from keras_nlp.src.utils.preset_utils import TASK_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import TASK_WEIGHTS_FILE
-from keras_nlp.src.utils.preset_utils import builtin_presets
-from keras_nlp.src.utils.preset_utils import find_subclass
-from keras_nlp.src.utils.preset_utils import get_preset_loader
-from keras_nlp.src.utils.preset_utils import save_serialized_object
-from keras_nlp.src.utils.python_utils import classproperty
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.utils.keras_utils import print_msg
+from keras_hub.src.utils.pipeline_model import PipelineModel
+from keras_hub.src.utils.preset_utils import TASK_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import TASK_WEIGHTS_FILE
+from keras_hub.src.utils.preset_utils import builtin_presets
+from keras_hub.src.utils.preset_utils import find_subclass
+from keras_hub.src.utils.preset_utils import get_preset_loader
+from keras_hub.src.utils.preset_utils import save_serialized_object
+from keras_hub.src.utils.python_utils import classproperty
 
 
-@keras_nlp_export("keras_nlp.models.Task")
+@keras_hub_export("keras_hub.models.Task")
 class Task(PipelineModel):
     """Base class for all Task models.
 
-    A `Task` wraps a `keras_nlp.models.Backbone` and
-    a `keras_nlp.models.Preprocessor` to create a model that can be directly
+    A `Task` wraps a `keras_hub.models.Backbone` and
+    a `keras_hub.models.Preprocessor` to create a model that can be directly
     used for training, fine-tuning, and prediction for a given text problem.
 
     All `Task` models have `backbone` and `preprocessor` properties. By
@@ -47,8 +47,8 @@ class Task(PipelineModel):
 
     All `Task` classes include a `from_preset()` constructor which can be used
     to load a pre-trained config and weights. Calling `from_preset()` on a task
-    will automatically instantiate a `keras_nlp.models.Backbone` and
-    `keras_nlp.models.Preprocessor`.
+    will automatically instantiate a `keras_hub.models.Backbone` and
+    `keras_hub.models.Preprocessor`.
 
     Args:
         compile: boolean, defaults to `True`. If `True` will compile the model
@@ -90,7 +90,7 @@ def __setattr__(self, name, value):
 
     @property
     def backbone(self):
-        """A `keras_nlp.models.Backbone` model with the core architecture."""
+        """A `keras_hub.models.Backbone` model with the core architecture."""
         return getattr(self, "_backbone", None)
 
     @backbone.setter
@@ -99,7 +99,7 @@ def backbone(self, value):
 
     @property
     def preprocessor(self):
-        """A `keras_nlp.models.Preprocessor` layer used to preprocess input."""
+        """A `keras_hub.models.Preprocessor` layer used to preprocess input."""
         return getattr(self, "_preprocessor", None)
 
     @preprocessor.setter
@@ -141,7 +141,7 @@ def from_preset(
         load_weights=True,
         **kwargs,
     ):
-        """Instantiate a `keras_nlp.models.Task` from a model preset.
+        """Instantiate a `keras_hub.models.Task` from a model preset.
 
         A preset is a directory of configs, weights and other file assets used
         to save and load a pre-trained model. The `preset` can be passed as
@@ -156,8 +156,8 @@ def from_preset(
         built-in presets available on the class.
 
         This constructor can be called in one of two ways. Either from a task
-        specific base class like `keras_nlp.models.CausalLM.from_preset()`, or
-        from a model class like `keras_nlp.models.BertTextClassifier.from_preset()`.
+        specific base class like `keras_hub.models.CausalLM.from_preset()`, or
+        from a model class like `keras_hub.models.BertTextClassifier.from_preset()`.
         If calling from the a base class, the subclass of the returning object
         will be inferred from the config in the preset directory.
 
@@ -171,12 +171,12 @@ def from_preset(
         Examples:
         ```python
         # Load a Gemma generative task.
-        causal_lm = keras_nlp.models.CausalLM.from_preset(
+        causal_lm = keras_hub.models.CausalLM.from_preset(
             "gemma_2b_en",
         )
 
         # Load a Bert classification task.
-        model = keras_nlp.models.TextClassifier.from_preset(
+        model = keras_hub.models.TextClassifier.from_preset(
             "bert_base_en",
             num_classes=2,
         )
@@ -186,7 +186,7 @@ def from_preset(
             raise ValueError(
                 "Do not call `Task.from_preset()` directly. Instead call a "
                 "particular task class, e.g. "
-                "`keras_nlp.models.TextClassifier.from_preset()`."
+                "`keras_hub.models.TextClassifier.from_preset()`."
             )
 
         loader = get_preset_loader(preset)
diff --git a/keras_nlp/src/models/task_test.py b/keras_hub/src/models/task_test.py
similarity index 86%
rename from keras_nlp/src/models/task_test.py
rename to keras_hub/src/models/task_test.py
index 2eba398452..9fefcca050 100644
--- a/keras_nlp/src/models/task_test.py
+++ b/keras_hub/src/models/task_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,21 +18,21 @@
 import keras
 import pytest
 
-from keras_nlp.src.models.bert.bert_text_classifier import BertTextClassifier
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.gpt2.gpt2_causal_lm import GPT2CausalLM
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.models.task import Task
-from keras_nlp.src.models.text_classifier import TextClassifier
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.tokenizer import Tokenizer
-from keras_nlp.src.utils.preset_utils import CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import METADATA_FILE
-from keras_nlp.src.utils.preset_utils import MODEL_WEIGHTS_FILE
-from keras_nlp.src.utils.preset_utils import TASK_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import TASK_WEIGHTS_FILE
-from keras_nlp.src.utils.preset_utils import check_config_class
-from keras_nlp.src.utils.preset_utils import load_json
+from keras_hub.src.models.bert.bert_text_classifier import BertTextClassifier
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.gpt2.gpt2_causal_lm import GPT2CausalLM
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.models.task import Task
+from keras_hub.src.models.text_classifier import TextClassifier
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.tokenizer import Tokenizer
+from keras_hub.src.utils.preset_utils import CONFIG_FILE
+from keras_hub.src.utils.preset_utils import METADATA_FILE
+from keras_hub.src.utils.preset_utils import MODEL_WEIGHTS_FILE
+from keras_hub.src.utils.preset_utils import TASK_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import TASK_WEIGHTS_FILE
+from keras_hub.src.utils.preset_utils import check_config_class
+from keras_hub.src.utils.preset_utils import load_json
 
 
 class SimpleTokenizer(Tokenizer):
diff --git a/keras_nlp/src/models/text_classifier.py b/keras_hub/src/models/text_classifier.py
similarity index 91%
rename from keras_nlp/src/models/text_classifier.py
rename to keras_hub/src/models/text_classifier.py
index fd84cac3a3..54e042c2bd 100644
--- a/keras_nlp/src/models/text_classifier.py
+++ b/keras_hub/src/models/text_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,21 +13,21 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.task import Task
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.task import Task
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.TextClassifier",
-        "keras_nlp.models.Classifier",
+        "keras_hub.models.TextClassifier",
+        "keras_hub.models.Classifier",
     ]
 )
 class TextClassifier(Task):
     """Base class for all classification tasks.
 
-    `TextClassifier` tasks wrap a `keras_nlp.models.Backbone` and
-    a `keras_nlp.models.Preprocessor` to create a model that can be used for
+    `TextClassifier` tasks wrap a `keras_hub.models.Backbone` and
+    a `keras_hub.models.Preprocessor` to create a model that can be used for
     sequence classification. `TextClassifier` tasks take an additional
     `num_classes` argument, controlling the number of predicted output classes.
 
@@ -46,7 +46,7 @@ class TextClassifier(Task):
     Example:
     ```python
     # Load a BERT classifier with pre-trained weights.
-    classifier = keras_nlp.models.TextClassifier.from_preset(
+    classifier = keras_hub.models.TextClassifier.from_preset(
         "bert_base_en",
         num_classes=2,
     )
diff --git a/keras_nlp/src/models/text_classifier_preprocessor.py b/keras_hub/src/models/text_classifier_preprocessor.py
similarity index 89%
rename from keras_nlp/src/models/text_classifier_preprocessor.py
rename to keras_hub/src/models/text_classifier_preprocessor.py
index 9b34353938..a0c8996d71 100644
--- a/keras_nlp/src/models/text_classifier_preprocessor.py
+++ b/keras_hub/src/models/text_classifier_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,21 +13,21 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.multi_segment_packer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.multi_segment_packer import (
     MultiSegmentPacker,
 )
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.models.preprocessor import Preprocessor
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.models.TextClassifierPreprocessor")
+@keras_hub_export("keras_hub.models.TextClassifierPreprocessor")
 class TextClassifierPreprocessor(Preprocessor):
     """Base class for text classification preprocessing layers.
 
-    `TextClassifierPreprocessor` tasks wrap a `keras_nlp.tokenizer.Tokenizer` to
+    `TextClassifierPreprocessor` tasks wrap a `keras_hub.tokenizer.Tokenizer` to
     create a preprocessing layer for text classification tasks. It is intended
-    to be paired with a `keras_nlp.models.TextClassifier` task.
+    to be paired with a `keras_hub.models.TextClassifier` task.
 
     All `TextClassifierPreprocessor` take inputs three ordered inputs, `x`, `y`,
     and `sample_weight`. `x`, the first input, should always be included. It can
@@ -49,7 +49,7 @@ class TextClassifierPreprocessor(Preprocessor):
 
     Examples.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "bert_base_en_uncased",
         sequence_length=256, # Optional.
     )
diff --git a/keras_nlp/src/models/text_classifier_preprocessor_test.py b/keras_hub/src/models/text_classifier_preprocessor_test.py
similarity index 88%
rename from keras_nlp/src/models/text_classifier_preprocessor_test.py
rename to keras_hub/src/models/text_classifier_preprocessor_test.py
index e4b3480367..036563097f 100644
--- a/keras_nlp/src/models/text_classifier_preprocessor_test.py
+++ b/keras_hub/src/models/text_classifier_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,14 +13,14 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.bert.bert_text_classifier_preprocessor import (
+from keras_hub.src.models.bert.bert_text_classifier_preprocessor import (
     BertTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
-from keras_nlp.src.models.text_classifier_preprocessor import (
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.models.text_classifier_preprocessor import (
     TextClassifierPreprocessor,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestTextClassifierPreprocessor(TestCase):
diff --git a/keras_hub/src/models/vgg/__init__.py b/keras_hub/src/models/vgg/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/vgg/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/vgg/vgg_backbone.py b/keras_hub/src/models/vgg/vgg_backbone.py
similarity index 94%
rename from keras_nlp/src/models/vgg/vgg_backbone.py
rename to keras_hub/src/models/vgg/vgg_backbone.py
index b215261fed..541b3600ef 100644
--- a/keras_nlp/src/models/vgg/vgg_backbone.py
+++ b/keras_hub/src/models/vgg/vgg_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2023 The KerasNLP Authors
+# Copyright 2023 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 import keras
 from keras import layers
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.backbone import Backbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.backbone import Backbone
 
 
-@keras_nlp_export("keras_nlp.models.VGGBackbone")
+@keras_hub_export("keras_hub.models.VGGBackbone")
 class VGGBackbone(Backbone):
     """This class represents Keras Backbone of VGG model.
 
@@ -53,11 +53,11 @@ class VGGBackbone(Backbone):
     input_data = np.ones((2, 224, 224, 3), dtype="float32")
 
     # Pretrained VGG backbone.
-    model = keras_nlp.models.VGGBackbone.from_preset("vgg16")
+    model = keras_hub.models.VGGBackbone.from_preset("vgg16")
     model(input_data)
 
     # Randomly initialized VGG backbone with a custom config.
-    model = keras_nlp.models.VGGBackbone(
+    model = keras_hub.models.VGGBackbone(
         stackwise_num_repeats = [2, 2, 3, 3, 3],
         stackwise_num_filters = [64, 128, 256, 512, 512],
         image_shape = (224, 224, 3),
diff --git a/keras_nlp/src/models/vgg/vgg_backbone_test.py b/keras_hub/src/models/vgg/vgg_backbone_test.py
similarity index 90%
rename from keras_nlp/src/models/vgg/vgg_backbone_test.py
rename to keras_hub/src/models/vgg/vgg_backbone_test.py
index d5521ca92d..38f7d03606 100644
--- a/keras_nlp/src/models/vgg/vgg_backbone_test.py
+++ b/keras_hub/src/models/vgg/vgg_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 The KerasNLP Authors
+# Copyright 2023 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.vgg.vgg_backbone import VGGBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.vgg.vgg_backbone import VGGBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class VGGBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/vgg/vgg_image_classifier.py b/keras_hub/src/models/vgg/vgg_image_classifier.py
similarity index 87%
rename from keras_nlp/src/models/vgg/vgg_image_classifier.py
rename to keras_hub/src/models/vgg/vgg_image_classifier.py
index d849586ed8..6b9733c250 100644
--- a/keras_nlp/src/models/vgg/vgg_image_classifier.py
+++ b/keras_hub/src/models/vgg/vgg_image_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2023 The KerasNLP Authors
+# Copyright 2023 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,17 +13,17 @@
 # limitations under the License.
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.image_classifier import ImageClassifier
-from keras_nlp.src.models.vgg.vgg_backbone import VGGBackbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.image_classifier import ImageClassifier
+from keras_hub.src.models.vgg.vgg_backbone import VGGBackbone
 
 
-@keras_nlp_export("keras_nlp.models.VGGImageClassifier")
+@keras_hub_export("keras_hub.models.VGGImageClassifier")
 class VGGImageClassifier(ImageClassifier):
     """VGG16 image classifier task model.
 
     Args:
-      backbone: A `keras_nlp.models.VGGBackbone` instance.
+      backbone: A `keras_hub.models.VGGBackbone` instance.
       num_classes: int, number of classes to predict.
       pooling: str, type of pooling layer. Must be one of "avg", "max".
       activation: Optional `str` or callable, defaults to "softmax". The
@@ -41,7 +41,7 @@ class VGGImageClassifier(ImageClassifier):
     # Load preset and train
     images = np.ones((2, 224, 224, 3), dtype="float32")
     labels = [0, 3]
-    classifier = keras_nlp.models.VGGImageClassifier.from_preset(
+    classifier = keras_hub.models.VGGImageClassifier.from_preset(
         'vgg_16_image_classifier')
     classifier.fit(x=images, y=labels, batch_size=2)
 
@@ -62,14 +62,14 @@ class VGGImageClassifier(ImageClassifier):
     images = np.ones((2, 224, 224, 3), dtype="float32")
     labels = [0, 3]
 
-    backbone = keras_nlp.models.VGGBackbone(
+    backbone = keras_hub.models.VGGBackbone(
         stackwise_num_repeats = [2, 2, 3, 3, 3],
         stackwise_num_filters = [64, 128, 256, 512, 512],
         image_shape = (224, 224, 3),
         include_rescaling = False,
         pooling = "avg",
     )
-    classifier = keras_nlp.models.VGGImageClassifier(
+    classifier = keras_hub.models.VGGImageClassifier(
         backbone=backbone,
         num_classes=4,
     )
diff --git a/keras_nlp/src/models/vgg/vgg_image_classifier_test.py b/keras_hub/src/models/vgg/vgg_image_classifier_test.py
similarity index 89%
rename from keras_nlp/src/models/vgg/vgg_image_classifier_test.py
rename to keras_hub/src/models/vgg/vgg_image_classifier_test.py
index 20d855cb66..83ec811bbf 100644
--- a/keras_nlp/src/models/vgg/vgg_image_classifier_test.py
+++ b/keras_hub/src/models/vgg/vgg_image_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 The KerasNLP Authors
+# Copyright 2023 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,9 +14,9 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.vgg.vgg_backbone import VGGBackbone
-from keras_nlp.src.models.vgg.vgg_image_classifier import VGGImageClassifier
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.vgg.vgg_backbone import VGGBackbone
+from keras_hub.src.models.vgg.vgg_image_classifier import VGGImageClassifier
+from keras_hub.src.tests.test_case import TestCase
 
 
 class VGGImageClassifierTest(TestCase):
diff --git a/keras_hub/src/models/vit_det/__init__.py b/keras_hub/src/models/vit_det/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/vit_det/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/vit_det/vit_det_backbone.py b/keras_hub/src/models/vit_det/vit_det_backbone.py
similarity index 94%
rename from keras_nlp/src/models/vit_det/vit_det_backbone.py
rename to keras_hub/src/models/vit_det/vit_det_backbone.py
index 1e83e94b05..0aed62fd11 100644
--- a/keras_nlp/src/models/vit_det/vit_det_backbone.py
+++ b/keras_hub/src/models/vit_det/vit_det_backbone.py
@@ -15,14 +15,14 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.vit_det.vit_layers import AddPositionalEmbedding
-from keras_nlp.src.models.vit_det.vit_layers import ViTDetPatchingAndEmbedding
-from keras_nlp.src.models.vit_det.vit_layers import WindowedTransformerEncoder
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.vit_det.vit_layers import AddPositionalEmbedding
+from keras_hub.src.models.vit_det.vit_layers import ViTDetPatchingAndEmbedding
+from keras_hub.src.models.vit_det.vit_layers import WindowedTransformerEncoder
 
 
-@keras_nlp_export("keras_nlp.models.ViTDetBackbone")
+@keras_hub_export("keras_hub.models.ViTDetBackbone")
 class ViTDetBackbone(Backbone):
     """An implementation of ViT image encoder.
 
@@ -70,11 +70,11 @@ class ViTDetBackbone(Backbone):
     input_data = np.ones((2, 224, 224, 3), dtype="float32")
 
     # Pretrained ViTDetBackbone backbone.
-    model = keras_nlp.models.ViTDetBackbone.from_preset("vit_det")
+    model = keras_hub.models.ViTDetBackbone.from_preset("vit_det")
     model(input_data)
 
     # Randomly initialized ViTDetBackbone backbone with a custom config.
-    model = keras_nlp.models.ViTDetBackbone(
+    model = keras_hub.models.ViTDetBackbone(
             image_shape = (16, 16, 3),
             patch_size = 2,
             hidden_size = 4,
diff --git a/keras_nlp/src/models/vit_det/vit_det_backbone_test.py b/keras_hub/src/models/vit_det/vit_det_backbone_test.py
similarity index 91%
rename from keras_nlp/src/models/vit_det/vit_det_backbone_test.py
rename to keras_hub/src/models/vit_det/vit_det_backbone_test.py
index 0ae277d122..d8c1b2d24c 100644
--- a/keras_nlp/src/models/vit_det/vit_det_backbone_test.py
+++ b/keras_hub/src/models/vit_det/vit_det_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.vit_det.vit_det_backbone import ViTDetBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.vit_det.vit_det_backbone import ViTDetBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ViTDetBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/vit_det/vit_layers.py b/keras_hub/src/models/vit_det/vit_layers.py
similarity index 100%
rename from keras_nlp/src/models/vit_det/vit_layers.py
rename to keras_hub/src/models/vit_det/vit_layers.py
diff --git a/keras_nlp/src/models/whisper/__init__.py b/keras_hub/src/models/whisper/__init__.py
similarity index 73%
rename from keras_nlp/src/models/whisper/__init__.py
rename to keras_hub/src/models/whisper/__init__.py
index 8223325ed2..c0bb8d319e 100644
--- a/keras_nlp/src/models/whisper/__init__.py
+++ b/keras_hub/src/models/whisper/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.whisper.whisper_backbone import WhisperBackbone
-from keras_nlp.src.models.whisper.whisper_presets import backbone_presets
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.models.whisper.whisper_backbone import WhisperBackbone
+from keras_hub.src.models.whisper.whisper_presets import backbone_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, WhisperBackbone)
diff --git a/keras_nlp/src/models/whisper/whisper_audio_converter.py b/keras_hub/src/models/whisper/whisper_audio_converter.py
similarity index 96%
rename from keras_nlp/src/models/whisper/whisper_audio_converter.py
rename to keras_hub/src/models/whisper/whisper_audio_converter.py
index d7cead0094..4647fe9f13 100644
--- a/keras_nlp/src/models/whisper/whisper_audio_converter.py
+++ b/keras_hub/src/models/whisper/whisper_audio_converter.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,9 +15,9 @@
 
 import numpy as np
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.audio_converter import AudioConverter
-from keras_nlp.src.models.whisper.whisper_backbone import WhisperBackbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.audio_converter import AudioConverter
+from keras_hub.src.models.whisper.whisper_backbone import WhisperBackbone
 
 try:
     import tensorflow as tf
@@ -25,7 +25,7 @@
     tf = None
 
 
-@keras_nlp_export("keras_nlp.layers.WhisperAudioConverter")
+@keras_hub_export("keras_hub.layers.WhisperAudioConverter")
 class WhisperAudioConverter(AudioConverter):
     """Whisper audio converter layer.
 
@@ -54,7 +54,7 @@ class WhisperAudioConverter(AudioConverter):
     audio_tensor = tf.ones((8000,), dtype="float32")
 
     # Compute the log-mel spectrogram.
-    audio_converter = keras_nlp.models.WhisperAudioConverter.from_preset(
+    audio_converter = keras_hub.models.WhisperAudioConverter.from_preset(
         "whisper_base_en",
     )
     audio_converter(audio_tensor)
diff --git a/keras_nlp/src/models/whisper/whisper_audio_converter_test.py b/keras_hub/src/models/whisper/whisper_audio_converter_test.py
similarity index 91%
rename from keras_nlp/src/models/whisper/whisper_audio_converter_test.py
rename to keras_hub/src/models/whisper/whisper_audio_converter_test.py
index 16923787f4..bcadbd409c 100644
--- a/keras_nlp/src/models/whisper/whisper_audio_converter_test.py
+++ b/keras_hub/src/models/whisper/whisper_audio_converter_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,10 +14,10 @@
 
 import tensorflow as tf
 
-from keras_nlp.src.models.whisper.whisper_audio_converter import (
+from keras_hub.src.models.whisper.whisper_audio_converter import (
     WhisperAudioConverter,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class WhisperAudioConverterTest(TestCase):
diff --git a/keras_nlp/src/models/whisper/whisper_backbone.py b/keras_hub/src/models/whisper/whisper_backbone.py
similarity index 95%
rename from keras_nlp/src/models/whisper/whisper_backbone.py
rename to keras_hub/src/models/whisper/whisper_backbone.py
index e104494b2b..ec0b9c17e0 100644
--- a/keras_nlp/src/models/whisper/whisper_backbone.py
+++ b/keras_hub/src/models/whisper/whisper_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,14 +16,14 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.layers.modeling.token_and_position_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.position_embedding import PositionEmbedding
+from keras_hub.src.layers.modeling.token_and_position_embedding import (
     TokenAndPositionEmbedding,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.whisper.whisper_decoder import WhisperDecoder
-from keras_nlp.src.models.whisper.whisper_encoder import WhisperEncoder
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.whisper.whisper_decoder import WhisperDecoder
+from keras_hub.src.models.whisper.whisper_encoder import WhisperEncoder
 
 
 def whisper_kernel_initializer(stddev=0.02):
@@ -35,7 +35,7 @@ def call(self, x):
         return ops.pad(x, [[0, 0], [1, 1], [0, 0]])
 
 
-@keras_nlp_export("keras_nlp.models.WhisperBackbone")
+@keras_hub_export("keras_hub.models.WhisperBackbone")
 class WhisperBackbone(Backbone):
     """A Whisper encoder-decoder network for speech.
 
@@ -89,7 +89,7 @@ class WhisperBackbone(Backbone):
     }
 
     # Randomly initialized Whisper encoder-decoder model with a custom config.
-    model = keras_nlp.models.WhisperBackbone(
+    model = keras_hub.models.WhisperBackbone(
         vocabulary_size=51864,
         num_layers=4,
         num_heads=4,
@@ -240,7 +240,7 @@ def __init__(
         # The position embedding layer for the encoder is a sinusoidal embedding
         # layer: https://github.com/openai/whisper/blob/v20230124/whisper/model.py#L137.
         # Hence, we set it to be non-trainable.
-        # TODO: We can use `keras_nlp.layers.SinePositionEncoding` layer.
+        # TODO: We can use `keras_hub.layers.SinePositionEncoding` layer.
         positions = self.encoder_position_embedding(embedded_features)
         x = self.encoder_embeddings_add((embedded_features, positions))
         x = self.encoder_embeddings_dropout(x)
diff --git a/keras_nlp/src/models/whisper/whisper_backbone_test.py b/keras_hub/src/models/whisper/whisper_backbone_test.py
similarity index 96%
rename from keras_nlp/src/models/whisper/whisper_backbone_test.py
rename to keras_hub/src/models/whisper/whisper_backbone_test.py
index 33bc0e2871..9cce1cb1b4 100644
--- a/keras_nlp/src/models/whisper/whisper_backbone_test.py
+++ b/keras_hub/src/models/whisper/whisper_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.whisper.whisper_backbone import WhisperBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.whisper.whisper_backbone import WhisperBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class WhisperBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/whisper/whisper_cached_multi_head_attention.py b/keras_hub/src/models/whisper/whisper_cached_multi_head_attention.py
similarity index 95%
rename from keras_nlp/src/models/whisper/whisper_cached_multi_head_attention.py
rename to keras_hub/src/models/whisper/whisper_cached_multi_head_attention.py
index c7566c9d2f..6236df5798 100644
--- a/keras_nlp/src/models/whisper/whisper_cached_multi_head_attention.py
+++ b/keras_hub/src/models/whisper/whisper_cached_multi_head_attention.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,7 +18,7 @@
 
 import keras
 
-from keras_nlp.src.layers.modeling.cached_multi_head_attention import (
+from keras_hub.src.layers.modeling.cached_multi_head_attention import (
     CachedMultiHeadAttention,
 )
 
@@ -64,11 +64,11 @@ def _get_output_shape(output_rank, known_last_dims):
     return [None] * (output_rank - len(known_last_dims)) + list(known_last_dims)
 
 
-@keras.saving.register_keras_serializable(package="keras_nlp")
+@keras.saving.register_keras_serializable(package="keras_hub")
 class WhisperCachedMultiHeadAttention(CachedMultiHeadAttention):
     """Whisper Cached Multi-Head Attention layer.
 
-    Inherits from `keras_nlp.layers.CachedMultiHeadAttention`, and overrides the
+    Inherits from `keras_hub.layers.CachedMultiHeadAttention`, and overrides the
     `build` method so that Q, V projection layers have bias
     whereas K projection layer does not.
     """
diff --git a/keras_nlp/src/models/whisper/whisper_decoder.py b/keras_hub/src/models/whisper/whisper_decoder.py
similarity index 90%
rename from keras_nlp/src/models/whisper/whisper_decoder.py
rename to keras_hub/src/models/whisper/whisper_decoder.py
index 126a082967..bea5b46a0f 100644
--- a/keras_nlp/src/models/whisper/whisper_decoder.py
+++ b/keras_hub/src/models/whisper/whisper_decoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,23 +16,23 @@
 
 import keras
 
-from keras_nlp.src.layers.modeling.transformer_decoder import TransformerDecoder
-from keras_nlp.src.models.whisper.whisper_cached_multi_head_attention import (
+from keras_hub.src.layers.modeling.transformer_decoder import TransformerDecoder
+from keras_hub.src.models.whisper.whisper_cached_multi_head_attention import (
     WhisperCachedMultiHeadAttention,
 )
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
-@keras.saving.register_keras_serializable(package="keras_nlp")
+@keras.saving.register_keras_serializable(package="keras_hub")
 class WhisperDecoder(TransformerDecoder):
     """Whisper decoder.
 
-    Inherits from `keras_nlp.layers.TransformerDecoder`, and overrides the
+    Inherits from `keras_hub.layers.TransformerDecoder`, and overrides the
     `build` method to use the
-    `keras_nlp.models.whisper.whisper_multi_head_attention.WhisperMultiHeadAttention`
+    `keras_hub.models.whisper.whisper_multi_head_attention.WhisperMultiHeadAttention`
     layer instead of `keras.layers.MultiHeadAttention` and
-    `keras_nlp.models.whisper.whisper_cached_multi_head_attention.WhisperCachedMultiHeadAttention`
-    instead of `keras_nlp.layers.cached_multi_head_attention.CachedMultiHeadAttention`.
+    `keras_hub.models.whisper.whisper_cached_multi_head_attention.WhisperCachedMultiHeadAttention`
+    instead of `keras_hub.layers.cached_multi_head_attention.CachedMultiHeadAttention`.
     """
 
     def build(
diff --git a/keras_nlp/src/models/whisper/whisper_encoder.py b/keras_hub/src/models/whisper/whisper_encoder.py
similarity index 90%
rename from keras_nlp/src/models/whisper/whisper_encoder.py
rename to keras_hub/src/models/whisper/whisper_encoder.py
index 71ebf48757..72e316ef39 100644
--- a/keras_nlp/src/models/whisper/whisper_encoder.py
+++ b/keras_hub/src/models/whisper/whisper_encoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,20 +16,20 @@
 
 import keras
 
-from keras_nlp.src.layers.modeling.transformer_encoder import TransformerEncoder
-from keras_nlp.src.models.whisper.whisper_cached_multi_head_attention import (
+from keras_hub.src.layers.modeling.transformer_encoder import TransformerEncoder
+from keras_hub.src.models.whisper.whisper_cached_multi_head_attention import (
     WhisperCachedMultiHeadAttention,
 )
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
-@keras.saving.register_keras_serializable(package="keras_nlp")
+@keras.saving.register_keras_serializable(package="keras_hub")
 class WhisperEncoder(TransformerEncoder):
     """Whisper encoder.
 
-    Inherits from `keras_nlp.layers.TransformerEncoder`, and overrides the
+    Inherits from `keras_hub.layers.TransformerEncoder`, and overrides the
     `_build` method to use the
-    `keras_nlp.models.whisper.whisper_multi_head_attention.WhisperCachedMultiHeadAttention`
+    `keras_hub.models.whisper.whisper_multi_head_attention.WhisperCachedMultiHeadAttention`
     layer instead of `keras.layers.MultiHeadAttention`.
     """
 
diff --git a/keras_nlp/src/models/whisper/whisper_presets.py b/keras_hub/src/models/whisper/whisper_presets.py
similarity index 99%
rename from keras_nlp/src/models/whisper/whisper_presets.py
rename to keras_hub/src/models/whisper/whisper_presets.py
index 1a7844bbbc..3e306584df 100644
--- a/keras_nlp/src/models/whisper/whisper_presets.py
+++ b/keras_hub/src/models/whisper/whisper_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/whisper/whisper_tokenizer.py b/keras_hub/src/models/whisper/whisper_tokenizer.py
similarity index 93%
rename from keras_nlp/src/models/whisper/whisper_tokenizer.py
rename to keras_hub/src/models/whisper/whisper_tokenizer.py
index 972c33502d..39d1e46cef 100644
--- a/keras_nlp/src/models/whisper/whisper_tokenizer.py
+++ b/keras_hub/src/models/whisper/whisper_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,9 +14,9 @@
 
 import json
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.whisper.whisper_backbone import WhisperBackbone
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.whisper.whisper_backbone import WhisperBackbone
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
 
 
 def _load_dict(dict_or_path):
@@ -26,17 +26,17 @@ def _load_dict(dict_or_path):
     return dict_or_path
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.WhisperTokenizer",
-        "keras_nlp.models.WhisperTokenizer",
+        "keras_hub.tokenizers.WhisperTokenizer",
+        "keras_hub.models.WhisperTokenizer",
     ]
 )
 class WhisperTokenizer(BytePairTokenizer):
     """Whisper text tokenizer using Byte-Pair Encoding subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.BytePairTokenizer`.
+    is based on `keras_hub.tokenizers.BytePairTokenizer`.
     This tokenizer does not provide truncation or padding of inputs.
 
     Args:
diff --git a/keras_nlp/src/models/whisper/whisper_tokenizer_test.py b/keras_hub/src/models/whisper/whisper_tokenizer_test.py
similarity index 95%
rename from keras_nlp/src/models/whisper/whisper_tokenizer_test.py
rename to keras_hub/src/models/whisper/whisper_tokenizer_test.py
index 03cc856b3a..df9f25f3e0 100644
--- a/keras_nlp/src/models/whisper/whisper_tokenizer_test.py
+++ b/keras_hub/src/models/whisper/whisper_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import pytest
 
-from keras_nlp.src.models.whisper.whisper_tokenizer import WhisperTokenizer
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.whisper.whisper_tokenizer import WhisperTokenizer
+from keras_hub.src.tests.test_case import TestCase
 
 
 class WhisperTokenizerTest(TestCase):
diff --git a/keras_nlp/src/models/xlm_roberta/__init__.py b/keras_hub/src/models/xlm_roberta/__init__.py
similarity index 74%
rename from keras_nlp/src/models/xlm_roberta/__init__.py
rename to keras_hub/src/models/xlm_roberta/__init__.py
index e2249aa81b..7832950edb 100644
--- a/keras_nlp/src/models/xlm_roberta/__init__.py
+++ b/keras_hub/src/models/xlm_roberta/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,12 +12,12 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_backbone import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_backbone import (
     XLMRobertaBackbone,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_presets import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_presets import (
     backbone_presets,
 )
-from keras_nlp.src.utils.preset_utils import register_presets
+from keras_hub.src.utils.preset_utils import register_presets
 
 register_presets(backbone_presets, XLMRobertaBackbone)
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_backbone.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_backbone.py
similarity index 90%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_backbone.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_backbone.py
index acda2b8baf..1d532d22fa 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_backbone.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.roberta import roberta_backbone
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.roberta import roberta_backbone
 
 
-@keras_nlp_export("keras_nlp.models.XLMRobertaBackbone")
+@keras_hub_export("keras_hub.models.XLMRobertaBackbone")
 class XLMRobertaBackbone(roberta_backbone.RobertaBackbone):
     """An XLM-RoBERTa encoder network.
 
@@ -62,13 +62,13 @@ class XLMRobertaBackbone(roberta_backbone.RobertaBackbone):
     }
 
     # Pretrained XLM-R encoder.
-    model = keras_nlp.models.XLMRobertaBackbone.from_preset(
+    model = keras_hub.models.XLMRobertaBackbone.from_preset(
         "xlm_roberta_base_multi",
     )
     model(input_data)
 
     # Randomly initialized XLM-R model with custom config.
-    model = keras_nlp.models.XLMRobertaBackbone(
+    model = keras_hub.models.XLMRobertaBackbone(
         vocabulary_size=250002,
         num_layers=4,
         num_heads=4,
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_backbone_test.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_backbone_test.py
similarity index 94%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_backbone_test.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_backbone_test.py
index 404201d662..d79fe672c5 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_backbone_test.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,10 +15,10 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_backbone import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_backbone import (
     XLMRobertaBackbone,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class XLMRobertaBackboneTest(TestCase):
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm.py
similarity index 85%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm.py
index dd9a85db9e..b51e23f9f6 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,21 +15,21 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.modeling.masked_lm_head import MaskedLMHead
-from keras_nlp.src.models.masked_lm import MaskedLM
-from keras_nlp.src.models.roberta.roberta_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.modeling.masked_lm_head import MaskedLMHead
+from keras_hub.src.models.masked_lm import MaskedLM
+from keras_hub.src.models.roberta.roberta_backbone import (
     roberta_kernel_initializer,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_backbone import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_backbone import (
     XLMRobertaBackbone,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_masked_lm_preprocessor import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_masked_lm_preprocessor import (
     XLMRobertaMaskedLMPreprocessor,
 )
 
 
-@keras_nlp_export("keras_nlp.models.XLMRobertaMaskedLM")
+@keras_hub_export("keras_hub.models.XLMRobertaMaskedLM")
 class XLMRobertaMaskedLM(MaskedLM):
     """An end-to-end XLM-RoBERTa model for the masked language modeling task.
 
@@ -50,8 +50,8 @@ class XLMRobertaMaskedLM(MaskedLM):
     [here](https://github.com/facebookresearch/fairseq).
 
     Args:
-        backbone: A `keras_nlp.models.XLMRobertaBackbone` instance.
-        preprocessor: A `keras_nlp.models.XLMRobertaMaskedLMPreprocessor` or
+        backbone: A `keras_hub.models.XLMRobertaBackbone` instance.
+        preprocessor: A `keras_hub.models.XLMRobertaMaskedLMPreprocessor` or
             `None`. If `None`, this model will not apply preprocessing, and
             inputs should be preprocessed before calling the model.
 
@@ -64,7 +64,7 @@ class XLMRobertaMaskedLM(MaskedLM):
 
     # Pretrained language model
     # on an MLM task.
-    masked_lm = keras_nlp.models.XLMRobertaMaskedLM.from_preset(
+    masked_lm = keras_hub.models.XLMRobertaMaskedLM.from_preset(
         "xlm_roberta_base_multi",
     )
     masked_lm.fit(x=features, batch_size=2)
@@ -93,7 +93,7 @@ class XLMRobertaMaskedLM(MaskedLM):
     # Labels are the original masked values.
     labels = [[3, 5]] * 2
 
-    masked_lm = keras_nlp.models.XLMRobertaMaskedLM.from_preset(
+    masked_lm = keras_hub.models.XLMRobertaMaskedLM.from_preset(
         "xlm_roberta_base_multi",
         preprocessor=None,
     )
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor.py
similarity index 88%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor.py
index d1f028aac4..c7d0bf64de 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,27 +14,27 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.multi_segment_packer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.multi_segment_packer import (
     MultiSegmentPacker,
 )
-from keras_nlp.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_backbone import (
+from keras_hub.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
+from keras_hub.src.models.xlm_roberta.xlm_roberta_backbone import (
     XLMRobertaBackbone,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_tokenizer import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_tokenizer import (
     XLMRobertaTokenizer,
 )
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export("keras_nlp.models.XLMRobertaMaskedLMPreprocessor")
+@keras_hub_export("keras_hub.models.XLMRobertaMaskedLMPreprocessor")
 class XLMRobertaMaskedLMPreprocessor(MaskedLMPreprocessor):
     """XLM-RoBERTa preprocessing for the masked language modeling task.
 
     This preprocessing layer will prepare inputs for a masked language modeling
     task. It is primarily intended for use with the
-    `keras_nlp.models.XLMRobertaMaskedLM` task model. Preprocessing will occur in
+    `keras_hub.models.XLMRobertaMaskedLM` task model. Preprocessing will occur in
     multiple steps.
 
     1. Tokenize any number of input segments using the `tokenizer`.
@@ -45,10 +45,10 @@ class XLMRobertaMaskedLMPreprocessor(MaskedLMPreprocessor):
     3. Randomly select non-special tokens to mask, controlled by
       `mask_selection_rate`.
     4. Construct a `(x, y, sample_weight)` tuple suitable for training with a
-      `keras_nlp.models.XLMRobertaMaskedLM` task model.
+      `keras_hub.models.XLMRobertaMaskedLM` task model.
 
     Args:
-        tokenizer: A `keras_nlp.models.XLMRobertaTokenizer` instance.
+        tokenizer: A `keras_hub.models.XLMRobertaTokenizer` instance.
         sequence_length: int. The length of the packed inputs.
         truncate: string. The algorithm to truncate a list of batched segments
             to fit within `sequence_length`. The value can be either
@@ -85,7 +85,7 @@ class XLMRobertaMaskedLMPreprocessor(MaskedLMPreprocessor):
     Directly calling the layer on data.
     ```python
     # Load the preprocessor from a preset.
-    preprocessor = keras_nlp.models.XLMRobertaMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.XLMRobertaMaskedLMPreprocessor.from_preset(
         "xlm_roberta_base_multi"
     )
 
@@ -102,7 +102,7 @@ class XLMRobertaMaskedLMPreprocessor(MaskedLMPreprocessor):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.XLMRobertaMaskedLMPreprocessor.from_preset(
+    preprocessor = keras_hub.models.XLMRobertaMaskedLMPreprocessor.from_preset(
         "xlm_roberta_base_multi"
     )
     first = tf.constant(["The quick brown fox jumped.", "Call me Ishmael."])
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor_test.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor_test.py
similarity index 93%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor_test.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor_test.py
index 4a743efcd0..b878207622 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor_test.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,13 +16,13 @@
 
 import pytest
 
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_masked_lm_preprocessor import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_masked_lm_preprocessor import (
     XLMRobertaMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_tokenizer import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_tokenizer import (
     XLMRobertaTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class XLMRobertaMaskedLMPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm_test.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm_test.py
similarity index 88%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm_test.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm_test.py
index b2b8226fdc..a21f0fe408 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_masked_lm_test.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_masked_lm_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,19 +16,19 @@
 
 import pytest
 
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_backbone import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_backbone import (
     XLMRobertaBackbone,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_masked_lm import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_masked_lm import (
     XLMRobertaMaskedLM,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_masked_lm_preprocessor import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_masked_lm_preprocessor import (
     XLMRobertaMaskedLMPreprocessor,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_tokenizer import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_tokenizer import (
     XLMRobertaTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class XLMRobertaMaskedLMTest(TestCase):
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_presets.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_presets.py
similarity index 97%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_presets.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_presets.py
index 482b9ce1cb..d34d90d860 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_presets.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier.py
similarity index 87%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier.py
index 904fa0f4fd..5a140b4c2a 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,30 +15,30 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.roberta.roberta_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.roberta.roberta_backbone import (
     roberta_kernel_initializer,
 )
-from keras_nlp.src.models.text_classifier import TextClassifier
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_backbone import (
+from keras_hub.src.models.text_classifier import TextClassifier
+from keras_hub.src.models.xlm_roberta.xlm_roberta_backbone import (
     XLMRobertaBackbone,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_text_classifier_preprocessor import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_text_classifier_preprocessor import (
     XLMRobertaTextClassifierPreprocessor,
 )
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.XLMRobertaTextClassifier",
-        "keras_nlp.models.XLMRobertaClassifier",
+        "keras_hub.models.XLMRobertaTextClassifier",
+        "keras_hub.models.XLMRobertaClassifier",
     ]
 )
 class XLMRobertaTextClassifier(TextClassifier):
     """An end-to-end XLM-RoBERTa model for classification tasks.
 
     This model attaches a classification head to a
-    `keras_nlp.model.XLMRobertaBackbone` instance, mapping from the backbone
+    `keras_hub.model.XLMRobertaBackbone` instance, mapping from the backbone
     outputs to logits suitable for a classification task. For usage of
     this model with pre-trained weights, see the `from_preset()` constructor.
 
@@ -53,9 +53,9 @@ class XLMRobertaTextClassifier(TextClassifier):
     [here](https://github.com/facebookresearch/fairseq).
 
     Args:
-        backbone: A `keras_nlp.models.XLMRobertaBackbone` instance.
+        backbone: A `keras_hub.models.XLMRobertaBackbone` instance.
         num_classes: int. Number of classes to predict.
-        preprocessor: A `keras_nlp.models.XLMRobertaTextClassifierPreprocessor` or `None`. If
+        preprocessor: A `keras_hub.models.XLMRobertaTextClassifierPreprocessor` or `None`. If
             `None`, this model will not apply preprocessing, and inputs should
             be preprocessed before calling the model.
         activation: Optional `str` or callable. The activation function to use
@@ -73,7 +73,7 @@ class XLMRobertaTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier.
-    classifier = keras_nlp.models.XLMRobertaTextClassifier.from_preset(
+    classifier = keras_hub.models.XLMRobertaTextClassifier.from_preset(
         "xlm_roberta_base_multi",
         num_classes=4,
     )
@@ -101,7 +101,7 @@ class XLMRobertaTextClassifier(TextClassifier):
     labels = [0, 3]
 
     # Pretrained classifier without preprocessing.
-    classifier = keras_nlp.models.XLMRobertaTextClassifier.from_preset(
+    classifier = keras_hub.models.XLMRobertaTextClassifier.from_preset(
         "xlm_roberta_base_multi",
         num_classes=4,
         preprocessor=None,
@@ -130,14 +130,14 @@ def train_sentencepiece(ds, vocab_size):
         ["the quick brown fox", "the earth is round"]
     )
     proto = train_sentencepiece(ds, vocab_size=10)
-    tokenizer = keras_nlp.models.XLMRobertaTokenizer(
+    tokenizer = keras_hub.models.XLMRobertaTokenizer(
         proto=proto
     )
-    preprocessor = keras_nlp.models.XLMRobertaTextClassifierPreprocessor(
+    preprocessor = keras_hub.models.XLMRobertaTextClassifierPreprocessor(
         tokenizer,
         sequence_length=128,
     )
-    backbone = keras_nlp.models.XLMRobertaBackbone(
+    backbone = keras_hub.models.XLMRobertaBackbone(
         vocabulary_size=250002,
         num_layers=4,
         num_heads=4,
@@ -145,7 +145,7 @@ def train_sentencepiece(ds, vocab_size):
         intermediate_dim=512,
         max_sequence_length=128,
     )
-    classifier = keras_nlp.models.XLMRobertaTextClassifier(
+    classifier = keras_hub.models.XLMRobertaTextClassifier(
         backbone=backbone,
         preprocessor=preprocessor,
         num_classes=4,
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor.py
similarity index 86%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor.py
index 52984e9ad7..921b06cbf9 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,26 +15,26 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.multi_segment_packer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.multi_segment_packer import (
     MultiSegmentPacker,
 )
-from keras_nlp.src.models.text_classifier_preprocessor import (
+from keras_hub.src.models.text_classifier_preprocessor import (
     TextClassifierPreprocessor,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_backbone import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_backbone import (
     XLMRobertaBackbone,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_tokenizer import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_tokenizer import (
     XLMRobertaTokenizer,
 )
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.models.XLMRobertaTextClassifierPreprocessor",
-        "keras_nlp.models.XLMRobertaPreprocessor",
+        "keras_hub.models.XLMRobertaTextClassifierPreprocessor",
+        "keras_hub.models.XLMRobertaPreprocessor",
     ]
 )
 class XLMRobertaTextClassifierPreprocessor(TextClassifierPreprocessor):
@@ -43,7 +43,7 @@ class XLMRobertaTextClassifierPreprocessor(TextClassifierPreprocessor):
     This preprocessing layer will do three things:
 
     1. Tokenize any number of input segments using the `tokenizer`.
-    2. Pack the inputs together using a `keras_nlp.layers.MultiSegmentPacker`.
+    2. Pack the inputs together using a `keras_hub.layers.MultiSegmentPacker`.
       with the appropriate `"<s>"`, `"</s>"` and `"<pad>"` tokens, i.e., adding
       a single `"<s>"` at the start of the entire sequence, `"</s></s>"` at the
       end of each segment, save the last and a `"</s>"` at the end of the
@@ -56,7 +56,7 @@ class XLMRobertaTextClassifierPreprocessor(TextClassifierPreprocessor):
     `keras.Model.fit`.
 
     Args:
-        tokenizer: A `keras_nlp.tokenizers.XLMRobertaTokenizer` instance.
+        tokenizer: A `keras_hub.tokenizers.XLMRobertaTokenizer` instance.
         sequence_length: The length of the packed inputs.
         truncate: The algorithm to truncate a list of batched segments to fit
             within `sequence_length`. The value can be either `round_robin` or
@@ -81,7 +81,7 @@ class XLMRobertaTextClassifierPreprocessor(TextClassifierPreprocessor):
 
     Directly calling the layer on data.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "xlm_roberta_base_multi"
     )
 
@@ -114,8 +114,8 @@ def train_sentencepiece(ds, vocab_size):
         ["the quick brown fox", "the earth is round"]
     )
     proto = train_sentencepiece(ds, vocab_size=10)
-    tokenizer = keras_nlp.models.XLMRobertaTokenizer(proto=proto)
-    preprocessor = keras_nlp.models.XLMRobertaTextClassifierPreprocessor(
+    tokenizer = keras_hub.models.XLMRobertaTokenizer(proto=proto)
+    preprocessor = keras_hub.models.XLMRobertaTextClassifierPreprocessor(
         tokenizer
     )
     preprocessor("The quick brown fox jumped.")
@@ -123,7 +123,7 @@ def train_sentencepiece(ds, vocab_size):
 
     Mapping with `tf.data.Dataset`.
     ```python
-    preprocessor = keras_nlp.models.TextClassifierPreprocessor.from_preset(
+    preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
         "xlm_roberta_base_multi"
     )
 
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor_test.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor_test.py
similarity index 91%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor_test.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor_test.py
index 103e0ae8a7..652e76fb9d 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor_test.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier_preprocessor_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,13 +16,13 @@
 
 import pytest
 
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_text_classifier_preprocessor import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_text_classifier_preprocessor import (
     XLMRobertaTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_tokenizer import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_tokenizer import (
     XLMRobertaTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class XLMRobertaTextClassifierPreprocessorTest(TestCase):
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier_test.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier_test.py
similarity index 88%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier_test.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier_test.py
index a618da07a2..5ff67a1c5a 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_text_classifier_test.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_text_classifier_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,19 +16,19 @@
 
 import pytest
 
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_backbone import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_backbone import (
     XLMRobertaBackbone,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_text_classifier import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_text_classifier import (
     XLMRobertaTextClassifier,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_text_classifier_preprocessor import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_text_classifier_preprocessor import (
     XLMRobertaTextClassifierPreprocessor,
 )
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_tokenizer import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_tokenizer import (
     XLMRobertaTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class XLMRobertaTextClassifierTest(TestCase):
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_tokenizer.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_tokenizer.py
similarity index 91%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_tokenizer.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_tokenizer.py
index 1fba910f10..4f60139bc6 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_tokenizer.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,14 +13,14 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_backbone import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.xlm_roberta.xlm_roberta_backbone import (
     XLMRobertaBackbone,
 )
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
-from keras_nlp.src.utils.tensor_utils import tensor_to_list
+from keras_hub.src.utils.tensor_utils import tensor_to_list
 
 try:
     import tensorflow as tf
@@ -28,17 +28,17 @@
     tf = None
 
 
-@keras_nlp_export(
+@keras_hub_export(
     [
-        "keras_nlp.tokenizers.XLMRobertaTokenizer",
-        "keras_nlp.models.XLMRobertaTokenizer",
+        "keras_hub.tokenizers.XLMRobertaTokenizer",
+        "keras_hub.models.XLMRobertaTokenizer",
     ]
 )
 class XLMRobertaTokenizer(SentencePieceTokenizer):
     """An XLM-RoBERTa tokenizer using SentencePiece subword segmentation.
 
     This tokenizer class will tokenize raw strings into integer sequences and
-    is based on `keras_nlp.tokenizers.SentencePieceTokenizer`. Unlike the
+    is based on `keras_hub.tokenizers.SentencePieceTokenizer`. Unlike the
     underlying tokenizer, it will check for all special tokens needed by
     XLM-RoBERTa models and provides a `from_preset()` method to automatically
     download a matching vocabulary for an XLM-RoBERTa preset.
@@ -62,7 +62,7 @@ class XLMRobertaTokenizer(SentencePieceTokenizer):
 
     Examples:
     ```python
-    tokenizer = keras_nlp.models.XLMRobertaTokenizer.from_preset(
+    tokenizer = keras_hub.models.XLMRobertaTokenizer.from_preset(
         "xlm_roberta_base_multi",
     )
 
@@ -93,7 +93,7 @@ def train_sentencepiece(ds, vocab_size):
         ["the quick brown fox", "the earth is round"]
     )
     proto = train_sentencepiece(ds, vocab_size=10)
-    tokenizer = keras_nlp.models.XLMRobertaTokenizer(proto=proto)
+    tokenizer = keras_hub.models.XLMRobertaTokenizer(proto=proto)
     ```
     """
 
diff --git a/keras_nlp/src/models/xlm_roberta/xlm_roberta_tokenizer_test.py b/keras_hub/src/models/xlm_roberta/xlm_roberta_tokenizer_test.py
similarity index 92%
rename from keras_nlp/src/models/xlm_roberta/xlm_roberta_tokenizer_test.py
rename to keras_hub/src/models/xlm_roberta/xlm_roberta_tokenizer_test.py
index 827744f1d9..ea39cdbd08 100644
--- a/keras_nlp/src/models/xlm_roberta/xlm_roberta_tokenizer_test.py
+++ b/keras_hub/src/models/xlm_roberta/xlm_roberta_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,10 +16,10 @@
 
 import pytest
 
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_tokenizer import (
+from keras_hub.src.models.xlm_roberta.xlm_roberta_tokenizer import (
     XLMRobertaTokenizer,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class XLMRobertaTokenizerTest(TestCase):
diff --git a/keras_hub/src/models/xlnet/__init__.py b/keras_hub/src/models/xlnet/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/models/xlnet/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/models/xlnet/relative_attention.py b/keras_hub/src/models/xlnet/relative_attention.py
similarity index 100%
rename from keras_nlp/src/models/xlnet/relative_attention.py
rename to keras_hub/src/models/xlnet/relative_attention.py
diff --git a/keras_nlp/src/models/xlnet/xlnet_backbone.py b/keras_hub/src/models/xlnet/xlnet_backbone.py
similarity index 93%
rename from keras_nlp/src/models/xlnet/xlnet_backbone.py
rename to keras_hub/src/models/xlnet/xlnet_backbone.py
index c61755b203..0b78060928 100644
--- a/keras_nlp/src/models/xlnet/xlnet_backbone.py
+++ b/keras_hub/src/models/xlnet/xlnet_backbone.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,17 +14,17 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.xlnet.xlnet_content_and_query_embedding import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.xlnet.xlnet_content_and_query_embedding import (
     ContentAndQueryEmbedding,
 )
-from keras_nlp.src.models.xlnet.xlnet_encoder import XLNetAttentionMaskLayer
-from keras_nlp.src.models.xlnet.xlnet_encoder import XLNetEncoder
-from keras_nlp.src.models.xlnet.xlnet_encoder import XLNetSegmentMatrixLayer
+from keras_hub.src.models.xlnet.xlnet_encoder import XLNetAttentionMaskLayer
+from keras_hub.src.models.xlnet.xlnet_encoder import XLNetEncoder
+from keras_hub.src.models.xlnet.xlnet_encoder import XLNetSegmentMatrixLayer
 
 
-@keras_nlp_export("keras_nlp.models.XLNetBackbone")
+@keras_hub_export("keras_hub.models.XLNetBackbone")
 class XLNetBackbone(Backbone):
     """XLNet encoder network.
 
@@ -69,7 +69,7 @@ class XLNetBackbone(Backbone):
     Example:
     ```python
     import numpy as np
-    from keras_nlp.src.models import XLNetBackbone
+    from keras_hub.src.models import XLNetBackbone
 
     input_data = {
         "token_ids": np.array(
@@ -84,7 +84,7 @@ class XLNetBackbone(Backbone):
     }
 
     # Randomly initialized XLNet encoder with a custom config
-    model = keras_nlp.models.XLNetBackbone(
+    model = keras_hub.models.XLNetBackbone(
         vocabulary_size=32000,
         num_layers=12,
         num_heads=12,
diff --git a/keras_nlp/src/models/xlnet/xlnet_backbone_test.py b/keras_hub/src/models/xlnet/xlnet_backbone_test.py
similarity index 90%
rename from keras_nlp/src/models/xlnet/xlnet_backbone_test.py
rename to keras_hub/src/models/xlnet/xlnet_backbone_test.py
index 0f32771f70..6e8000facf 100644
--- a/keras_nlp/src/models/xlnet/xlnet_backbone_test.py
+++ b/keras_hub/src/models/xlnet/xlnet_backbone_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.xlnet.xlnet_backbone import XLNetBackbone
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.xlnet.xlnet_backbone import XLNetBackbone
+from keras_hub.src.tests.test_case import TestCase
 
 
 class XLNetTest(TestCase):
diff --git a/keras_nlp/src/models/xlnet/xlnet_content_and_query_embedding.py b/keras_hub/src/models/xlnet/xlnet_content_and_query_embedding.py
similarity index 99%
rename from keras_nlp/src/models/xlnet/xlnet_content_and_query_embedding.py
rename to keras_hub/src/models/xlnet/xlnet_content_and_query_embedding.py
index fb4bbd1dd1..1bd7c0bcd2 100644
--- a/keras_nlp/src/models/xlnet/xlnet_content_and_query_embedding.py
+++ b/keras_hub/src/models/xlnet/xlnet_content_and_query_embedding.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/models/xlnet/xlnet_encoder.py b/keras_hub/src/models/xlnet/xlnet_encoder.py
similarity index 99%
rename from keras_nlp/src/models/xlnet/xlnet_encoder.py
rename to keras_hub/src/models/xlnet/xlnet_encoder.py
index a2f0755e3f..270fb6c510 100644
--- a/keras_nlp/src/models/xlnet/xlnet_encoder.py
+++ b/keras_hub/src/models/xlnet/xlnet_encoder.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,7 +15,7 @@
 import keras
 from keras import ops
 
-from keras_nlp.src.models.xlnet.relative_attention import (
+from keras_hub.src.models.xlnet.relative_attention import (
     TwoStreamRelativeAttention,
 )
 
diff --git a/keras_hub/src/samplers/__init__.py b/keras_hub/src/samplers/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/samplers/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/samplers/beam_sampler.py b/keras_hub/src/samplers/beam_sampler.py
similarity index 95%
rename from keras_nlp/src/samplers/beam_sampler.py
rename to keras_hub/src/samplers/beam_sampler.py
index d8992ef448..477125b0b7 100644
--- a/keras_nlp/src/samplers/beam_sampler.py
+++ b/keras_hub/src/samplers/beam_sampler.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,12 +16,12 @@
 from keras import ops
 from keras import tree
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.samplers.sampler import Sampler
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.samplers.sampler import Sampler
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.samplers.BeamSampler")
+@keras_hub_export("keras_hub.samplers.BeamSampler")
 class BeamSampler(Sampler):
     """Beam Sampler class.
 
@@ -41,14 +41,14 @@ class BeamSampler(Sampler):
 
     Examples:
     ```python
-    causal_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+    causal_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
 
     # Pass by name to compile.
     causal_lm.compile(sampler="beam")
     causal_lm.generate(["Keras is a"])
 
     # Pass by object to compile.
-    sampler = keras_nlp.samplers.BeamSampler(num_beams=5)
+    sampler = keras_hub.samplers.BeamSampler(num_beams=5)
     causal_lm.compile(sampler=sampler)
     causal_lm.generate(["Keras is a"])
     ```
diff --git a/keras_nlp/src/samplers/beam_sampler_test.py b/keras_hub/src/samplers/beam_sampler_test.py
similarity index 96%
rename from keras_nlp/src/samplers/beam_sampler_test.py
rename to keras_hub/src/samplers/beam_sampler_test.py
index 0faae66e4a..71323d5e18 100644
--- a/keras_nlp/src/samplers/beam_sampler_test.py
+++ b/keras_hub/src/samplers/beam_sampler_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 from keras import ops
 
-from keras_nlp.src.samplers.beam_sampler import BeamSampler
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.samplers.beam_sampler import BeamSampler
+from keras_hub.src.tests.test_case import TestCase
 
 
 class BeamSamplerTest(TestCase):
diff --git a/keras_nlp/src/samplers/contrastive_sampler.py b/keras_hub/src/samplers/contrastive_sampler.py
similarity index 95%
rename from keras_nlp/src/samplers/contrastive_sampler.py
rename to keras_hub/src/samplers/contrastive_sampler.py
index f8bfaf003e..8efe8fa9ff 100644
--- a/keras_nlp/src/samplers/contrastive_sampler.py
+++ b/keras_hub/src/samplers/contrastive_sampler.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,12 +15,12 @@
 from keras import ops
 from keras import tree
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.samplers.sampler import Sampler
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.samplers.sampler import Sampler
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.samplers.ContrastiveSampler")
+@keras_hub_export("keras_hub.samplers.ContrastiveSampler")
 class ContrastiveSampler(Sampler):
     """Contrastive Sampler class.
 
@@ -41,14 +41,14 @@ class ContrastiveSampler(Sampler):
 
     Examples:
     ```python
-    causal_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+    causal_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
 
     # Pass by name to compile.
     causal_lm.compile(sampler="contrastive")
     causal_lm.generate(["Keras is a"])
 
     # Pass by object to compile.
-    sampler = keras_nlp.samplers.ContrastiveSampler(k=5)
+    sampler = keras_hub.samplers.ContrastiveSampler(k=5)
     causal_lm.compile(sampler=sampler)
     causal_lm.generate(["Keras is a"])
     ```
diff --git a/keras_nlp/src/samplers/contrastive_sampler_test.py b/keras_hub/src/samplers/contrastive_sampler_test.py
similarity index 97%
rename from keras_nlp/src/samplers/contrastive_sampler_test.py
rename to keras_hub/src/samplers/contrastive_sampler_test.py
index 598a04b216..7cdecc3dcf 100644
--- a/keras_nlp/src/samplers/contrastive_sampler_test.py
+++ b/keras_hub/src/samplers/contrastive_sampler_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import numpy as np
 from keras import ops
 
-from keras_nlp.src.samplers.contrastive_sampler import ContrastiveSampler
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.samplers.contrastive_sampler import ContrastiveSampler
+from keras_hub.src.tests.test_case import TestCase
 
 
 class ContrastiveSamplerTest(TestCase):
diff --git a/keras_nlp/src/samplers/greedy_sampler.py b/keras_hub/src/samplers/greedy_sampler.py
similarity index 79%
rename from keras_nlp/src/samplers/greedy_sampler.py
rename to keras_hub/src/samplers/greedy_sampler.py
index 711e6dc28a..1e6b422f18 100644
--- a/keras_nlp/src/samplers/greedy_sampler.py
+++ b/keras_hub/src/samplers/greedy_sampler.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,11 +14,11 @@
 
 from keras import ops
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.samplers.sampler import Sampler
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.samplers.sampler import Sampler
 
 
-@keras_nlp_export("keras_nlp.samplers.GreedySampler")
+@keras_hub_export("keras_hub.samplers.GreedySampler")
 class GreedySampler(Sampler):
     """Greedy sampler class.
 
@@ -27,14 +27,14 @@ class GreedySampler(Sampler):
 
     Examples:
     ```python
-    causal_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+    causal_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
 
     # Pass by name to compile.
     causal_lm.compile(sampler="greedy")
     causal_lm.generate(["Keras is a"])
 
     # Pass by object to compile.
-    sampler = keras_nlp.samplers.GreedySampler()
+    sampler = keras_hub.samplers.GreedySampler()
     causal_lm.compile(sampler=sampler)
     causal_lm.generate(["Keras is a"])
     ```
diff --git a/keras_nlp/src/samplers/greedy_sampler_test.py b/keras_hub/src/samplers/greedy_sampler_test.py
similarity index 96%
rename from keras_nlp/src/samplers/greedy_sampler_test.py
rename to keras_hub/src/samplers/greedy_sampler_test.py
index 81933aec17..3450f7b594 100644
--- a/keras_nlp/src/samplers/greedy_sampler_test.py
+++ b/keras_hub/src/samplers/greedy_sampler_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 from keras import ops
 
-from keras_nlp.src.samplers.greedy_sampler import GreedySampler
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.samplers.greedy_sampler import GreedySampler
+from keras_hub.src.tests.test_case import TestCase
 
 
 class GreedySamplerTest(TestCase):
diff --git a/keras_nlp/src/samplers/random_sampler.py b/keras_hub/src/samplers/random_sampler.py
similarity index 85%
rename from keras_nlp/src/samplers/random_sampler.py
rename to keras_hub/src/samplers/random_sampler.py
index 5e95b1dbbb..2ceb9d363c 100644
--- a/keras_nlp/src/samplers/random_sampler.py
+++ b/keras_hub/src/samplers/random_sampler.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,11 +15,11 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.samplers.sampler import Sampler
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.samplers.sampler import Sampler
 
 
-@keras_nlp_export("keras_nlp.samplers.RandomSampler")
+@keras_hub_export("keras_hub.samplers.RandomSampler")
 class RandomSampler(Sampler):
     """Random Sampler class.
 
@@ -35,14 +35,14 @@ class RandomSampler(Sampler):
 
     Examples:
     ```python
-    causal_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+    causal_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
 
     # Pass by name to compile.
     causal_lm.compile(sampler="random")
     causal_lm.generate(["Keras is a"])
 
     # Pass by object to compile.
-    sampler = keras_nlp.samplers.RandomSampler(temperature=0.7)
+    sampler = keras_hub.samplers.RandomSampler(temperature=0.7)
     causal_lm.compile(sampler=sampler)
     causal_lm.generate(["Keras is a"])
     ```
diff --git a/keras_nlp/src/samplers/random_sampler_test.py b/keras_hub/src/samplers/random_sampler_test.py
similarity index 96%
rename from keras_nlp/src/samplers/random_sampler_test.py
rename to keras_hub/src/samplers/random_sampler_test.py
index 1ed83dd774..cd6c919681 100644
--- a/keras_nlp/src/samplers/random_sampler_test.py
+++ b/keras_hub/src/samplers/random_sampler_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import numpy as np
 from keras import ops
 
-from keras_nlp.src.samplers.random_sampler import RandomSampler
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.samplers.random_sampler import RandomSampler
+from keras_hub.src.tests.test_case import TestCase
 
 
 class RandomSamplerTest(TestCase):
diff --git a/keras_nlp/src/samplers/sampler.py b/keras_hub/src/samplers/sampler.py
similarity index 96%
rename from keras_nlp/src/samplers/sampler.py
rename to keras_hub/src/samplers/sampler.py
index 4987ff0b09..41514c4662 100644
--- a/keras_nlp/src/samplers/sampler.py
+++ b/keras_hub/src/samplers/sampler.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,11 +16,11 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.utils.tensor_utils import any_equal
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.utils.tensor_utils import any_equal
 
 
-@keras_nlp_export("keras_nlp.samplers.Sampler")
+@keras_hub_export("keras_hub.samplers.Sampler")
 class Sampler:
     """Base sampler class.
 
@@ -40,10 +40,10 @@ class Sampler:
     Example:
 
     ```python
-    causal_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+    causal_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
 
     # Greedy search with some tokens forbidden.
-    class CustomSampler(keras_nlp.samplers.Sampler):
+    class CustomSampler(keras_hub.samplers.Sampler):
         def __init__(self, forbidden_tokens, **kwargs):
             super().__init__(**kwargs)
             self.forbidden_tokens = forbidden_tokens
diff --git a/keras_nlp/src/samplers/serialization.py b/keras_hub/src/samplers/serialization.py
similarity index 73%
rename from keras_nlp/src/samplers/serialization.py
rename to keras_hub/src/samplers/serialization.py
index 1a8b273455..10639f14ab 100644
--- a/keras_nlp/src/samplers/serialization.py
+++ b/keras_hub/src/samplers/serialization.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,21 +14,21 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.samplers.beam_sampler import BeamSampler
-from keras_nlp.src.samplers.contrastive_sampler import ContrastiveSampler
-from keras_nlp.src.samplers.greedy_sampler import GreedySampler
-from keras_nlp.src.samplers.random_sampler import RandomSampler
-from keras_nlp.src.samplers.top_k_sampler import TopKSampler
-from keras_nlp.src.samplers.top_p_sampler import TopPSampler
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.samplers.beam_sampler import BeamSampler
+from keras_hub.src.samplers.contrastive_sampler import ContrastiveSampler
+from keras_hub.src.samplers.greedy_sampler import GreedySampler
+from keras_hub.src.samplers.random_sampler import RandomSampler
+from keras_hub.src.samplers.top_k_sampler import TopKSampler
+from keras_hub.src.samplers.top_p_sampler import TopPSampler
 
 
-@keras_nlp_export("keras_nlp.samplers.serialize")
+@keras_hub_export("keras_hub.samplers.serialize")
 def serialize(sampler):
     return keras.saving.serialize_keras_object(sampler)
 
 
-@keras_nlp_export("keras_nlp.samplers.deserialize")
+@keras_hub_export("keras_hub.samplers.deserialize")
 def deserialize(config, custom_objects=None):
     """Return a `Sampler` object from its config."""
     all_classes = {
@@ -47,21 +47,21 @@ def deserialize(config, custom_objects=None):
     )
 
 
-@keras_nlp_export("keras_nlp.samplers.get")
+@keras_hub_export("keras_hub.samplers.get")
 def get(identifier):
-    """Retrieve a KerasNLP sampler by the identifier.
+    """Retrieve a KerasHub sampler by the identifier.
 
     The `identifier` may be the string name of a sampler class or class.
 
     >>> identifier = 'greedy'
-    >>> sampler = keras_nlp.samplers.get(identifier)
+    >>> sampler = keras_hub.samplers.get(identifier)
 
     You can also specify `config` of the sampler to this function by passing
     dict containing `class_name` and `config` as an identifier. Also note that
     the `class_name` must map to a `Sampler` class.
 
-    >>> cfg = {'class_name': 'keras_nlp>GreedySampler', 'config': {}}
-    >>> sampler = keras_nlp.samplers.get(cfg)
+    >>> cfg = {'class_name': 'keras_hub>GreedySampler', 'config': {}}
+    >>> sampler = keras_hub.samplers.get(cfg)
 
     In the case that the `identifier` is a class, this method will return a new
     instance of the class by its constructor.
@@ -85,7 +85,7 @@ def get(identifier):
     elif isinstance(identifier, str):
         if not identifier.islower():
             raise KeyError(
-                "`keras_nlp.samplers.get()` must take a lowercase string "
+                "`keras_hub.samplers.get()` must take a lowercase string "
                 f"identifier, but received: {identifier}."
             )
         return deserialize(identifier)
diff --git a/keras_nlp/src/samplers/serialization_test.py b/keras_hub/src/samplers/serialization_test.py
similarity index 81%
rename from keras_nlp/src/samplers/serialization_test.py
rename to keras_hub/src/samplers/serialization_test.py
index e66640cbc9..e12e501904 100644
--- a/keras_nlp/src/samplers/serialization_test.py
+++ b/keras_hub/src/samplers/serialization_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,11 +12,11 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.samplers.serialization import deserialize
-from keras_nlp.src.samplers.serialization import get
-from keras_nlp.src.samplers.serialization import serialize
-from keras_nlp.src.samplers.top_k_sampler import TopKSampler
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.samplers.serialization import deserialize
+from keras_hub.src.samplers.serialization import get
+from keras_hub.src.samplers.serialization import serialize
+from keras_hub.src.samplers.top_k_sampler import TopKSampler
+from keras_hub.src.tests.test_case import TestCase
 
 
 class SerializationTest(TestCase):
diff --git a/keras_nlp/src/samplers/top_k_sampler.py b/keras_hub/src/samplers/top_k_sampler.py
similarity index 88%
rename from keras_nlp/src/samplers/top_k_sampler.py
rename to keras_hub/src/samplers/top_k_sampler.py
index 53a9e6fcb4..6ac742cafe 100644
--- a/keras_nlp/src/samplers/top_k_sampler.py
+++ b/keras_hub/src/samplers/top_k_sampler.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,11 +15,11 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.samplers.sampler import Sampler
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.samplers.sampler import Sampler
 
 
-@keras_nlp_export("keras_nlp.samplers.TopKSampler")
+@keras_hub_export("keras_hub.samplers.TopKSampler")
 class TopKSampler(Sampler):
     """Top-K Sampler class.
 
@@ -36,14 +36,14 @@ class TopKSampler(Sampler):
 
     Examples:
     ```python
-    causal_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+    causal_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
 
     # Pass by name to compile.
     causal_lm.compile(sampler="top_k")
     causal_lm.generate(["Keras is a"])
 
     # Pass by object to compile.
-    sampler = keras_nlp.samplers.TopKSampler(k=5, temperature=0.7)
+    sampler = keras_hub.samplers.TopKSampler(k=5, temperature=0.7)
     causal_lm.compile(sampler=sampler)
     causal_lm.generate(["Keras is a"])
     ```
diff --git a/keras_nlp/src/samplers/top_k_sampler_test.py b/keras_hub/src/samplers/top_k_sampler_test.py
similarity index 96%
rename from keras_nlp/src/samplers/top_k_sampler_test.py
rename to keras_hub/src/samplers/top_k_sampler_test.py
index ed74b7b64b..f28cef0987 100644
--- a/keras_nlp/src/samplers/top_k_sampler_test.py
+++ b/keras_hub/src/samplers/top_k_sampler_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 from keras import ops
 
-from keras_nlp.src.samplers.top_k_sampler import TopKSampler
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.samplers.top_k_sampler import TopKSampler
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TopKSamplerTest(TestCase):
diff --git a/keras_nlp/src/samplers/top_p_sampler.py b/keras_hub/src/samplers/top_p_sampler.py
similarity index 91%
rename from keras_nlp/src/samplers/top_p_sampler.py
rename to keras_hub/src/samplers/top_p_sampler.py
index e9cfd5df68..f71c49ef71 100644
--- a/keras_nlp/src/samplers/top_p_sampler.py
+++ b/keras_hub/src/samplers/top_p_sampler.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,11 +15,11 @@
 from keras import ops
 from keras import random
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.samplers.sampler import Sampler
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.samplers.sampler import Sampler
 
 
-@keras_nlp_export("keras_nlp.samplers.TopPSampler")
+@keras_hub_export("keras_hub.samplers.TopPSampler")
 class TopPSampler(Sampler):
     """Top-P Sampler class.
 
@@ -44,14 +44,14 @@ class TopPSampler(Sampler):
 
     Examples:
     ```python
-    causal_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en")
+    causal_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")
 
     # Pass by name to compile.
     causal_lm.compile(sampler="top_p")
     causal_lm.generate(["Keras is a"])
 
     # Pass by object to compile.
-    sampler = keras_nlp.samplers.TopPSampler(p=0.1, k=1_000)
+    sampler = keras_hub.samplers.TopPSampler(p=0.1, k=1_000)
     causal_lm.compile(sampler=sampler)
     causal_lm.generate(["Keras is a"])
     ```
diff --git a/keras_nlp/src/samplers/top_p_sampler_test.py b/keras_hub/src/samplers/top_p_sampler_test.py
similarity index 97%
rename from keras_nlp/src/samplers/top_p_sampler_test.py
rename to keras_hub/src/samplers/top_p_sampler_test.py
index 74e6ed1a75..41141b0486 100644
--- a/keras_nlp/src/samplers/top_p_sampler_test.py
+++ b/keras_hub/src/samplers/top_p_sampler_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -15,8 +15,8 @@
 import numpy as np
 from keras import ops
 
-from keras_nlp.src.samplers.top_p_sampler import TopPSampler
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.samplers.top_p_sampler import TopPSampler
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TopPSamplerTest(TestCase):
diff --git a/keras_hub/src/tests/__init__.py b/keras_hub/src/tests/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/tests/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_hub/src/tests/doc_tests/__init__.py b/keras_hub/src/tests/doc_tests/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/tests/doc_tests/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/tests/doc_tests/docstring_lib.py b/keras_hub/src/tests/doc_tests/docstring_lib.py
similarity index 99%
rename from keras_nlp/src/tests/doc_tests/docstring_lib.py
rename to keras_hub/src/tests/doc_tests/docstring_lib.py
index 5feb1b1eb1..7b4611b917 100644
--- a/keras_nlp/src/tests/doc_tests/docstring_lib.py
+++ b/keras_hub/src/tests/doc_tests/docstring_lib.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/tests/doc_tests/docstring_test.py b/keras_hub/src/tests/doc_tests/docstring_test.py
similarity index 85%
rename from keras_nlp/src/tests/doc_tests/docstring_test.py
rename to keras_hub/src/tests/doc_tests/docstring_test.py
index 145b85bc6b..03c3f72c3d 100644
--- a/keras_nlp/src/tests/doc_tests/docstring_test.py
+++ b/keras_hub/src/tests/doc_tests/docstring_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -24,23 +24,23 @@
 import sentencepiece
 import tensorflow as tf
 
-import keras_nlp
-from keras_nlp.src.tests.doc_tests import docstring_lib
-from keras_nlp.src.tests.doc_tests import fenced_docstring_lib
-from keras_nlp.src.tests.doc_tests.fenced_docstring_lib import (
+import keras_hub
+from keras_hub.src.tests.doc_tests import docstring_lib
+from keras_hub.src.tests.doc_tests import fenced_docstring_lib
+from keras_hub.src.tests.doc_tests.fenced_docstring_lib import (
     astor,  # For checking conditional import.
 )
 
-PACKAGE = "keras_nlp."
+PACKAGE = "keras_hub."
 
 
 def find_modules():
-    keras_nlp_modules = []
+    keras_hub_modules = []
     for name, module in sys.modules.items():
         if name.startswith(PACKAGE):
-            keras_nlp_modules.append(module)
+            keras_hub_modules.append(module)
 
-    return keras_nlp_modules
+    return keras_hub_modules
 
 
 @pytest.fixture(scope="session")
@@ -50,14 +50,14 @@ def docstring_module(pytestconfig):
 
 @pytest.mark.tf_only
 def test_docstrings(docstring_module):
-    keras_nlp_modules = find_modules()
+    keras_hub_modules = find_modules()
     # As of this writing, it doesn't seem like pytest support load_tests
     # protocol for unittest:
     #     https://docs.pytest.org/en/7.1.x/how-to/unittest.html
     # So we run the unittest.TestSuite manually and report the results back.
     runner = unittest.TextTestRunner()
     suite = unittest.TestSuite()
-    for module in keras_nlp_modules:
+    for module in keras_hub_modules:
         if docstring_module and docstring_module not in module.__name__:
             continue
         print(f"Adding tests for docstrings in {module.__name__}")
@@ -70,7 +70,7 @@ def test_docstrings(docstring_module):
                     "np": np,
                     "os": os,
                     "keras": keras,
-                    "keras_nlp": keras_nlp,
+                    "keras_hub": keras_hub,
                 },
                 checker=docstring_lib.DoctestOutputChecker(),
                 optionflags=(
@@ -97,17 +97,17 @@ def test_fenced_docstrings(docstring_module):
     """Tests fenced code blocks in docstrings.
 
     This can only be run manually and will take many minutes. Run with:
-    `pytest keras_nlp/tests/doc_tests/docstring_test.py --run_extra_large`
+    `pytest keras_hub/tests/doc_tests/docstring_test.py --run_extra_large`
 
     To restrict the docstring you test, you can pass an additional
     --docstring_module flag. For example, to run only "bert" module tests:
-    `pytest keras_nlp/tests/doc_tests/docstring_test.py --run_extra_large --docstring_module "models.bert"`
+    `pytest keras_hub/tests/doc_tests/docstring_test.py --run_extra_large --docstring_module "models.bert"`
     """
-    keras_nlp_modules = find_modules()
+    keras_hub_modules = find_modules()
 
     runner = unittest.TextTestRunner()
     suite = unittest.TestSuite()
-    for module in keras_nlp_modules:
+    for module in keras_hub_modules:
         if docstring_module and docstring_module not in module.__name__:
             continue
         print(f"Adding tests for fenced docstrings in {module.__name__}")
@@ -128,7 +128,7 @@ def test_fenced_docstrings(docstring_module):
                     "np": np,
                     "os": os,
                     "keras": keras,
-                    "keras_nlp": keras_nlp,
+                    "keras_hub": keras_hub,
                     "io": io,
                     "sentencepiece": sentencepiece,
                 },
diff --git a/keras_nlp/src/tests/doc_tests/fenced_docstring_lib.py b/keras_hub/src/tests/doc_tests/fenced_docstring_lib.py
similarity index 99%
rename from keras_nlp/src/tests/doc_tests/fenced_docstring_lib.py
rename to keras_hub/src/tests/doc_tests/fenced_docstring_lib.py
index 46b3f0d0e3..a6e73eb1b6 100644
--- a/keras_nlp/src/tests/doc_tests/fenced_docstring_lib.py
+++ b/keras_hub/src/tests/doc_tests/fenced_docstring_lib.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/tests/test_case.py b/keras_hub/src/tests/test_case.py
similarity index 98%
rename from keras_nlp/src/tests/test_case.py
rename to keras_hub/src/tests/test_case.py
index cb612cb13e..e88ed79709 100644
--- a/keras_nlp/src/tests/test_case.py
+++ b/keras_hub/src/tests/test_case.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -24,12 +24,12 @@
 from keras import ops
 from keras import tree
 
-from keras_nlp.src.layers.modeling.reversible_embedding import (
+from keras_hub.src.layers.modeling.reversible_embedding import (
     ReversibleEmbedding,
 )
-from keras_nlp.src.tokenizers.tokenizer import Tokenizer
-from keras_nlp.src.utils.keras_utils import has_quantization_support
-from keras_nlp.src.utils.tensor_utils import is_float_dtype
+from keras_hub.src.tokenizers.tokenizer import Tokenizer
+from keras_hub.src.utils.keras_utils import has_quantization_support
+from keras_hub.src.utils.tensor_utils import is_float_dtype
 
 
 def convert_to_comparible_type(x):
@@ -52,7 +52,7 @@ def convert_to_comparible_type(x):
 
 
 class TestCase(tf.test.TestCase, parameterized.TestCase):
-    """Base test case class for KerasNLP."""
+    """Base test case class for KerasHub."""
 
     def assertAllClose(self, x1, x2, atol=1e-6, rtol=1e-6, msg=None):
         # This metric dict hack is only needed for tf.keras, and can be
diff --git a/keras_nlp/src/tests/test_data/albert_test_vocab.spm b/keras_hub/src/tests/test_data/albert_test_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/albert_test_vocab.spm
rename to keras_hub/src/tests/test_data/albert_test_vocab.spm
diff --git a/keras_nlp/src/tests/test_data/deberta_v3_test_vocab.spm b/keras_hub/src/tests/test_data/deberta_v3_test_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/deberta_v3_test_vocab.spm
rename to keras_hub/src/tests/test_data/deberta_v3_test_vocab.spm
diff --git a/keras_nlp/src/tests/test_data/f_net_test_vocab.spm b/keras_hub/src/tests/test_data/f_net_test_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/f_net_test_vocab.spm
rename to keras_hub/src/tests/test_data/f_net_test_vocab.spm
diff --git a/keras_nlp/src/tests/test_data/gemma_test_vocab.spm b/keras_hub/src/tests/test_data/gemma_test_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/gemma_test_vocab.spm
rename to keras_hub/src/tests/test_data/gemma_test_vocab.spm
diff --git a/keras_nlp/src/tests/test_data/llama_test_vocab.spm b/keras_hub/src/tests/test_data/llama_test_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/llama_test_vocab.spm
rename to keras_hub/src/tests/test_data/llama_test_vocab.spm
diff --git a/keras_nlp/src/tests/test_data/mistral_test_vocab.spm b/keras_hub/src/tests/test_data/mistral_test_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/mistral_test_vocab.spm
rename to keras_hub/src/tests/test_data/mistral_test_vocab.spm
diff --git a/keras_nlp/src/tests/test_data/no_special_token_vocab.spm b/keras_hub/src/tests/test_data/no_special_token_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/no_special_token_vocab.spm
rename to keras_hub/src/tests/test_data/no_special_token_vocab.spm
diff --git a/keras_nlp/src/tests/test_data/phi3_test_vocab.spm b/keras_hub/src/tests/test_data/phi3_test_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/phi3_test_vocab.spm
rename to keras_hub/src/tests/test_data/phi3_test_vocab.spm
diff --git a/keras_nlp/src/tests/test_data/t5_test_vocab.spm b/keras_hub/src/tests/test_data/t5_test_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/t5_test_vocab.spm
rename to keras_hub/src/tests/test_data/t5_test_vocab.spm
diff --git a/keras_nlp/src/tests/test_data/test_image.jpg b/keras_hub/src/tests/test_data/test_image.jpg
similarity index 100%
rename from keras_nlp/src/tests/test_data/test_image.jpg
rename to keras_hub/src/tests/test_data/test_image.jpg
diff --git a/keras_nlp/src/tests/test_data/tokenizer_test_vocab.spm b/keras_hub/src/tests/test_data/tokenizer_test_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/tokenizer_test_vocab.spm
rename to keras_hub/src/tests/test_data/tokenizer_test_vocab.spm
diff --git a/keras_nlp/src/tests/test_data/xlm_roberta_test_vocab.spm b/keras_hub/src/tests/test_data/xlm_roberta_test_vocab.spm
similarity index 100%
rename from keras_nlp/src/tests/test_data/xlm_roberta_test_vocab.spm
rename to keras_hub/src/tests/test_data/xlm_roberta_test_vocab.spm
diff --git a/keras_hub/src/tokenizers/__init__.py b/keras_hub/src/tokenizers/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/tokenizers/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/tokenizers/byte_pair_tokenizer.py b/keras_hub/src/tokenizers/byte_pair_tokenizer.py
similarity index 97%
rename from keras_nlp/src/tokenizers/byte_pair_tokenizer.py
rename to keras_hub/src/tokenizers/byte_pair_tokenizer.py
index 5eecf4cbfc..531ba55b9f 100644
--- a/keras_nlp/src/tokenizers/byte_pair_tokenizer.py
+++ b/keras_hub/src/tokenizers/byte_pair_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -26,12 +26,12 @@
 import keras
 import regex as re
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.tokenizers import tokenizer
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import is_int_dtype
-from keras_nlp.src.utils.tensor_utils import is_string_dtype
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.tokenizers import tokenizer
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import is_int_dtype
+from keras_hub.src.utils.tensor_utils import is_string_dtype
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 try:
     import tensorflow as tf
@@ -205,7 +205,7 @@ def create_static_hashtable(keys, values, default):
     )
 
 
-@keras_nlp_export("keras_nlp.tokenizers.BytePairTokenizer")
+@keras_hub_export("keras_hub.tokenizers.BytePairTokenizer")
 class BytePairTokenizer(tokenizer.Tokenizer):
     """Bype-pair encoding tokenizer layer.
 
@@ -252,7 +252,7 @@ class BytePairTokenizer(tokenizer.Tokenizer):
     Tokenize
     >>> vocab = {"butter": 1, "fly": 2}
     >>> merge = ["b u", "t t", "e r", "bu tt", "butt er", "f l", "fl y"]
-    >>> tokenizer = keras_nlp.tokenizers.BytePairTokenizer(vocab, merge)
+    >>> tokenizer = keras_hub.tokenizers.BytePairTokenizer(vocab, merge)
     >>> outputs = tokenizer("butterfly")
     >>> np.array(outputs)
     array([1, 2], dtype=int32)
@@ -261,7 +261,7 @@ class BytePairTokenizer(tokenizer.Tokenizer):
     array([1, 2])
     >>> np.array(seq2)
     array([1])
-    >>> tokenizer = keras_nlp.tokenizers.BytePairTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.BytePairTokenizer(
     ...     vocab, merge, sequence_length=2)
     >>> seq1, seq2 = tokenizer(["butterfly", "butter"])
     >>> np.array(seq1)
@@ -272,7 +272,7 @@ class BytePairTokenizer(tokenizer.Tokenizer):
     Detokenize
     >>> vocab = {"butter": 1, "fly": 2}
     >>> merge = ["b u", "t t", "e r", "bu tt", "butt er", "f l", "fl y"]
-    >>> tokenizer = keras_nlp.tokenizers.BytePairTokenizer(vocab, merge)
+    >>> tokenizer = keras_hub.tokenizers.BytePairTokenizer(vocab, merge)
     >>> tokenizer.detokenize([[1, 2]])
     ['butterfly']
     """
diff --git a/keras_nlp/src/tokenizers/byte_pair_tokenizer_test.py b/keras_hub/src/tokenizers/byte_pair_tokenizer_test.py
similarity index 95%
rename from keras_nlp/src/tokenizers/byte_pair_tokenizer_test.py
rename to keras_hub/src/tokenizers/byte_pair_tokenizer_test.py
index e3cf1ccced..fdec7a00a9 100644
--- a/keras_nlp/src/tokenizers/byte_pair_tokenizer_test.py
+++ b/keras_hub/src/tokenizers/byte_pair_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,16 +16,16 @@
 import pytest
 import tensorflow as tf
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
 
 VOCAB_PATH = keras.utils.get_file(
     None,
-    "https://storage.googleapis.com/keras-nlp/models/roberta_base/vocab.json",
+    "https://storage.googleapis.com/keras-hub/models/roberta_base/vocab.json",
 )
 MERGE_PATH = keras.utils.get_file(
     None,
-    "https://storage.googleapis.com/keras-nlp/models/roberta_base/merges.txt",
+    "https://storage.googleapis.com/keras-hub/models/roberta_base/merges.txt",
 )
 
 
diff --git a/keras_nlp/src/tokenizers/byte_tokenizer.py b/keras_hub/src/tokenizers/byte_tokenizer.py
similarity index 91%
rename from keras_nlp/src/tokenizers/byte_tokenizer.py
rename to keras_hub/src/tokenizers/byte_tokenizer.py
index 6b70df35ce..5c9bb2d793 100644
--- a/keras_nlp/src/tokenizers/byte_tokenizer.py
+++ b/keras_hub/src/tokenizers/byte_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,15 +18,15 @@
     import tensorflow as tf
 except ImportError:
     raise ImportError(
-        "To use `keras_nlp`, please install Tensorflow: `pip install tensorflow`. "
+        "To use `keras_hub`, please install Tensorflow: `pip install tensorflow`. "
         "The TensorFlow package is required for data preprocessing with any backend."
     )
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.tokenizers import tokenizer
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import is_int_dtype
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.tokenizers import tokenizer
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import is_int_dtype
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 try:
     import tensorflow_text as tf_text
@@ -34,7 +34,7 @@
     tf_text = None
 
 
-@keras_nlp_export("keras_nlp.tokenizers.ByteTokenizer")
+@keras_hub_export("keras_hub.tokenizers.ByteTokenizer")
 class ByteTokenizer(tokenizer.Tokenizer):
     """Raw byte tokenizer.
 
@@ -86,14 +86,14 @@ class ByteTokenizer(tokenizer.Tokenizer):
     Examples:
 
     Basic usage.
-    >>> tokenizer = keras_nlp.tokenizers.ByteTokenizer()
+    >>> tokenizer = keras_hub.tokenizers.ByteTokenizer()
     >>> outputs = tokenizer("hello")
     >>> np.array(outputs)
     array([104, 101, 108, 108, 111], dtype=int32)
 
     Ragged outputs.
     >>> inputs = ["hello", "hi"]
-    >>> tokenizer = keras_nlp.tokenizers.ByteTokenizer()
+    >>> tokenizer = keras_hub.tokenizers.ByteTokenizer()
     >>> seq1, seq2 = tokenizer(inputs)
     >>> np.array(seq1)
     array([104, 101, 108, 108, 111])
@@ -102,7 +102,7 @@ class ByteTokenizer(tokenizer.Tokenizer):
 
     Dense outputs.
     >>> inputs = ["hello", "hi"]
-    >>> tokenizer = keras_nlp.tokenizers.ByteTokenizer(sequence_length=8)
+    >>> tokenizer = keras_hub.tokenizers.ByteTokenizer(sequence_length=8)
     >>> seq1, seq2 = tokenizer(inputs)
     >>> np.array(seq1)
     array([104, 101, 108, 108, 111,   0,   0,   0], dtype=int32)
@@ -110,7 +110,7 @@ class ByteTokenizer(tokenizer.Tokenizer):
     array([104, 105,   0,   0,   0,   0,   0,   0], dtype=int32)
 
     Tokenize, then batch for ragged outputs.
-    >>> tokenizer = keras_nlp.tokenizers.ByteTokenizer()
+    >>> tokenizer = keras_hub.tokenizers.ByteTokenizer()
     >>> ds = tf.data.Dataset.from_tensor_slices(["hello", "fun"])
     >>> ds = ds.map(tokenizer)
     >>> ds = ds.apply(tf.data.experimental.dense_to_ragged_batch(2))
@@ -118,14 +118,14 @@ class ByteTokenizer(tokenizer.Tokenizer):
     <tf.RaggedTensor [[104, 101, 108, 108, 111], [102, 117, 110]]>
 
     Batch, then tokenize for ragged outputs.
-    >>> tokenizer = keras_nlp.tokenizers.ByteTokenizer()
+    >>> tokenizer = keras_hub.tokenizers.ByteTokenizer()
     >>> ds = tf.data.Dataset.from_tensor_slices(["hello", "fun"])
     >>> ds = ds.batch(2).map(tokenizer)
     >>> ds.take(1).get_single_element()
     <tf.RaggedTensor [[104, 101, 108, 108, 111], [102, 117, 110]]>
 
     Tokenize, then batch for dense outputs (`sequence_length` provided).
-    >>> tokenizer = keras_nlp.tokenizers.ByteTokenizer(sequence_length=5)
+    >>> tokenizer = keras_hub.tokenizers.ByteTokenizer(sequence_length=5)
     >>> ds = tf.data.Dataset.from_tensor_slices(["hello", "fun"])
     >>> ds = ds.map(tokenizer)
     >>> ds = ds.apply(tf.data.experimental.dense_to_ragged_batch(2))
@@ -135,7 +135,7 @@ class ByteTokenizer(tokenizer.Tokenizer):
            [102, 117, 110,   0,   0]], dtype=int32)>
 
     Batch, then tokenize for dense outputs. (`sequence_length` provided).
-    >>> tokenizer = keras_nlp.tokenizers.ByteTokenizer(sequence_length=5)
+    >>> tokenizer = keras_hub.tokenizers.ByteTokenizer(sequence_length=5)
     >>> ds = tf.data.Dataset.from_tensor_slices(["hello", "fun"])
     >>> ds = ds.batch(2).map(tokenizer)
     >>> ds.take(1).get_single_element()
@@ -145,14 +145,14 @@ class ByteTokenizer(tokenizer.Tokenizer):
 
     Detokenization.
     >>> inputs = [104, 101, 108, 108, 111]
-    >>> tokenizer = keras_nlp.tokenizers.ByteTokenizer()
+    >>> tokenizer = keras_hub.tokenizers.ByteTokenizer()
     >>> tokenizer.detokenize(inputs)
     'hello'
 
     Detokenization with invalid bytes.
     >>> # The 255 below is invalid utf-8.
     >>> inputs = [104, 101, 255, 108, 108, 111]
-    >>> tokenizer = keras_nlp.tokenizers.ByteTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.ByteTokenizer(
     ...     errors="replace", replacement_char=88)
     >>> tokenizer.detokenize(inputs)
     'heXllo'
diff --git a/keras_nlp/src/tokenizers/byte_tokenizer_test.py b/keras_hub/src/tokenizers/byte_tokenizer_test.py
similarity index 98%
rename from keras_nlp/src/tokenizers/byte_tokenizer_test.py
rename to keras_hub/src/tokenizers/byte_tokenizer_test.py
index df7372625c..c3d130fb2d 100644
--- a/keras_nlp/src/tokenizers/byte_tokenizer_test.py
+++ b/keras_hub/src/tokenizers/byte_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import tensorflow as tf
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.byte_tokenizer import ByteTokenizer
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.byte_tokenizer import ByteTokenizer
 
 
 class ByteTokenizerTest(TestCase):
diff --git a/keras_nlp/src/tokenizers/sentence_piece_tokenizer.py b/keras_hub/src/tokenizers/sentence_piece_tokenizer.py
similarity index 93%
rename from keras_nlp/src/tokenizers/sentence_piece_tokenizer.py
rename to keras_hub/src/tokenizers/sentence_piece_tokenizer.py
index ea9f4d6f4a..ecf1eee7d9 100644
--- a/keras_nlp/src/tokenizers/sentence_piece_tokenizer.py
+++ b/keras_hub/src/tokenizers/sentence_piece_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -22,17 +22,17 @@
     import tensorflow as tf
 except ImportError:
     raise ImportError(
-        "To use `keras_nlp`, please install Tensorflow: `pip install tensorflow`. "
+        "To use `keras_hub`, please install Tensorflow: `pip install tensorflow`. "
         "The TensorFlow package is required for data preprocessing with any backend."
     )
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.tokenizers import tokenizer
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import is_int_dtype
-from keras_nlp.src.utils.tensor_utils import is_string_dtype
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
-from keras_nlp.src.utils.tensor_utils import tensor_to_list
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.tokenizers import tokenizer
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import is_int_dtype
+from keras_hub.src.utils.tensor_utils import is_string_dtype
+from keras_hub.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.utils.tensor_utils import tensor_to_list
 
 try:
     import tensorflow_text as tf_text
@@ -43,7 +43,7 @@
 VOCAB_FILENAME = "vocabulary.spm"
 
 
-@keras_nlp_export("keras_nlp.tokenizers.SentencePieceTokenizer")
+@keras_hub_export("keras_hub.tokenizers.SentencePieceTokenizer")
 class SentencePieceTokenizer(tokenizer.Tokenizer):
     """A SentencePiece tokenizer layer.
 
@@ -91,7 +91,7 @@ def train_sentence_piece_bytes(ds, size):
     ds = tf.data.Dataset.from_tensor_slices(["the quick brown fox."])
     proto = train_sentence_piece_bytes(ds, 20)
     # Tokenize inputs.
-    tokenizer = keras_nlp.tokenizers.SentencePieceTokenizer(proto=proto)
+    tokenizer = keras_hub.tokenizers.SentencePieceTokenizer(proto=proto)
     ds = ds.map(tokenizer)
     ```
 
@@ -109,7 +109,7 @@ def train_sentence_piece_file(ds, path, size):
     ds = tf.data.Dataset.from_tensor_slices(["the quick brown fox."])
     proto = train_sentence_piece_file(ds, "model.spm", 20)
     # Tokenize inputs.
-    tokenizer = keras_nlp.tokenizers.SentencePieceTokenizer(proto="model.spm")
+    tokenizer = keras_hub.tokenizers.SentencePieceTokenizer(proto="model.spm")
     ds = ds.map(tokenizer)
     ```
     """
diff --git a/keras_nlp/src/tokenizers/sentence_piece_tokenizer_test.py b/keras_hub/src/tokenizers/sentence_piece_tokenizer_test.py
similarity index 97%
rename from keras_nlp/src/tokenizers/sentence_piece_tokenizer_test.py
rename to keras_hub/src/tokenizers/sentence_piece_tokenizer_test.py
index 16a76f810c..9e35e3b399 100644
--- a/keras_nlp/src/tokenizers/sentence_piece_tokenizer_test.py
+++ b/keras_hub/src/tokenizers/sentence_piece_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 
 import tensorflow as tf
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
 
diff --git a/keras_nlp/src/tokenizers/sentence_piece_tokenizer_trainer.py b/keras_hub/src/tokenizers/sentence_piece_tokenizer_trainer.py
similarity index 88%
rename from keras_nlp/src/tokenizers/sentence_piece_tokenizer_trainer.py
rename to keras_hub/src/tokenizers/sentence_piece_tokenizer_trainer.py
index 2088b9d2ee..fe8dea99b8 100644
--- a/keras_nlp/src/tokenizers/sentence_piece_tokenizer_trainer.py
+++ b/keras_hub/src/tokenizers/sentence_piece_tokenizer_trainer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,7 +18,7 @@
     import tensorflow as tf
 except ImportError:
     raise ImportError(
-        "To use `keras_nlp`, please install Tensorflow: `pip install tensorflow`. "
+        "To use `keras_hub`, please install Tensorflow: `pip install tensorflow`. "
         "The TensorFlow package is required for data preprocessing with any backend."
     )
 
@@ -27,10 +27,10 @@
 except ImportError:
     spm = None
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 
-@keras_nlp_export("keras_nlp.tokenizers.compute_sentence_piece_proto")
+@keras_hub_export("keras_hub.tokenizers.compute_sentence_piece_proto")
 def compute_sentence_piece_proto(
     data,
     vocabulary_size,
@@ -66,8 +66,8 @@ def compute_sentence_piece_proto(
 
     Basic Usage (from Dataset).
     >>> inputs = tf.data.Dataset.from_tensor_slices(["Drifting Along"])
-    >>> proto = keras_nlp.tokenizers.compute_sentence_piece_proto(inputs, vocabulary_size=15)
-    >>> tokenizer = keras_nlp.tokenizers.SentencePieceTokenizer(proto=proto)
+    >>> proto = keras_hub.tokenizers.compute_sentence_piece_proto(inputs, vocabulary_size=15)
+    >>> tokenizer = keras_hub.tokenizers.SentencePieceTokenizer(proto=proto)
     >>> outputs = inputs.map(tokenizer)
     >>> for output in outputs:
     ...     print(output)
@@ -78,18 +78,18 @@ def compute_sentence_piece_proto(
     ``` python
     with open("test.txt", "w+") as f: f.write("Drifting Along\n")
     inputs = ["test.txt"]
-    proto = keras_nlp.tokenizers.compute_sentence_piece_proto(
+    proto = keras_hub.tokenizers.compute_sentence_piece_proto(
          inputs, vocabulary_size=15, proto_output_file="model.spm")
-    tokenizer = keras_nlp.tokenizers.SentencePieceTokenizer(proto="model.spm")
+    tokenizer = keras_hub.tokenizers.SentencePieceTokenizer(proto="model.spm")
     ds = tf.data.Dataset.from_tensor_slices(["the quick brown fox."])
     ds = ds.map(tokenizer)
     ```
 
     Usage with lowercase
     >>> inputs = tf.data.Dataset.from_tensor_slices(["Drifting Along"])
-    >>> proto = keras_nlp.tokenizers.compute_sentence_piece_proto(
+    >>> proto = keras_hub.tokenizers.compute_sentence_piece_proto(
     ...     inputs, vocabulary_size=15, lowercase=True)
-    >>> tokenizer = keras_nlp.tokenizers.SentencePieceTokenizer(proto=proto)
+    >>> tokenizer = keras_hub.tokenizers.SentencePieceTokenizer(proto=proto)
     >>> outputs = inputs.map(tokenizer)
     >>> for output in outputs:
     ...     print(output)
diff --git a/keras_nlp/src/tokenizers/sentence_piece_tokenizer_trainer_test.py b/keras_hub/src/tokenizers/sentence_piece_tokenizer_trainer_test.py
similarity index 94%
rename from keras_nlp/src/tokenizers/sentence_piece_tokenizer_trainer_test.py
rename to keras_hub/src/tokenizers/sentence_piece_tokenizer_trainer_test.py
index d9402fa446..363fbc7d52 100644
--- a/keras_nlp/src/tokenizers/sentence_piece_tokenizer_trainer_test.py
+++ b/keras_hub/src/tokenizers/sentence_piece_tokenizer_trainer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,11 +17,11 @@
 
 import tensorflow as tf
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.sentence_piece_tokenizer import (
     SentencePieceTokenizer,
 )
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer_trainer import (
+from keras_hub.src.tokenizers.sentence_piece_tokenizer_trainer import (
     compute_sentence_piece_proto,
 )
 
diff --git a/keras_nlp/src/tokenizers/tokenizer.py b/keras_hub/src/tokenizers/tokenizer.py
similarity index 87%
rename from keras_nlp/src/tokenizers/tokenizer.py
rename to keras_hub/src/tokenizers/tokenizer.py
index ee92a680c1..7856b79ca6 100644
--- a/keras_nlp/src/tokenizers/tokenizer.py
+++ b/keras_hub/src/tokenizers/tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,32 +13,32 @@
 # limitations under the License.
 import os
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.layers.preprocessing.preprocessing_layer import (
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.layers.preprocessing.preprocessing_layer import (
     PreprocessingLayer,
 )
-from keras_nlp.src.utils.preset_utils import TOKENIZER_ASSET_DIR
-from keras_nlp.src.utils.preset_utils import TOKENIZER_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import builtin_presets
-from keras_nlp.src.utils.preset_utils import find_subclass
-from keras_nlp.src.utils.preset_utils import get_file
-from keras_nlp.src.utils.preset_utils import get_preset_loader
-from keras_nlp.src.utils.preset_utils import save_serialized_object
-from keras_nlp.src.utils.preset_utils import save_tokenizer_assets
-from keras_nlp.src.utils.python_utils import classproperty
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
-
-
-@keras_nlp_export(
+from keras_hub.src.utils.preset_utils import TOKENIZER_ASSET_DIR
+from keras_hub.src.utils.preset_utils import TOKENIZER_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import builtin_presets
+from keras_hub.src.utils.preset_utils import find_subclass
+from keras_hub.src.utils.preset_utils import get_file
+from keras_hub.src.utils.preset_utils import get_preset_loader
+from keras_hub.src.utils.preset_utils import save_serialized_object
+from keras_hub.src.utils.preset_utils import save_tokenizer_assets
+from keras_hub.src.utils.python_utils import classproperty
+from keras_hub.src.utils.tensor_utils import preprocessing_function
+
+
+@keras_hub_export(
     [
-        "keras_nlp.models.Tokenizer",
-        "keras_nlp.tokenizers.Tokenizer",
+        "keras_hub.models.Tokenizer",
+        "keras_hub.tokenizers.Tokenizer",
     ]
 )
 class Tokenizer(PreprocessingLayer):
     """A base class for tokenizer layers.
 
-    Tokenizers in the KerasNLP library should all subclass this layer.
+    Tokenizers in the KerasHub library should all subclass this layer.
     The class provides two core methods `tokenize()` and `detokenize()` for
     going from plain text to sequences and back. A tokenizer is a subclass of
     `keras.layers.Layer` and can be combined into a `keras.Model`.
@@ -57,7 +57,7 @@ class Tokenizer(PreprocessingLayer):
     Example:
 
     ```python
-    class WhitespaceSplitterTokenizer(keras_nlp.tokenizers.Tokenizer):
+    class WhitespaceSplitterTokenizer(keras_hub.tokenizers.Tokenizer):
         def tokenize(self, inputs):
             return tf.strings.split(inputs)
 
@@ -224,7 +224,7 @@ def from_preset(
         preset,
         **kwargs,
     ):
-        """Instantiate a `keras_nlp.models.Tokenizer` from a model preset.
+        """Instantiate a `keras_hub.models.Tokenizer` from a model preset.
 
         A preset is a directory of configs, weights and other file assets used
         to save and load a pre-trained model. The `preset` can be passed as
@@ -239,8 +239,8 @@ def from_preset(
         all built-in presets available on the class.
 
         This constructor can be called in one of two ways. Either from the base
-        class like `keras_nlp.models.Tokenizer.from_preset()`, or from
-        a model class like `keras_nlp.models.GemmaTokenizer.from_preset()`.
+        class like `keras_hub.models.Tokenizer.from_preset()`, or from
+        a model class like `keras_hub.models.GemmaTokenizer.from_preset()`.
         If calling from the base class, the subclass of the returning object
         will be inferred from the config in the preset directory.
 
@@ -254,7 +254,7 @@ class like `keras_nlp.models.Tokenizer.from_preset()`, or from
         Examples:
         ```python
         # Load a preset tokenizer.
-        tokenizer = keras_nlp.tokenizer.Tokenizer.from_preset("bert_base_en")
+        tokenizer = keras_hub.tokenizer.Tokenizer.from_preset("bert_base_en")
 
         # Tokenize some input.
         tokenizer("The quick brown fox tripped.")
diff --git a/keras_nlp/src/tokenizers/tokenizer_test.py b/keras_hub/src/tokenizers/tokenizer_test.py
similarity index 87%
rename from keras_nlp/src/tokenizers/tokenizer_test.py
rename to keras_hub/src/tokenizers/tokenizer_test.py
index e237e183c6..093ae5fc7e 100644
--- a/keras_nlp/src/tokenizers/tokenizer_test.py
+++ b/keras_hub/src/tokenizers/tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,16 +18,16 @@
 import tensorflow as tf
 from absl.testing import parameterized
 
-from keras_nlp.src.models.albert.albert_tokenizer import AlbertTokenizer
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
-from keras_nlp.src.models.roberta.roberta_tokenizer import RobertaTokenizer
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.tokenizer import Tokenizer
-from keras_nlp.src.utils.preset_utils import TOKENIZER_ASSET_DIR
-from keras_nlp.src.utils.preset_utils import TOKENIZER_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import check_config_class
-from keras_nlp.src.utils.preset_utils import load_json
+from keras_hub.src.models.albert.albert_tokenizer import AlbertTokenizer
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.src.models.roberta.roberta_tokenizer import RobertaTokenizer
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.tokenizer import Tokenizer
+from keras_hub.src.utils.preset_utils import TOKENIZER_ASSET_DIR
+from keras_hub.src.utils.preset_utils import TOKENIZER_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import check_config_class
+from keras_hub.src.utils.preset_utils import load_json
 
 
 class SimpleTokenizer(Tokenizer):
diff --git a/keras_nlp/src/tokenizers/unicode_codepoint_tokenizer.py b/keras_hub/src/tokenizers/unicode_codepoint_tokenizer.py
similarity index 91%
rename from keras_nlp/src/tokenizers/unicode_codepoint_tokenizer.py
rename to keras_hub/src/tokenizers/unicode_codepoint_tokenizer.py
index 30bf31afc1..727b015b55 100644
--- a/keras_nlp/src/tokenizers/unicode_codepoint_tokenizer.py
+++ b/keras_hub/src/tokenizers/unicode_codepoint_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.tokenizers import tokenizer
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import is_int_dtype
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.tokenizers import tokenizer
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import is_int_dtype
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 try:
     import tensorflow as tf
@@ -27,7 +27,7 @@
     tf_text = None
 
 
-@keras_nlp_export("keras_nlp.tokenizers.UnicodeCodepointTokenizer")
+@keras_hub_export("keras_hub.tokenizers.UnicodeCodepointTokenizer")
 class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
     """A unicode character tokenizer layer.
 
@@ -84,7 +84,7 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Basic Usage.
     >>> inputs = "Unicode Tokenizer"
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer()
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer()
     >>> outputs = tokenizer(inputs)
     >>> np.array(outputs)
     array([117, 110, 105,  99, 111, 100, 101,  32, 116, 111, 107, 101, 110,
@@ -92,7 +92,7 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Ragged outputs.
     >>> inputs = ["पुस्तक", "کتاب"]
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer()
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer()
     >>> seq1, seq2 = tokenizer(inputs)
     >>> np.array(seq1)
     array([2346, 2369, 2360, 2381, 2340, 2325])
@@ -101,7 +101,7 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Dense outputs.
     >>> inputs = ["पुस्तक", "کتاب"]
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer(
     ...     sequence_length=8)
     >>> seq1, seq2 = tokenizer(inputs)
     >>> np.array(seq1)
@@ -111,7 +111,7 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Tokenize, then batch for ragged outputs.
     >>> inputs = ["Book", "पुस्तक", "کتاب"]
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer()
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer()
     >>> ds = tf.data.Dataset.from_tensor_slices(inputs)
     >>> ds = ds.map(tokenizer)
     >>> ds = ds.apply(tf.data.experimental.dense_to_ragged_batch(3))
@@ -122,7 +122,7 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Batch, then tokenize for ragged outputs.
     >>> inputs = ["Book", "पुस्तक", "کتاب"]
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer()
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer()
     >>> ds = tf.data.Dataset.from_tensor_slices(inputs)
     >>> ds = ds.batch(3).map(tokenizer)
     >>> ds.take(1).get_single_element()
@@ -132,7 +132,7 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Tokenize, then batch for dense outputs (`sequence_length` provided).
     >>> inputs = ["Book", "पुस्तक", "کتاب"]
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer(
     ...     sequence_length=5)
     >>> ds = tf.data.Dataset.from_tensor_slices(inputs)
     >>> ds = ds.map(tokenizer)
@@ -145,7 +145,7 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Batch, then tokenize for dense outputs (`sequence_length` provided).
     >>> inputs = ["Book", "पुस्तक", "کتاب"]
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer(
     ...     sequence_length=5)
     >>> ds = tf.data.Dataset.from_tensor_slices(inputs)
     >>> ds = ds.batch(3).map(tokenizer)
@@ -157,7 +157,7 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Tokenization with truncation.
     >>> inputs = ["I Like to Travel a Lot", "मैं किताबें पढ़ना पसंद करता हूं"]
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer(
     ...     sequence_length=5)
     >>> outputs = tokenizer(inputs)
     >>> np.array(outputs)
@@ -166,7 +166,7 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Tokenization with vocabulary_size.
     >>> latin_ext_cutoff = 592
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer(
     ...     vocabulary_size=latin_ext_cutoff)
     >>> outputs = tokenizer("¿Cómo estás?")
     >>> np.array(outputs)
@@ -179,12 +179,12 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Detokenization.
     >>> inputs = tf.constant([110, 105, 110, 106,  97], dtype="int32")
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer()
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer()
     >>> tokenizer.detokenize(inputs)
     'ninja'
 
     Detokenization with padding.
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer(
     ...     sequence_length=7)
     >>> dataset = tf.data.Dataset.from_tensor_slices(["a b c", "b c", "a"])
     >>> dataset = dataset.map(tokenizer)
@@ -197,7 +197,7 @@ class UnicodeCodepointTokenizer(tokenizer.Tokenizer):
 
     Detokenization with invalid bytes.
     >>> inputs = tf.constant([110, 105, 10000000, 110, 106,  97])
-    >>> tokenizer = keras_nlp.tokenizers.UnicodeCodepointTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.UnicodeCodepointTokenizer(
     ...     errors="replace", replacement_char=88)
     >>> tokenizer.detokenize(inputs)
     'niXnja'
diff --git a/keras_nlp/src/tokenizers/unicode_codepoint_tokenizer_test.py b/keras_hub/src/tokenizers/unicode_codepoint_tokenizer_test.py
similarity index 98%
rename from keras_nlp/src/tokenizers/unicode_codepoint_tokenizer_test.py
rename to keras_hub/src/tokenizers/unicode_codepoint_tokenizer_test.py
index 0b3ee0dddc..8112d2d633 100644
--- a/keras_nlp/src/tokenizers/unicode_codepoint_tokenizer_test.py
+++ b/keras_hub/src/tokenizers/unicode_codepoint_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import tensorflow as tf
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.unicode_codepoint_tokenizer import (
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.unicode_codepoint_tokenizer import (
     UnicodeCodepointTokenizer,
 )
 
diff --git a/keras_nlp/src/tokenizers/word_piece_tokenizer.py b/keras_hub/src/tokenizers/word_piece_tokenizer.py
similarity index 96%
rename from keras_nlp/src/tokenizers/word_piece_tokenizer.py
rename to keras_hub/src/tokenizers/word_piece_tokenizer.py
index f228afae2a..fee24b3b80 100644
--- a/keras_nlp/src/tokenizers/word_piece_tokenizer.py
+++ b/keras_hub/src/tokenizers/word_piece_tokenizer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,12 +18,12 @@
 
 import keras
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.tokenizers import tokenizer
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import is_int_dtype
-from keras_nlp.src.utils.tensor_utils import is_string_dtype
-from keras_nlp.src.utils.tensor_utils import preprocessing_function
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.tokenizers import tokenizer
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import is_int_dtype
+from keras_hub.src.utils.tensor_utils import is_string_dtype
+from keras_hub.src.utils.tensor_utils import preprocessing_function
 
 try:
     import tensorflow as tf
@@ -201,7 +201,7 @@ def pretokenize(
     return text
 
 
-@keras_nlp_export("keras_nlp.tokenizers.WordPieceTokenizer")
+@keras_hub_export("keras_hub.tokenizers.WordPieceTokenizer")
 class WordPieceTokenizer(tokenizer.Tokenizer):
     """A WordPiece tokenizer layer.
 
@@ -277,7 +277,7 @@ class WordPieceTokenizer(tokenizer.Tokenizer):
     Ragged outputs.
     >>> vocab = ["[UNK]", "the", "qu", "##ick", "br", "##own", "fox", "."]
     >>> inputs = "The quick brown fox."
-    >>> tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.WordPieceTokenizer(
     ...     vocabulary=vocab,
     ...     lowercase=True,
     ... )
@@ -288,7 +288,7 @@ class WordPieceTokenizer(tokenizer.Tokenizer):
     Dense outputs.
     >>> vocab = ["[UNK]", "the", "qu", "##ick", "br", "##own", "fox", "."]
     >>> inputs = ["The quick brown fox."]
-    >>> tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.WordPieceTokenizer(
     ...     vocabulary=vocab,
     ...     sequence_length=10,
     ...     lowercase=True,
@@ -300,7 +300,7 @@ class WordPieceTokenizer(tokenizer.Tokenizer):
     String output.
     >>> vocab = ["[UNK]", "the", "qu", "##ick", "br", "##own", "fox", "."]
     >>> inputs = "The quick brown fox."
-    >>> tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.WordPieceTokenizer(
     ...     vocabulary=vocab,
     ...     lowercase=True,
     ...     dtype="string",
@@ -311,7 +311,7 @@ class WordPieceTokenizer(tokenizer.Tokenizer):
     Detokenization.
     >>> vocab = ["[UNK]", "the", "qu", "##ick", "br", "##own", "fox", "."]
     >>> inputs = "The quick brown fox."
-    >>> tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.WordPieceTokenizer(
     ...     vocabulary=vocab,
     ...     lowercase=True,
     ... )
@@ -321,7 +321,7 @@ class WordPieceTokenizer(tokenizer.Tokenizer):
     Custom splitting.
     >>> vocab = ["[UNK]", "the", "qu", "##ick", "br", "##own", "fox", "."]
     >>> inputs = "The$quick$brown$fox"
-    >>> tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(
+    >>> tokenizer = keras_hub.tokenizers.WordPieceTokenizer(
     ...     vocabulary=vocab,
     ...     split=False,
     ...     lowercase=True,
diff --git a/keras_nlp/src/tokenizers/word_piece_tokenizer_test.py b/keras_hub/src/tokenizers/word_piece_tokenizer_test.py
similarity index 98%
rename from keras_nlp/src/tokenizers/word_piece_tokenizer_test.py
rename to keras_hub/src/tokenizers/word_piece_tokenizer_test.py
index c8ccdc9afb..e0234d26f8 100644
--- a/keras_nlp/src/tokenizers/word_piece_tokenizer_test.py
+++ b/keras_hub/src/tokenizers/word_piece_tokenizer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 
 import tensorflow as tf
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.word_piece_tokenizer import WordPieceTokenizer
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.word_piece_tokenizer import WordPieceTokenizer
 
 
 class WordPieceTokenizerTest(TestCase):
diff --git a/keras_nlp/src/tokenizers/word_piece_tokenizer_trainer.py b/keras_hub/src/tokenizers/word_piece_tokenizer_trainer.py
similarity index 92%
rename from keras_nlp/src/tokenizers/word_piece_tokenizer_trainer.py
rename to keras_hub/src/tokenizers/word_piece_tokenizer_trainer.py
index b47c7ddd59..4ecbe5aca4 100644
--- a/keras_nlp/src/tokenizers/word_piece_tokenizer_trainer.py
+++ b/keras_hub/src/tokenizers/word_piece_tokenizer_trainer.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,8 +13,8 @@
 # limitations under the License.
 
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.tokenizers.word_piece_tokenizer import pretokenize
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.tokenizers.word_piece_tokenizer import pretokenize
 
 try:
     import tensorflow as tf
@@ -26,7 +26,7 @@
     learner = None
 
 
-@keras_nlp_export("keras_nlp.tokenizers.compute_word_piece_vocabulary")
+@keras_hub_export("keras_hub.tokenizers.compute_word_piece_vocabulary")
 def compute_word_piece_vocabulary(
     data,
     vocabulary_size,
@@ -82,7 +82,7 @@ def compute_word_piece_vocabulary(
     >>> vocab = compute_word_piece_vocabulary(inputs, 13)
     >>> vocab
     ['[PAD]', '[CLS]', '[SEP]', '[UNK]', '[MASK]', 'a', 'b', 'm', 'p', 'r', 's', 't', '##at']
-    >>> tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(vocabulary=vocab, oov_token="[UNK]")
+    >>> tokenizer = keras_hub.tokenizers.WordPieceTokenizer(vocabulary=vocab, oov_token="[UNK]")
     >>> outputs = inputs.map(tokenizer.tokenize)
     >>> for x in outputs:
     ...     print(x)
@@ -93,7 +93,7 @@ def compute_word_piece_vocabulary(
     with open("test.txt", "w+") as f:
         f.write("bat sat pat mat rat\n")
     inputs = ["test.txt"]
-    vocab = keras_nlp.tokenizers.compute_word_piece_vocabulary(inputs, 13)
+    vocab = keras_hub.tokenizers.compute_word_piece_vocabulary(inputs, 13)
     ```
 
     Custom Split Usage (from Dataset).
@@ -108,7 +108,7 @@ def compute_word_piece_vocabulary(
     ... )
     >>> vocab
     ['[PAD]', '[CLS]', '[SEP]', '[UNK]', '[MASK]', 'a', 'b', 'm', 'p', 'r', 's', 't', '##at']
-    >>> tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(vocabulary=vocab)
+    >>> tokenizer = keras_hub.tokenizers.WordPieceTokenizer(vocabulary=vocab)
     >>> inputs.map(tokenizer.tokenize)
 
     Custom Split Usage (from filenames).
@@ -121,10 +121,10 @@ def normalize_and_split(x):
         f.write("bat sat: pat mat rat.\n")
     inputs = tf.data.TextLineDataset(["test.txt"])
     split_inputs = inputs.map(normalize_and_split)
-    vocab = keras_nlp.tokenizers.compute_word_piece_vocabulary(
+    vocab = keras_hub.tokenizers.compute_word_piece_vocabulary(
         split_inputs, 13, split=False
     )
-    tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(vocabulary=vocab)
+    tokenizer = keras_hub.tokenizers.WordPieceTokenizer(vocabulary=vocab)
     inputs.map(tokenizer.tokenize)
     ```
     """
diff --git a/keras_nlp/src/tokenizers/word_piece_tokenizer_trainer_test.py b/keras_hub/src/tokenizers/word_piece_tokenizer_trainer_test.py
similarity index 98%
rename from keras_nlp/src/tokenizers/word_piece_tokenizer_trainer_test.py
rename to keras_hub/src/tokenizers/word_piece_tokenizer_trainer_test.py
index 699de8851b..9601e1376f 100644
--- a/keras_nlp/src/tokenizers/word_piece_tokenizer_trainer_test.py
+++ b/keras_hub/src/tokenizers/word_piece_tokenizer_trainer_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -16,8 +16,8 @@
 
 import tensorflow as tf
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.tokenizers.word_piece_tokenizer_trainer import (
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.tokenizers.word_piece_tokenizer_trainer import (
     compute_word_piece_vocabulary,
 )
 
diff --git a/keras_hub/src/utils/__init__.py b/keras_hub/src/utils/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/utils/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/utils/keras_utils.py b/keras_hub/src/utils/keras_utils.py
similarity index 96%
rename from keras_nlp/src/utils/keras_utils.py
rename to keras_hub/src/utils/keras_utils.py
index 9d67f5abc2..dd2bd5ef9e 100644
--- a/keras_nlp/src/utils/keras_utils.py
+++ b/keras_hub/src/utils/keras_utils.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -50,7 +50,7 @@ def print_msg(message, line_break=True):
         logging.info(message)
 
 
-@keras.saving.register_keras_serializable(package="keras_nlp")
+@keras.saving.register_keras_serializable(package="keras_hub")
 def gelu_approximate(x):
     return keras.activations.gelu(x, approximate=True)
 
diff --git a/keras_nlp/src/utils/keras_utils_test.py b/keras_hub/src/utils/keras_utils_test.py
similarity index 89%
rename from keras_nlp/src/utils/keras_utils_test.py
rename to keras_hub/src/utils/keras_utils_test.py
index 0e31ad9698..1c2fa94215 100644
--- a/keras_nlp/src/utils/keras_utils_test.py
+++ b/keras_hub/src/utils/keras_utils_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,8 +14,8 @@
 
 import keras
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.utils.keras_utils import clone_initializer
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.utils.keras_utils import clone_initializer
 
 
 class CloneInitializerTest(TestCase):
diff --git a/keras_nlp/src/utils/pipeline_model.py b/keras_hub/src/utils/pipeline_model.py
similarity index 98%
rename from keras_nlp/src/utils/pipeline_model.py
rename to keras_hub/src/utils/pipeline_model.py
index 5fc5a6c381..4c215f717b 100644
--- a/keras_nlp/src/utils/pipeline_model.py
+++ b/keras_hub/src/utils/pipeline_model.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -19,7 +19,7 @@
 from keras import ops
 from keras import tree
 
-from keras_nlp.src.utils.tensor_utils import is_tensor_type
+from keras_hub.src.utils.tensor_utils import is_tensor_type
 
 try:
     import tensorflow as tf
@@ -141,7 +141,7 @@ def _split(t, start, end):
     return train_arrays, val_arrays
 
 
-@keras.saving.register_keras_serializable(package="keras_nlp")
+@keras.saving.register_keras_serializable(package="keras_hub")
 class PipelineModel(keras.Model):
     """A model which allows automatically applying preprocessing."""
 
diff --git a/keras_nlp/src/utils/pipeline_model_test.py b/keras_hub/src/utils/pipeline_model_test.py
similarity index 99%
rename from keras_nlp/src/utils/pipeline_model_test.py
rename to keras_hub/src/utils/pipeline_model_test.py
index 423af3d83c..8e20a1b7ad 100644
--- a/keras_nlp/src/utils/pipeline_model_test.py
+++ b/keras_hub/src/utils/pipeline_model_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,8 +18,8 @@
 import numpy as np
 import tensorflow as tf
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.utils.pipeline_model import PipelineModel
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.utils.pipeline_model import PipelineModel
 
 
 class NoopPipeline(PipelineModel):
diff --git a/keras_nlp/src/utils/preset_utils.py b/keras_hub/src/utils/preset_utils.py
similarity index 96%
rename from keras_nlp/src/utils/preset_utils.py
rename to keras_hub/src/utils/preset_utils.py
index a935fd160b..bfe343ca23 100644
--- a/keras_nlp/src/utils/preset_utils.py
+++ b/keras_hub/src/utils/preset_utils.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -23,14 +23,14 @@
 from absl import logging
 from packaging.version import parse
 
-from keras_nlp.src.api_export import keras_nlp_export
-from keras_nlp.src.utils.keras_utils import print_msg
+from keras_hub.src.api_export import keras_hub_export
+from keras_hub.src.utils.keras_utils import print_msg
 
 try:
     import tensorflow as tf
 except ImportError:
     raise ImportError(
-        "To use `keras_nlp`, please install Tensorflow: `pip install tensorflow`. "
+        "To use `keras_hub`, please install Tensorflow: `pip install tensorflow`. "
         "The TensorFlow package is required for data preprocessing with any backend."
     )
 
@@ -281,9 +281,9 @@ def check_file_exists(preset, path):
 
 
 def get_tokenizer(layer):
-    """Get the tokenizer from any KerasNLP model or layer."""
+    """Get the tokenizer from any KerasHub model or layer."""
     # Avoid circular import.
-    from keras_nlp.src.tokenizers.tokenizer import Tokenizer
+    from keras_hub.src.tokenizers.tokenizer import Tokenizer
 
     if isinstance(layer, Tokenizer):
         return layer
@@ -331,12 +331,12 @@ def save_serialized_object(
 
 
 def save_metadata(layer, preset):
-    from keras_nlp.src.version_utils import __version__ as keras_nlp_version
+    from keras_hub.src.version_utils import __version__ as keras_hub_version
 
     keras_version = keras.version() if hasattr(keras, "version") else None
     metadata = {
         "keras_version": keras_version,
-        "keras_nlp_version": keras_nlp_version,
+        "keras_hub_version": keras_hub_version,
         "parameter_count": layer.count_params(),
         "date_saved": datetime.datetime.now().strftime("%Y-%m-%d@%H:%M:%S"),
     }
@@ -425,7 +425,7 @@ def create_model_card(preset):
 
     # YAML
     markdown_content += "---\n"
-    markdown_content += "library_name: keras-nlp\n"
+    markdown_content += "library_name: keras-hub\n"
     if task_type == "CausalLM":
         markdown_content += "pipeline_tag: text-generation\n"
     elif task_type == "TextClassifier":
@@ -433,11 +433,11 @@ def create_model_card(preset):
     markdown_content += "---\n"
 
     model_link = (
-        f"https://keras.io/api/keras_nlp/models/{to_snake_case(model_name)}"
+        f"https://keras.io/api/keras_hub/models/{to_snake_case(model_name)}"
     )
     markdown_content += (
         f"This is a [`{model_name}` model]({model_link}) "
-        "uploaded using the KerasNLP library and can be used with JAX, "
+        "uploaded using the KerasHub library and can be used with JAX, "
         "TensorFlow, and PyTorch backends.\n"
     )
     if task_type:
@@ -471,7 +471,7 @@ def delete_model_card(preset):
         )
 
 
-@keras_nlp_export("keras_nlp.upload_preset")
+@keras_hub_export("keras_hub.upload_preset")
 def upload_preset(
     uri,
     preset,
@@ -608,7 +608,7 @@ def get_preset_loader(preset):
     if not check_file_exists(preset, CONFIG_FILE):
         raise ValueError(
             f"Preset {preset} has no {CONFIG_FILE}. Make sure the URI or "
-            "directory you are trying to load is a valid KerasNLP preset and "
+            "directory you are trying to load is a valid KerasHub preset and "
             "and that you have permissions to read/download from this location."
         )
     # We currently assume all formats we support have a `config.json`, this is
@@ -620,7 +620,7 @@ def get_preset_loader(preset):
         return KerasPresetLoader(preset, config)
     elif "model_type" in config:
         # Avoid circular import.
-        from keras_nlp.src.utils.transformers.preset_loader import (
+        from keras_hub.src.utils.transformers.preset_loader import (
             TransformersPresetLoader,
         )
 
@@ -628,7 +628,7 @@ def get_preset_loader(preset):
         return TransformersPresetLoader(preset, config)
     elif "architecture" in config:
         # Avoid circular import.
-        from keras_nlp.src.utils.timm.preset_loader import TimmPresetLoader
+        from keras_hub.src.utils.timm.preset_loader import TimmPresetLoader
 
         # If we see "architecture", we assume a timm config. We could make this
         # more robust later on if we need to.
@@ -638,7 +638,7 @@ def get_preset_loader(preset):
         contents = json.dumps(config, indent=4)
         raise ValueError(
             f"Unrecognized format for {CONFIG_FILE} in {preset}. "
-            "Create a preset with the `save_to_preset` utility on KerasNLP "
+            "Create a preset with the `save_to_preset` utility on KerasHub "
             f"models. Contents of {CONFIG_FILE}:\n{contents}"
         )
 
diff --git a/keras_nlp/src/utils/preset_utils_test.py b/keras_hub/src/utils/preset_utils_test.py
similarity index 86%
rename from keras_nlp/src/utils/preset_utils_test.py
rename to keras_hub/src/utils/preset_utils_test.py
index 728e5a6967..30e4f5e140 100644
--- a/keras_nlp/src/utils/preset_utils_test.py
+++ b/keras_hub/src/utils/preset_utils_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,17 +17,17 @@
 import pytest
 from absl.testing import parameterized
 
-from keras_nlp.src.models.albert.albert_text_classifier import (
+from keras_hub.src.models.albert.albert_text_classifier import (
     AlbertTextClassifier,
 )
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.utils.keras_utils import has_quantization_support
-from keras_nlp.src.utils.preset_utils import CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import TOKENIZER_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import load_serialized_object
-from keras_nlp.src.utils.preset_utils import upload_preset
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.bert.bert_tokenizer import BertTokenizer
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.utils.keras_utils import has_quantization_support
+from keras_hub.src.utils.preset_utils import CONFIG_FILE
+from keras_hub.src.utils.preset_utils import TOKENIZER_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import load_serialized_object
+from keras_hub.src.utils.preset_utils import upload_preset
 
 
 class PresetUtilsTest(TestCase):
diff --git a/keras_nlp/src/utils/python_utils.py b/keras_hub/src/utils/python_utils.py
similarity index 95%
rename from keras_nlp/src/utils/python_utils.py
rename to keras_hub/src/utils/python_utils.py
index c4cc28aaee..61de0bf6b5 100644
--- a/keras_nlp/src/utils/python_utils.py
+++ b/keras_hub/src/utils/python_utils.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/keras_nlp/src/utils/python_utils_test.py b/keras_hub/src/utils/python_utils_test.py
similarity index 84%
rename from keras_nlp/src/utils/python_utils_test.py
rename to keras_hub/src/utils/python_utils_test.py
index 647dfa0c4a..75dd808c69 100644
--- a/keras_nlp/src/utils/python_utils_test.py
+++ b/keras_hub/src/utils/python_utils_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.utils.python_utils import classproperty
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.utils.python_utils import classproperty
 
 
 class ClassPropertyTest(TestCase):
diff --git a/keras_nlp/src/utils/tensor_utils.py b/keras_hub/src/utils/tensor_utils.py
similarity index 95%
rename from keras_nlp/src/utils/tensor_utils.py
rename to keras_hub/src/utils/tensor_utils.py
index 94594202ab..3eb4556b4a 100644
--- a/keras_nlp/src/utils/tensor_utils.py
+++ b/keras_hub/src/utils/tensor_utils.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -96,11 +96,11 @@ def convert_preprocessing_inputs(x):
     ```python
     # Two ragged arrays of token ids.
     x = ([[1, 2, 3], [4, 5]], [[1, 2], [3, 4, 5]])
-    keras_nlp.utils.convert_preprocessing_inputs(x)
+    keras_hub.utils.convert_preprocessing_inputs(x)
 
     # A batch of three samples each with two string segments.
     x = (["hi", "hello", "hey"], ["bye", "later", "so long"])
-    keras_nlp.utils.convert_preprocessing_inputs(x)
+    keras_hub.utils.convert_preprocessing_inputs(x)
 
     # A batch of features in a dictionary.
     x = {
@@ -108,7 +108,7 @@ def convert_preprocessing_inputs(x):
         "images": np.ones((3, 64, 64, 3)),
         "labels": [1, 0, 1],
     }
-    keras_nlp.utils.convert_preprocessing_inputs(x)
+    keras_hub.utils.convert_preprocessing_inputs(x)
     ```
     """
     if not tf.executing_eagerly() or in_no_convert_scope():
@@ -167,7 +167,7 @@ def convert_preprocessing_outputs(x):
     - Python lists, in the case of ragged and string data.
 
     This will automatically be called when on the output of preprocessing
-    layers or `keras_nlp.models.Task`s with preprocessing included. It could be
+    layers or `keras_hub.models.Task`s with preprocessing included. It could be
     used directly to convert a `tf.data.Dataset` output to a backend agnostic
     type.
 
@@ -175,11 +175,11 @@ def convert_preprocessing_outputs(x):
     ```python
     # Two ragged arrays of token ids.
     x = tf.ragged.constant([[1, 2, 3], [4, 5]])
-    keras_nlp.utils.convert_preprocessing_outputs(x)
+    keras_hub.utils.convert_preprocessing_outputs(x)
 
     # A batch of three samples each with two string segments.
     x = (tf.constant["hi", "yo", "hey"]), tf.constant(["bye", "ciao", ""]))
-    keras_nlp.utils.convert_preprocessing_outputs(x)
+    keras_hub.utils.convert_preprocessing_outputs(x)
 
     # A batch of features in a dictionary.
     x = {
@@ -187,7 +187,7 @@ def convert_preprocessing_outputs(x):
         "images": tf.ones((3, 64, 64, 3)),
         "labels": tf.constant([1, 0, 1]),
     }
-    keras_nlp.utils.convert_preprocessing_outputs(x)
+    keras_hub.utils.convert_preprocessing_outputs(x)
     ```
     """
     if not tf.executing_eagerly() or in_no_convert_scope():
@@ -270,7 +270,7 @@ def assert_tf_libs_installed(symbol_name):
             "both packages or visit https://www.tensorflow.org/install\n\n"
             "If `tensorflow-text` is already installed, try importing it "
             "in a clean python session. Your installation may have errors.\n\n"
-            "KerasNLP uses `tf.data` and `tensorflow-text` to preprocess text "
+            "KerasHub uses `tf.data` and `tensorflow-text` to preprocess text "
             "on all Keras backends. If you are running on Jax or Torch, this "
             "installation does not need GPU support."
         )
diff --git a/keras_nlp/src/utils/tensor_utils_test.py b/keras_hub/src/utils/tensor_utils_test.py
similarity index 94%
rename from keras_nlp/src/utils/tensor_utils_test.py
rename to keras_hub/src/utils/tensor_utils_test.py
index c0f34595c2..226b796eac 100644
--- a/keras_nlp/src/utils/tensor_utils_test.py
+++ b/keras_hub/src/utils/tensor_utils_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,13 +17,13 @@
 from keras import ops
 from keras import tree
 
-from keras_nlp.src.tests.test_case import TestCase
-from keras_nlp.src.utils.tensor_utils import any_equal
-from keras_nlp.src.utils.tensor_utils import convert_preprocessing_inputs
-from keras_nlp.src.utils.tensor_utils import convert_preprocessing_outputs
-from keras_nlp.src.utils.tensor_utils import convert_to_ragged_batch
-from keras_nlp.src.utils.tensor_utils import is_tensor_type
-from keras_nlp.src.utils.tensor_utils import tensor_to_list
+from keras_hub.src.tests.test_case import TestCase
+from keras_hub.src.utils.tensor_utils import any_equal
+from keras_hub.src.utils.tensor_utils import convert_preprocessing_inputs
+from keras_hub.src.utils.tensor_utils import convert_preprocessing_outputs
+from keras_hub.src.utils.tensor_utils import convert_to_ragged_batch
+from keras_hub.src.utils.tensor_utils import is_tensor_type
+from keras_hub.src.utils.tensor_utils import tensor_to_list
 
 
 class ConvertHelpers(TestCase):
diff --git a/keras_hub/src/utils/timm/__init__.py b/keras_hub/src/utils/timm/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/utils/timm/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/utils/timm/convert_resnet.py b/keras_hub/src/utils/timm/convert_resnet.py
similarity index 98%
rename from keras_nlp/src/utils/timm/convert_resnet.py
rename to keras_hub/src/utils/timm/convert_resnet.py
index 9ede5e8b73..8042d5f5f1 100644
--- a/keras_nlp/src/utils/timm/convert_resnet.py
+++ b/keras_hub/src/utils/timm/convert_resnet.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,7 +13,7 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.resnet.resnet_backbone import ResNetBackbone
+from keras_hub.src.models.resnet.resnet_backbone import ResNetBackbone
 
 backbone_cls = ResNetBackbone
 
diff --git a/keras_nlp/src/utils/timm/convert_resnet_test.py b/keras_hub/src/utils/timm/convert_resnet_test.py
similarity index 85%
rename from keras_nlp/src/utils/timm/convert_resnet_test.py
rename to keras_hub/src/utils/timm/convert_resnet_test.py
index 78f8feacf3..70cb69b9d5 100644
--- a/keras_nlp/src/utils/timm/convert_resnet_test.py
+++ b/keras_hub/src/utils/timm/convert_resnet_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,9 +14,9 @@
 import pytest
 from keras import ops
 
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.image_classifier import ImageClassifier
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.image_classifier import ImageClassifier
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TimmResNetBackboneTest(TestCase):
diff --git a/keras_nlp/src/utils/timm/preset_loader.py b/keras_hub/src/utils/timm/preset_loader.py
similarity index 84%
rename from keras_nlp/src/utils/timm/preset_loader.py
rename to keras_hub/src/utils/timm/preset_loader.py
index 69476f378c..123cdf9674 100644
--- a/keras_nlp/src/utils/timm/preset_loader.py
+++ b/keras_hub/src/utils/timm/preset_loader.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,13 +11,13 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-"""Convert timm models to KerasNLP."""
+"""Convert timm models to KerasHub."""
 
-from keras_nlp.src.models.image_classifier import ImageClassifier
-from keras_nlp.src.utils.preset_utils import PresetLoader
-from keras_nlp.src.utils.preset_utils import jax_memory_cleanup
-from keras_nlp.src.utils.timm import convert_resnet
-from keras_nlp.src.utils.transformers.safetensor_utils import SafetensorLoader
+from keras_hub.src.models.image_classifier import ImageClassifier
+from keras_hub.src.utils.preset_utils import PresetLoader
+from keras_hub.src.utils.preset_utils import jax_memory_cleanup
+from keras_hub.src.utils.timm import convert_resnet
+from keras_hub.src.utils.transformers.safetensor_utils import SafetensorLoader
 
 
 class TimmPresetLoader(PresetLoader):
@@ -28,7 +28,7 @@ def __init__(self, preset, config):
             self.converter = convert_resnet
         else:
             raise ValueError(
-                "KerasNLP has no converter for timm models "
+                "KerasHub has no converter for timm models "
                 f"with architecture `'{architecture}'`."
             )
 
diff --git a/keras_hub/src/utils/transformers/__init__.py b/keras_hub/src/utils/transformers/__init__.py
new file mode 100644
index 0000000000..fd48fde00f
--- /dev/null
+++ b/keras_hub/src/utils/transformers/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 The KerasHub Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/keras_nlp/src/utils/transformers/convert_albert.py b/keras_hub/src/utils/transformers/convert_albert.py
similarity index 98%
rename from keras_nlp/src/utils/transformers/convert_albert.py
rename to keras_hub/src/utils/transformers/convert_albert.py
index 171749fe2d..78c25f4503 100644
--- a/keras_nlp/src/utils/transformers/convert_albert.py
+++ b/keras_hub/src/utils/transformers/convert_albert.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,8 +13,8 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.utils.preset_utils import get_file
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.utils.preset_utils import get_file
 
 backbone_cls = AlbertBackbone
 
diff --git a/keras_nlp/src/utils/transformers/convert_albert_test.py b/keras_hub/src/utils/transformers/convert_albert_test.py
similarity index 80%
rename from keras_nlp/src/utils/transformers/convert_albert_test.py
rename to keras_hub/src/utils/transformers/convert_albert_test.py
index a5fe3a24ce..2a4253a44d 100644
--- a/keras_nlp/src/utils/transformers/convert_albert_test.py
+++ b/keras_hub/src/utils/transformers/convert_albert_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,13 +13,13 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.models.albert.albert_text_classifier import (
+from keras_hub.src.models.albert.albert_backbone import AlbertBackbone
+from keras_hub.src.models.albert.albert_text_classifier import (
     AlbertTextClassifier,
 )
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.text_classifier import TextClassifier
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.text_classifier import TextClassifier
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestTask(TestCase):
diff --git a/keras_nlp/src/utils/transformers/convert_bart.py b/keras_hub/src/utils/transformers/convert_bart.py
similarity index 98%
rename from keras_nlp/src/utils/transformers/convert_bart.py
rename to keras_hub/src/utils/transformers/convert_bart.py
index c004c2898b..9c0a2f048e 100644
--- a/keras_nlp/src/utils/transformers/convert_bart.py
+++ b/keras_hub/src/utils/transformers/convert_bart.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,8 +13,8 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.bart.bart_backbone import BartBackbone
-from keras_nlp.src.utils.preset_utils import get_file
+from keras_hub.src.models.bart.bart_backbone import BartBackbone
+from keras_hub.src.utils.preset_utils import get_file
 
 backbone_cls = BartBackbone
 
diff --git a/keras_nlp/src/utils/transformers/convert_bart_test.py b/keras_hub/src/utils/transformers/convert_bart_test.py
similarity index 80%
rename from keras_nlp/src/utils/transformers/convert_bart_test.py
rename to keras_hub/src/utils/transformers/convert_bart_test.py
index 0e7aa37f37..bc174f4105 100644
--- a/keras_nlp/src/utils/transformers/convert_bart_test.py
+++ b/keras_hub/src/utils/transformers/convert_bart_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.bart.bart_backbone import BartBackbone
-from keras_nlp.src.models.bart.bart_seq_2_seq_lm import BartSeq2SeqLM
-from keras_nlp.src.models.seq_2_seq_lm import Seq2SeqLM
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.bart.bart_backbone import BartBackbone
+from keras_hub.src.models.bart.bart_seq_2_seq_lm import BartSeq2SeqLM
+from keras_hub.src.models.seq_2_seq_lm import Seq2SeqLM
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestTask(TestCase):
diff --git a/keras_nlp/src/utils/transformers/convert_bert.py b/keras_hub/src/utils/transformers/convert_bert.py
similarity index 95%
rename from keras_nlp/src/utils/transformers/convert_bert.py
rename to keras_hub/src/utils/transformers/convert_bert.py
index 02caca3ba1..e57fa051aa 100644
--- a/keras_nlp/src/utils/transformers/convert_bert.py
+++ b/keras_hub/src/utils/transformers/convert_bert.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,10 +13,10 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.utils.preset_utils import HF_TOKENIZER_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import get_file
-from keras_nlp.src.utils.preset_utils import load_json
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.utils.preset_utils import HF_TOKENIZER_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import get_file
+from keras_hub.src.utils.preset_utils import load_json
 
 backbone_cls = BertBackbone
 
diff --git a/keras_nlp/src/utils/transformers/convert_bert_test.py b/keras_hub/src/utils/transformers/convert_bert_test.py
similarity index 80%
rename from keras_nlp/src/utils/transformers/convert_bert_test.py
rename to keras_hub/src/utils/transformers/convert_bert_test.py
index 16699aa0cc..cbd8803a9b 100644
--- a/keras_nlp/src/utils/transformers/convert_bert_test.py
+++ b/keras_hub/src/utils/transformers/convert_bert_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.bert.bert_text_classifier import BertTextClassifier
-from keras_nlp.src.models.text_classifier import TextClassifier
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.bert.bert_backbone import BertBackbone
+from keras_hub.src.models.bert.bert_text_classifier import BertTextClassifier
+from keras_hub.src.models.text_classifier import TextClassifier
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestTask(TestCase):
diff --git a/keras_nlp/src/utils/transformers/convert_distilbert.py b/keras_hub/src/utils/transformers/convert_distilbert.py
similarity index 96%
rename from keras_nlp/src/utils/transformers/convert_distilbert.py
rename to keras_hub/src/utils/transformers/convert_distilbert.py
index 240763bd8c..c4a31ce2eb 100644
--- a/keras_nlp/src/utils/transformers/convert_distilbert.py
+++ b/keras_hub/src/utils/transformers/convert_distilbert.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,12 +13,12 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.utils.preset_utils import HF_TOKENIZER_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import get_file
-from keras_nlp.src.utils.preset_utils import load_json
+from keras_hub.src.utils.preset_utils import HF_TOKENIZER_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import get_file
+from keras_hub.src.utils.preset_utils import load_json
 
 backbone_cls = DistilBertBackbone
 
diff --git a/keras_nlp/src/utils/transformers/convert_distilbert_test.py b/keras_hub/src/utils/transformers/convert_distilbert_test.py
similarity index 81%
rename from keras_nlp/src/utils/transformers/convert_distilbert_test.py
rename to keras_hub/src/utils/transformers/convert_distilbert_test.py
index 0aa037d54a..0463c9eaf7 100644
--- a/keras_nlp/src/utils/transformers/convert_distilbert_test.py
+++ b/keras_hub/src/utils/transformers/convert_distilbert_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,15 +13,15 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.distil_bert.distil_bert_backbone import (
     DistilBertBackbone,
 )
-from keras_nlp.src.models.distil_bert.distil_bert_text_classifier import (
+from keras_hub.src.models.distil_bert.distil_bert_text_classifier import (
     DistilBertTextClassifier,
 )
-from keras_nlp.src.models.text_classifier import TextClassifier
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.text_classifier import TextClassifier
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestTask(TestCase):
diff --git a/keras_nlp/src/utils/transformers/convert_gemma.py b/keras_hub/src/utils/transformers/convert_gemma.py
similarity index 97%
rename from keras_nlp/src/utils/transformers/convert_gemma.py
rename to keras_hub/src/utils/transformers/convert_gemma.py
index 7eab62b17c..e26151d5a9 100644
--- a/keras_nlp/src/utils/transformers/convert_gemma.py
+++ b/keras_hub/src/utils/transformers/convert_gemma.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,8 +13,8 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.gemma.gemma_backbone import GemmaBackbone
-from keras_nlp.src.utils.preset_utils import get_file
+from keras_hub.src.models.gemma.gemma_backbone import GemmaBackbone
+from keras_hub.src.utils.preset_utils import get_file
 
 backbone_cls = GemmaBackbone
 
diff --git a/keras_nlp/src/utils/transformers/convert_gemma_test.py b/keras_hub/src/utils/transformers/convert_gemma_test.py
similarity index 82%
rename from keras_nlp/src/utils/transformers/convert_gemma_test.py
rename to keras_hub/src/utils/transformers/convert_gemma_test.py
index fd0900b9f7..bfa9b6d7df 100644
--- a/keras_nlp/src/utils/transformers/convert_gemma_test.py
+++ b/keras_hub/src/utils/transformers/convert_gemma_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.gemma.gemma_backbone import GemmaBackbone
-from keras_nlp.src.models.gemma.gemma_causal_lm import GemmaCausalLM
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.gemma.gemma_backbone import GemmaBackbone
+from keras_hub.src.models.gemma.gemma_causal_lm import GemmaCausalLM
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestTask(TestCase):
diff --git a/keras_nlp/src/utils/transformers/convert_gpt2.py b/keras_hub/src/utils/transformers/convert_gpt2.py
similarity index 97%
rename from keras_nlp/src/utils/transformers/convert_gpt2.py
rename to keras_hub/src/utils/transformers/convert_gpt2.py
index 73bc596905..d4a6de9898 100644
--- a/keras_nlp/src/utils/transformers/convert_gpt2.py
+++ b/keras_hub/src/utils/transformers/convert_gpt2.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,8 +13,8 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.utils.preset_utils import get_file
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.utils.preset_utils import get_file
 
 backbone_cls = GPT2Backbone
 
diff --git a/keras_nlp/src/utils/transformers/convert_gpt2_test.py b/keras_hub/src/utils/transformers/convert_gpt2_test.py
similarity index 80%
rename from keras_nlp/src/utils/transformers/convert_gpt2_test.py
rename to keras_hub/src/utils/transformers/convert_gpt2_test.py
index 68fd0ef77a..9ea2df33b2 100644
--- a/keras_nlp/src/utils/transformers/convert_gpt2_test.py
+++ b/keras_hub/src/utils/transformers/convert_gpt2_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.models.gpt2.gpt2_causal_lm import GPT2CausalLM
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.src.models.gpt2.gpt2_causal_lm import GPT2CausalLM
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestTask(TestCase):
diff --git a/keras_nlp/src/utils/transformers/convert_llama3.py b/keras_hub/src/utils/transformers/convert_llama3.py
similarity index 96%
rename from keras_nlp/src/utils/transformers/convert_llama3.py
rename to keras_hub/src/utils/transformers/convert_llama3.py
index 3402aff55d..9675ab42a0 100644
--- a/keras_nlp/src/utils/transformers/convert_llama3.py
+++ b/keras_hub/src/utils/transformers/convert_llama3.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,8 +13,8 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.llama3.llama3_backbone import Llama3Backbone
-from keras_nlp.src.utils.preset_utils import load_json
+from keras_hub.src.models.llama3.llama3_backbone import Llama3Backbone
+from keras_hub.src.utils.preset_utils import load_json
 
 backbone_cls = Llama3Backbone
 
diff --git a/keras_nlp/src/utils/transformers/convert_llama3_test.py b/keras_hub/src/utils/transformers/convert_llama3_test.py
similarity index 80%
rename from keras_nlp/src/utils/transformers/convert_llama3_test.py
rename to keras_hub/src/utils/transformers/convert_llama3_test.py
index 8b78988c49..e24e100f0f 100644
--- a/keras_nlp/src/utils/transformers/convert_llama3_test.py
+++ b/keras_hub/src/utils/transformers/convert_llama3_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.llama3.llama3_backbone import Llama3Backbone
-from keras_nlp.src.models.llama3.llama3_causal_lm import Llama3CausalLM
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.llama3.llama3_backbone import Llama3Backbone
+from keras_hub.src.models.llama3.llama3_causal_lm import Llama3CausalLM
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestTask(TestCase):
diff --git a/keras_nlp/src/utils/transformers/convert_mistral.py b/keras_hub/src/utils/transformers/convert_mistral.py
similarity index 97%
rename from keras_nlp/src/utils/transformers/convert_mistral.py
rename to keras_hub/src/utils/transformers/convert_mistral.py
index df1ec79824..5630c0bbdb 100644
--- a/keras_nlp/src/utils/transformers/convert_mistral.py
+++ b/keras_hub/src/utils/transformers/convert_mistral.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,8 +13,8 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.mistral.mistral_backbone import MistralBackbone
-from keras_nlp.src.utils.preset_utils import get_file
+from keras_hub.src.models.mistral.mistral_backbone import MistralBackbone
+from keras_hub.src.utils.preset_utils import get_file
 
 backbone_cls = MistralBackbone
 
diff --git a/keras_nlp/src/utils/transformers/convert_mistral_test.py b/keras_hub/src/utils/transformers/convert_mistral_test.py
similarity index 80%
rename from keras_nlp/src/utils/transformers/convert_mistral_test.py
rename to keras_hub/src/utils/transformers/convert_mistral_test.py
index 56982faf3b..76601c5db9 100644
--- a/keras_nlp/src/utils/transformers/convert_mistral_test.py
+++ b/keras_hub/src/utils/transformers/convert_mistral_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 import pytest
 
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.mistral.mistral_backbone import MistralBackbone
-from keras_nlp.src.models.mistral.mistral_causal_lm import MistralCausalLM
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.mistral.mistral_backbone import MistralBackbone
+from keras_hub.src.models.mistral.mistral_causal_lm import MistralCausalLM
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestTask(TestCase):
diff --git a/keras_nlp/src/utils/transformers/convert_pali_gemma.py b/keras_hub/src/utils/transformers/convert_pali_gemma.py
similarity index 98%
rename from keras_nlp/src/utils/transformers/convert_pali_gemma.py
rename to keras_hub/src/utils/transformers/convert_pali_gemma.py
index 95f47e0102..dc5eb5a068 100644
--- a/keras_nlp/src/utils/transformers/convert_pali_gemma.py
+++ b/keras_hub/src/utils/transformers/convert_pali_gemma.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,10 +13,10 @@
 # limitations under the License.
 import numpy as np
 
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
     PaliGemmaBackbone,
 )
-from keras_nlp.src.utils.preset_utils import get_file
+from keras_hub.src.utils.preset_utils import get_file
 
 backbone_cls = PaliGemmaBackbone
 
diff --git a/keras_nlp/src/utils/transformers/convert_pali_gemma_test.py b/keras_hub/src/utils/transformers/convert_pali_gemma_test.py
similarity index 82%
rename from keras_nlp/src/utils/transformers/convert_pali_gemma_test.py
rename to keras_hub/src/utils/transformers/convert_pali_gemma_test.py
index dbd5405f53..f225caf11e 100644
--- a/keras_nlp/src/utils/transformers/convert_pali_gemma_test.py
+++ b/keras_hub/src/utils/transformers/convert_pali_gemma_test.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,15 +14,15 @@
 import numpy as np
 import pytest
 
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
+from keras_hub.src.models.backbone import Backbone
+from keras_hub.src.models.causal_lm import CausalLM
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (
     PaliGemmaBackbone,
 )
-from keras_nlp.src.models.pali_gemma.pali_gemma_causal_lm import (
+from keras_hub.src.models.pali_gemma.pali_gemma_causal_lm import (
     PaliGemmaCausalLM,
 )
-from keras_nlp.src.tests.test_case import TestCase
+from keras_hub.src.tests.test_case import TestCase
 
 
 class TestTask(TestCase):
diff --git a/keras_nlp/src/utils/transformers/preset_loader.py b/keras_hub/src/utils/transformers/preset_loader.py
similarity index 73%
rename from keras_nlp/src/utils/transformers/preset_loader.py
rename to keras_hub/src/utils/transformers/preset_loader.py
index 7e93140979..593792d08e 100644
--- a/keras_nlp/src/utils/transformers/preset_loader.py
+++ b/keras_hub/src/utils/transformers/preset_loader.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,21 +11,21 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-"""Convert huggingface models to KerasNLP."""
+"""Convert huggingface models to KerasHub."""
 
 
-from keras_nlp.src.utils.preset_utils import PresetLoader
-from keras_nlp.src.utils.preset_utils import jax_memory_cleanup
-from keras_nlp.src.utils.transformers import convert_albert
-from keras_nlp.src.utils.transformers import convert_bart
-from keras_nlp.src.utils.transformers import convert_bert
-from keras_nlp.src.utils.transformers import convert_distilbert
-from keras_nlp.src.utils.transformers import convert_gemma
-from keras_nlp.src.utils.transformers import convert_gpt2
-from keras_nlp.src.utils.transformers import convert_llama3
-from keras_nlp.src.utils.transformers import convert_mistral
-from keras_nlp.src.utils.transformers import convert_pali_gemma
-from keras_nlp.src.utils.transformers.safetensor_utils import SafetensorLoader
+from keras_hub.src.utils.preset_utils import PresetLoader
+from keras_hub.src.utils.preset_utils import jax_memory_cleanup
+from keras_hub.src.utils.transformers import convert_albert
+from keras_hub.src.utils.transformers import convert_bart
+from keras_hub.src.utils.transformers import convert_bert
+from keras_hub.src.utils.transformers import convert_distilbert
+from keras_hub.src.utils.transformers import convert_gemma
+from keras_hub.src.utils.transformers import convert_gpt2
+from keras_hub.src.utils.transformers import convert_llama3
+from keras_hub.src.utils.transformers import convert_mistral
+from keras_hub.src.utils.transformers import convert_pali_gemma
+from keras_hub.src.utils.transformers.safetensor_utils import SafetensorLoader
 
 
 class TransformersPresetLoader(PresetLoader):
@@ -53,7 +53,7 @@ def __init__(self, preset, config):
             self.converter = convert_pali_gemma
         else:
             raise ValueError(
-                "KerasNLP has no converter for huggingface/transformers models "
+                "KerasHub has no converter for huggingface/transformers models "
                 f"with model type `'{model_type}'`."
             )
 
diff --git a/keras_nlp/src/utils/transformers/safetensor_utils.py b/keras_hub/src/utils/transformers/safetensor_utils.py
similarity index 90%
rename from keras_nlp/src/utils/transformers/safetensor_utils.py
rename to keras_hub/src/utils/transformers/safetensor_utils.py
index 27a0aac593..1f7fd80d25 100644
--- a/keras_nlp/src/utils/transformers/safetensor_utils.py
+++ b/keras_hub/src/utils/transformers/safetensor_utils.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -13,11 +13,11 @@
 # limitations under the License.
 import contextlib
 
-from keras_nlp.src.utils.preset_utils import SAFETENSOR_CONFIG_FILE
-from keras_nlp.src.utils.preset_utils import SAFETENSOR_FILE
-from keras_nlp.src.utils.preset_utils import check_file_exists
-from keras_nlp.src.utils.preset_utils import get_file
-from keras_nlp.src.utils.preset_utils import load_json
+from keras_hub.src.utils.preset_utils import SAFETENSOR_CONFIG_FILE
+from keras_hub.src.utils.preset_utils import SAFETENSOR_FILE
+from keras_hub.src.utils.preset_utils import check_file_exists
+from keras_hub.src.utils.preset_utils import get_file
+from keras_hub.src.utils.preset_utils import load_json
 
 try:
     import safetensors
diff --git a/keras_nlp/src/version_utils.py b/keras_hub/src/version_utils.py
similarity index 83%
rename from keras_nlp/src/version_utils.py
rename to keras_hub/src/version_utils.py
index 6cee944157..b1d4f043d1 100644
--- a/keras_nlp/src/version_utils.py
+++ b/keras_hub/src/version_utils.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,12 +12,12 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from keras_nlp.src.api_export import keras_nlp_export
+from keras_hub.src.api_export import keras_hub_export
 
 # Unique source of truth for the version number.
 __version__ = "0.16.0"
 
 
-@keras_nlp_export("keras_nlp.version")
+@keras_hub_export("keras_hub.version")
 def version():
     return __version__
diff --git a/keras_nlp/api/bounding_box/__init__.py b/keras_nlp/api/bounding_box/__init__.py
deleted file mode 100644
index 8488f76e6f..0000000000
--- a/keras_nlp/api/bounding_box/__init__.py
+++ /dev/null
@@ -1,36 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-"""DO NOT EDIT.
-
-This file was autogenerated. Do not edit it by hand,
-since your modifications would be overwritten.
-"""
-
-from keras_nlp.src.bounding_box.converters import convert_format
-from keras_nlp.src.bounding_box.formats import CENTER_XYWH
-from keras_nlp.src.bounding_box.formats import REL_XYWH
-from keras_nlp.src.bounding_box.formats import REL_XYXY
-from keras_nlp.src.bounding_box.formats import REL_YXYX
-from keras_nlp.src.bounding_box.formats import XYWH
-from keras_nlp.src.bounding_box.formats import XYXY
-from keras_nlp.src.bounding_box.formats import YXYX
-from keras_nlp.src.bounding_box.iou import compute_ciou
-from keras_nlp.src.bounding_box.iou import compute_iou
-from keras_nlp.src.bounding_box.to_dense import to_dense
-from keras_nlp.src.bounding_box.to_ragged import to_ragged
-from keras_nlp.src.bounding_box.utils import as_relative
-from keras_nlp.src.bounding_box.utils import clip_boxes
-from keras_nlp.src.bounding_box.utils import clip_to_image
-from keras_nlp.src.bounding_box.utils import is_relative
-from keras_nlp.src.bounding_box.validate_format import validate_format
diff --git a/keras_nlp/api/layers/__init__.py b/keras_nlp/api/layers/__init__.py
deleted file mode 100644
index 7def279b19..0000000000
--- a/keras_nlp/api/layers/__init__.py
+++ /dev/null
@@ -1,61 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-"""DO NOT EDIT.
-
-This file was autogenerated. Do not edit it by hand,
-since your modifications would be overwritten.
-"""
-
-from keras_nlp.src.layers.modeling.alibi_bias import AlibiBias
-from keras_nlp.src.layers.modeling.cached_multi_head_attention import (
-    CachedMultiHeadAttention,
-)
-from keras_nlp.src.layers.modeling.f_net_encoder import FNetEncoder
-from keras_nlp.src.layers.modeling.masked_lm_head import MaskedLMHead
-from keras_nlp.src.layers.modeling.position_embedding import PositionEmbedding
-from keras_nlp.src.layers.modeling.reversible_embedding import (
-    ReversibleEmbedding,
-)
-from keras_nlp.src.layers.modeling.rotary_embedding import RotaryEmbedding
-from keras_nlp.src.layers.modeling.sine_position_encoding import (
-    SinePositionEncoding,
-)
-from keras_nlp.src.layers.modeling.token_and_position_embedding import (
-    TokenAndPositionEmbedding,
-)
-from keras_nlp.src.layers.modeling.transformer_decoder import TransformerDecoder
-from keras_nlp.src.layers.modeling.transformer_encoder import TransformerEncoder
-from keras_nlp.src.layers.preprocessing.audio_converter import AudioConverter
-from keras_nlp.src.layers.preprocessing.image_converter import ImageConverter
-from keras_nlp.src.layers.preprocessing.masked_lm_mask_generator import (
-    MaskedLMMaskGenerator,
-)
-from keras_nlp.src.layers.preprocessing.multi_segment_packer import (
-    MultiSegmentPacker,
-)
-from keras_nlp.src.layers.preprocessing.random_deletion import RandomDeletion
-from keras_nlp.src.layers.preprocessing.random_swap import RandomSwap
-from keras_nlp.src.layers.preprocessing.resizing_image_converter import (
-    ResizingImageConverter,
-)
-from keras_nlp.src.layers.preprocessing.start_end_packer import StartEndPacker
-from keras_nlp.src.models.pali_gemma.pali_gemma_image_converter import (
-    PaliGemmaImageConverter,
-)
-from keras_nlp.src.models.resnet.resnet_image_converter import (
-    ResNetImageConverter,
-)
-from keras_nlp.src.models.whisper.whisper_audio_converter import (
-    WhisperAudioConverter,
-)
diff --git a/keras_nlp/api/models/__init__.py b/keras_nlp/api/models/__init__.py
deleted file mode 100644
index 2b3ffbb30b..0000000000
--- a/keras_nlp/api/models/__init__.py
+++ /dev/null
@@ -1,298 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-"""DO NOT EDIT.
-
-This file was autogenerated. Do not edit it by hand,
-since your modifications would be overwritten.
-"""
-
-from keras_nlp.src.models.albert.albert_backbone import AlbertBackbone
-from keras_nlp.src.models.albert.albert_masked_lm import AlbertMaskedLM
-from keras_nlp.src.models.albert.albert_masked_lm_preprocessor import (
-    AlbertMaskedLMPreprocessor,
-)
-from keras_nlp.src.models.albert.albert_text_classifier import (
-    AlbertTextClassifier,
-)
-from keras_nlp.src.models.albert.albert_text_classifier import (
-    AlbertTextClassifier as AlbertClassifier,
-)
-from keras_nlp.src.models.albert.albert_text_classifier_preprocessor import (
-    AlbertTextClassifierPreprocessor,
-)
-from keras_nlp.src.models.albert.albert_text_classifier_preprocessor import (
-    AlbertTextClassifierPreprocessor as AlbertPreprocessor,
-)
-from keras_nlp.src.models.albert.albert_tokenizer import AlbertTokenizer
-from keras_nlp.src.models.backbone import Backbone
-from keras_nlp.src.models.bart.bart_backbone import BartBackbone
-from keras_nlp.src.models.bart.bart_seq_2_seq_lm import BartSeq2SeqLM
-from keras_nlp.src.models.bart.bart_seq_2_seq_lm_preprocessor import (
-    BartSeq2SeqLMPreprocessor,
-)
-from keras_nlp.src.models.bart.bart_tokenizer import BartTokenizer
-from keras_nlp.src.models.bert.bert_backbone import BertBackbone
-from keras_nlp.src.models.bert.bert_masked_lm import BertMaskedLM
-from keras_nlp.src.models.bert.bert_masked_lm_preprocessor import (
-    BertMaskedLMPreprocessor,
-)
-from keras_nlp.src.models.bert.bert_text_classifier import BertTextClassifier
-from keras_nlp.src.models.bert.bert_text_classifier import (
-    BertTextClassifier as BertClassifier,
-)
-from keras_nlp.src.models.bert.bert_text_classifier_preprocessor import (
-    BertTextClassifierPreprocessor,
-)
-from keras_nlp.src.models.bert.bert_text_classifier_preprocessor import (
-    BertTextClassifierPreprocessor as BertPreprocessor,
-)
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.models.bloom.bloom_backbone import BloomBackbone
-from keras_nlp.src.models.bloom.bloom_causal_lm import BloomCausalLM
-from keras_nlp.src.models.bloom.bloom_causal_lm_preprocessor import (
-    BloomCausalLMPreprocessor,
-)
-from keras_nlp.src.models.bloom.bloom_tokenizer import BloomTokenizer
-from keras_nlp.src.models.causal_lm import CausalLM
-from keras_nlp.src.models.causal_lm_preprocessor import CausalLMPreprocessor
-from keras_nlp.src.models.csp_darknet.csp_darknet_backbone import (
-    CSPDarkNetBackbone,
-)
-from keras_nlp.src.models.csp_darknet.csp_darknet_image_classifier import (
-    CSPDarkNetImageClassifier,
-)
-from keras_nlp.src.models.deberta_v3.deberta_v3_backbone import (
-    DebertaV3Backbone,
-)
-from keras_nlp.src.models.deberta_v3.deberta_v3_masked_lm import (
-    DebertaV3MaskedLM,
-)
-from keras_nlp.src.models.deberta_v3.deberta_v3_masked_lm_preprocessor import (
-    DebertaV3MaskedLMPreprocessor,
-)
-from keras_nlp.src.models.deberta_v3.deberta_v3_text_classifier import (
-    DebertaV3TextClassifier,
-)
-from keras_nlp.src.models.deberta_v3.deberta_v3_text_classifier import (
-    DebertaV3TextClassifier as DebertaV3Classifier,
-)
-from keras_nlp.src.models.deberta_v3.deberta_v3_text_classifier_preprocessor import (
-    DebertaV3TextClassifierPreprocessor,
-)
-from keras_nlp.src.models.deberta_v3.deberta_v3_text_classifier_preprocessor import (
-    DebertaV3TextClassifierPreprocessor as DebertaV3Preprocessor,
-)
-from keras_nlp.src.models.deberta_v3.deberta_v3_tokenizer import (
-    DebertaV3Tokenizer,
-)
-from keras_nlp.src.models.densenet.densenet_backbone import DenseNetBackbone
-from keras_nlp.src.models.densenet.densenet_image_classifier import (
-    DenseNetImageClassifier,
-)
-from keras_nlp.src.models.distil_bert.distil_bert_backbone import (
-    DistilBertBackbone,
-)
-from keras_nlp.src.models.distil_bert.distil_bert_masked_lm import (
-    DistilBertMaskedLM,
-)
-from keras_nlp.src.models.distil_bert.distil_bert_masked_lm_preprocessor import (
-    DistilBertMaskedLMPreprocessor,
-)
-from keras_nlp.src.models.distil_bert.distil_bert_text_classifier import (
-    DistilBertTextClassifier,
-)
-from keras_nlp.src.models.distil_bert.distil_bert_text_classifier import (
-    DistilBertTextClassifier as DistilBertClassifier,
-)
-from keras_nlp.src.models.distil_bert.distil_bert_text_classifier_preprocessor import (
-    DistilBertTextClassifierPreprocessor,
-)
-from keras_nlp.src.models.distil_bert.distil_bert_text_classifier_preprocessor import (
-    DistilBertTextClassifierPreprocessor as DistilBertPreprocessor,
-)
-from keras_nlp.src.models.distil_bert.distil_bert_tokenizer import (
-    DistilBertTokenizer,
-)
-from keras_nlp.src.models.efficientnet.efficientnet_backbone import (
-    EfficientNetBackbone,
-)
-from keras_nlp.src.models.electra.electra_backbone import ElectraBackbone
-from keras_nlp.src.models.electra.electra_tokenizer import ElectraTokenizer
-from keras_nlp.src.models.f_net.f_net_backbone import FNetBackbone
-from keras_nlp.src.models.f_net.f_net_masked_lm import FNetMaskedLM
-from keras_nlp.src.models.f_net.f_net_masked_lm_preprocessor import (
-    FNetMaskedLMPreprocessor,
-)
-from keras_nlp.src.models.f_net.f_net_text_classifier import FNetTextClassifier
-from keras_nlp.src.models.f_net.f_net_text_classifier import (
-    FNetTextClassifier as FNetClassifier,
-)
-from keras_nlp.src.models.f_net.f_net_text_classifier_preprocessor import (
-    FNetTextClassifierPreprocessor,
-)
-from keras_nlp.src.models.f_net.f_net_text_classifier_preprocessor import (
-    FNetTextClassifierPreprocessor as FNetPreprocessor,
-)
-from keras_nlp.src.models.f_net.f_net_tokenizer import FNetTokenizer
-from keras_nlp.src.models.falcon.falcon_backbone import FalconBackbone
-from keras_nlp.src.models.falcon.falcon_causal_lm import FalconCausalLM
-from keras_nlp.src.models.falcon.falcon_causal_lm_preprocessor import (
-    FalconCausalLMPreprocessor,
-)
-from keras_nlp.src.models.falcon.falcon_tokenizer import FalconTokenizer
-from keras_nlp.src.models.feature_pyramid_backbone import FeaturePyramidBackbone
-from keras_nlp.src.models.gemma.gemma_backbone import GemmaBackbone
-from keras_nlp.src.models.gemma.gemma_causal_lm import GemmaCausalLM
-from keras_nlp.src.models.gemma.gemma_causal_lm_preprocessor import (
-    GemmaCausalLMPreprocessor,
-)
-from keras_nlp.src.models.gemma.gemma_tokenizer import GemmaTokenizer
-from keras_nlp.src.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.src.models.gpt2.gpt2_causal_lm import GPT2CausalLM
-from keras_nlp.src.models.gpt2.gpt2_causal_lm_preprocessor import (
-    GPT2CausalLMPreprocessor,
-)
-from keras_nlp.src.models.gpt2.gpt2_preprocessor import GPT2Preprocessor
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_backbone import GPTNeoXBackbone
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_causal_lm import GPTNeoXCausalLM
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_causal_lm_preprocessor import (
-    GPTNeoXCausalLMPreprocessor,
-)
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
-from keras_nlp.src.models.image_classifier import ImageClassifier
-from keras_nlp.src.models.image_classifier_preprocessor import (
-    ImageClassifierPreprocessor,
-)
-from keras_nlp.src.models.llama3.llama3_backbone import Llama3Backbone
-from keras_nlp.src.models.llama3.llama3_causal_lm import Llama3CausalLM
-from keras_nlp.src.models.llama3.llama3_causal_lm_preprocessor import (
-    Llama3CausalLMPreprocessor,
-)
-from keras_nlp.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
-from keras_nlp.src.models.llama.llama_backbone import LlamaBackbone
-from keras_nlp.src.models.llama.llama_causal_lm import LlamaCausalLM
-from keras_nlp.src.models.llama.llama_causal_lm_preprocessor import (
-    LlamaCausalLMPreprocessor,
-)
-from keras_nlp.src.models.llama.llama_tokenizer import LlamaTokenizer
-from keras_nlp.src.models.masked_lm import MaskedLM
-from keras_nlp.src.models.masked_lm_preprocessor import MaskedLMPreprocessor
-from keras_nlp.src.models.mistral.mistral_backbone import MistralBackbone
-from keras_nlp.src.models.mistral.mistral_causal_lm import MistralCausalLM
-from keras_nlp.src.models.mistral.mistral_causal_lm_preprocessor import (
-    MistralCausalLMPreprocessor,
-)
-from keras_nlp.src.models.mistral.mistral_tokenizer import MistralTokenizer
-from keras_nlp.src.models.mix_transformer.mix_transformer_backbone import (
-    MiTBackbone,
-)
-from keras_nlp.src.models.mix_transformer.mix_transformer_classifier import (
-    MiTImageClassifier,
-)
-from keras_nlp.src.models.mobilenet.mobilenet_backbone import MobileNetBackbone
-from keras_nlp.src.models.mobilenet.mobilenet_image_classifier import (
-    MobileNetImageClassifier,
-)
-from keras_nlp.src.models.opt.opt_backbone import OPTBackbone
-from keras_nlp.src.models.opt.opt_causal_lm import OPTCausalLM
-from keras_nlp.src.models.opt.opt_causal_lm_preprocessor import (
-    OPTCausalLMPreprocessor,
-)
-from keras_nlp.src.models.opt.opt_tokenizer import OPTTokenizer
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (
-    PaliGemmaBackbone,
-)
-from keras_nlp.src.models.pali_gemma.pali_gemma_causal_lm import (
-    PaliGemmaCausalLM,
-)
-from keras_nlp.src.models.pali_gemma.pali_gemma_causal_lm_preprocessor import (
-    PaliGemmaCausalLMPreprocessor,
-)
-from keras_nlp.src.models.pali_gemma.pali_gemma_tokenizer import (
-    PaliGemmaTokenizer,
-)
-from keras_nlp.src.models.phi3.phi3_backbone import Phi3Backbone
-from keras_nlp.src.models.phi3.phi3_causal_lm import Phi3CausalLM
-from keras_nlp.src.models.phi3.phi3_causal_lm_preprocessor import (
-    Phi3CausalLMPreprocessor,
-)
-from keras_nlp.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
-from keras_nlp.src.models.preprocessor import Preprocessor
-from keras_nlp.src.models.resnet.resnet_backbone import ResNetBackbone
-from keras_nlp.src.models.resnet.resnet_image_classifier import (
-    ResNetImageClassifier,
-)
-from keras_nlp.src.models.resnet.resnet_image_classifier_preprocessor import (
-    ResNetImageClassifierPreprocessor,
-)
-from keras_nlp.src.models.roberta.roberta_backbone import RobertaBackbone
-from keras_nlp.src.models.roberta.roberta_masked_lm import RobertaMaskedLM
-from keras_nlp.src.models.roberta.roberta_masked_lm_preprocessor import (
-    RobertaMaskedLMPreprocessor,
-)
-from keras_nlp.src.models.roberta.roberta_text_classifier import (
-    RobertaTextClassifier,
-)
-from keras_nlp.src.models.roberta.roberta_text_classifier import (
-    RobertaTextClassifier as RobertaClassifier,
-)
-from keras_nlp.src.models.roberta.roberta_text_classifier_preprocessor import (
-    RobertaTextClassifierPreprocessor,
-)
-from keras_nlp.src.models.roberta.roberta_text_classifier_preprocessor import (
-    RobertaTextClassifierPreprocessor as RobertaPreprocessor,
-)
-from keras_nlp.src.models.roberta.roberta_tokenizer import RobertaTokenizer
-from keras_nlp.src.models.seq_2_seq_lm import Seq2SeqLM
-from keras_nlp.src.models.seq_2_seq_lm_preprocessor import Seq2SeqLMPreprocessor
-from keras_nlp.src.models.t5.t5_backbone import T5Backbone
-from keras_nlp.src.models.t5.t5_tokenizer import T5Tokenizer
-from keras_nlp.src.models.task import Task
-from keras_nlp.src.models.text_classifier import TextClassifier
-from keras_nlp.src.models.text_classifier import TextClassifier as Classifier
-from keras_nlp.src.models.text_classifier_preprocessor import (
-    TextClassifierPreprocessor,
-)
-from keras_nlp.src.models.vgg.vgg_backbone import VGGBackbone
-from keras_nlp.src.models.vgg.vgg_image_classifier import VGGImageClassifier
-from keras_nlp.src.models.vit_det.vit_det_backbone import ViTDetBackbone
-from keras_nlp.src.models.whisper.whisper_backbone import WhisperBackbone
-from keras_nlp.src.models.whisper.whisper_tokenizer import WhisperTokenizer
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_backbone import (
-    XLMRobertaBackbone,
-)
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_masked_lm import (
-    XLMRobertaMaskedLM,
-)
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_masked_lm_preprocessor import (
-    XLMRobertaMaskedLMPreprocessor,
-)
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_text_classifier import (
-    XLMRobertaTextClassifier,
-)
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_text_classifier import (
-    XLMRobertaTextClassifier as XLMRobertaClassifier,
-)
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_text_classifier_preprocessor import (
-    XLMRobertaTextClassifierPreprocessor,
-)
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_text_classifier_preprocessor import (
-    XLMRobertaTextClassifierPreprocessor as XLMRobertaPreprocessor,
-)
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_tokenizer import (
-    XLMRobertaTokenizer,
-)
-from keras_nlp.src.models.xlnet.xlnet_backbone import XLNetBackbone
-from keras_nlp.src.tokenizers.tokenizer import Tokenizer
diff --git a/keras_nlp/api/tokenizers/__init__.py b/keras_nlp/api/tokenizers/__init__.py
deleted file mode 100644
index 0971ef4185..0000000000
--- a/keras_nlp/api/tokenizers/__init__.py
+++ /dev/null
@@ -1,65 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-"""DO NOT EDIT.
-
-This file was autogenerated. Do not edit it by hand,
-since your modifications would be overwritten.
-"""
-
-from keras_nlp.src.models.albert.albert_tokenizer import AlbertTokenizer
-from keras_nlp.src.models.bart.bart_tokenizer import BartTokenizer
-from keras_nlp.src.models.bert.bert_tokenizer import BertTokenizer
-from keras_nlp.src.models.bloom.bloom_tokenizer import BloomTokenizer
-from keras_nlp.src.models.deberta_v3.deberta_v3_tokenizer import (
-    DebertaV3Tokenizer,
-)
-from keras_nlp.src.models.distil_bert.distil_bert_tokenizer import (
-    DistilBertTokenizer,
-)
-from keras_nlp.src.models.electra.electra_tokenizer import ElectraTokenizer
-from keras_nlp.src.models.f_net.f_net_tokenizer import FNetTokenizer
-from keras_nlp.src.models.falcon.falcon_tokenizer import FalconTokenizer
-from keras_nlp.src.models.gemma.gemma_tokenizer import GemmaTokenizer
-from keras_nlp.src.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
-from keras_nlp.src.models.gpt_neo_x.gpt_neo_x_tokenizer import GPTNeoXTokenizer
-from keras_nlp.src.models.llama3.llama3_tokenizer import Llama3Tokenizer
-from keras_nlp.src.models.llama.llama_tokenizer import LlamaTokenizer
-from keras_nlp.src.models.mistral.mistral_tokenizer import MistralTokenizer
-from keras_nlp.src.models.opt.opt_tokenizer import OPTTokenizer
-from keras_nlp.src.models.pali_gemma.pali_gemma_tokenizer import (
-    PaliGemmaTokenizer,
-)
-from keras_nlp.src.models.phi3.phi3_tokenizer import Phi3Tokenizer
-from keras_nlp.src.models.roberta.roberta_tokenizer import RobertaTokenizer
-from keras_nlp.src.models.t5.t5_tokenizer import T5Tokenizer
-from keras_nlp.src.models.whisper.whisper_tokenizer import WhisperTokenizer
-from keras_nlp.src.models.xlm_roberta.xlm_roberta_tokenizer import (
-    XLMRobertaTokenizer,
-)
-from keras_nlp.src.tokenizers.byte_pair_tokenizer import BytePairTokenizer
-from keras_nlp.src.tokenizers.byte_tokenizer import ByteTokenizer
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer import (
-    SentencePieceTokenizer,
-)
-from keras_nlp.src.tokenizers.sentence_piece_tokenizer_trainer import (
-    compute_sentence_piece_proto,
-)
-from keras_nlp.src.tokenizers.tokenizer import Tokenizer
-from keras_nlp.src.tokenizers.unicode_codepoint_tokenizer import (
-    UnicodeCodepointTokenizer,
-)
-from keras_nlp.src.tokenizers.word_piece_tokenizer import WordPieceTokenizer
-from keras_nlp.src.tokenizers.word_piece_tokenizer_trainer import (
-    compute_word_piece_vocabulary,
-)
diff --git a/keras_nlp/src/layers/preprocessing/__init__.py b/keras_nlp/src/layers/preprocessing/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/layers/preprocessing/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/metrics/__init__.py b/keras_nlp/src/metrics/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/metrics/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/__init__.py b/keras_nlp/src/models/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/csp_darknet/__init__.py b/keras_nlp/src/models/csp_darknet/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/csp_darknet/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/densenet/__init__.py b/keras_nlp/src/models/densenet/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/densenet/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/efficientnet/__init__.py b/keras_nlp/src/models/efficientnet/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/efficientnet/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/gpt_neo_x/__init__.py b/keras_nlp/src/models/gpt_neo_x/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/gpt_neo_x/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/mix_transformer/__init__.py b/keras_nlp/src/models/mix_transformer/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/mix_transformer/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/mobilenet/__init__.py b/keras_nlp/src/models/mobilenet/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/mobilenet/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/stable_diffusion_v3/__init__.py b/keras_nlp/src/models/stable_diffusion_v3/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/stable_diffusion_v3/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/vgg/__init__.py b/keras_nlp/src/models/vgg/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/vgg/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/vit_det/__init__.py b/keras_nlp/src/models/vit_det/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/vit_det/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/models/xlnet/__init__.py b/keras_nlp/src/models/xlnet/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/models/xlnet/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/samplers/__init__.py b/keras_nlp/src/samplers/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/samplers/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/tests/__init__.py b/keras_nlp/src/tests/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/tests/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/tests/doc_tests/__init__.py b/keras_nlp/src/tests/doc_tests/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/tests/doc_tests/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/tokenizers/__init__.py b/keras_nlp/src/tokenizers/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/tokenizers/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/utils/__init__.py b/keras_nlp/src/utils/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/utils/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/utils/timm/__init__.py b/keras_nlp/src/utils/timm/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/utils/timm/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/keras_nlp/src/utils/transformers/__init__.py b/keras_nlp/src/utils/transformers/__init__.py
deleted file mode 100644
index 3364a6bd16..0000000000
--- a/keras_nlp/src/utils/transformers/__init__.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# Copyright 2024 The KerasNLP Authors
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
diff --git a/pip_build.py b/pip_build.py
index 7fed385c71..fdc3e97fdc 100644
--- a/pip_build.py
+++ b/pip_build.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -11,7 +11,7 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-"""Script to create (and optionally install) a `.whl` archive for KerasNLP.
+"""Script to create (and optionally install) a `.whl` archive for KerasHub.
 
 Usage:
 
@@ -36,7 +36,7 @@
 import re
 import shutil
 
-package = "keras_nlp"
+package = "keras_hub"
 build_directory = "tmp_build_dir"
 dist_directory = "dist"
 to_copy = ["setup.py", "setup.cfg", "README.md"]
@@ -51,12 +51,12 @@ def export_version_string(version, is_nightly=False):
     if is_nightly:
         date = datetime.datetime.now()
         version += f".dev{date.strftime('%Y%m%d%H')}"
-        # Replaces `name="keras-nlp"` in `setup.py` with `keras-nlp-nightly`
+        # Replaces `name="keras-hub"` in `setup.py` with `keras-hub-nightly`
         with open("setup.py") as f:
             setup_contents = f.read()
         with open("setup.py", "w") as f:
             setup_contents = setup_contents.replace(
-                'name="keras-nlp"', 'name="keras-nlp-nightly"'
+                'name="keras-hub"', 'name="keras-hub-nightly"'
             )
             f.write(setup_contents)
 
@@ -73,7 +73,7 @@ def export_version_string(version, is_nightly=False):
 
 
 def copy_source_to_build_directory(root_path):
-    # Copy sources (`keras_nlp/` directory and setup files) to build
+    # Copy sources (`keras_hub/` directory and setup files) to build
     # directory
     os.chdir(root_path)
     os.mkdir(build_directory)
@@ -93,7 +93,7 @@ def build(root_path, is_nightly=False):
         copy_source_to_build_directory(root_path)
         print(os.getcwd())
 
-        from keras_nlp.src.version_utils import __version__  # noqa: E402
+        from keras_hub.src.version_utils import __version__  # noqa: E402
 
         export_version_string(__version__, is_nightly)
         return build_and_save_output(root_path, __version__)
diff --git a/pyproject.toml b/pyproject.toml
index c90dcc241c..c95800bdc4 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,6 +4,6 @@ line-length = 80
 [tool.isort]
 profile = "black"
 force_single_line = "True"
-known_first_party = ["keras_nlp", "tests"]
+known_first_party = ["keras_hub", "tests"]
 default_section = "THIRDPARTY"
 line_length = 80
diff --git a/setup.py b/setup.py
index afbae82675..c694137b81 100644
--- a/setup.py
+++ b/setup.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -37,13 +37,13 @@ def get_version(rel_path):
 
 HERE = pathlib.Path(__file__).parent
 README = (HERE / "README.md").read_text()
-if os.path.exists("keras_nlp/version_utils.py"):
-    VERSION = get_version("keras_nlp/version_utils.py")
+if os.path.exists("keras_hub/version_utils.py"):
+    VERSION = get_version("keras_hub/version_utils.py")
 else:
-    VERSION = get_version("keras_nlp/src/version_utils.py")
+    VERSION = get_version("keras_hub/src/version_utils.py")
 
 setup(
-    name="keras-nlp",
+    name="keras-hub",
     description=(
         "Industry-strength Natural Language Processing extensions for Keras."
     ),
@@ -52,7 +52,7 @@ def get_version(rel_path):
     version=VERSION,
     url="https://github.com/keras-team/keras-nlp",
     author="Keras team",
-    author_email="keras-nlp@google.com",
+    author_email="keras-hub@google.com",
     license="Apache License 2.0",
     install_requires=[
         "absl-py",
@@ -63,7 +63,7 @@ def get_version(rel_path):
         "kagglehub",
         # Don't require tensorflow-text on MacOS, there are no binaries for ARM.
         # Also, we rely on tensorflow *transitively* through tensorflow-text.
-        # This avoid a slowdown during `pip install keras-nlp` where pip would
+        # This avoid a slowdown during `pip install keras-hub` where pip would
         # download many version of both libraries to find compatible versions.
         "tensorflow-text; platform_system != 'Darwin'",
     ],
diff --git a/shell/copyright.txt b/shell/copyright.txt
index 3364a6bd16..fd48fde00f 100644
--- a/shell/copyright.txt
+++ b/shell/copyright.txt
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/__init__.py b/tools/__init__.py
index 3364a6bd16..fd48fde00f 100644
--- a/tools/__init__.py
+++ b/tools/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/checkpoint_conversion/__init__.py b/tools/checkpoint_conversion/__init__.py
index 3364a6bd16..fd48fde00f 100644
--- a/tools/checkpoint_conversion/__init__.py
+++ b/tools/checkpoint_conversion/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/checkpoint_conversion/checkpoint_conversion_utils.py b/tools/checkpoint_conversion/checkpoint_conversion_utils.py
index 9017058286..5281d8a856 100644
--- a/tools/checkpoint_conversion/checkpoint_conversion_utils.py
+++ b/tools/checkpoint_conversion/checkpoint_conversion_utils.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/checkpoint_conversion/convert_albert_checkpoints.py b/tools/checkpoint_conversion/convert_albert_checkpoints.py
index 565680046b..0385879bcc 100644
--- a/tools/checkpoint_conversion/convert_albert_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_albert_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -21,7 +21,7 @@
 from absl import flags
 from checkpoint_conversion_utils import get_md5_checksum
 
-import keras_nlp
+import keras_hub
 
 PRESET_MAP = {
     "albert_base_en_uncased": "albert-base-v2",
@@ -38,10 +38,10 @@
 
 
 def convert_checkpoints(hf_model):
-    print("\n-> Convert original weights to KerasNLP format.")
+    print("\n-> Convert original weights to KerasHub format.")
 
-    print("\n-> Load KerasNLP model.")
-    keras_nlp_model = keras_nlp.models.AlbertBackbone.from_preset(
+    print("\n-> Load KerasHub model.")
+    keras_hub_model = keras_hub.models.AlbertBackbone.from_preset(
         FLAGS.preset, load_weights=False
     )
 
@@ -49,36 +49,36 @@ def convert_checkpoints(hf_model):
     print("Original weights:")
     print(list(hf_wts.keys()))
 
-    num_heads = keras_nlp_model.num_heads
-    hidden_dim = keras_nlp_model.hidden_dim
+    num_heads = keras_hub_model.num_heads
+    hidden_dim = keras_hub_model.hidden_dim
 
-    keras_nlp_model.get_layer("token_embedding").embeddings.assign(
+    keras_hub_model.get_layer("token_embedding").embeddings.assign(
         hf_wts["embeddings.word_embeddings.weight"]
     )
-    keras_nlp_model.get_layer("position_embedding").position_embeddings.assign(
+    keras_hub_model.get_layer("position_embedding").position_embeddings.assign(
         hf_wts["embeddings.position_embeddings.weight"]
     )
-    keras_nlp_model.get_layer("segment_embedding").embeddings.assign(
+    keras_hub_model.get_layer("segment_embedding").embeddings.assign(
         hf_wts["embeddings.token_type_embeddings.weight"]
     )
 
-    keras_nlp_model.get_layer("embeddings_layer_norm").gamma.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").gamma.assign(
         hf_wts["embeddings.LayerNorm.weight"]
     )
-    keras_nlp_model.get_layer("embeddings_layer_norm").beta.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").beta.assign(
         hf_wts["embeddings.LayerNorm.bias"]
     )
 
-    keras_nlp_model.get_layer("embedding_projection").kernel.assign(
+    keras_hub_model.get_layer("embedding_projection").kernel.assign(
         hf_wts["encoder.embedding_hidden_mapping_in.weight"].T
     )
-    keras_nlp_model.get_layer("embedding_projection").bias.assign(
+    keras_hub_model.get_layer("embedding_projection").bias.assign(
         hf_wts["encoder.embedding_hidden_mapping_in.bias"]
     )
 
-    for i in range(keras_nlp_model.num_groups):
-        for j in range(keras_nlp_model.num_inner_repetitions):
-            keras_nlp_model.get_layer(
+    for i in range(keras_hub_model.num_groups):
+        for j in range(keras_hub_model.num_inner_repetitions):
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._self_attention_layer._query_dense.kernel.assign(
                 hf_wts[
@@ -88,7 +88,7 @@ def convert_checkpoints(hf_model):
                 .reshape((hidden_dim, num_heads, -1))
                 .numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._self_attention_layer._query_dense.bias.assign(
                 hf_wts[
@@ -98,7 +98,7 @@ def convert_checkpoints(hf_model):
                 .numpy()
             )
 
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._self_attention_layer._key_dense.kernel.assign(
                 hf_wts[
@@ -108,7 +108,7 @@ def convert_checkpoints(hf_model):
                 .reshape((hidden_dim, num_heads, -1))
                 .numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._self_attention_layer._key_dense.bias.assign(
                 hf_wts[
@@ -118,7 +118,7 @@ def convert_checkpoints(hf_model):
                 .numpy()
             )
 
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._self_attention_layer._value_dense.kernel.assign(
                 hf_wts[
@@ -128,7 +128,7 @@ def convert_checkpoints(hf_model):
                 .reshape((hidden_dim, num_heads, -1))
                 .numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._self_attention_layer._value_dense.bias.assign(
                 hf_wts[
@@ -138,7 +138,7 @@ def convert_checkpoints(hf_model):
                 .numpy()
             )
 
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._self_attention_layer._output_dense.kernel.assign(
                 hf_wts[
@@ -148,7 +148,7 @@ def convert_checkpoints(hf_model):
                 .reshape((num_heads, -1, hidden_dim))
                 .numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._self_attention_layer._output_dense.bias.assign(
                 hf_wts[
@@ -156,14 +156,14 @@ def convert_checkpoints(hf_model):
                 ].numpy()
             )
 
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._self_attention_layer_norm.gamma.assign(
                 hf_wts[
                     f"encoder.albert_layer_groups.{i}.albert_layers.{j}.attention.LayerNorm.weight"
                 ].numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._self_attention_layer_norm.beta.assign(
                 hf_wts[
@@ -171,7 +171,7 @@ def convert_checkpoints(hf_model):
                 ].numpy()
             )
 
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._feedforward_intermediate_dense.kernel.assign(
                 hf_wts[
@@ -180,7 +180,7 @@ def convert_checkpoints(hf_model):
                 .transpose(1, 0)
                 .numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._feedforward_intermediate_dense.bias.assign(
                 hf_wts[
@@ -188,7 +188,7 @@ def convert_checkpoints(hf_model):
                 ].numpy()
             )
 
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._feedforward_output_dense.kernel.assign(
                 hf_wts[
@@ -197,7 +197,7 @@ def convert_checkpoints(hf_model):
                 .transpose(1, 0)
                 .numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._feedforward_output_dense.bias.assign(
                 hf_wts[
@@ -205,14 +205,14 @@ def convert_checkpoints(hf_model):
                 ].numpy()
             )
 
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._feedforward_layer_norm.gamma.assign(
                 hf_wts[
                     f"encoder.albert_layer_groups.{i}.albert_layers.{j}.full_layer_layer_norm.weight"
                 ].numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"group_{i}_inner_layer_{j}"
             )._feedforward_layer_norm.beta.assign(
                 hf_wts[
@@ -220,23 +220,23 @@ def convert_checkpoints(hf_model):
                 ].numpy()
             )
 
-    keras_nlp_model.get_layer("pooled_dense").kernel.assign(
+    keras_hub_model.get_layer("pooled_dense").kernel.assign(
         hf_wts["pooler.weight"].transpose(1, 0).numpy()
     )
-    keras_nlp_model.get_layer("pooled_dense").bias.assign(
+    keras_hub_model.get_layer("pooled_dense").bias.assign(
         hf_wts["pooler.bias"].numpy()
     )
 
     # Save the model.
-    print("\n-> Save KerasNLP model weights.")
-    keras_nlp_model.save_weights(os.path.join(FLAGS.preset, "model.h5"))
+    print("\n-> Save KerasHub model weights.")
+    keras_hub_model.save_weights(os.path.join(FLAGS.preset, "model.h5"))
 
-    return keras_nlp_model
+    return keras_hub_model
 
 
 def extract_vocab(hf_tokenizer):
     spm_path = os.path.join(FLAGS.preset, "spiece.model")
-    print(f"\n-> Save KerasNLP SPM vocabulary file to `{spm_path}`.")
+    print(f"\n-> Save KerasHub SPM vocabulary file to `{spm_path}`.")
 
     shutil.copyfile(
         transformers.utils.hub.get_file_from_repo(
@@ -245,31 +245,31 @@ def extract_vocab(hf_tokenizer):
         spm_path,
     )
 
-    keras_nlp_tokenizer = keras_nlp.models.AlbertTokenizer(
+    keras_hub_tokenizer = keras_hub.models.AlbertTokenizer(
         proto=spm_path,
     )
-    keras_nlp_preprocessor = keras_nlp.models.AlbertTextClassifierPreprocessor(
-        keras_nlp_tokenizer
+    keras_hub_preprocessor = keras_hub.models.AlbertTextClassifierPreprocessor(
+        keras_hub_tokenizer
     )
 
     print("-> Print MD5 checksum of the vocab files.")
     print(f"`{spm_path}` md5sum: ", get_md5_checksum(spm_path))
 
-    return keras_nlp_preprocessor
+    return keras_hub_preprocessor
 
 
 def check_output(
-    keras_nlp_preprocessor,
-    keras_nlp_model,
+    keras_hub_preprocessor,
+    keras_hub_model,
     hf_tokenizer,
     hf_model,
 ):
     print("\n-> Check the outputs.")
     sample_text = ["cricket is awesome, easily the best sport in the world!"]
 
-    # KerasNLP
-    keras_nlp_inputs = keras_nlp_preprocessor(tf.constant(sample_text))
-    keras_nlp_output = keras_nlp_model.predict(keras_nlp_inputs)[
+    # KerasHub
+    keras_hub_inputs = keras_hub_preprocessor(tf.constant(sample_text))
+    keras_hub_output = keras_hub_model.predict(keras_hub_inputs)[
         "sequence_output"
     ]
 
@@ -279,9 +279,9 @@ def check_output(
     )
     hf_output = hf_model(**hf_inputs).last_hidden_state
 
-    print("KerasNLP output:", keras_nlp_output[0, 0, :10])
+    print("KerasHub output:", keras_hub_output[0, 0, :10])
     print("HF output:", hf_output[0, 0, :10])
-    print("Difference:", np.mean(keras_nlp_output - hf_output.detach().numpy()))
+    print("Difference:", np.mean(keras_hub_output - hf_output.detach().numpy()))
 
     # Show the MD5 checksum of the model weights.
     print(
@@ -300,13 +300,13 @@ def main(_):
     hf_model.eval()
     hf_tokenizer = transformers.AutoTokenizer.from_pretrained(hf_model_name)
 
-    keras_nlp_model = convert_checkpoints(hf_model)
-    print("\n -> Load KerasNLP preprocessor.")
-    keras_nlp_preprocessor = extract_vocab(hf_tokenizer)
+    keras_hub_model = convert_checkpoints(hf_model)
+    print("\n -> Load KerasHub preprocessor.")
+    keras_hub_preprocessor = extract_vocab(hf_tokenizer)
 
     check_output(
-        keras_nlp_preprocessor,
-        keras_nlp_model,
+        keras_hub_preprocessor,
+        keras_hub_model,
         hf_tokenizer,
         hf_model,
     )
diff --git a/tools/checkpoint_conversion/convert_bart_checkpoints.py b/tools/checkpoint_conversion/convert_bart_checkpoints.py
index 8f7def2c20..7c86f11efb 100644
--- a/tools/checkpoint_conversion/convert_bart_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_bart_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -21,7 +21,7 @@
 from absl import flags
 from checkpoint_conversion_utils import get_md5_checksum
 
-import keras_nlp
+import keras_hub
 
 PRESET_MAP = {
     "bart_base_en": "facebook/bart-base",
@@ -36,10 +36,10 @@
 
 
 def convert_checkpoints(hf_model):
-    print("\n-> Convert original weights to KerasNLP format.")
+    print("\n-> Convert original weights to KerasHub format.")
 
-    print("\n-> Load KerasNLP model.")
-    keras_nlp_model = keras_nlp.models.BartBackbone.from_preset(
+    print("\n-> Load KerasHub model.")
+    keras_hub_model = keras_hub.models.BartBackbone.from_preset(
         FLAGS.preset, load_weights=False
     )
 
@@ -47,29 +47,29 @@ def convert_checkpoints(hf_model):
     print("Original weights:")
     print(list(hf_wts.keys()))
 
-    hidden_dim = keras_nlp_model.hidden_dim
-    num_heads = keras_nlp_model.num_heads
+    hidden_dim = keras_hub_model.hidden_dim
+    num_heads = keras_hub_model.num_heads
 
     # Token embedding weights shared by encoder and decoder.
-    keras_nlp_model.get_layer("token_embedding").embeddings.assign(
+    keras_hub_model.get_layer("token_embedding").embeddings.assign(
         hf_wts["shared.weight"]
     )
 
     # Encoder weights.
-    keras_nlp_model.get_layer(
+    keras_hub_model.get_layer(
         "encoder_position_embedding"
     ).position_embeddings.assign(hf_wts["encoder.embed_positions.weight"][2:])
 
-    keras_nlp_model.get_layer("encoder_embeddings_layer_norm").gamma.assign(
+    keras_hub_model.get_layer("encoder_embeddings_layer_norm").gamma.assign(
         hf_wts["encoder.layer_norm_embedding.weight"]
     )
-    keras_nlp_model.get_layer("encoder_embeddings_layer_norm").beta.assign(
+    keras_hub_model.get_layer("encoder_embeddings_layer_norm").beta.assign(
         hf_wts["encoder.layer_norm_embedding.bias"]
     )
 
-    for i in range(keras_nlp_model.num_layers):
+    for i in range(keras_hub_model.num_layers):
         # Self-attention.
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._self_attention_layer._query_dense.kernel.assign(
             hf_wts[f"encoder.layers.{i}.self_attn.q_proj.weight"]
@@ -77,7 +77,7 @@ def convert_checkpoints(hf_model):
             .reshape((hidden_dim, num_heads, -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._self_attention_layer._query_dense.bias.assign(
             hf_wts[f"encoder.layers.{i}.self_attn.q_proj.bias"]
@@ -85,7 +85,7 @@ def convert_checkpoints(hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._self_attention_layer._key_dense.kernel.assign(
             hf_wts[f"encoder.layers.{i}.self_attn.k_proj.weight"]
@@ -93,7 +93,7 @@ def convert_checkpoints(hf_model):
             .reshape((hidden_dim, num_heads, -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._self_attention_layer._key_dense.bias.assign(
             hf_wts[f"encoder.layers.{i}.self_attn.k_proj.bias"]
@@ -101,7 +101,7 @@ def convert_checkpoints(hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._self_attention_layer._value_dense.kernel.assign(
             hf_wts[f"encoder.layers.{i}.self_attn.v_proj.weight"]
@@ -109,7 +109,7 @@ def convert_checkpoints(hf_model):
             .reshape((hidden_dim, num_heads, -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._self_attention_layer._value_dense.bias.assign(
             hf_wts[f"encoder.layers.{i}.self_attn.v_proj.bias"]
@@ -117,7 +117,7 @@ def convert_checkpoints(hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._self_attention_layer._output_dense.kernel.assign(
             hf_wts[f"encoder.layers.{i}.self_attn.out_proj.weight"]
@@ -125,52 +125,52 @@ def convert_checkpoints(hf_model):
             .reshape((num_heads, -1, hidden_dim))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._self_attention_layer._output_dense.bias.assign(
             hf_wts[f"encoder.layers.{i}.self_attn.out_proj.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._self_attention_layer_norm.gamma.assign(
             hf_wts[f"encoder.layers.{i}.self_attn_layer_norm.weight"].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._self_attention_layer_norm.beta.assign(
             hf_wts[f"encoder.layers.{i}.self_attn_layer_norm.bias"].numpy()
         )
 
         # Post self-attention layers.
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._feedforward_intermediate_dense.kernel.assign(
             hf_wts[f"encoder.layers.{i}.fc1.weight"].transpose(1, 0).numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._feedforward_intermediate_dense.bias.assign(
             hf_wts[f"encoder.layers.{i}.fc1.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._feedforward_output_dense.kernel.assign(
             hf_wts[f"encoder.layers.{i}.fc2.weight"].transpose(1, 0).numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._feedforward_output_dense.bias.assign(
             hf_wts[f"encoder.layers.{i}.fc2.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._feedforward_layer_norm.gamma.assign(
             hf_wts[f"encoder.layers.{i}.final_layer_norm.weight"].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_encoder_layer_{i}"
         )._feedforward_layer_norm.beta.assign(
             hf_wts[f"encoder.layers.{i}.final_layer_norm.bias"].numpy()
@@ -178,20 +178,20 @@ def convert_checkpoints(hf_model):
 
     # Decoder weights.
 
-    keras_nlp_model.get_layer(
+    keras_hub_model.get_layer(
         "decoder_position_embedding"
     ).position_embeddings.assign(hf_wts["decoder.embed_positions.weight"][2:])
 
-    keras_nlp_model.get_layer("decoder_embeddings_layer_norm").gamma.assign(
+    keras_hub_model.get_layer("decoder_embeddings_layer_norm").gamma.assign(
         hf_wts["decoder.layer_norm_embedding.weight"]
     )
-    keras_nlp_model.get_layer("decoder_embeddings_layer_norm").beta.assign(
+    keras_hub_model.get_layer("decoder_embeddings_layer_norm").beta.assign(
         hf_wts["decoder.layer_norm_embedding.bias"]
     )
 
-    for i in range(keras_nlp_model.num_layers):
+    for i in range(keras_hub_model.num_layers):
         # Self-attention.
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._self_attention_layer._query_dense.kernel.assign(
             hf_wts[f"decoder.layers.{i}.self_attn.q_proj.weight"]
@@ -199,7 +199,7 @@ def convert_checkpoints(hf_model):
             .reshape((hidden_dim, num_heads, -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._self_attention_layer._query_dense.bias.assign(
             hf_wts[f"decoder.layers.{i}.self_attn.q_proj.bias"]
@@ -207,7 +207,7 @@ def convert_checkpoints(hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._self_attention_layer._key_dense.kernel.assign(
             hf_wts[f"decoder.layers.{i}.self_attn.k_proj.weight"]
@@ -215,7 +215,7 @@ def convert_checkpoints(hf_model):
             .reshape((hidden_dim, num_heads, -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._self_attention_layer._key_dense.bias.assign(
             hf_wts[f"decoder.layers.{i}.self_attn.k_proj.bias"]
@@ -223,7 +223,7 @@ def convert_checkpoints(hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._self_attention_layer._value_dense.kernel.assign(
             hf_wts[f"decoder.layers.{i}.self_attn.v_proj.weight"]
@@ -231,7 +231,7 @@ def convert_checkpoints(hf_model):
             .reshape((hidden_dim, num_heads, -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._self_attention_layer._value_dense.bias.assign(
             hf_wts[f"decoder.layers.{i}.self_attn.v_proj.bias"]
@@ -239,7 +239,7 @@ def convert_checkpoints(hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._self_attention_layer._output_dense.kernel.assign(
             hf_wts[f"decoder.layers.{i}.self_attn.out_proj.weight"]
@@ -247,25 +247,25 @@ def convert_checkpoints(hf_model):
             .reshape((num_heads, -1, hidden_dim))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._self_attention_layer._output_dense.bias.assign(
             hf_wts[f"decoder.layers.{i}.self_attn.out_proj.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._self_attention_layer_norm.gamma.assign(
             hf_wts[f"decoder.layers.{i}.self_attn_layer_norm.weight"].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._self_attention_layer_norm.beta.assign(
             hf_wts[f"decoder.layers.{i}.self_attn_layer_norm.bias"].numpy()
         )
 
         # Cross-attention.
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._cross_attention_layer._query_dense.kernel.assign(
             hf_wts[f"decoder.layers.{i}.encoder_attn.q_proj.weight"]
@@ -273,7 +273,7 @@ def convert_checkpoints(hf_model):
             .reshape((hidden_dim, num_heads, -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._cross_attention_layer._query_dense.bias.assign(
             hf_wts[f"decoder.layers.{i}.encoder_attn.q_proj.bias"]
@@ -281,7 +281,7 @@ def convert_checkpoints(hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._cross_attention_layer._key_dense.kernel.assign(
             hf_wts[f"decoder.layers.{i}.encoder_attn.k_proj.weight"]
@@ -289,7 +289,7 @@ def convert_checkpoints(hf_model):
             .reshape((hidden_dim, num_heads, -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._cross_attention_layer._key_dense.bias.assign(
             hf_wts[f"decoder.layers.{i}.encoder_attn.k_proj.bias"]
@@ -297,7 +297,7 @@ def convert_checkpoints(hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._cross_attention_layer._value_dense.kernel.assign(
             hf_wts[f"decoder.layers.{i}.encoder_attn.v_proj.weight"]
@@ -305,7 +305,7 @@ def convert_checkpoints(hf_model):
             .reshape((hidden_dim, num_heads, -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._cross_attention_layer._value_dense.bias.assign(
             hf_wts[f"decoder.layers.{i}.encoder_attn.v_proj.bias"]
@@ -313,7 +313,7 @@ def convert_checkpoints(hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._cross_attention_layer._output_dense.kernel.assign(
             hf_wts[f"decoder.layers.{i}.encoder_attn.out_proj.weight"]
@@ -321,69 +321,69 @@ def convert_checkpoints(hf_model):
             .reshape((num_heads, -1, hidden_dim))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._cross_attention_layer._output_dense.bias.assign(
             hf_wts[f"decoder.layers.{i}.encoder_attn.out_proj.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._cross_attention_layer_norm.gamma.assign(
             hf_wts[f"decoder.layers.{i}.encoder_attn_layer_norm.weight"].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._cross_attention_layer_norm.beta.assign(
             hf_wts[f"decoder.layers.{i}.encoder_attn_layer_norm.bias"].numpy()
         )
 
         # Post self-attention and cross-attention layers.
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._feedforward_intermediate_dense.kernel.assign(
             hf_wts[f"decoder.layers.{i}.fc1.weight"].transpose(1, 0).numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._feedforward_intermediate_dense.bias.assign(
             hf_wts[f"decoder.layers.{i}.fc1.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._feedforward_output_dense.kernel.assign(
             hf_wts[f"decoder.layers.{i}.fc2.weight"].transpose(1, 0).numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._feedforward_output_dense.bias.assign(
             hf_wts[f"decoder.layers.{i}.fc2.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._feedforward_layer_norm.gamma.assign(
             hf_wts[f"decoder.layers.{i}.final_layer_norm.weight"].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_decoder_layer_{i}"
         )._feedforward_layer_norm.beta.assign(
             hf_wts[f"decoder.layers.{i}.final_layer_norm.bias"].numpy()
         )
 
     # Save the model.
-    print("\n-> Save KerasNLP model weights.")
-    keras_nlp_model.save_weights(os.path.join(FLAGS.preset, "model.h5"))
+    print("\n-> Save KerasHub model weights.")
+    keras_hub_model.save_weights(os.path.join(FLAGS.preset, "model.h5"))
 
-    return keras_nlp_model
+    return keras_hub_model
 
 
 def extract_vocab(hf_tokenizer):
     vocabulary_path = os.path.join(FLAGS.preset, "vocab.json")
     merges_path = os.path.join(FLAGS.preset, "merges.txt")
-    print(f"\n-> Save KerasNLP vocab to `{vocabulary_path}`.")
-    print(f"-> Save KerasNLP merges to `{merges_path}`.")
+    print(f"\n-> Save KerasHub vocab to `{vocabulary_path}`.")
+    print(f"-> Save KerasHub merges to `{merges_path}`.")
 
     # Huggingface has a save_vocabulary function but it's not byte-for-byte
     # with the source. Instead copy the original downloaded file directly.
@@ -400,7 +400,7 @@ def extract_vocab(hf_tokenizer):
         merges_path,
     )
 
-    keras_nlp_tokenizer = keras_nlp.models.BartTokenizer(
+    keras_hub_tokenizer = keras_hub.models.BartTokenizer(
         vocabulary=vocabulary_path, merges=merges_path
     )
 
@@ -408,12 +408,12 @@ def extract_vocab(hf_tokenizer):
     print(f"`{vocabulary_path}` md5sum: ", get_md5_checksum(vocabulary_path))
     print(f"`{merges_path}` md5sum: ", get_md5_checksum(merges_path))
 
-    return keras_nlp_tokenizer
+    return keras_hub_tokenizer
 
 
 def check_output(
-    keras_nlp_tokenizer,
-    keras_nlp_model,
+    keras_hub_tokenizer,
+    keras_hub_model,
     hf_tokenizer,
     hf_model,
 ):
@@ -425,38 +425,38 @@ def check_output(
         "football is good too, but nowhere near as good as cricket."
     ]
 
-    # KerasNLP
-    keras_nlp_enc_token_ids = keras_nlp_tokenizer(
+    # KerasHub
+    keras_hub_enc_token_ids = keras_hub_tokenizer(
         tf.constant(enc_sample_text)
     ).to_tensor()
-    keras_nlp_enc_token_ids = tf.concat(
+    keras_hub_enc_token_ids = tf.concat(
         [
-            tf.constant([[keras_nlp_tokenizer.start_token_id]]),
-            keras_nlp_enc_token_ids,
-            tf.constant([[keras_nlp_tokenizer.end_token_id]]),
+            tf.constant([[keras_hub_tokenizer.start_token_id]]),
+            keras_hub_enc_token_ids,
+            tf.constant([[keras_hub_tokenizer.end_token_id]]),
         ],
         axis=-1,
     )
-    keras_nlp_dec_token_ids = keras_nlp_tokenizer(
+    keras_hub_dec_token_ids = keras_hub_tokenizer(
         tf.constant(dec_sample_text)
     ).to_tensor()
-    keras_nlp_dec_token_ids = tf.concat(
+    keras_hub_dec_token_ids = tf.concat(
         [
-            tf.constant([[keras_nlp_tokenizer.start_token_id]]),
-            keras_nlp_dec_token_ids,
-            tf.constant([[keras_nlp_tokenizer.end_token_id]]),
+            tf.constant([[keras_hub_tokenizer.start_token_id]]),
+            keras_hub_dec_token_ids,
+            tf.constant([[keras_hub_tokenizer.end_token_id]]),
         ],
         axis=-1,
     )
-    keras_nlp_inputs = {
-        "encoder_token_ids": keras_nlp_enc_token_ids,
-        "encoder_padding_mask": keras_nlp_enc_token_ids
-        != keras_nlp_tokenizer.pad_token_id,
-        "decoder_token_ids": keras_nlp_dec_token_ids,
-        "decoder_padding_mask": keras_nlp_dec_token_ids
-        != keras_nlp_tokenizer.pad_token_id,
+    keras_hub_inputs = {
+        "encoder_token_ids": keras_hub_enc_token_ids,
+        "encoder_padding_mask": keras_hub_enc_token_ids
+        != keras_hub_tokenizer.pad_token_id,
+        "decoder_token_ids": keras_hub_dec_token_ids,
+        "decoder_padding_mask": keras_hub_dec_token_ids
+        != keras_hub_tokenizer.pad_token_id,
     }
-    keras_nlp_output = keras_nlp_model.predict(keras_nlp_inputs)
+    keras_hub_output = keras_hub_model.predict(keras_hub_inputs)
 
     # HF
     hf_enc_inputs = hf_tokenizer(enc_sample_text, return_tensors="pt")
@@ -470,28 +470,28 @@ def check_output(
 
     print("Encoder Outputs:")
     print(
-        "KerasNLP output:",
-        keras_nlp_output["encoder_sequence_output"][0, 0, :10],
+        "KerasHub output:",
+        keras_hub_output["encoder_sequence_output"][0, 0, :10],
     )
     print("HF output:", hf_output.encoder_last_hidden_state[0, 0, :10])
     print(
         "Difference:",
         np.mean(
-            keras_nlp_output["encoder_sequence_output"]
+            keras_hub_output["encoder_sequence_output"]
             - hf_output.encoder_last_hidden_state.detach().numpy()
         ),
     )
 
     print("Decoder Outputs:")
     print(
-        "KerasNLP output:",
-        keras_nlp_output["decoder_sequence_output"][0, 0, :10],
+        "KerasHub output:",
+        keras_hub_output["decoder_sequence_output"][0, 0, :10],
     )
     print("HF output:", hf_output.last_hidden_state[0, 0, :10])
     print(
         "Difference:",
         np.mean(
-            keras_nlp_output["decoder_sequence_output"]
+            keras_hub_output["decoder_sequence_output"]
             - hf_output.last_hidden_state.detach().numpy()
         ),
     )
@@ -513,13 +513,13 @@ def main(_):
     hf_model.eval()
     hf_tokenizer = transformers.AutoTokenizer.from_pretrained(hf_model_name)
 
-    keras_nlp_model = convert_checkpoints(hf_model)
-    print("\n -> Load KerasNLP tokenizer.")
-    keras_nlp_tokenizer = extract_vocab(hf_tokenizer)
+    keras_hub_model = convert_checkpoints(hf_model)
+    print("\n -> Load KerasHub tokenizer.")
+    keras_hub_tokenizer = extract_vocab(hf_tokenizer)
 
     check_output(
-        keras_nlp_tokenizer,
-        keras_nlp_model,
+        keras_hub_tokenizer,
+        keras_hub_model,
         hf_tokenizer,
         hf_model,
     )
diff --git a/tools/checkpoint_conversion/convert_bloom_checkpoints.py b/tools/checkpoint_conversion/convert_bloom_checkpoints.py
index ff0cb24344..8e3da4a97f 100644
--- a/tools/checkpoint_conversion/convert_bloom_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_bloom_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -21,10 +21,10 @@
 from absl import app
 from absl import flags
 
-import keras_nlp
-from keras_nlp.models import BloomBackbone
-from keras_nlp.models import BloomPreprocessor
-from keras_nlp.models import BloomTokenizer
+import keras_hub
+from keras_hub.models import BloomBackbone
+from keras_hub.models import BloomPreprocessor
+from keras_hub.models import BloomTokenizer
 
 FLAGS = flags.FLAGS
 
@@ -221,7 +221,7 @@ def validate_output(
     hf_model_outputs = hf_model(**hf_model_input).last_hidden_state
     hf_model_outputs = hf_model_outputs.detach().numpy()
 
-    # KerasNLP
+    # KerasHub
     preprocessor = BloomPreprocessor(
         tokenizer=keras_tokenizer,
         sequence_length=hf_model_outputs.shape[1],
@@ -232,7 +232,7 @@ def validate_output(
     keras_model_outputs = keras_model.predict(keras_model_input)
 
     # Comparing the outputs.
-    print("🔶 KerasNLP output:", keras_model_outputs[0, 0, :10])
+    print("🔶 KerasHub output:", keras_model_outputs[0, 0, :10])
     print("🔶 HF output:", hf_model_outputs[0, 0, :10])
     print("🔶 Difference:", np.mean(keras_model_outputs - hf_model_outputs))
 
@@ -280,7 +280,7 @@ def main(_):
         del hf_tokenizer
 
         # Save float32 keras preset
-        keras_nlp.src.utils.preset_utils.save_to_preset(keras_model, preset)
+        keras_hub.src.utils.preset_utils.save_to_preset(keras_model, preset)
 
         # Delete float32 Keras model
         del keras_model
@@ -290,8 +290,8 @@ def main(_):
         keras_model = BloomBackbone.from_preset(preset_path, dtype="float16")
 
         # Save float16 keras model
-        keras_nlp.src.utils.preset_utils.save_to_preset(keras_model, preset)
-        keras_nlp.src.utils.preset_utils.save_to_preset(
+        keras_hub.src.utils.preset_utils.save_to_preset(keras_model, preset)
+        keras_hub.src.utils.preset_utils.save_to_preset(
             keras_tokenizer, preset, config_filename="tokenizer.json"
         )
 
diff --git a/tools/checkpoint_conversion/convert_deberta_v3_checkpoints.py b/tools/checkpoint_conversion/convert_deberta_v3_checkpoints.py
index b56485f31e..d0f439f5f8 100644
--- a/tools/checkpoint_conversion/convert_deberta_v3_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_deberta_v3_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -22,11 +22,11 @@
 from absl import flags
 from checkpoint_conversion_utils import get_md5_checksum
 
-from keras_nlp.models.deberta_v3.deberta_v3_backbone import DebertaV3Backbone
-from keras_nlp.models.deberta_v3.deberta_v3_preprocessor import (
+from keras_hub.models.deberta_v3.deberta_v3_backbone import DebertaV3Backbone
+from keras_hub.models.deberta_v3.deberta_v3_preprocessor import (
     DebertaV3TextClassifierPreprocessor,
 )
-from keras_nlp.models.deberta_v3.deberta_v3_tokenizer import DebertaV3Tokenizer
+from keras_hub.models.deberta_v3.deberta_v3_tokenizer import DebertaV3Tokenizer
 
 PRESET_MAP = {
     "deberta_v3_extra_small_en": "microsoft/deberta-v3-xsmall",
@@ -73,7 +73,7 @@ def define_preprocessor(hf_model_name):
     extract_dir = EXTRACT_DIR.format(FLAGS.preset)
     spm_path = os.path.join(extract_dir, "spm.model")
 
-    keras_nlp_tokenizer = DebertaV3Tokenizer(proto=spm_path)
+    keras_hub_tokenizer = DebertaV3Tokenizer(proto=spm_path)
 
     # Avoid having padding tokens. This is because the representations of the
     # padding token may be vastly different from the representations computed in
@@ -81,8 +81,8 @@ def define_preprocessor(hf_model_name):
     sequence_length = 14
     if FLAGS.preset == "deberta_v3_base_multi":
         sequence_length = 17
-    keras_nlp_preprocessor = DebertaV3TextClassifierPreprocessor(
-        keras_nlp_tokenizer, sequence_length=sequence_length
+    keras_hub_preprocessor = DebertaV3TextClassifierPreprocessor(
+        keras_hub_tokenizer, sequence_length=sequence_length
     )
 
     hf_tokenizer = transformers.AutoTokenizer.from_pretrained(hf_model_name)
@@ -90,11 +90,11 @@ def define_preprocessor(hf_model_name):
     print("\n-> Print MD5 checksum of the vocab files.")
     print(f"`{spm_path}` md5sum: ", get_md5_checksum(spm_path))
 
-    return keras_nlp_preprocessor, hf_tokenizer
+    return keras_hub_preprocessor, hf_tokenizer
 
 
-def convert_checkpoints(keras_nlp_model, hf_model):
-    print("\n-> Convert original weights to KerasNLP format.")
+def convert_checkpoints(keras_hub_model, hf_model):
+    print("\n-> Convert original weights to KerasHub format.")
 
     extract_dir = EXTRACT_DIR.format(FLAGS.preset)
     config_path = os.path.join(extract_dir, "config.json")
@@ -123,35 +123,35 @@ def convert_checkpoints(keras_nlp_model, hf_model):
         .replace(")", "")
     )
 
-    keras_nlp_model.get_layer("token_embedding").embeddings.assign(
+    keras_hub_model.get_layer("token_embedding").embeddings.assign(
         hf_wts["embeddings.word_embeddings.weight"]
     )
-    keras_nlp_model.get_layer("embeddings_layer_norm").gamma.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").gamma.assign(
         hf_wts["embeddings.LayerNorm.weight"]
     )
-    keras_nlp_model.get_layer("embeddings_layer_norm").beta.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").beta.assign(
         hf_wts["embeddings.LayerNorm.bias"]
     )
-    keras_nlp_model.get_layer("rel_embedding").rel_embeddings.assign(
+    keras_hub_model.get_layer("rel_embedding").rel_embeddings.assign(
         hf_wts["encoder.rel_embeddings.weight"]
     )
-    keras_nlp_model.get_layer("rel_embedding").layer_norm.gamma.assign(
+    keras_hub_model.get_layer("rel_embedding").layer_norm.gamma.assign(
         hf_wts["encoder.LayerNorm.weight"]
     )
-    keras_nlp_model.get_layer("rel_embedding").layer_norm.beta.assign(
+    keras_hub_model.get_layer("rel_embedding").layer_norm.beta.assign(
         hf_wts["encoder.LayerNorm.bias"]
     )
 
-    for i in range(keras_nlp_model.num_layers):
+    for i in range(keras_hub_model.num_layers):
         # Q,K,V
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._self_attention_layer._query_dense.kernel.assign(
             hf_wts[f"encoder.layer.{i}.attention.self.query_proj.weight"]
             .numpy()
             .T.reshape((cfg["hidden_dim"], cfg["num_heads"], -1))
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._self_attention_layer._query_dense.bias.assign(
             hf_wts[f"encoder.layer.{i}.attention.self.query_proj.bias"]
@@ -159,14 +159,14 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._self_attention_layer._key_dense.kernel.assign(
             hf_wts[f"encoder.layer.{i}.attention.self.key_proj.weight"]
             .numpy()
             .T.reshape((cfg["hidden_dim"], cfg["num_heads"], -1))
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._self_attention_layer._key_dense.bias.assign(
             hf_wts[f"encoder.layer.{i}.attention.self.key_proj.bias"]
@@ -174,14 +174,14 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._self_attention_layer._value_dense.kernel.assign(
             hf_wts[f"encoder.layer.{i}.attention.self.value_proj.weight"]
             .numpy()
             .T.reshape((cfg["hidden_dim"], cfg["num_heads"], -1))
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._self_attention_layer._value_dense.bias.assign(
             hf_wts[f"encoder.layer.{i}.attention.self.value_proj.bias"]
@@ -190,88 +190,88 @@ def convert_checkpoints(keras_nlp_model, hf_model):
         )
 
         # Attn output.
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._self_attention_layer._output_dense.kernel.assign(
             hf_wts[f"encoder.layer.{i}.attention.output.dense.weight"]
             .transpose(1, 0)
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._self_attention_layer._output_dense.bias.assign(
             hf_wts[f"encoder.layer.{i}.attention.output.dense.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._self_attention_layer_norm.gamma.assign(
             hf_wts[
                 f"encoder.layer.{i}.attention.output.LayerNorm.weight"
             ].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._self_attention_layer_norm.beta.assign(
             hf_wts[f"encoder.layer.{i}.attention.output.LayerNorm.bias"].numpy()
         )
 
         # Intermediate FF layer.
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._feedforward_intermediate_dense.kernel.assign(
             hf_wts[f"encoder.layer.{i}.intermediate.dense.weight"]
             .transpose(1, 0)
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._feedforward_intermediate_dense.bias.assign(
             hf_wts[f"encoder.layer.{i}.intermediate.dense.bias"].numpy()
         )
 
         # Output FF layer.
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._feedforward_output_dense.kernel.assign(
             hf_wts[f"encoder.layer.{i}.output.dense.weight"].numpy().T
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._feedforward_output_dense.bias.assign(
             hf_wts[f"encoder.layer.{i}.output.dense.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._feedforward_layer_norm.gamma.assign(
             hf_wts[f"encoder.layer.{i}.output.LayerNorm.weight"].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"disentangled_attention_encoder_layer_{i}"
         )._feedforward_layer_norm.beta.assign(
             hf_wts[f"encoder.layer.{i}.output.LayerNorm.bias"].numpy()
         )
 
     # Save the model.
-    print(f"\n-> Save KerasNLP model weights to `{FLAGS.preset}.h5`.")
-    keras_nlp_model.save_weights(f"{FLAGS.preset}.h5")
+    print(f"\n-> Save KerasHub model weights to `{FLAGS.preset}.h5`.")
+    keras_hub_model.save_weights(f"{FLAGS.preset}.h5")
 
-    return keras_nlp_model
+    return keras_hub_model
 
 
 def check_output(
-    keras_nlp_preprocessor,
-    keras_nlp_model,
+    keras_hub_preprocessor,
+    keras_hub_model,
     hf_tokenizer,
     hf_model,
 ):
     print("\n-> Check the outputs.")
     sample_text = ["cricket is awesome, easily the best sport in the world!"]
 
-    # KerasNLP
-    keras_nlp_inputs = keras_nlp_preprocessor(tf.constant(sample_text))
-    keras_nlp_output = keras_nlp_model.predict(keras_nlp_inputs)
+    # KerasHub
+    keras_hub_inputs = keras_hub_preprocessor(tf.constant(sample_text))
+    keras_hub_output = keras_hub_model.predict(keras_hub_inputs)
 
     # HF
     hf_inputs = hf_tokenizer(
@@ -279,9 +279,9 @@ def check_output(
     )
     hf_output = hf_model(**hf_inputs).last_hidden_state
 
-    print("KerasNLP output:", keras_nlp_output[0, 0, :10])
+    print("KerasHub output:", keras_hub_output[0, 0, :10])
     print("HF output:", hf_output[0, 0, :10])
-    print("Difference:", np.mean(keras_nlp_output - hf_output.detach().numpy()))
+    print("Difference:", np.mean(keras_hub_output - hf_output.detach().numpy()))
 
     # Show the MD5 checksum of the model weights.
     print("Model md5sum: ", get_md5_checksum(f"./{FLAGS.preset}.h5"))
@@ -292,10 +292,10 @@ def main(_):
 
     download_files(hf_model_name)
 
-    keras_nlp_preprocessor, hf_tokenizer = define_preprocessor(hf_model_name)
+    keras_hub_preprocessor, hf_tokenizer = define_preprocessor(hf_model_name)
 
-    print("\n-> Load KerasNLP model.")
-    keras_nlp_model = DebertaV3Backbone.from_preset(
+    print("\n-> Load KerasHub model.")
+    keras_hub_model = DebertaV3Backbone.from_preset(
         FLAGS.preset, load_weights=False
     )
 
@@ -303,11 +303,11 @@ def main(_):
     hf_model = transformers.AutoModel.from_pretrained(hf_model_name)
     hf_model.eval()
 
-    keras_nlp_model = convert_checkpoints(keras_nlp_model, hf_model)
+    keras_hub_model = convert_checkpoints(keras_hub_model, hf_model)
 
     check_output(
-        keras_nlp_preprocessor,
-        keras_nlp_model,
+        keras_hub_preprocessor,
+        keras_hub_model,
         hf_tokenizer,
         hf_model,
     )
diff --git a/tools/checkpoint_conversion/convert_distilbert_checkpoints.py b/tools/checkpoint_conversion/convert_distilbert_checkpoints.py
index 55f055c8d5..9351a19912 100644
--- a/tools/checkpoint_conversion/convert_distilbert_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_distilbert_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -21,7 +21,7 @@
 from absl import app
 from absl import flags
 
-import keras_nlp
+import keras_hub
 from tools.checkpoint_conversion.checkpoint_conversion_utils import (
     get_md5_checksum,
 )
@@ -69,12 +69,12 @@ def define_preprocessor(hf_model_name):
     extract_dir = EXTRACT_DIR.format(FLAGS.preset)
     vocab_path = os.path.join(extract_dir, "vocab.txt")
 
-    keras_nlp_tokenizer = keras_nlp.models.DistilBertTokenizer(
+    keras_hub_tokenizer = keras_hub.models.DistilBertTokenizer(
         vocabulary=vocab_path,
     )
-    keras_nlp_preprocessor = (
-        keras_nlp.models.DistilBertTextClassifierPreprocessor(
-            keras_nlp_tokenizer
+    keras_hub_preprocessor = (
+        keras_hub.models.DistilBertTextClassifierPreprocessor(
+            keras_hub_tokenizer
         )
     )
 
@@ -83,11 +83,11 @@ def define_preprocessor(hf_model_name):
     print("\n-> Print MD5 checksum of the vocab files.")
     print(f"`{vocab_path}` md5sum: ", get_md5_checksum(vocab_path))
 
-    return keras_nlp_preprocessor, hf_tokenizer
+    return keras_hub_preprocessor, hf_tokenizer
 
 
-def convert_checkpoints(keras_nlp_model, hf_model):
-    print("\n-> Convert original weights to KerasNLP format.")
+def convert_checkpoints(keras_hub_model, hf_model):
+    print("\n-> Convert original weights to KerasHub format.")
 
     extract_dir = EXTRACT_DIR.format(FLAGS.preset)
     config_path = os.path.join(extract_dir, "config.json")
@@ -116,26 +116,26 @@ def convert_checkpoints(keras_nlp_model, hf_model):
         .replace(")", "")
     )
 
-    keras_nlp_model.get_layer(
+    keras_hub_model.get_layer(
         "token_and_position_embedding"
     ).token_embedding.embeddings.assign(
         hf_wts["embeddings.word_embeddings.weight"]
     )
-    keras_nlp_model.get_layer(
+    keras_hub_model.get_layer(
         "token_and_position_embedding"
     ).position_embedding.position_embeddings.assign(
         hf_wts["embeddings.position_embeddings.weight"]
     )
 
-    keras_nlp_model.get_layer("embeddings_layer_norm").gamma.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").gamma.assign(
         hf_wts["embeddings.LayerNorm.weight"]
     )
-    keras_nlp_model.get_layer("embeddings_layer_norm").beta.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").beta.assign(
         hf_wts["embeddings.LayerNorm.bias"]
     )
 
-    for i in range(keras_nlp_model.num_layers):
-        keras_nlp_model.get_layer(
+    for i in range(keras_hub_model.num_layers):
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._query_dense.kernel.assign(
             hf_wts[f"transformer.layer.{i}.attention.q_lin.weight"]
@@ -143,7 +143,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             .reshape((cfg["hidden_dim"], cfg["num_heads"], -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._query_dense.bias.assign(
             hf_wts[f"transformer.layer.{i}.attention.q_lin.bias"]
@@ -151,7 +151,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._key_dense.kernel.assign(
             hf_wts[f"transformer.layer.{i}.attention.k_lin.weight"]
@@ -159,7 +159,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             .reshape((cfg["hidden_dim"], cfg["num_heads"], -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._key_dense.bias.assign(
             hf_wts[f"transformer.layer.{i}.attention.k_lin.bias"]
@@ -167,7 +167,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._value_dense.kernel.assign(
             hf_wts[f"transformer.layer.{i}.attention.v_lin.weight"]
@@ -175,7 +175,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             .reshape((cfg["hidden_dim"], cfg["num_heads"], -1))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._value_dense.bias.assign(
             hf_wts[f"transformer.layer.{i}.attention.v_lin.bias"]
@@ -183,7 +183,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             .numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._output_dense.kernel.assign(
             hf_wts[f"transformer.layer.{i}.attention.out_lin.weight"]
@@ -191,79 +191,79 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             .reshape((cfg["num_heads"], -1, cfg["hidden_dim"]))
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._output_dense.bias.assign(
             hf_wts[f"transformer.layer.{i}.attention.out_lin.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer_norm.gamma.assign(
             hf_wts[f"transformer.layer.{i}.sa_layer_norm.weight"].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer_norm.beta.assign(
             hf_wts[f"transformer.layer.{i}.sa_layer_norm.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_intermediate_dense.kernel.assign(
             hf_wts[f"transformer.layer.{i}.ffn.lin1.weight"]
             .transpose(1, 0)
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_intermediate_dense.bias.assign(
             hf_wts[f"transformer.layer.{i}.ffn.lin1.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_output_dense.kernel.assign(
             hf_wts[f"transformer.layer.{i}.ffn.lin2.weight"]
             .transpose(1, 0)
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_output_dense.bias.assign(
             hf_wts[f"transformer.layer.{i}.ffn.lin2.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_layer_norm.gamma.assign(
             hf_wts[f"transformer.layer.{i}.output_layer_norm.weight"].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_layer_norm.beta.assign(
             hf_wts[f"transformer.layer.{i}.output_layer_norm.bias"].numpy()
         )
 
     # Save the model.
-    print(f"\n-> Save KerasNLP model weights to `{FLAGS.preset}.h5`.")
-    keras_nlp_model.save_weights(f"{FLAGS.preset}.h5")
+    print(f"\n-> Save KerasHub model weights to `{FLAGS.preset}.h5`.")
+    keras_hub_model.save_weights(f"{FLAGS.preset}.h5")
 
-    return keras_nlp_model
+    return keras_hub_model
 
 
 def check_output(
-    keras_nlp_preprocessor,
-    keras_nlp_model,
+    keras_hub_preprocessor,
+    keras_hub_model,
     hf_tokenizer,
     hf_model,
 ):
     print("\n-> Check the outputs.")
     sample_text = ["cricket is awesome, easily the best sport in the world!"]
 
-    # KerasNLP
-    keras_nlp_inputs = keras_nlp_preprocessor(tf.constant(sample_text))
-    keras_nlp_output = keras_nlp_model.predict(keras_nlp_inputs)
+    # KerasHub
+    keras_hub_inputs = keras_hub_preprocessor(tf.constant(sample_text))
+    keras_hub_output = keras_hub_model.predict(keras_hub_inputs)
 
     # HF
     hf_inputs = hf_tokenizer(
@@ -271,9 +271,9 @@ def check_output(
     )
     hf_output = hf_model(**hf_inputs).last_hidden_state
 
-    print("KerasNLP output:", keras_nlp_output[0, 0, :10])
+    print("KerasHub output:", keras_hub_output[0, 0, :10])
     print("HF output:", hf_output[0, 0, :10])
-    print("Difference:", np.mean(keras_nlp_output - hf_output.detach().numpy()))
+    print("Difference:", np.mean(keras_hub_output - hf_output.detach().numpy()))
 
     # Show the MD5 checksum of the model weights.
     print("Model md5sum: ", get_md5_checksum(f"./{FLAGS.preset}.h5"))
@@ -284,10 +284,10 @@ def main(_):
 
     download_files(hf_model_name)
 
-    keras_nlp_preprocessor, hf_tokenizer = define_preprocessor(hf_model_name)
+    keras_hub_preprocessor, hf_tokenizer = define_preprocessor(hf_model_name)
 
-    print("\n-> Load KerasNLP model.")
-    keras_nlp_model = keras_nlp.models.DistilBertBackbone.from_preset(
+    print("\n-> Load KerasHub model.")
+    keras_hub_model = keras_hub.models.DistilBertBackbone.from_preset(
         FLAGS.preset, load_weights=False
     )
 
@@ -295,11 +295,11 @@ def main(_):
     hf_model = transformers.AutoModel.from_pretrained(hf_model_name)
     hf_model.eval()
 
-    keras_nlp_model = convert_checkpoints(keras_nlp_model, hf_model)
+    keras_hub_model = convert_checkpoints(keras_hub_model, hf_model)
 
     check_output(
-        keras_nlp_preprocessor,
-        keras_nlp_model,
+        keras_hub_preprocessor,
+        keras_hub_model,
         hf_tokenizer,
         hf_model,
     )
diff --git a/tools/checkpoint_conversion/convert_electra_checkpoints.py b/tools/checkpoint_conversion/convert_electra_checkpoints.py
index ccc7da00b8..f676fcecc9 100644
--- a/tools/checkpoint_conversion/convert_electra_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_electra_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -29,8 +29,8 @@
 from absl import app  # noqa: E402
 from absl import flags  # noqa: E402
 
-import keras_nlp  # noqa: E402
-from keras_nlp.utils.preset_utils import save_to_preset  # noqa: E402
+import keras_hub  # noqa: E402
+from keras_hub.utils.preset_utils import save_to_preset  # noqa: E402
 
 PRESET_MAP = {
     "electra_base_generator_en": "google/electra-base-generator",
@@ -73,7 +73,7 @@ def convert_model(hf_model):
     cfg["intermediate_dim"] = hf_config["intermediate_size"]
     cfg["dropout"] = hf_config["hidden_dropout_prob"]
     cfg["max_sequence_length"] = hf_config["max_position_embeddings"]
-    return keras_nlp.models.ElectraBackbone(**cfg)
+    return keras_hub.models.ElectraBackbone(**cfg)
 
 
 def convert_tokenizer(hf_model_dir):
@@ -82,7 +82,7 @@ def convert_tokenizer(hf_model_dir):
         hf_tokenizer = json.load(f)
     vocab = hf_tokenizer["model"]["vocab"]
 
-    return keras_nlp.models.ElectraTokenizer(vocabulary=vocab)
+    return keras_hub.models.ElectraTokenizer(vocabulary=vocab)
 
 
 def convert_weights(keras_model, hf_model):
@@ -228,11 +228,11 @@ def convert_weights(keras_model, hf_model):
 def validate_output(keras_model, hf_model, keras_tokenizer, hf_tokenizer):
     input_str = ["The quick brown fox jumps over the lazy dog."]
 
-    keras_nlp_preprocessor = keras_nlp.models.ElectraPreprocessor(
+    keras_hub_preprocessor = keras_hub.models.ElectraPreprocessor(
         keras_tokenizer
     )
-    keras_nlp_inputs = keras_nlp_preprocessor(tf.constant(input_str))
-    keras_nlp_output = keras_model.predict(keras_nlp_inputs).get(
+    keras_hub_inputs = keras_hub_preprocessor(tf.constant(input_str))
+    keras_hub_output = keras_model.predict(keras_hub_inputs).get(
         "sequence_output"
     )
 
@@ -240,9 +240,9 @@ def validate_output(keras_model, hf_model, keras_tokenizer, hf_tokenizer):
         input_str, padding="max_length", return_tensors="pt"
     )
     hf_output = hf_model(**hf_inputs).last_hidden_state.detach().numpy()
-    print("🔶 KerasNLP output:", keras_nlp_output[0, 0, :10])
+    print("🔶 KerasHub output:", keras_hub_output[0, 0, :10])
     print("🔶 HF output:", hf_output[0, 0, :10])
-    print("Difference: ", np.mean(keras_nlp_output - hf_output))
+    print("Difference: ", np.mean(keras_hub_output - hf_output))
 
 
 def main(_):
diff --git a/tools/checkpoint_conversion/convert_f_net_checkpoints.py b/tools/checkpoint_conversion/convert_f_net_checkpoints.py
index 37910c8809..259db71829 100644
--- a/tools/checkpoint_conversion/convert_f_net_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_f_net_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -21,7 +21,7 @@
 from absl import flags
 from checkpoint_conversion_utils import get_md5_checksum
 
-import keras_nlp
+import keras_hub
 
 PRESET_MAP = {
     "f_net_base_en": "google/fnet-base",
@@ -35,10 +35,10 @@
 
 
 def convert_checkpoints(hf_model):
-    print("\n-> Convert original weights to KerasNLP format.")
+    print("\n-> Convert original weights to KerasHub format.")
 
-    print("\n-> Load KerasNLP model.")
-    keras_nlp_model = keras_nlp.models.FNetBackbone.from_preset(
+    print("\n-> Load KerasHub model.")
+    keras_hub_model = keras_hub.models.FNetBackbone.from_preset(
         FLAGS.preset, load_weights=False
     )
 
@@ -46,94 +46,94 @@ def convert_checkpoints(hf_model):
     print("Original weights:")
     print(list(hf_wts.keys()))
 
-    keras_nlp_model.get_layer("token_embedding").embeddings.assign(
+    keras_hub_model.get_layer("token_embedding").embeddings.assign(
         hf_wts["embeddings.word_embeddings.weight"]
     )
-    keras_nlp_model.get_layer("position_embedding").position_embeddings.assign(
+    keras_hub_model.get_layer("position_embedding").position_embeddings.assign(
         hf_wts["embeddings.position_embeddings.weight"]
     )
-    keras_nlp_model.get_layer("segment_embedding").embeddings.assign(
+    keras_hub_model.get_layer("segment_embedding").embeddings.assign(
         hf_wts["embeddings.token_type_embeddings.weight"]
     )
 
-    keras_nlp_model.get_layer("embeddings_layer_norm").gamma.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").gamma.assign(
         hf_wts["embeddings.LayerNorm.weight"]
     )
-    keras_nlp_model.get_layer("embeddings_layer_norm").beta.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").beta.assign(
         hf_wts["embeddings.LayerNorm.bias"]
     )
 
-    keras_nlp_model.get_layer("embedding_projection").kernel.assign(
+    keras_hub_model.get_layer("embedding_projection").kernel.assign(
         hf_wts["embeddings.projection.weight"].T
     )
-    keras_nlp_model.get_layer("embedding_projection").bias.assign(
+    keras_hub_model.get_layer("embedding_projection").bias.assign(
         hf_wts["embeddings.projection.bias"]
     )
 
-    for i in range(keras_nlp_model.num_layers):
-        keras_nlp_model.get_layer(
+    for i in range(keras_hub_model.num_layers):
+        keras_hub_model.get_layer(
             f"f_net_layer_{i}"
         )._mixing_layer_norm.gamma.assign(
             hf_wts[f"encoder.layer.{i}.fourier.output.LayerNorm.weight"].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"f_net_layer_{i}"
         )._mixing_layer_norm.beta.assign(
             hf_wts[f"encoder.layer.{i}.fourier.output.LayerNorm.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"f_net_layer_{i}"
         )._intermediate_dense.kernel.assign(
             hf_wts[f"encoder.layer.{i}.intermediate.dense.weight"]
             .transpose(1, 0)
             .numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"f_net_layer_{i}"
         )._intermediate_dense.bias.assign(
             hf_wts[f"encoder.layer.{i}.intermediate.dense.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"f_net_layer_{i}"
         )._output_dense.kernel.assign(
             hf_wts[f"encoder.layer.{i}.output.dense.weight"]
             .transpose(1, 0)
             .numpy()
         )
-        keras_nlp_model.get_layer(f"f_net_layer_{i}")._output_dense.bias.assign(
+        keras_hub_model.get_layer(f"f_net_layer_{i}")._output_dense.bias.assign(
             hf_wts[f"encoder.layer.{i}.output.dense.bias"].numpy()
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"f_net_layer_{i}"
         )._output_layer_norm.gamma.assign(
             hf_wts[f"encoder.layer.{i}.output.LayerNorm.weight"].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"f_net_layer_{i}"
         )._output_layer_norm.beta.assign(
             hf_wts[f"encoder.layer.{i}.output.LayerNorm.bias"].numpy()
         )
 
-    keras_nlp_model.get_layer("pooled_dense").kernel.assign(
+    keras_hub_model.get_layer("pooled_dense").kernel.assign(
         hf_wts["pooler.dense.weight"].transpose(1, 0).numpy()
     )
-    keras_nlp_model.get_layer("pooled_dense").bias.assign(
+    keras_hub_model.get_layer("pooled_dense").bias.assign(
         hf_wts["pooler.dense.bias"].numpy()
     )
 
     # Save the model.
-    print("\n-> Save KerasNLP model weights.")
-    keras_nlp_model.save_weights(os.path.join(FLAGS.preset, "model.h5"))
+    print("\n-> Save KerasHub model weights.")
+    keras_hub_model.save_weights(os.path.join(FLAGS.preset, "model.h5"))
 
-    return keras_nlp_model
+    return keras_hub_model
 
 
 def extract_vocab(hf_tokenizer):
     spm_path = os.path.join(FLAGS.preset, "spiece.model")
-    print(f"\n-> Save KerasNLP SPM vocabulary file to `{spm_path}`.")
+    print(f"\n-> Save KerasHub SPM vocabulary file to `{spm_path}`.")
 
     shutil.copyfile(
         transformers.utils.hub.get_file_from_repo(
@@ -142,31 +142,31 @@ def extract_vocab(hf_tokenizer):
         spm_path,
     )
 
-    keras_nlp_tokenizer = keras_nlp.models.FNetTokenizer(
+    keras_hub_tokenizer = keras_hub.models.FNetTokenizer(
         proto=spm_path,
     )
-    keras_nlp_preprocessor = keras_nlp.models.FNetTextClassifierPreprocessor(
-        keras_nlp_tokenizer
+    keras_hub_preprocessor = keras_hub.models.FNetTextClassifierPreprocessor(
+        keras_hub_tokenizer
     )
 
     print("-> Print MD5 checksum of the vocab files.")
     print(f"`{spm_path}` md5sum: ", get_md5_checksum(spm_path))
 
-    return keras_nlp_preprocessor
+    return keras_hub_preprocessor
 
 
 def check_output(
-    keras_nlp_preprocessor,
-    keras_nlp_model,
+    keras_hub_preprocessor,
+    keras_hub_model,
     hf_tokenizer,
     hf_model,
 ):
     print("\n-> Check the outputs.")
     sample_text = ["cricket is awesome, easily the best sport in the world!"]
 
-    # KerasNLP
-    keras_nlp_inputs = keras_nlp_preprocessor(tf.constant(sample_text))
-    keras_nlp_output = keras_nlp_model.predict(keras_nlp_inputs)[
+    # KerasHub
+    keras_hub_inputs = keras_hub_preprocessor(tf.constant(sample_text))
+    keras_hub_output = keras_hub_model.predict(keras_hub_inputs)[
         "sequence_output"
     ]
 
@@ -176,9 +176,9 @@ def check_output(
     )
     hf_output = hf_model(**hf_inputs).last_hidden_state
 
-    print("KerasNLP output:", keras_nlp_output[0, 0, :10])
+    print("KerasHub output:", keras_hub_output[0, 0, :10])
     print("HF output:", hf_output[0, 0, :10])
-    print("Difference:", np.mean(keras_nlp_output - hf_output.detach().numpy()))
+    print("Difference:", np.mean(keras_hub_output - hf_output.detach().numpy()))
 
     # Show the MD5 checksum of the model weights.
     print(
@@ -197,13 +197,13 @@ def main(_):
     hf_model.eval()
     hf_tokenizer = transformers.AutoTokenizer.from_pretrained(hf_model_name)
 
-    keras_nlp_model = convert_checkpoints(hf_model)
-    print("\n -> Load KerasNLP preprocessor.")
-    keras_nlp_preprocessor = extract_vocab(hf_tokenizer)
+    keras_hub_model = convert_checkpoints(hf_model)
+    print("\n -> Load KerasHub preprocessor.")
+    keras_hub_preprocessor = extract_vocab(hf_tokenizer)
 
     check_output(
-        keras_nlp_preprocessor,
-        keras_nlp_model,
+        keras_hub_preprocessor,
+        keras_hub_model,
         hf_tokenizer,
         hf_model,
     )
diff --git a/tools/checkpoint_conversion/convert_falcon_checkpoints.py b/tools/checkpoint_conversion/convert_falcon_checkpoints.py
index d62f5dd58b..9b249ce6d1 100644
--- a/tools/checkpoint_conversion/convert_falcon_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_falcon_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -44,7 +44,7 @@
 import torch  # noqa: E402
 import transformers  # noqa: E402
 
-import keras_nlp  # noqa: E402
+import keras_hub  # noqa: E402
 
 PRESET_MAP = {
     "falcon_refinedweb_1b_en": "tiiuae/falcon-rw-1b",
@@ -82,7 +82,7 @@ def convert_model(hf_model):
     kwargs["feedforward_dropout_rate"] = hf_config["hidden_dropout"]
     kwargs["attention_dropout_rate"] = hf_config["attention_dropout"]
 
-    return keras_nlp.models.FalconBackbone(**kwargs)
+    return keras_hub.models.FalconBackbone(**kwargs)
 
 
 def convert_tokenizer(hf_model_dir):
@@ -92,7 +92,7 @@ def convert_tokenizer(hf_model_dir):
 
     vocab = hf_tokenizer["model"]["vocab"]
     merges = hf_tokenizer["model"]["merges"]
-    return keras_nlp.models.FalconTokenizer(vocabulary=vocab, merges=merges)
+    return keras_hub.models.FalconTokenizer(vocabulary=vocab, merges=merges)
 
 
 def convert_weights(keras_model, hf_model):
@@ -236,7 +236,7 @@ def validate_output(
 ):
     input_str = ["the quick brown fox ran, galloped and jumped."]
 
-    # KerasNLP model.
+    # KerasHub model.
     token_ids = torch.tensor(keras_tokenizer(input_str))
     padding_mask = token_ids != 3
     keras_model_input = {
@@ -261,9 +261,9 @@ def hook(hf_model, input, output):
     hf_model_outputs = activation["ln_f"].detach().numpy()
 
     # Comparing the outputs.
-    print("🔶 KerasNLP tokens ids:", keras_model_input["token_ids"])
+    print("🔶 KerasHub tokens ids:", keras_model_input["token_ids"])
     print("🔶 HF tokens ids:", hf_model_input["input_ids"])
-    print("🔶 KerasNLP output:", keras_model_outputs[0, 1, :10])
+    print("🔶 KerasHub output:", keras_model_outputs[0, 1, :10])
     print("🔶 HF output:", hf_model_outputs[0, 1, :10])
     print("🔶 Difference:", np.mean(keras_model_outputs - hf_model_outputs))
 
@@ -296,8 +296,8 @@ def main(_):
     )
     print("✅ Numerics validated")
 
-    keras_nlp.src.utils.preset_utils.save_to_preset(keras_model, preset)
-    keras_nlp.src.utils.preset_utils.save_to_preset(
+    keras_hub.src.utils.preset_utils.save_to_preset(keras_model, preset)
+    keras_hub.src.utils.preset_utils.save_to_preset(
         keras_tokenizer, preset, config_filename="tokenizer.json"
     )
     print("✅ Preset saved")
diff --git a/tools/checkpoint_conversion/convert_gemma_checkpoints.py b/tools/checkpoint_conversion/convert_gemma_checkpoints.py
index afc50e4b30..a4c596606c 100644
--- a/tools/checkpoint_conversion/convert_gemma_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_gemma_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -55,7 +55,7 @@
 from gemma import sampler as sampler_lib  # noqa: E402
 from gemma import transformer as transformer_lib  # noqa: E402
 
-import keras_nlp  # noqa: E402
+import keras_hub  # noqa: E402
 
 FLAGS = flags.FLAGS
 
@@ -104,7 +104,7 @@ def convert_model(flax_config, flax_params, vocab_size):
             "use_sliding_window_attention": True,
             "sliding_window_size": 4096,
         }
-    return keras_nlp.models.GemmaBackbone(
+    return keras_hub.models.GemmaBackbone(
         vocabulary_size=vocab_size,
         num_layers=flax_config.num_layers,
         num_query_heads=flax_config.num_heads,
@@ -117,7 +117,7 @@ def convert_model(flax_config, flax_params, vocab_size):
 
 
 def convert_tokenizer(proto_path):
-    return keras_nlp.models.GemmaTokenizer(proto=proto_path)
+    return keras_hub.models.GemmaTokenizer(proto=proto_path)
 
 
 def convert_weights(keras_model, flax_config, flax_params):
@@ -195,15 +195,15 @@ def validate_output(
     input_str = "What is Keras?"
     length = 32
 
-    # KerasNLP
-    preprocessor = keras_nlp.models.GemmaCausalLMPreprocessor(keras_tokenizer)
-    gemma_lm = keras_nlp.models.GemmaCausalLM(
+    # KerasHub
+    preprocessor = keras_hub.models.GemmaCausalLMPreprocessor(keras_tokenizer)
+    gemma_lm = keras_hub.models.GemmaCausalLM(
         backbone=keras_model,
         preprocessor=preprocessor,
     )
     keras_output = gemma_lm.generate([input_str], max_length=length)
     keras_output = keras_output[0]
-    print("🔶 KerasNLP output:", keras_output)
+    print("🔶 KerasHub output:", keras_output)
 
     # Flax
     try:
diff --git a/tools/checkpoint_conversion/convert_gpt2_checkpoints.py b/tools/checkpoint_conversion/convert_gpt2_checkpoints.py
index ee15858cce..3396b0bb3f 100644
--- a/tools/checkpoint_conversion/convert_gpt2_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_gpt2_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -23,8 +23,8 @@
 from checkpoint_conversion_utils import get_md5_checksum
 
 # Temporarily directly import gpt2 until we expose it.
-from keras_nlp.models.gpt2.gpt2_backbone import GPT2Backbone
-from keras_nlp.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
+from keras_hub.models.gpt2.gpt2_backbone import GPT2Backbone
+from keras_hub.models.gpt2.gpt2_tokenizer import GPT2Tokenizer
 
 PRESET_MAP = {
     "gpt2_base_en": ("124M", "gpt2"),
@@ -54,7 +54,7 @@ def download_model(num_params):
 
 
 def convert_checkpoints(num_params):
-    print("\n-> Convert original weights to KerasNLP format.")
+    print("\n-> Convert original weights to KerasHub format.")
     # GPT-2 paths.
     extract_dir = EXTRACT_DIR.format(num_params)
     checkpoint_path = os.path.join(extract_dir, "model.ckpt")
@@ -73,15 +73,15 @@ def convert_checkpoints(num_params):
         weights[name] = weight
 
     # Temporary direct import, as we aren't exposing this quite yet.
-    keras_nlp_model = GPT2Backbone.from_preset(
+    keras_hub_model = GPT2Backbone.from_preset(
         FLAGS.preset,
         load_weights=False,
     )
 
-    keras_nlp_model.get_layer("token_embedding").embeddings.assign(
+    keras_hub_model.get_layer("token_embedding").embeddings.assign(
         weights["model/wte"]
     )
-    keras_nlp_model.get_layer("position_embedding").position_embeddings.assign(
+    keras_hub_model.get_layer("position_embedding").position_embeddings.assign(
         weights["model/wpe"]
     )
 
@@ -89,43 +89,43 @@ def convert_checkpoints(num_params):
     range_2 = (cfg["n_embd"], 2 * cfg["n_embd"])
     range_3 = (2 * cfg["n_embd"], 3 * cfg["n_embd"])
 
-    for i in range(keras_nlp_model.num_layers):
-        keras_nlp_model.get_layer(
+    for i in range(keras_hub_model.num_layers):
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._query_dense.kernel.assign(
             weights[f"model/h{i}/attn/c_attn/w"][
                 0, :, range_1[0] : range_1[1]
             ].reshape((cfg["n_embd"], cfg["n_head"], -1))
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._query_dense.bias.assign(
             weights[f"model/h{i}/attn/c_attn/b"][
                 range_1[0] : range_1[1]
             ].reshape((cfg["n_head"], -1))
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._key_dense.kernel.assign(
             weights[f"model/h{i}/attn/c_attn/w"][
                 0, :, range_2[0] : range_2[1]
             ].reshape((cfg["n_embd"], cfg["n_head"], -1))
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._key_dense.bias.assign(
             weights[f"model/h{i}/attn/c_attn/b"][
                 range_2[0] : range_2[1]
             ].reshape((cfg["n_head"], -1))
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._value_dense.kernel.assign(
             weights[f"model/h{i}/attn/c_attn/w"][
                 0, :, range_3[0] : range_3[1]
             ].reshape((cfg["n_embd"], cfg["n_head"], -1))
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._value_dense.bias.assign(
             weights[f"model/h{i}/attn/c_attn/b"][
@@ -133,66 +133,66 @@ def convert_checkpoints(num_params):
             ].reshape((cfg["n_head"], -1))
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._output_dense.kernel.assign(
             weights[f"model/h{i}/attn/c_proj/w"][0].reshape(
                 (cfg["n_head"], -1, cfg["n_embd"])
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._output_dense.bias.assign(
             weights[f"model/h{i}/attn/c_proj/b"]
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer_norm.gamma.assign(weights[f"model/h{i}/ln_1/g"])
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer_norm.beta.assign(weights[f"model/h{i}/ln_1/b"])
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_intermediate_dense.kernel.assign(
             weights[f"model/h{i}/mlp/c_fc/w"][0]
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_intermediate_dense.bias.assign(
             weights[f"model/h{i}/mlp/c_fc/b"]
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_output_dense.kernel.assign(
             weights[f"model/h{i}/mlp/c_proj/w"][0]
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_output_dense.bias.assign(
             weights[f"model/h{i}/mlp/c_proj/b"]
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_layer_norm.gamma.assign(weights[f"model/h{i}/ln_2/g"])
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_layer_norm.beta.assign(weights[f"model/h{i}/ln_2/b"])
 
-    keras_nlp_model.get_layer("layer_norm").gamma.assign(
+    keras_hub_model.get_layer("layer_norm").gamma.assign(
         weights["model/ln_f/g"]
     )
 
-    keras_nlp_model.get_layer("layer_norm").beta.assign(weights["model/ln_f/b"])
+    keras_hub_model.get_layer("layer_norm").beta.assign(weights["model/ln_f/b"])
 
     # Save the model.
-    print(f"\n-> Save KerasNLP model weights to `{FLAGS.preset}.h5`.")
-    keras_nlp_model.save_weights(f"{FLAGS.preset}.h5")
+    print(f"\n-> Save KerasHub model weights to `{FLAGS.preset}.h5`.")
+    keras_hub_model.save_weights(f"{FLAGS.preset}.h5")
 
-    return keras_nlp_model
+    return keras_hub_model
 
 
 def define_tokenizer(num_params, hf_model_name):
@@ -201,7 +201,7 @@ def define_tokenizer(num_params, hf_model_name):
     merges_path = os.path.join(extract_dir, "vocab.bpe")
     vocab_path = os.path.join(extract_dir, "encoder.json")
 
-    keras_nlp_tokenizer = GPT2Tokenizer(
+    keras_hub_tokenizer = GPT2Tokenizer(
         vocabulary=vocab_path,
         merges=merges_path,
     )
@@ -211,40 +211,40 @@ def define_tokenizer(num_params, hf_model_name):
     print(f"`{vocab_path}` md5sum: ", get_md5_checksum(vocab_path))
     print(f"`{merges_path}` md5sum: ", get_md5_checksum(merges_path))
 
-    return keras_nlp_tokenizer, hf_tokenizer
+    return keras_hub_tokenizer, hf_tokenizer
 
 
 def check_output(
-    keras_nlp_model,
-    keras_nlp_tokenizer,
+    keras_hub_model,
+    keras_hub_tokenizer,
     hf_model,
     hf_tokenizer,
 ):
     print("\n-> Check the outputs.")
     input_str = ["the quick brown fox ran, galloped and jumped."]
 
-    # KerasNLP
-    token_ids = keras_nlp_tokenizer(input_str)
+    # KerasHub
+    token_ids = keras_hub_tokenizer(input_str)
     padding_mask = token_ids != 0
 
-    keras_nlp_inputs = {
+    keras_hub_inputs = {
         "token_ids": token_ids.to_tensor(),
         "padding_mask": padding_mask.to_tensor(),
     }
-    keras_nlp_output = keras_nlp_model.predict(keras_nlp_inputs)
+    keras_hub_output = keras_hub_model.predict(keras_hub_inputs)
 
     # HF
     hf_inputs = hf_tokenizer(input_str, return_tensors="pt")
     hf_output = hf_model(**hf_inputs).last_hidden_state
 
-    print("KerasNLP output:", keras_nlp_output[0, 0, :10])
+    print("KerasHub output:", keras_hub_output[0, 0, :10])
     print("HF output:", hf_output[0, 0, :10])
-    print("Difference:", np.mean(keras_nlp_output - hf_output.detach().numpy()))
+    print("Difference:", np.mean(keras_hub_output - hf_output.detach().numpy()))
 
     # Show the MD5 checksum of the model weights.
     print("Model md5sum: ", get_md5_checksum(f"./{FLAGS.preset}.h5"))
 
-    return keras_nlp_output
+    return keras_hub_output
 
 
 def main(_):
@@ -256,19 +256,19 @@ def main(_):
 
     download_model(num_params)
 
-    keras_nlp_model = convert_checkpoints(num_params)
+    keras_hub_model = convert_checkpoints(num_params)
 
     print("\n-> Load HF model.")
     hf_model = transformers.AutoModel.from_pretrained(hf_model_name)
     hf_model.eval()
 
-    keras_nlp_tokenizer, hf_tokenizer = define_tokenizer(
+    keras_hub_tokenizer, hf_tokenizer = define_tokenizer(
         num_params, hf_model_name
     )
 
     check_output(
-        keras_nlp_model,
-        keras_nlp_tokenizer,
+        keras_hub_model,
+        keras_hub_tokenizer,
         hf_model,
         hf_tokenizer,
     )
diff --git a/tools/checkpoint_conversion/convert_gpt_neox_checkpoints.py b/tools/checkpoint_conversion/convert_gpt_neox_checkpoints.py
index 098a4bafc5..2ae568ab84 100644
--- a/tools/checkpoint_conversion/convert_gpt_neox_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_gpt_neox_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -19,8 +19,8 @@
 from transformers import AutoTokenizer
 from transformers import GPTNeoXModel
 
-from keras_nlp.models import GPTNeoXBackbone
-from keras_nlp.models import GPTNeoXTokenizer
+from keras_hub.models import GPTNeoXBackbone
+from keras_hub.models import GPTNeoXTokenizer
 
 PRESET_NAME = "pythia-70m"
 BASE_MODEL = "EleutherAI/gpt-neox-20b"
diff --git a/tools/checkpoint_conversion/convert_llama3_checkpoints.py b/tools/checkpoint_conversion/convert_llama3_checkpoints.py
index 61a4c6cad4..6f7227e93e 100644
--- a/tools/checkpoint_conversion/convert_llama3_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_llama3_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -23,10 +23,10 @@
 from transformers import AutoTokenizer
 from transformers import LlamaForCausalLM
 
-from keras_nlp import upload_preset
-from keras_nlp.models import Llama3Backbone
-from keras_nlp.models import Llama3CausalLMPreprocessor
-from keras_nlp.models import Llama3Tokenizer
+from keras_hub import upload_preset
+from keras_hub.models import Llama3Backbone
+from keras_hub.models import Llama3CausalLMPreprocessor
+from keras_hub.models import Llama3Tokenizer
 
 PRESET_MAP = {
     "llama3_8b_en": "meta-llama/Meta-Llama-3-8B",
@@ -39,15 +39,15 @@
 )
 
 
-def convert_checkpoints(keras_nlp_model, hf_model):
+def convert_checkpoints(keras_hub_model, hf_model):
     config = hf_model.config
 
-    keras_nlp_model.token_embedding.embeddings.assign(
+    keras_hub_model.token_embedding.embeddings.assign(
         hf_model.model.embed_tokens.weight.detach().cpu().float().numpy()
     )
 
-    for i in range(keras_nlp_model.num_layers):
-        keras_nlp_model.transformer_layers[
+    for i in range(keras_hub_model.num_layers):
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._key_dense.set_weights(
             [
@@ -63,7 +63,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._query_dense.set_weights(
             [
@@ -79,7 +79,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._value_dense.set_weights(
             [
@@ -95,7 +95,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._output_dense.set_weights(
             [
@@ -111,7 +111,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layernorm.set_weights(
             [
@@ -122,7 +122,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_intermediate_dense.set_weights(
             [
@@ -133,7 +133,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_output_dense.set_weights(
             [
@@ -144,7 +144,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_gate_dense.set_weights(
             [
@@ -155,7 +155,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_layernorm.set_weights(
             [
@@ -167,21 +167,21 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             ]
         )
 
-    keras_nlp_model.layer_norm.set_weights(
+    keras_hub_model.layer_norm.set_weights(
         [hf_model.model.norm.weight.detach().cpu().float().numpy()]
     )
-    keras_nlp_model.token_embedding.reverse_embeddings.assign(
+    keras_hub_model.token_embedding.reverse_embeddings.assign(
         hf_model.lm_head.weight.T.detach().cpu().float().numpy()
     )
 
 
 def test_model(
-    keras_nlp_model, keras_nlp_tokenizer, hf_model, hf_model_tokenizer
+    keras_hub_model, keras_hub_tokenizer, hf_model, hf_model_tokenizer
 ):
     # First, test that the number of parameters match
-    keras_nlp_params = keras_nlp_model.count_params()
+    keras_hub_params = keras_hub_model.count_params()
     hf_params = hf_model.num_parameters()
-    assert keras_nlp_params == hf_params
+    assert keras_hub_params == hf_params
 
     # Test the outputs of both the models
     hf_outputs = hf_model(
@@ -189,19 +189,19 @@ def test_model(
     )
     hf_output_logits = hf_outputs.logits.detach().cpu().numpy()
 
-    keras_nlp_preprocessor = Llama3CausalLMPreprocessor(keras_nlp_tokenizer)
-    keras_nlp_output = keras_nlp_model(
-        keras_nlp_preprocessor(["What is Keras?"], sequence_length=5)[0]
+    keras_hub_preprocessor = Llama3CausalLMPreprocessor(keras_hub_tokenizer)
+    keras_hub_output = keras_hub_model(
+        keras_hub_preprocessor(["What is Keras?"], sequence_length=5)[0]
     )
-    keras_nlp_logits = keras_nlp_model.token_embedding(
-        keras_nlp_output, reverse=True
+    keras_hub_logits = keras_hub_model.token_embedding(
+        keras_hub_output, reverse=True
     )
-    keras_nlp_logits = ops.convert_to_numpy(keras_nlp_logits)
+    keras_hub_logits = ops.convert_to_numpy(keras_hub_logits)
 
     # High tolerence since bfloat16 is used as the default dtype for Llama
     try:
         np.testing.assert_allclose(
-            keras_nlp_logits, hf_output_logits, atol=1e-4
+            keras_hub_logits, hf_output_logits, atol=1e-4
         )
     except AssertionError as err:
         print("\n")
@@ -210,16 +210,16 @@ def test_model(
         print("\n")
 
 
-def test_tokenizer(keras_nlp_tokenizer, hf_tokenizer):
+def test_tokenizer(keras_hub_tokenizer, hf_tokenizer):
     hf_output = hf_tokenizer(["What is Keras?"], return_tensors="pt")
     hf_output = hf_output["input_ids"].detach().cpu().numpy()
-    keras_nlp_preprocessor = Llama3CausalLMPreprocessor(keras_nlp_tokenizer)
-    keras_nlp_output = keras_nlp_preprocessor(
+    keras_hub_preprocessor = Llama3CausalLMPreprocessor(keras_hub_tokenizer)
+    keras_hub_output = keras_hub_preprocessor(
         ["What is Keras?"], sequence_length=5
     )
-    keras_nlp_output = ops.convert_to_numpy(keras_nlp_output[0]["token_ids"])
+    keras_hub_output = ops.convert_to_numpy(keras_hub_output[0]["token_ids"])
 
-    np.testing.assert_equal(keras_nlp_output, hf_output)
+    np.testing.assert_equal(keras_hub_output, hf_output)
 
 
 def main(_):
@@ -240,7 +240,7 @@ def main(_):
     hf_model.eval()
     print("\n-> Huggingface model and tokenizer loaded")
 
-    # === Load the KerasNLP model ===
+    # === Load the KerasHub model ===
     backbone_kwargs = dict(
         vocabulary_size=hf_model.config.vocab_size,
         hidden_dim=hf_model.config.hidden_size,
@@ -252,7 +252,7 @@ def main(_):
         rope_max_wavelength=hf_model.config.rope_theta,
         dtype="bfloat16",
     )
-    keras_nlp_model = Llama3Backbone(**backbone_kwargs)
+    keras_hub_model = Llama3Backbone(**backbone_kwargs)
 
     # === Get the tokenizer from the Huggingface model ===
     tokenizer_path = hf_hub_download(
@@ -262,23 +262,23 @@ def main(_):
         tokenizer_content = json.load(tokenizer_file)
     vocabulary = hf_tokenizer.vocab
     merges = tokenizer_content["model"]["merges"]
-    keras_nlp_tokenizer = Llama3Tokenizer(vocabulary, merges)
+    keras_hub_tokenizer = Llama3Tokenizer(vocabulary, merges)
     print("\n-> Keras 3 model and tokenizer loaded.")
 
     # === Port the weights ===
-    convert_checkpoints(keras_nlp_model, hf_model)
+    convert_checkpoints(keras_hub_model, hf_model)
     print("\n-> Weight transfer done.")
 
     # === Check that the models and tokenizers outputs match ===
-    test_tokenizer(keras_nlp_tokenizer, hf_tokenizer)
-    test_model(keras_nlp_model, keras_nlp_tokenizer, hf_model, hf_tokenizer)
+    test_tokenizer(keras_hub_tokenizer, hf_tokenizer)
+    test_model(keras_hub_model, keras_hub_tokenizer, hf_model, hf_tokenizer)
     print("\n-> Tests passed!")
 
-    keras_nlp_model.save_to_preset(preset)
+    keras_hub_model.save_to_preset(preset)
     print("\n-> Saved the model preset in float16")
 
     # === Save the tokenizer ===
-    keras_nlp_tokenizer.save_to_preset(preset)
+    keras_hub_tokenizer.save_to_preset(preset)
     print("\n-> Saved the tokenizer")
 
     # === Upload the preset ===
diff --git a/tools/checkpoint_conversion/convert_llama_checkpoints.py b/tools/checkpoint_conversion/convert_llama_checkpoints.py
index 3c4d3dcfc4..06033f7a9f 100644
--- a/tools/checkpoint_conversion/convert_llama_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_llama_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -28,10 +28,10 @@
 from transformers import AutoTokenizer  # noqa: E402
 from transformers import LlamaForCausalLM  # noqa: E402
 
-from keras_nlp import upload_preset  # noqa: E402
-from keras_nlp.models import LlamaBackbone  # noqa: E402
-from keras_nlp.models import LlamaCausalLMPreprocessor  # noqa: E402
-from keras_nlp.models import LlamaTokenizer  # noqa: E402
+from keras_hub import upload_preset  # noqa: E402
+from keras_hub.models import LlamaBackbone  # noqa: E402
+from keras_hub.models import LlamaCausalLMPreprocessor  # noqa: E402
+from keras_hub.models import LlamaTokenizer  # noqa: E402
 
 PRESET_MAP = {
     "llama2_7b_en": "meta-llama/Llama-2-7b-hf",
@@ -80,15 +80,15 @@
 )
 
 
-def convert_checkpoints(keras_nlp_model, hf_model):
+def convert_checkpoints(keras_hub_model, hf_model):
     config = hf_model.config
 
-    keras_nlp_model.token_embedding.embeddings.assign(
+    keras_hub_model.token_embedding.embeddings.assign(
         hf_model.model.embed_tokens.weight
     )
 
-    for i in range(keras_nlp_model.num_layers):
-        keras_nlp_model.transformer_layers[
+    for i in range(keras_hub_model.num_layers):
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._key_dense.set_weights(
             [
@@ -99,7 +99,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 )
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._query_dense.set_weights(
             [
@@ -110,7 +110,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 )
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._value_dense.set_weights(
             [
@@ -121,7 +121,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 )
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._output_dense.set_weights(
             [
@@ -132,45 +132,45 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 )
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layernorm.set_weights(
             [hf_model.model.layers[i].input_layernorm.weight]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_intermediate_dense.set_weights(
             [hf_model.model.layers[i].mlp.up_proj.weight.T]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_output_dense.set_weights(
             [hf_model.model.layers[i].mlp.down_proj.weight.T]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_gate_dense.set_weights(
             [hf_model.model.layers[i].mlp.gate_proj.weight.T]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_layernorm.set_weights(
             [hf_model.model.layers[i].post_attention_layernorm.weight.detach()]
         )
 
-    keras_nlp_model.layer_norm.set_weights([hf_model.model.norm.weight])
-    keras_nlp_model.token_embedding.reverse_embeddings.assign(
+    keras_hub_model.layer_norm.set_weights([hf_model.model.norm.weight])
+    keras_hub_model.token_embedding.reverse_embeddings.assign(
         hf_model.lm_head.weight.T
     )
 
 
 def test_model(
-    keras_nlp_model, keras_nlp_tokenizer, hf_model, hf_model_tokenizer
+    keras_hub_model, keras_hub_tokenizer, hf_model, hf_model_tokenizer
 ):
     # First, test that the number of parameters match
-    keras_nlp_params = keras_nlp_model.count_params()
+    keras_hub_params = keras_hub_model.count_params()
     hf_params = hf_model.num_parameters()
-    assert keras_nlp_params == hf_params
+    assert keras_hub_params == hf_params
 
     # Test the outputs of both the models
     hf_outputs = hf_model(
@@ -178,19 +178,19 @@ def test_model(
     )
     hf_output_logits = ops.convert_to_numpy(hf_outputs.logits)
 
-    keras_nlp_preprocessor = LlamaCausalLMPreprocessor(keras_nlp_tokenizer)
-    keras_nlp_output = keras_nlp_model(
-        keras_nlp_preprocessor(["What is Keras?"], sequence_length=6)[0]
+    keras_hub_preprocessor = LlamaCausalLMPreprocessor(keras_hub_tokenizer)
+    keras_hub_output = keras_hub_model(
+        keras_hub_preprocessor(["What is Keras?"], sequence_length=6)[0]
     )
-    keras_nlp_logits = keras_nlp_model.token_embedding(
-        keras_nlp_output, reverse=True
+    keras_hub_logits = keras_hub_model.token_embedding(
+        keras_hub_output, reverse=True
     )
-    keras_nlp_logits = ops.convert_to_numpy(keras_nlp_logits)
+    keras_hub_logits = ops.convert_to_numpy(keras_hub_logits)
 
     # High tolerence when bfloat16 is used as the default dtype for Llama
     try:
         np.testing.assert_allclose(
-            keras_nlp_logits, hf_output_logits, atol=1e-4
+            keras_hub_logits, hf_output_logits, atol=1e-4
         )
     except AssertionError as err:
         print("\n")
@@ -199,16 +199,16 @@ def test_model(
         print("\n")
 
 
-def test_tokenizer(keras_nlp_tokenizer, hf_tokenizer):
+def test_tokenizer(keras_hub_tokenizer, hf_tokenizer):
     hf_output = hf_tokenizer(["What is Keras?"], return_tensors="pt")
     hf_output = ops.convert_to_numpy(hf_output["input_ids"])
-    keras_nlp_preprocessor = LlamaCausalLMPreprocessor(keras_nlp_tokenizer)
-    keras_nlp_output = keras_nlp_preprocessor(
+    keras_hub_preprocessor = LlamaCausalLMPreprocessor(keras_hub_tokenizer)
+    keras_hub_output = keras_hub_preprocessor(
         ["What is Keras?"], sequence_length=6
     )
-    keras_nlp_output = ops.convert_to_numpy(keras_nlp_output[0]["token_ids"])
+    keras_hub_output = ops.convert_to_numpy(keras_hub_output[0]["token_ids"])
 
-    np.testing.assert_equal(keras_nlp_output, hf_output)
+    np.testing.assert_equal(keras_hub_output, hf_output)
 
 
 def main(_):
@@ -237,7 +237,7 @@ def main(_):
             f"\n-> Huggingface model and tokenizer loaded with dtype: {FLAGS.validate_dtype}"
         )
 
-        # === Load the KerasNLP model ===
+        # === Load the KerasHub model ===
         backbone_kwargs = dict(
             vocabulary_size=hf_model.config.vocab_size,
             hidden_dim=hf_model.config.hidden_size,
@@ -249,39 +249,39 @@ def main(_):
             rope_max_wavelength=hf_model.config.rope_theta,
             dtype=FLAGS.validate_dtype,
         )
-        keras_nlp_model = LlamaBackbone(**backbone_kwargs)
+        keras_hub_model = LlamaBackbone(**backbone_kwargs)
 
         # === Get the tokenizer from the Huggingface model ===
         tokenizer_path = hf_tokenizer.vocab_file
-        keras_nlp_tokenizer = LlamaTokenizer(tokenizer_path)
+        keras_hub_tokenizer = LlamaTokenizer(tokenizer_path)
         print("\n-> Keras 3 model and tokenizer loaded.")
 
         # === Port the weights ===
-        convert_checkpoints(keras_nlp_model, hf_model)
+        convert_checkpoints(keras_hub_model, hf_model)
         print("\n-> Weight transfer done.")
 
         # === Check that the models and tokenizers outputs match ===
-        test_tokenizer(keras_nlp_tokenizer, hf_tokenizer)
-        test_model(keras_nlp_model, keras_nlp_tokenizer, hf_model, hf_tokenizer)
+        test_tokenizer(keras_hub_tokenizer, hf_tokenizer)
+        test_model(keras_hub_model, keras_hub_tokenizer, hf_model, hf_tokenizer)
         print("\n-> Tests passed!")
 
-        keras_nlp_model.save_weights(os.path.join(temp_dir, "model.weights.h5"))
+        keras_hub_model.save_weights(os.path.join(temp_dir, "model.weights.h5"))
         print(f"\n-> Saved the model weights in {FLAGS.validate_dtype}")
 
-        del keras_nlp_model, hf_model
+        del keras_hub_model, hf_model
         gc.collect()
 
         # === Save the weights again in user defined dtype ===
         backbone_kwargs["dtype"] = FLAGS.save_dtype
-        keras_nlp_model = LlamaBackbone(**backbone_kwargs)
-        keras_nlp_model.load_weights(os.path.join(temp_dir, "model.weights.h5"))
+        keras_hub_model = LlamaBackbone(**backbone_kwargs)
+        keras_hub_model.load_weights(os.path.join(temp_dir, "model.weights.h5"))
 
         # === Save the model ===
-        keras_nlp_model.save_to_preset(preset)
+        keras_hub_model.save_to_preset(preset)
         print(f"\n-> Saved the model preset in {FLAGS.save_dtype}")
 
         # === Save the tokenizer ===
-        keras_nlp_tokenizer.save_to_preset(preset)
+        keras_hub_tokenizer.save_to_preset(preset)
         print("\n-> Saved the tokenizer")
 
         # == Upload preset ==
diff --git a/tools/checkpoint_conversion/convert_mistral_checkpoints.py b/tools/checkpoint_conversion/convert_mistral_checkpoints.py
index dd43bfb8b8..e2ad921139 100644
--- a/tools/checkpoint_conversion/convert_mistral_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_mistral_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -25,10 +25,10 @@
 from transformers import AutoTokenizer
 from transformers import MistralForCausalLM
 
-from keras_nlp.models import MistralBackbone
-from keras_nlp.models import MistralCausalLMPreprocessor
-from keras_nlp.models import MistralTokenizer
-from keras_nlp.utils.preset_utils import save_to_preset
+from keras_hub.models import MistralBackbone
+from keras_hub.models import MistralCausalLMPreprocessor
+from keras_hub.models import MistralTokenizer
+from keras_hub.utils.preset_utils import save_to_preset
 
 PRESET_MAP = {
     "mistral_7b_en": "mistralai/Mistral-7B-v0.1",
@@ -42,15 +42,15 @@
 )
 
 
-def convert_checkpoints(keras_nlp_model, hf_model):
+def convert_checkpoints(keras_hub_model, hf_model):
     config = hf_model.config
 
-    keras_nlp_model.token_embedding.embeddings.assign(
+    keras_hub_model.token_embedding.embeddings.assign(
         hf_model.model.embed_tokens.weight.detach().cpu().numpy()
     )
 
-    for i in range(keras_nlp_model.num_layers):
-        keras_nlp_model.transformer_layers[
+    for i in range(keras_hub_model.num_layers):
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._key_dense.set_weights(
             [
@@ -65,7 +65,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._query_dense.set_weights(
             [
@@ -80,7 +80,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._value_dense.set_weights(
             [
@@ -95,7 +95,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layer._output_dense.set_weights(
             [
@@ -110,7 +110,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._self_attention_layernorm.set_weights(
             [
@@ -120,7 +120,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_intermediate_dense.set_weights(
             [
@@ -130,7 +130,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_output_dense.set_weights(
             [
@@ -140,7 +140,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_gate_dense.set_weights(
             [
@@ -150,7 +150,7 @@ def convert_checkpoints(keras_nlp_model, hf_model):
                 .numpy()
             ]
         )
-        keras_nlp_model.transformer_layers[
+        keras_hub_model.transformer_layers[
             i
         ]._feedforward_layernorm.set_weights(
             [
@@ -161,21 +161,21 @@ def convert_checkpoints(keras_nlp_model, hf_model):
             ]
         )
 
-    keras_nlp_model.layer_norm.set_weights(
+    keras_hub_model.layer_norm.set_weights(
         [hf_model.model.norm.weight.detach().cpu().numpy()]
     )
-    keras_nlp_model.token_embedding.reverse_embeddings.assign(
+    keras_hub_model.token_embedding.reverse_embeddings.assign(
         hf_model.lm_head.weight.T.detach().cpu().numpy()
     )
 
 
 def test_model(
-    keras_nlp_model, keras_nlp_tokenizer, hf_model, hf_model_tokenizer
+    keras_hub_model, keras_hub_tokenizer, hf_model, hf_model_tokenizer
 ):
     # First, test that the number of parameters match
-    keras_nlp_params = keras_nlp_model.count_params()
+    keras_hub_params = keras_hub_model.count_params()
     hf_params = hf_model.num_parameters()
-    assert keras_nlp_params == hf_params
+    assert keras_hub_params == hf_params
 
     # Test the outputs of both the models
     hf_outputs = hf_model(
@@ -183,19 +183,19 @@ def test_model(
     )
     hf_output_logits = hf_outputs.logits.detach().cpu().numpy()
 
-    keras_nlp_preprocessor = MistralCausalLMPreprocessor(keras_nlp_tokenizer)
-    keras_nlp_output = keras_nlp_model(
-        keras_nlp_preprocessor(["What is Keras?"], sequence_length=6)[0]
+    keras_hub_preprocessor = MistralCausalLMPreprocessor(keras_hub_tokenizer)
+    keras_hub_output = keras_hub_model(
+        keras_hub_preprocessor(["What is Keras?"], sequence_length=6)[0]
     )
-    keras_nlp_logits = keras_nlp_model.token_embedding(
-        keras_nlp_output, reverse=True
+    keras_hub_logits = keras_hub_model.token_embedding(
+        keras_hub_output, reverse=True
     )
-    keras_nlp_logits = ops.convert_to_numpy(keras_nlp_logits)
+    keras_hub_logits = ops.convert_to_numpy(keras_hub_logits)
 
     # High tolerence since bfloat16 is used as the default dtype for Mistral
     try:
         np.testing.assert_allclose(
-            keras_nlp_logits, hf_output_logits, atol=1e-4
+            keras_hub_logits, hf_output_logits, atol=1e-4
         )
     except AssertionError as err:
         print("\n")
@@ -204,16 +204,16 @@ def test_model(
         print("\n")
 
 
-def test_tokenizer(keras_nlp_tokenizer, hf_tokenizer):
+def test_tokenizer(keras_hub_tokenizer, hf_tokenizer):
     hf_output = hf_tokenizer(["What is Keras?"], return_tensors="pt")
     hf_output = hf_output["input_ids"].detach().cpu().numpy()
-    keras_nlp_preprocessor = MistralCausalLMPreprocessor(keras_nlp_tokenizer)
-    keras_nlp_output = keras_nlp_preprocessor(
+    keras_hub_preprocessor = MistralCausalLMPreprocessor(keras_hub_tokenizer)
+    keras_hub_output = keras_hub_preprocessor(
         ["What is Keras?"], sequence_length=6
     )
-    keras_nlp_output = ops.convert_to_numpy(keras_nlp_output[0]["token_ids"])
+    keras_hub_output = ops.convert_to_numpy(keras_hub_output[0]["token_ids"])
 
-    np.testing.assert_equal(keras_nlp_output, hf_output)
+    np.testing.assert_equal(keras_hub_output, hf_output)
 
 
 def main(_):
@@ -236,7 +236,7 @@ def main(_):
         hf_model.eval()
         print("\n-> Huggingface model and tokenizer loaded")
 
-        # === Load the KerasNLP model ===
+        # === Load the KerasHub model ===
         backbone_kwargs = dict(
             vocabulary_size=hf_model.config.vocab_size,
             hidden_dim=hf_model.config.hidden_size,
@@ -249,7 +249,7 @@ def main(_):
             rope_max_wavelength=hf_model.config.rope_theta,
             dtype="float32",
         )
-        keras_nlp_model = MistralBackbone(**backbone_kwargs)
+        keras_hub_model = MistralBackbone(**backbone_kwargs)
 
         # === Download the tokenizer from Huggingface model card ===
         spm_path = (
@@ -261,35 +261,35 @@ def main(_):
         tokenizer_path = os.path.join(temp_dir, "vocabulary.spm")
         with open(tokenizer_path, "wb") as tokenizer_file:
             tokenizer_file.write(response.content)
-        keras_nlp_tokenizer = MistralTokenizer(tokenizer_path)
+        keras_hub_tokenizer = MistralTokenizer(tokenizer_path)
         print("\n-> Keras 3 model and tokenizer loaded.")
 
         # === Port the weights ===
-        convert_checkpoints(keras_nlp_model, hf_model)
+        convert_checkpoints(keras_hub_model, hf_model)
         print("\n-> Weight transfer done.")
 
         # === Check that the models and tokenizers outputs match ===
-        test_tokenizer(keras_nlp_tokenizer, hf_tokenizer)
-        test_model(keras_nlp_model, keras_nlp_tokenizer, hf_model, hf_tokenizer)
+        test_tokenizer(keras_hub_tokenizer, hf_tokenizer)
+        test_model(keras_hub_model, keras_hub_tokenizer, hf_model, hf_tokenizer)
         print("\n-> Tests passed!")
 
         # === Save the model weights in float32 format ===
-        keras_nlp_model.save_weights(os.path.join(temp_dir, "model.weights.h5"))
+        keras_hub_model.save_weights(os.path.join(temp_dir, "model.weights.h5"))
         print("\n-> Saved the model weights in float32")
 
-        del keras_nlp_model, hf_model
+        del keras_hub_model, hf_model
         gc.collect()
 
         # === Save the weights again in float16 ===
         backbone_kwargs["dtype"] = "float16"
-        keras_nlp_model = MistralBackbone(**backbone_kwargs)
-        keras_nlp_model.load_weights(os.path.join(temp_dir, "model.weights.h5"))
-        save_to_preset(keras_nlp_model, preset)
+        keras_hub_model = MistralBackbone(**backbone_kwargs)
+        keras_hub_model.load_weights(os.path.join(temp_dir, "model.weights.h5"))
+        save_to_preset(keras_hub_model, preset)
         print("\n-> Saved the model preset in float16")
 
         # === Save the tokenizer ===
         save_to_preset(
-            keras_nlp_tokenizer, preset, config_filename="tokenizer.json"
+            keras_hub_tokenizer, preset, config_filename="tokenizer.json"
         )
         print("\n-> Saved the tokenizer")
     finally:
diff --git a/tools/checkpoint_conversion/convert_opt_checkpoints.py b/tools/checkpoint_conversion/convert_opt_checkpoints.py
index fb8de06531..a0856d6719 100644
--- a/tools/checkpoint_conversion/convert_opt_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_opt_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -21,7 +21,7 @@
 from absl import flags
 from checkpoint_conversion_utils import get_md5_checksum
 
-import keras_nlp
+import keras_hub
 
 PRESET_MAP = {
     "opt_125m_en": "facebook/opt-125m",
@@ -37,31 +37,31 @@
 
 
 def convert_weights(hf_model):
-    print("\n-> Convert original weights to KerasNLP format.")
+    print("\n-> Convert original weights to KerasHub format.")
 
     # Load PyTorch OPT checkpoint.
-    keras_nlp_model = keras_nlp.models.OPTBackbone.from_preset(
+    keras_hub_model = keras_hub.models.OPTBackbone.from_preset(
         FLAGS.preset, load_weights=False
     )
 
     # Token embedding.
-    keras_nlp_model.get_layer("embeddings").token_embedding.embeddings.assign(
+    keras_hub_model.get_layer("embeddings").token_embedding.embeddings.assign(
         hf_model.model.decoder.embed_tokens.weight
     )
     # Position embedding.
-    keras_nlp_model.get_layer(
+    keras_hub_model.get_layer(
         "embeddings"
     ).position_embedding.position_embeddings.assign(
         hf_model.model.decoder.embed_positions.weight[2:, :]
     )
 
-    num_heads = keras_nlp_model.num_heads
-    hidden_dim = keras_nlp_model.hidden_dim
+    num_heads = keras_hub_model.num_heads
+    hidden_dim = keras_hub_model.hidden_dim
 
     # Transformer layers.
-    for i in range(keras_nlp_model.num_layers):
+    for i in range(keras_hub_model.num_layers):
         # Self-attention.
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._query_dense.kernel.assign(
             tf.reshape(
@@ -69,7 +69,7 @@ def convert_weights(hf_model):
                 (hidden_dim, num_heads, -1),
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._query_dense.bias.assign(
             tf.reshape(
@@ -78,7 +78,7 @@ def convert_weights(hf_model):
             )
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._key_dense.kernel.assign(
             tf.reshape(
@@ -86,7 +86,7 @@ def convert_weights(hf_model):
                 (hidden_dim, num_heads, -1),
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._key_dense.bias.assign(
             tf.reshape(
@@ -95,7 +95,7 @@ def convert_weights(hf_model):
             )
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._value_dense.kernel.assign(
             tf.reshape(
@@ -103,7 +103,7 @@ def convert_weights(hf_model):
                 (hidden_dim, num_heads, -1),
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._value_dense.bias.assign(
             tf.reshape(
@@ -112,7 +112,7 @@ def convert_weights(hf_model):
             )
         )
 
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._output_dense.kernel.assign(
             tf.reshape(
@@ -120,83 +120,83 @@ def convert_weights(hf_model):
                 (num_heads, -1, hidden_dim),
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._output_dense.bias.assign(
             hf_model.model.decoder.layers[i].self_attn.out_proj.bias,
         )
 
         # Attention LayerNorm
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer_norm.gamma.assign(
             hf_model.model.decoder.layers[i].self_attn_layer_norm.gamma
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer_norm.beta.assign(
             hf_model.model.decoder.layers[i].self_attn_layer_norm.beta
         )
 
         # Intermediate FF layer
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_intermediate_dense.kernel.assign(
             hf_model.model.decoder.layers[i].fc1.kernel
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_intermediate_dense.bias.assign(
             hf_model.model.decoder.layers[i].fc1.bias
         )
 
         # Output dense layer
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_output_dense.kernel.assign(
             hf_model.model.decoder.layers[i].fc2.kernel
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_output_dense.bias.assign(
             hf_model.model.decoder.layers[i].fc2.bias
         )
 
         # FF LayerNorm
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_layer_norm.gamma.assign(
             hf_model.model.decoder.layers[i].final_layer_norm.gamma
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_layer_norm.beta.assign(
             hf_model.model.decoder.layers[i].final_layer_norm.beta
         )
 
     # Output LayerNorm
-    keras_nlp_model.get_layer("layer_norm").gamma.assign(
+    keras_hub_model.get_layer("layer_norm").gamma.assign(
         hf_model.model.decoder.final_layer_norm.gamma
     )
-    keras_nlp_model.get_layer("layer_norm").beta.assign(
+    keras_hub_model.get_layer("layer_norm").beta.assign(
         hf_model.model.decoder.final_layer_norm.beta
     )
 
     # Save the model.
     model_path = f"./{FLAGS.preset}/model.h5"
-    print(f"-> Save KerasNLP model weights to `{model_path}`.")
-    keras_nlp_model.save_weights(model_path)
+    print(f"-> Save KerasHub model weights to `{model_path}`.")
+    keras_hub_model.save_weights(model_path)
     print("-> Print MD5 checksum of the model weights files.")
     print(f"`{model_path}` md5sum: ", get_md5_checksum(model_path))
 
-    return keras_nlp_model
+    return keras_hub_model
 
 
 def extract_vocab(hf_tokenizer):
     vocabulary_path = f"./{FLAGS.preset}/vocab.json"
     merges_path = f"./{FLAGS.preset}/merges.txt"
-    print(f"\n-> Save KerasNLP vocab to `{vocabulary_path}`.")
-    print(f"-> Save KerasNLP merges to `{merges_path}`.")
+    print(f"\n-> Save KerasHub vocab to `{vocabulary_path}`.")
+    print(f"-> Save KerasHub merges to `{merges_path}`.")
 
     # Huggingface has a save_vocabulary function but it's not byte-for-byte
     # with the source. Instead copy the original downloaded file directly.
@@ -213,7 +213,7 @@ def extract_vocab(hf_tokenizer):
         merges_path,
     )
 
-    keras_nlp_tokenizer = keras_nlp.models.OPTTokenizer(
+    keras_hub_tokenizer = keras_hub.models.OPTTokenizer(
         vocabulary=vocabulary_path, merges=merges_path
     )
 
@@ -221,12 +221,12 @@ def extract_vocab(hf_tokenizer):
     print(f"`{vocabulary_path}` md5sum: ", get_md5_checksum(vocabulary_path))
     print(f"`{merges_path}` md5sum: ", get_md5_checksum(merges_path))
 
-    return keras_nlp_tokenizer
+    return keras_hub_tokenizer
 
 
 def check_output(
-    keras_nlp_model,
-    keras_nlp_tokenizer,
+    keras_hub_model,
+    keras_hub_tokenizer,
     hf_model,
     hf_tokenizer,
 ):
@@ -234,20 +234,20 @@ def check_output(
     input_str = ["the quick brown fox ran, galloped and jumped."]
 
     sequence_length = 16
-    packer = keras_nlp.layers.StartEndPacker(
+    packer = keras_hub.layers.StartEndPacker(
         sequence_length=sequence_length,
-        start_value=keras_nlp_tokenizer.start_token_id,
-        pad_value=keras_nlp_tokenizer.pad_token_id,
+        start_value=keras_hub_tokenizer.start_token_id,
+        pad_value=keras_hub_tokenizer.pad_token_id,
     )
 
-    # KerasNLP
-    token_ids = packer(keras_nlp_tokenizer(input_str))
-    padding_mask = token_ids != keras_nlp_tokenizer.pad_token_id
-    keras_nlp_inputs = {
+    # KerasHub
+    token_ids = packer(keras_hub_tokenizer(input_str))
+    padding_mask = token_ids != keras_hub_tokenizer.pad_token_id
+    keras_hub_inputs = {
         "token_ids": token_ids,
         "padding_mask": padding_mask,
     }
-    keras_nlp_output = keras_nlp_model(keras_nlp_inputs)
+    keras_hub_output = keras_hub_model(keras_hub_inputs)
 
     # HF
     hf_inputs = hf_tokenizer(
@@ -261,14 +261,14 @@ def check_output(
     )
 
     # Compare tokenized inputs. This should be a compete match.
-    print("KerasNLP inputs:", keras_nlp_inputs)
+    print("KerasHub inputs:", keras_hub_inputs)
     print("HF inputs:", hf_inputs)
 
     # Compare outputs, this should match closely, though not exactly.
     hf_output = hf_output.last_hidden_state
-    print("KerasNLP output:", keras_nlp_output[0, 0, :5])
+    print("KerasHub output:", keras_hub_output[0, 0, :5])
     print("HF output:", hf_output[0, 0, :5])
-    difference = keras_nlp_output - hf_output
+    difference = keras_hub_output - hf_output
     difference_non_padding = tf.gather_nd(difference, tf.where(padding_mask))
     print("Difference:", np.mean(difference_non_padding))
 
@@ -281,12 +281,12 @@ def main(_):
     hf_tokenizer = transformers.AutoTokenizer.from_pretrained(hf_id)
     hf_model = transformers.TFAutoModel.from_pretrained(hf_id)
 
-    keras_nlp_tokenizer = extract_vocab(hf_tokenizer)
-    keras_nlp_model = convert_weights(hf_model)
+    keras_hub_tokenizer = extract_vocab(hf_tokenizer)
+    keras_hub_model = convert_weights(hf_model)
 
     check_output(
-        keras_nlp_model,
-        keras_nlp_tokenizer,
+        keras_hub_model,
+        keras_hub_tokenizer,
         hf_model,
         hf_tokenizer,
     )
diff --git a/tools/checkpoint_conversion/convert_pali_gemma_checkpoints.py b/tools/checkpoint_conversion/convert_pali_gemma_checkpoints.py
index 84b5a041e0..cad414e4c3 100644
--- a/tools/checkpoint_conversion/convert_pali_gemma_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_pali_gemma_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -21,7 +21,7 @@
 import keras  # noqa: E402
 from keras import ops  # noqa: E402
 
-from keras_nlp.src.models.pali_gemma.pali_gemma_backbone import (  # noqa: E402
+from keras_hub.src.models.pali_gemma.pali_gemma_backbone import (  # noqa: E402
     PaliGemmaBackbone,
 )
 
diff --git a/tools/checkpoint_conversion/convert_pali_gemma_vit_checkpoints.py b/tools/checkpoint_conversion/convert_pali_gemma_vit_checkpoints.py
index 64122a6319..cc6bb3d5c0 100644
--- a/tools/checkpoint_conversion/convert_pali_gemma_vit_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_pali_gemma_vit_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -18,7 +18,7 @@
 from absl import app  # noqa: E402
 from keras import ops
 
-from keras_nlp.src.models.pali_gemma.pali_gemma_vit import PaliGemmaVit
+from keras_hub.src.models.pali_gemma.pali_gemma_vit import PaliGemmaVit
 
 os.environ["KERAS_BACKEND"] = "jax"
 # No GPU for conversion, makes memory management easier.
diff --git a/tools/checkpoint_conversion/convert_phi3_checkpoints.py b/tools/checkpoint_conversion/convert_phi3_checkpoints.py
index ccd3f21d9e..3846ab8713 100644
--- a/tools/checkpoint_conversion/convert_phi3_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_phi3_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -26,10 +26,10 @@
 import torch  # noqa: E402
 import transformers  # noqa: E402
 
-from keras_nlp import upload_preset  # noqa: E402
-from keras_nlp.src.models import Phi3Backbone  # noqa: E402
-from keras_nlp.src.models import Phi3Preprocessor  # noqa: E402
-from keras_nlp.src.models import Phi3Tokenizer  # noqa: E402
+from keras_hub import upload_preset  # noqa: E402
+from keras_hub.src.models import Phi3Backbone  # noqa: E402
+from keras_hub.src.models import Phi3Preprocessor  # noqa: E402
+from keras_hub.src.models import Phi3Tokenizer  # noqa: E402
 
 PRESET_MAP = {
     "phi3_mini_4k_instruct_en": "microsoft/Phi-3-mini-4k-instruct",
@@ -226,7 +226,7 @@ def validate_output(
 
     hf_model_outputs = hf_model(**hf_model_input)[0]
 
-    # KerasNLP
+    # KerasHub
     keras_model_input = keras_preprocessor(
         ["<|user|>How to win?<|end|><|assistant|>"]
     )
@@ -236,7 +236,7 @@ def validate_output(
     keras_model_outputs = keras_model(keras_model_input)
 
     # Comparing the outputs.
-    print("🔶 KerasNLP output:", keras_model_outputs[0, 0, :10])
+    print("🔶 KerasHub output:", keras_model_outputs[0, 0, :10])
     print("🔶 HF output:", hf_model_outputs[0, 0, :10])
     print(
         "🔶 Difference:",
diff --git a/tools/checkpoint_conversion/convert_resnet_checkpoints.py b/tools/checkpoint_conversion/convert_resnet_checkpoints.py
index 5ac72874e8..eae4554256 100644
--- a/tools/checkpoint_conversion/convert_resnet_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_resnet_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -38,7 +38,7 @@
 from absl import app
 from absl import flags
 
-import keras_nlp
+import keras_hub
 
 PRESET_MAP = {
     "resnet_18_imagenet": "timm/resnet18.a1_in1k",
@@ -54,7 +54,7 @@
 flags.DEFINE_string(
     "preset",
     None,
-    "Must be a valid `ResNet` preset from KerasNLP",
+    "Must be a valid `ResNet` preset from KerasHub",
     required=True,
 )
 flags.DEFINE_string(
@@ -104,8 +104,8 @@ def main(_):
     timm_model = timm.create_model(timm_name, pretrained=True)
     timm_model = timm_model.eval()
 
-    print("✅ Loaded KerasNLP model.")
-    keras_model = keras_nlp.models.ImageClassifier.from_preset(
+    print("✅ Loaded KerasHub model.")
+    keras_model = keras_hub.models.ImageClassifier.from_preset(
         "hf://" + timm_name,
     )
 
@@ -116,7 +116,7 @@ def main(_):
 
     upload_uri = FLAGS.upload_uri
     if upload_uri:
-        keras_nlp.upload_preset(uri=upload_uri, preset=f"./{preset}")
+        keras_hub.upload_preset(uri=upload_uri, preset=f"./{preset}")
         print(f"🏁 Preset uploaded to {upload_uri}")
 
 
diff --git a/tools/checkpoint_conversion/convert_roberta_checkpoints.py b/tools/checkpoint_conversion/convert_roberta_checkpoints.py
index 768a21260d..20ab51465e 100644
--- a/tools/checkpoint_conversion/convert_roberta_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_roberta_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -24,7 +24,7 @@
 from checkpoint_conversion_utils import get_md5_checksum
 from tensorflow import keras
 
-import keras_nlp
+import keras_hub
 
 PRESET_MAP = {
     "roberta_base_en": ("roberta.base", "roberta-base"),
@@ -67,7 +67,7 @@ def download_model(size, hf_model_name):
 
 
 def convert_checkpoints(size):
-    print("\n-> Convert original weights to KerasNLP format.")
+    print("\n-> Convert original weights to KerasHub format.")
     # RoBERTa paths.
     extract_dir = EXTRACT_DIR.format(size)
     checkpoint_path = os.path.join(extract_dir, "model.pt")
@@ -92,15 +92,15 @@ def convert_checkpoints(size):
     }
     print("Config:", cfg)
 
-    keras_nlp_model = keras_nlp.models.RobertaBackbone.from_preset(
+    keras_hub_model = keras_hub.models.RobertaBackbone.from_preset(
         FLAGS.preset, load_weights=False
     )
 
     # Embedding Layer.
-    keras_nlp_model.get_layer("embeddings").token_embedding.embeddings.assign(
+    keras_hub_model.get_layer("embeddings").token_embedding.embeddings.assign(
         pt_model["decoder.sentence_encoder.embed_tokens.weight"].numpy()
     )
-    keras_nlp_model.get_layer(
+    keras_hub_model.get_layer(
         "embeddings"
     ).position_embedding.position_embeddings.assign(
         pt_model["decoder.sentence_encoder.embed_positions.weight"].numpy()[
@@ -109,10 +109,10 @@ def convert_checkpoints(size):
     )
 
     # Embedding LayerNorm.
-    keras_nlp_model.get_layer("embeddings_layer_norm").gamma.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").gamma.assign(
         pt_model["decoder.sentence_encoder.emb_layer_norm.weight"].numpy()
     )
-    keras_nlp_model.get_layer("embeddings_layer_norm").beta.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").beta.assign(
         pt_model["decoder.sentence_encoder.emb_layer_norm.bias"].numpy()
     )
 
@@ -124,7 +124,7 @@ def convert_checkpoints(size):
     range_2 = (cfg["hidden_dim"], 2 * cfg["hidden_dim"])
     range_3 = (2 * cfg["hidden_dim"], 3 * cfg["hidden_dim"])
     # Transformer layers.
-    for i in range(keras_nlp_model.num_layers):
+    for i in range(keras_hub_model.num_layers):
         q_k_v_wts = (
             pt_model[
                 f"decoder.sentence_encoder.layers.{i}.self_attn.in_proj_weight"
@@ -141,42 +141,42 @@ def convert_checkpoints(size):
         )
 
         # Query
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._query_dense.kernel.assign(
             q_k_v_wts[:, range_1[0] : range_1[1]].reshape(
                 (cfg["hidden_dim"], cfg["num_heads"], -1)
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._query_dense.bias.assign(
             q_k_v_bias[range_1[0] : range_1[1]].reshape((cfg["num_heads"], -1))
         )
 
         # Key
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._key_dense.kernel.assign(
             q_k_v_wts[:, range_2[0] : range_2[1]].reshape(
                 (cfg["hidden_dim"], cfg["num_heads"], -1)
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._key_dense.bias.assign(
             q_k_v_bias[range_2[0] : range_2[1]].reshape((cfg["num_heads"], -1))
         )
 
         # Value
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._value_dense.kernel.assign(
             q_k_v_wts[:, range_3[0] : range_3[1]].reshape(
                 (cfg["hidden_dim"], cfg["num_heads"], -1)
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._value_dense.bias.assign(
             q_k_v_bias[range_3[0] : range_3[1]].reshape((cfg["num_heads"], -1))
@@ -190,12 +190,12 @@ def convert_checkpoints(size):
             .numpy()
             .T
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._output_dense.kernel.assign(
             attn_output_wts.reshape((cfg["num_heads"], -1, cfg["hidden_dim"]))
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._output_dense.bias.assign(
             pt_model[
@@ -204,14 +204,14 @@ def convert_checkpoints(size):
         )
 
         # Attention LayerNorm
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer_norm.gamma.assign(
             pt_model[
                 f"decoder.sentence_encoder.layers.{i}.self_attn_layer_norm.weight"
             ].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer_norm.beta.assign(
             pt_model[
@@ -220,42 +220,42 @@ def convert_checkpoints(size):
         )
 
         # Intermediate FF layer
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_intermediate_dense.kernel.assign(
             pt_model[f"decoder.sentence_encoder.layers.{i}.fc1.weight"]
             .numpy()
             .T
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_intermediate_dense.bias.assign(
             pt_model[f"decoder.sentence_encoder.layers.{i}.fc1.bias"].numpy()
         )
 
         # Output dense layer
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_output_dense.kernel.assign(
             pt_model[f"decoder.sentence_encoder.layers.{i}.fc2.weight"]
             .numpy()
             .T
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_output_dense.bias.assign(
             pt_model[f"decoder.sentence_encoder.layers.{i}.fc2.bias"].numpy()
         )
 
         # FF LayerNorm
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_layer_norm.gamma.assign(
             pt_model[
                 f"decoder.sentence_encoder.layers.{i}.final_layer_norm.weight"
             ].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_layer_norm.beta.assign(
             pt_model[
@@ -264,10 +264,10 @@ def convert_checkpoints(size):
         )
 
     # Save the model.
-    print(f"\n-> Save KerasNLP model weights to `{FLAGS.preset}.h5`.")
-    keras_nlp_model.save_weights(f"{FLAGS.preset}.h5")
+    print(f"\n-> Save KerasHub model weights to `{FLAGS.preset}.h5`.")
+    keras_hub_model.save_weights(f"{FLAGS.preset}.h5")
 
-    return keras_nlp_model
+    return keras_hub_model
 
 
 def define_preprocessor(hf_model_name, size):
@@ -276,11 +276,11 @@ def define_preprocessor(hf_model_name, size):
     vocabulary_path = os.path.join(extract_dir, "vocab.json")
     merges_path = os.path.join(extract_dir, "merges.txt")
 
-    keras_nlp_tokenizer = keras_nlp.models.RobertaTokenizer(
+    keras_hub_tokenizer = keras_hub.models.RobertaTokenizer(
         vocabulary=vocabulary_path, merges=merges_path
     )
-    keras_nlp_preprocessor = keras_nlp.models.RobertaTextClassifierPreprocessor(
-        keras_nlp_tokenizer
+    keras_hub_preprocessor = keras_hub.models.RobertaTextClassifierPreprocessor(
+        keras_hub_tokenizer
     )
 
     hf_tokenizer = transformers.AutoTokenizer.from_pretrained(hf_model_name)
@@ -289,21 +289,21 @@ def define_preprocessor(hf_model_name, size):
     print(f"`{vocabulary_path}` md5sum: ", get_md5_checksum(vocabulary_path))
     print(f"`{merges_path}` md5sum: ", get_md5_checksum(merges_path))
 
-    return keras_nlp_preprocessor, hf_tokenizer
+    return keras_hub_preprocessor, hf_tokenizer
 
 
 def check_output(
-    keras_nlp_model,
-    keras_nlp_preprocessor,
+    keras_hub_model,
+    keras_hub_preprocessor,
     hf_model,
     hf_tokenizer,
 ):
     print("\n-> Check the outputs.")
     input_str = ["the quick brown fox ran, galloped and jumped."]
 
-    # KerasNLP
-    keras_nlp_inputs = keras_nlp_preprocessor(tf.constant(input_str))
-    keras_nlp_output = keras_nlp_model.predict(keras_nlp_inputs)
+    # KerasHub
+    keras_hub_inputs = keras_hub_preprocessor(tf.constant(input_str))
+    keras_hub_output = keras_hub_model.predict(keras_hub_inputs)
 
     # HF
     hf_inputs = hf_tokenizer(
@@ -311,14 +311,14 @@ def check_output(
     )
     hf_output = hf_model(**hf_inputs).last_hidden_state
 
-    print("KerasNLP output:", keras_nlp_output[0, 0, :10])
+    print("KerasHub output:", keras_hub_output[0, 0, :10])
     print("HF output:", hf_output[0, 0, :10])
-    print("Difference:", np.mean(keras_nlp_output - hf_output.detach().numpy()))
+    print("Difference:", np.mean(keras_hub_output - hf_output.detach().numpy()))
 
     # Show the MD5 checksum of the model weights.
     print("Model md5sum: ", get_md5_checksum(f"./{FLAGS.preset}.h5"))
 
-    return keras_nlp_output
+    return keras_hub_output
 
 
 def main(_):
@@ -330,19 +330,19 @@ def main(_):
 
     download_model(size, hf_model_name)
 
-    keras_nlp_model = convert_checkpoints(size)
+    keras_hub_model = convert_checkpoints(size)
 
     print("\n-> Load HF model.")
     hf_model = transformers.AutoModel.from_pretrained(hf_model_name)
     hf_model.eval()
 
-    keras_nlp_preprocessor, hf_tokenizer = define_preprocessor(
+    keras_hub_preprocessor, hf_tokenizer = define_preprocessor(
         hf_model_name, size
     )
 
     check_output(
-        keras_nlp_model,
-        keras_nlp_preprocessor,
+        keras_hub_model,
+        keras_hub_preprocessor,
         hf_model,
         hf_tokenizer,
     )
diff --git a/tools/checkpoint_conversion/convert_t5_checkpoints.py b/tools/checkpoint_conversion/convert_t5_checkpoints.py
index 2f9b74b86e..97cf874f73 100644
--- a/tools/checkpoint_conversion/convert_t5_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_t5_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -22,7 +22,7 @@
 from checkpoint_conversion_utils import get_md5_checksum
 from keras import ops
 
-import keras_nlp
+import keras_hub
 
 PRESET_MAP = {
     "t5_small_multi": "t5-small",
@@ -43,7 +43,7 @@
 
 def extract_vocab(hf_tokenizer):
     proto_path = f"./{FLAGS.preset}/vocab.spm"
-    print(f"\n-> Save KerasNLP vocab to `{proto_path}`.")
+    print(f"\n-> Save KerasHub vocab to `{proto_path}`.")
 
     # Huggingface has a save_vocabulary function but it's not byte-for-byte
     # with the source. Instead copy the original downloaded file directly.
@@ -54,7 +54,7 @@ def extract_vocab(hf_tokenizer):
         proto_path,
     )
 
-    keras_tokenizer = keras_nlp.models.T5Tokenizer(
+    keras_tokenizer = keras_hub.models.T5Tokenizer(
         proto=proto_path,
     )
 
@@ -65,7 +65,7 @@ def extract_vocab(hf_tokenizer):
 
 
 def convert_checkpoints(hf_model):
-    keras_nlp_model = keras_nlp.models.T5Backbone.from_preset(
+    keras_hub_model = keras_hub.models.T5Backbone.from_preset(
         FLAGS.preset, load_weights=False
     )
 
@@ -73,44 +73,44 @@ def convert_checkpoints(hf_model):
     print("Original weights:")
     print(list(hf_wts.keys()))
 
-    for i in range(keras_nlp_model.num_layers):
+    for i in range(keras_hub_model.num_layers):
         for section in ["encoder", "decoder"]:
             n = 0
 
             # Token embedding layer
-            keras_nlp_model.get_layer("token_embedding").embeddings.assign(
+            keras_hub_model.get_layer("token_embedding").embeddings.assign(
                 hf_wts[f"{section}.embed_tokens.weight"]
             )
-            if not keras_nlp_model.tie_embedding_weights:
-                keras_nlp_model.get_layer(
+            if not keras_hub_model.tie_embedding_weights:
+                keras_hub_model.get_layer(
                     "token_embedding"
                 ).reverse_embeddings.assign(
                     hf_wts["lm_head.weight"].transpose(1, 0).numpy()
                 )
 
             # Query, key, value, and output projectors in self-attention
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"transformer_{section}_layer_{i}"
             ).self_attention.query_projector.kernel.assign(
                 hf_wts[f"{section}.block.{i}.layer.{n}.SelfAttention.q.weight"]
                 .transpose(1, 0)
                 .numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"transformer_{section}_layer_{i}"
             ).self_attention.key_projector.kernel.assign(
                 hf_wts[f"{section}.block.{i}.layer.{n}.SelfAttention.k.weight"]
                 .transpose(1, 0)
                 .numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"transformer_{section}_layer_{i}"
             ).self_attention.value_projector.kernel.assign(
                 hf_wts[f"{section}.block.{i}.layer.{n}.SelfAttention.v.weight"]
                 .transpose(1, 0)
                 .numpy()
             )
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"transformer_{section}_layer_{i}"
             ).self_attention.output_projector.kernel.assign(
                 hf_wts[f"{section}.block.{i}.layer.{n}.SelfAttention.o.weight"]
@@ -119,10 +119,10 @@ def convert_checkpoints(hf_model):
             )
 
             # Add relative attention bias
-            if keras_nlp_model.get_layer(
+            if keras_hub_model.get_layer(
                 f"transformer_{section}_layer_{i}"
             ).self_attention.use_relative_attention_bias:
-                keras_nlp_model.get_layer(
+                keras_hub_model.get_layer(
                     f"transformer_{section}_layer_{i}"
                 ).self_attention.relative_attention_bias.assign(
                     hf_wts[
@@ -131,7 +131,7 @@ def convert_checkpoints(hf_model):
                 )
 
             # Self-attention norm
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"transformer_{section}_layer_{i}"
             ).self_attention_layer_norm.weight.assign(
                 hf_wts[
@@ -144,7 +144,7 @@ def convert_checkpoints(hf_model):
 
             if section == "decoder":
                 # Cross-attention QKV and output proj (one between encoder and decoder)
-                keras_nlp_model.get_layer(
+                keras_hub_model.get_layer(
                     f"transformer_{section}_layer_{i}"
                 ).cross_attention.query_projector.kernel.assign(
                     hf_wts[
@@ -153,7 +153,7 @@ def convert_checkpoints(hf_model):
                     .transpose(1, 0)
                     .numpy()
                 )
-                keras_nlp_model.get_layer(
+                keras_hub_model.get_layer(
                     f"transformer_{section}_layer_{i}"
                 ).cross_attention.key_projector.kernel.assign(
                     hf_wts[
@@ -162,7 +162,7 @@ def convert_checkpoints(hf_model):
                     .transpose(1, 0)
                     .numpy()
                 )
-                keras_nlp_model.get_layer(
+                keras_hub_model.get_layer(
                     f"transformer_{section}_layer_{i}"
                 ).cross_attention.value_projector.kernel.assign(
                     hf_wts[
@@ -171,7 +171,7 @@ def convert_checkpoints(hf_model):
                     .transpose(1, 0)
                     .numpy()
                 )
-                keras_nlp_model.get_layer(
+                keras_hub_model.get_layer(
                     f"transformer_{section}_layer_{i}"
                 ).cross_attention.output_projector.kernel.assign(
                     hf_wts[
@@ -182,7 +182,7 @@ def convert_checkpoints(hf_model):
                 )
 
                 # Cross-attention layer norm
-                keras_nlp_model.get_layer(
+                keras_hub_model.get_layer(
                     f"transformer_{section}_layer_{i}"
                 ).cross_attention_layer_norm.weight.assign(
                     hf_wts[
@@ -192,11 +192,11 @@ def convert_checkpoints(hf_model):
                 # Increment for next layer
                 n += 1
 
-            if keras_nlp_model.get_layer(
+            if keras_hub_model.get_layer(
                 f"transformer_{section}_layer_{i}"
             ).use_gated_activation:
                 # Input projection layer
-                keras_nlp_model.get_layer(
+                keras_hub_model.get_layer(
                     f"transformer_{section}_layer_{i}"
                 ).input_projector.weights[0].assign(
                     hf_wts[
@@ -207,7 +207,7 @@ def convert_checkpoints(hf_model):
                 )
 
                 # Gated activation layer
-                keras_nlp_model.get_layer(
+                keras_hub_model.get_layer(
                     f"transformer_{section}_layer_{i}"
                 ).gate_projector.weights[0].assign(
                     hf_wts[
@@ -218,7 +218,7 @@ def convert_checkpoints(hf_model):
                 )
             else:
                 # Input projection layer
-                keras_nlp_model.get_layer(
+                keras_hub_model.get_layer(
                     f"transformer_{section}_layer_{i}"
                 ).input_projector.weights[0].assign(
                     hf_wts[
@@ -229,7 +229,7 @@ def convert_checkpoints(hf_model):
                 )
 
             # Output projection layer
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"transformer_{section}_layer_{i}"
             ).output_projector.weights[0].assign(
                 hf_wts[
@@ -240,7 +240,7 @@ def convert_checkpoints(hf_model):
             )
 
             # Layer norm
-            keras_nlp_model.get_layer(
+            keras_hub_model.get_layer(
                 f"transformer_{section}_layer_{i}"
             ).layer_norm.weight.assign(
                 hf_wts[
@@ -249,11 +249,11 @@ def convert_checkpoints(hf_model):
             )
 
             # Final normalization
-            keras_nlp_model.get_layer(f"{section}_output_layer_norm").weights[
+            keras_hub_model.get_layer(f"{section}_output_layer_norm").weights[
                 -1
             ].assign(hf_wts[f"{section}.final_layer_norm.weight"].numpy())
 
-    return keras_nlp_model
+    return keras_hub_model
 
 
 def check_output(
@@ -268,8 +268,8 @@ def check_output(
 
     sequence_length = 12
 
-    # KerasNLP Tokenization
-    packer = keras_nlp.layers.StartEndPacker(
+    # KerasHub Tokenization
+    packer = keras_hub.layers.StartEndPacker(
         sequence_length=sequence_length,
         pad_value=keras_tokenizer.pad_token_id,
         end_value=keras_tokenizer.end_token_id,
@@ -306,7 +306,7 @@ def check_output(
     }
 
     # Compare tokenized inputs. This should be a compete match.
-    print("-> KerasNLP inputs:")
+    print("-> KerasHub inputs:")
     for k, v in keras_inputs.items():
         print(k, v)
     print("-> HF inputs:")
@@ -328,7 +328,7 @@ def check_output(
         hf_hidden_states, ops.where(decoder_padding_mask)
     )
 
-    print("-> KerasNLP output:", keras_outputs[0:5])
+    print("-> KerasHub output:", keras_outputs[0:5])
     print("-> HF output:", hf_outputs[0:5])
     np.testing.assert_allclose(
         keras_outputs.detach().numpy(), hf_outputs.detach().numpy(), atol=1e-5
@@ -343,7 +343,7 @@ def check_output(
         keras_hidden_states, reverse=True
     )
     hf_logits = hf_out.logits
-    print("-> KerasNLP logits:", keras_logits[0:5])
+    print("-> KerasHub logits:", keras_logits[0:5])
     print("-> HF logits:", hf_logits[0:5])
     np.testing.assert_allclose(
         keras_logits.detach().numpy(), hf_logits.detach().numpy(), atol=1e-3
@@ -366,7 +366,7 @@ def main(_):
 
     # Save the model.
     model_path = f"./{FLAGS.preset}/model.weights.h5"
-    print(f"\n-> Save KerasNLP model weights to `{model_path}`.")
+    print(f"\n-> Save KerasHub model weights to `{model_path}`.")
     keras_model.save_weights(model_path)
     print("-> Print MD5 checksum of the model weights files.")
     print(f"`{model_path}` md5sum: ", get_md5_checksum(model_path))
diff --git a/tools/checkpoint_conversion/convert_xlm_roberta_checkpoints.py b/tools/checkpoint_conversion/convert_xlm_roberta_checkpoints.py
index 5937141f0a..2da622c870 100644
--- a/tools/checkpoint_conversion/convert_xlm_roberta_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_xlm_roberta_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -23,7 +23,7 @@
 from checkpoint_conversion_utils import get_md5_checksum
 from tensorflow import keras
 
-import keras_nlp
+import keras_hub
 
 PRESET_MAP = {
     "xlm_roberta_base_multi": ("xlmr.base", "xlm-roberta-base"),
@@ -52,7 +52,7 @@ def download_model(size):
 
 
 def convert_checkpoints(size):
-    print("\n-> Convert original weights to KerasNLP format.")
+    print("\n-> Convert original weights to KerasHub format.")
     # XLM-RoBERTa paths.
     extract_dir = EXTRACT_DIR.format(size)
     checkpoint_path = os.path.join(extract_dir, "model.pt")
@@ -77,15 +77,15 @@ def convert_checkpoints(size):
     }
     print("Config:", cfg)
 
-    keras_nlp_model = keras_nlp.models.XLMRobertaBackbone.from_preset(
+    keras_hub_model = keras_hub.models.XLMRobertaBackbone.from_preset(
         FLAGS.preset, load_weights=False
     )
 
     # Embedding Layer.
-    keras_nlp_model.get_layer("embeddings").token_embedding.embeddings.assign(
+    keras_hub_model.get_layer("embeddings").token_embedding.embeddings.assign(
         pt_model["decoder.sentence_encoder.embed_tokens.weight"].numpy()
     )
-    keras_nlp_model.get_layer(
+    keras_hub_model.get_layer(
         "embeddings"
     ).position_embedding.position_embeddings.assign(
         pt_model["decoder.sentence_encoder.embed_positions.weight"].numpy()[
@@ -94,10 +94,10 @@ def convert_checkpoints(size):
     )
 
     # Embedding LayerNorm.
-    keras_nlp_model.get_layer("embeddings_layer_norm").gamma.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").gamma.assign(
         pt_model["decoder.sentence_encoder.emb_layer_norm.weight"].numpy()
     )
-    keras_nlp_model.get_layer("embeddings_layer_norm").beta.assign(
+    keras_hub_model.get_layer("embeddings_layer_norm").beta.assign(
         pt_model["decoder.sentence_encoder.emb_layer_norm.bias"].numpy()
     )
 
@@ -105,7 +105,7 @@ def convert_checkpoints(size):
     range_2 = (cfg["hidden_dim"], 2 * cfg["hidden_dim"])
     range_3 = (2 * cfg["hidden_dim"], 3 * cfg["hidden_dim"])
     # Transformer layers.
-    for i in range(keras_nlp_model.num_layers):
+    for i in range(keras_hub_model.num_layers):
         q_k_v_wts = (
             pt_model[
                 f"decoder.sentence_encoder.layers.{i}.self_attn.in_proj_weight"
@@ -122,42 +122,42 @@ def convert_checkpoints(size):
         )
 
         # Query
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._query_dense.kernel.assign(
             q_k_v_wts[:, range_1[0] : range_1[1]].reshape(
                 (cfg["hidden_dim"], cfg["num_heads"], -1)
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._query_dense.bias.assign(
             q_k_v_bias[range_1[0] : range_1[1]].reshape((cfg["num_heads"], -1))
         )
 
         # Key
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._key_dense.kernel.assign(
             q_k_v_wts[:, range_2[0] : range_2[1]].reshape(
                 (cfg["hidden_dim"], cfg["num_heads"], -1)
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._key_dense.bias.assign(
             q_k_v_bias[range_2[0] : range_2[1]].reshape((cfg["num_heads"], -1))
         )
 
         # Value
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._value_dense.kernel.assign(
             q_k_v_wts[:, range_3[0] : range_3[1]].reshape(
                 (cfg["hidden_dim"], cfg["num_heads"], -1)
             )
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._value_dense.bias.assign(
             q_k_v_bias[range_3[0] : range_3[1]].reshape((cfg["num_heads"], -1))
@@ -171,12 +171,12 @@ def convert_checkpoints(size):
             .numpy()
             .T
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._output_dense.kernel.assign(
             attn_output_wts.reshape((cfg["num_heads"], -1, cfg["hidden_dim"]))
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer._output_dense.bias.assign(
             pt_model[
@@ -185,14 +185,14 @@ def convert_checkpoints(size):
         )
 
         # Attention LayerNorm
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer_norm.gamma.assign(
             pt_model[
                 f"decoder.sentence_encoder.layers.{i}.self_attn_layer_norm.weight"
             ].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._self_attention_layer_norm.beta.assign(
             pt_model[
@@ -201,42 +201,42 @@ def convert_checkpoints(size):
         )
 
         # Intermediate FF layer
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_intermediate_dense.kernel.assign(
             pt_model[f"decoder.sentence_encoder.layers.{i}.fc1.weight"]
             .numpy()
             .T
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_intermediate_dense.bias.assign(
             pt_model[f"decoder.sentence_encoder.layers.{i}.fc1.bias"].numpy()
         )
 
         # Output dense layer
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_output_dense.kernel.assign(
             pt_model[f"decoder.sentence_encoder.layers.{i}.fc2.weight"]
             .numpy()
             .T
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_output_dense.bias.assign(
             pt_model[f"decoder.sentence_encoder.layers.{i}.fc2.bias"].numpy()
         )
 
         # FF LayerNorm
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_layer_norm.gamma.assign(
             pt_model[
                 f"decoder.sentence_encoder.layers.{i}.final_layer_norm.weight"
             ].numpy()
         )
-        keras_nlp_model.get_layer(
+        keras_hub_model.get_layer(
             f"transformer_layer_{i}"
         )._feedforward_layer_norm.beta.assign(
             pt_model[
@@ -245,10 +245,10 @@ def convert_checkpoints(size):
         )
 
     # Save the model.
-    print(f"\n-> Save KerasNLP model weights to `{FLAGS.preset}.h5`.")
-    keras_nlp_model.save_weights(f"{FLAGS.preset}.h5")
+    print(f"\n-> Save KerasHub model weights to `{FLAGS.preset}.h5`.")
+    keras_hub_model.save_weights(f"{FLAGS.preset}.h5")
 
-    return keras_nlp_model
+    return keras_hub_model
 
 
 def define_preprocessor(hf_model_name, size):
@@ -256,12 +256,12 @@ def define_preprocessor(hf_model_name, size):
     extract_dir = EXTRACT_DIR.format(size)
     spm_path = os.path.join(extract_dir, "sentencepiece.bpe.model")
 
-    keras_nlp_tokenizer = keras_nlp.models.XLMRobertaTokenizer(
+    keras_hub_tokenizer = keras_hub.models.XLMRobertaTokenizer(
         proto=spm_path,
     )
-    keras_nlp_preprocessor = (
-        keras_nlp.models.XLMRobertaTextClassifierPreprocessor(
-            keras_nlp_tokenizer
+    keras_hub_preprocessor = (
+        keras_hub.models.XLMRobertaTextClassifierPreprocessor(
+            keras_hub_tokenizer
         )
     )
 
@@ -270,21 +270,21 @@ def define_preprocessor(hf_model_name, size):
     print("\n-> Print MD5 checksum of the vocab files.")
     print(f"`{spm_path}` md5sum: ", get_md5_checksum(spm_path))
 
-    return keras_nlp_preprocessor, hf_tokenizer
+    return keras_hub_preprocessor, hf_tokenizer
 
 
 def check_output(
-    keras_nlp_model,
-    keras_nlp_preprocessor,
+    keras_hub_model,
+    keras_hub_preprocessor,
     hf_model,
     hf_tokenizer,
 ):
     print("\n-> Check the outputs.")
     input_str = ["the quick brown fox ran, galloped and jumped."]
 
-    # KerasNLP
-    keras_nlp_inputs = keras_nlp_preprocessor(tf.constant(input_str))
-    keras_nlp_output = keras_nlp_model.predict(keras_nlp_inputs)
+    # KerasHub
+    keras_hub_inputs = keras_hub_preprocessor(tf.constant(input_str))
+    keras_hub_output = keras_hub_model.predict(keras_hub_inputs)
 
     # HF
     hf_inputs = hf_tokenizer(
@@ -292,14 +292,14 @@ def check_output(
     )
     hf_output = hf_model(**hf_inputs).last_hidden_state
 
-    print("KerasNLP output:", keras_nlp_output[0, 0, :10])
+    print("KerasHub output:", keras_hub_output[0, 0, :10])
     print("HF output:", hf_output[0, 0, :10])
-    print("Difference:", np.mean(keras_nlp_output - hf_output.detach().numpy()))
+    print("Difference:", np.mean(keras_hub_output - hf_output.detach().numpy()))
 
     # Show the MD5 checksum of the model weights.
     print("Model md5sum: ", get_md5_checksum(f"./{FLAGS.preset}.h5"))
 
-    return keras_nlp_output
+    return keras_hub_output
 
 
 def main(_):
@@ -311,19 +311,19 @@ def main(_):
 
     download_model(size)
 
-    keras_nlp_model = convert_checkpoints(size)
+    keras_hub_model = convert_checkpoints(size)
 
     print("\n-> Load HF model.")
     hf_model = transformers.AutoModel.from_pretrained(hf_model_name)
     hf_model.eval()
 
-    keras_nlp_preprocessor, hf_tokenizer = define_preprocessor(
+    keras_hub_preprocessor, hf_tokenizer = define_preprocessor(
         hf_model_name, size
     )
 
     check_output(
-        keras_nlp_model,
-        keras_nlp_preprocessor,
+        keras_hub_model,
+        keras_hub_preprocessor,
         hf_model,
         hf_tokenizer,
     )
diff --git a/tools/checkpoint_conversion/convert_xlnet_checkpoints.py b/tools/checkpoint_conversion/convert_xlnet_checkpoints.py
index 6c36a626be..f43a130463 100644
--- a/tools/checkpoint_conversion/convert_xlnet_checkpoints.py
+++ b/tools/checkpoint_conversion/convert_xlnet_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -22,7 +22,7 @@
 from transformers import TFXLNetModel
 from transformers import XLNetTokenizer
 
-from keras_nlp.models import XLNetBackbone
+from keras_hub.models import XLNetBackbone
 
 check_mems = False
 
@@ -55,7 +55,7 @@
 del tokenized_knlp["input_ids"]
 del tokenized_knlp["token_type_ids"]
 
-# create keras_nlp model
+# create keras_hub model
 knlp_model = XLNetBackbone(
     vocabulary_size=hf_model.config.vocab_size,
     num_layers=hf_model.config.n_layer,
@@ -65,7 +65,7 @@
     dropout=0.0,
     kernel_initializer_range=hf_model.config.initializer_range,
 )
-# Load weights for keras_nlp model
+# Load weights for keras_hub model
 file_hf = h5py.File("./tf_weights.h5", "r")
 
 try:
diff --git a/tools/checkpoint_training/bert_tiny_uncased_en_sst2_training.ipynb b/tools/checkpoint_training/bert_tiny_uncased_en_sst2_training.ipynb
index 218a40f9ed..0ef7a128c8 100644
--- a/tools/checkpoint_training/bert_tiny_uncased_en_sst2_training.ipynb
+++ b/tools/checkpoint_training/bert_tiny_uncased_en_sst2_training.ipynb
@@ -19,7 +19,7 @@
   {
    "cell_type": "markdown",
    "source": [
-    "# keras-nlp installation"
+    "# keras-hub installation"
    ],
    "metadata": {
     "id": "FKTVkreu3MGG"
@@ -60,7 +60,7 @@
   {
    "cell_type": "code",
    "source": [
-    "import keras_nlp\n",
+    "import keras_hub\n",
     "import tensorflow as tf\n",
     "from tensorflow import keras\n",
     "import tensorflow_datasets as tfds"
@@ -171,9 +171,9 @@
    "cell_type": "code",
    "source": [
     "# Create the TextClassifier Model\n",
-    "# For more details please look https://keras.io/guides/keras_nlp/getting_started/\n",
+    "# For more details please look https://keras.io/guides/keras_hub/getting_started/\n",
     "\n",
-    "classifier = keras_nlp.models.BertTextClassifier.from_preset(\n",
+    "classifier = keras_hub.models.BertTextClassifier.from_preset(\n",
     "    \"bert_tiny_en_uncased\", num_classes=2, dropout=0.1\n",
     ")\n",
     "\n",
diff --git a/tools/convert_legacy_presets.py b/tools/convert_legacy_presets.py
index e6a8e39cf4..1bdf4661f9 100644
--- a/tools/convert_legacy_presets.py
+++ b/tools/convert_legacy_presets.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -24,10 +24,10 @@
 
 os.environ["KERAS_HOME"] = os.getcwd()
 
-from keras_nlp import models  # noqa: E402
-from keras_nlp.src.utils.preset_utils import save_to_preset  # noqa: E402
+from keras_hub import models  # noqa: E402
+from keras_hub.src.utils.preset_utils import save_to_preset  # noqa: E402
 
-BUCKET = "keras-nlp-kaggle"
+BUCKET = "keras-hub-kaggle"
 
 
 def to_snake_case(name):
diff --git a/tools/count_preset_params.py b/tools/count_preset_params.py
index 48f87a971e..2bb3745ba8 100644
--- a/tools/count_preset_params.py
+++ b/tools/count_preset_params.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -27,7 +27,7 @@
 from keras.utils.layer_utils import count_params
 from tensorflow import keras
 
-import keras_nlp
+import keras_hub
 
 FLAGS = flags.FLAGS
 flags.DEFINE_string("model", None, "The name of a model, e.g. BertBackbone.")
@@ -37,7 +37,7 @@
 
 
 def main(_):
-    for name, symbol in keras_nlp.models.__dict__.items():
+    for name, symbol in keras_hub.models.__dict__.items():
         if inspect.isclass(symbol) and issubclass(symbol, keras.Model):
             if FLAGS.model and name != FLAGS.model:
                 continue
diff --git a/tools/gemma/export_gemma_to_hf.py b/tools/gemma/export_gemma_to_hf.py
index 6f1fdf24d2..f0e4641bf3 100644
--- a/tools/gemma/export_gemma_to_hf.py
+++ b/tools/gemma/export_gemma_to_hf.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -20,7 +20,7 @@
 from absl import app
 from absl import flags
 
-import keras_nlp
+import keras_hub
 
 os.environ["KERAS_BACKEND"] = "torch"
 
@@ -134,28 +134,28 @@ def _set_default_tensor_type(dtype: torch.dtype):
 def convert_checkpoints(preset, weights_file, size, output_dir, vocab_path):
     if preset is not None:
         hf_id = PRESET_MAP[preset]
-        print(f"\n-> Loading KerasNLP Gemma model with preset `{preset}`...")
-        keras_nlp_model = keras_nlp.models.GemmaCausalLM.from_preset(preset)
+        print(f"\n-> Loading KerasHub Gemma model with preset `{preset}`...")
+        keras_hub_model = keras_hub.models.GemmaCausalLM.from_preset(preset)
     else:
         hf_id, keras_preset = SIZE_MAP[size.lower()]
         print(f"\n-> Loading Keras weights from file `{weights_file}`...")
-        keras_nlp_model = keras_nlp.models.GemmaCausalLM.from_preset(
+        keras_hub_model = keras_hub.models.GemmaCausalLM.from_preset(
             keras_preset
         )
-        keras_nlp_model.load_weights(weights_file)
+        keras_hub_model.load_weights(weights_file)
 
     print(f"\n-> Loading HuggingFace Gemma `{size.upper()}` model...")
     hf_model = transformers.GemmaForCausalLM(CONFIG_MAPPING[size.lower()])
 
     print("\n✅ Model loading complete.")
-    print("\n-> Converting weights from KerasNLP Gemma to HuggingFace Gemma...")
+    print("\n-> Converting weights from KerasHub Gemma to HuggingFace Gemma...")
 
     # Token embedding (with vocab size difference handling)
-    keras_embedding = keras_nlp_model.backbone.token_embedding.weights[0]
+    keras_embedding = keras_hub_model.backbone.token_embedding.weights[0]
     hf_vocab_size = hf_model.model.embed_tokens.weight.shape[0]
-    keras_nlp_vocab_size = keras_embedding.value.shape[0]
-    if hf_vocab_size < keras_nlp_vocab_size:
-        diff = keras_nlp_vocab_size - hf_vocab_size
+    keras_hub_vocab_size = keras_embedding.value.shape[0]
+    if hf_vocab_size < keras_hub_vocab_size:
+        diff = keras_hub_vocab_size - hf_vocab_size
         update_state_dict(
             hf_model.model.embed_tokens,
             "weight",
@@ -169,8 +169,8 @@ def convert_checkpoints(preset, weights_file, size, output_dir, vocab_path):
         )
 
     # Decoder blocks
-    for i in range(keras_nlp_model.backbone.num_layers):
-        decoder_block = keras_nlp_model.backbone.get_layer(f"decoder_block_{i}")
+    for i in range(keras_hub_model.backbone.num_layers):
+        decoder_block = keras_hub_model.backbone.get_layer(f"decoder_block_{i}")
 
         # Pre-attention norm
         update_state_dict(
@@ -247,7 +247,7 @@ def convert_checkpoints(preset, weights_file, size, output_dir, vocab_path):
     update_state_dict(
         hf_model.model.norm,
         "weight",
-        keras_nlp_model.backbone.layers[-1].weights[0].value,
+        keras_hub_model.backbone.layers[-1].weights[0].value,
     )
 
     print("\n✅ Weights converted successfully.")
@@ -264,14 +264,14 @@ def convert_checkpoints(preset, weights_file, size, output_dir, vocab_path):
     if not vocab_path:
         tokenizer_preset = preset or SIZE_MAP[size.lower()]
         print(
-            "\n-> Loading KerasNLP Gemma tokenizer with "
+            "\n-> Loading KerasHub Gemma tokenizer with "
             f"preset `{tokenizer_preset}`..."
         )
-        keras_nlp_tokenizer = keras_nlp.models.GemmaTokenizer.from_preset(
+        keras_hub_tokenizer = keras_hub.models.GemmaTokenizer.from_preset(
             tokenizer_preset
         )
         # Save tokenizer state
-        keras_nlp_tokenizer.save_assets(output_dir)
+        keras_hub_tokenizer.save_assets(output_dir)
         vocab_path = os.path.join(output_dir, "vocabulary.spm")
         print("\n✅ Tokenizer loading complete.")
 
diff --git a/tools/gemma/export_gemma_to_torch_xla.py b/tools/gemma/export_gemma_to_torch_xla.py
index 08d4b3ac98..d4ca432d8d 100644
--- a/tools/gemma/export_gemma_to_torch_xla.py
+++ b/tools/gemma/export_gemma_to_torch_xla.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -33,7 +33,7 @@
 from absl import flags
 from gemma import model_xla as gemma_model
 
-import keras_nlp
+import keras_hub
 
 os.environ["KERAS_BACKEND"] = "torch"
 
@@ -66,7 +66,7 @@
 functionality of the converted checkpoint:
 
 ```
-python keras-nlp-gemma/tools/gemma/run_gemma_xla.py \
+python keras-hub-gemma/tools/gemma/run_gemma_xla.py \
   --size 2b \
   --checkpoint_file fine_tuned_imdb.ckpt \
   --vocab_file gemma_tokenizer/vocabulary.spm \
@@ -146,8 +146,8 @@ def convert_checkpoints(preset, weights_file, size, output_file, vocab_dir):
         model = gemma_model.GemmaForCausalLM(
             PRESET_MAP[preset], world_size=1, rank=0, device=device
         )
-        print(f"\n-> Loading KerasNLP Gemma model with preset `{preset}`...")
-        keras_nlp_model = keras_nlp.models.GemmaCausalLM.from_preset(preset)
+        print(f"\n-> Loading KerasHub Gemma model with preset `{preset}`...")
+        keras_hub_model = keras_hub.models.GemmaCausalLM.from_preset(preset)
     else:
         print(f"\n-> Loading PyTorch Gemma model config for `{size}` model...")
         config, size_preset = SIZE_MAP[size.lower()]
@@ -155,20 +155,20 @@ def convert_checkpoints(preset, weights_file, size, output_file, vocab_dir):
             config, world_size=1, rank=0, device=device
         )
         print(f"\n-> Loading Keras weights from file `{weights_file}`...")
-        keras_nlp_model = keras_nlp.models.GemmaCausalLM.from_preset(
+        keras_hub_model = keras_hub.models.GemmaCausalLM.from_preset(
             size_preset
         )
-        keras_nlp_model.load_weights(weights_file)
+        keras_hub_model.load_weights(weights_file)
 
     print("\n✅ Model loading complete.")
-    print("\n-> Converting weights from KerasNLP Gemma to PyTorch Gemma...")
+    print("\n-> Converting weights from KerasHub Gemma to PyTorch Gemma...")
 
     # Token embedding (with vocab size difference handling)
-    keras_embedding = keras_nlp_model.backbone.token_embedding.weights[0]
+    keras_embedding = keras_hub_model.backbone.token_embedding.weights[0]
     torch_vocab_size = model.embedder.weight.shape[0]
-    keras_nlp_vocab_size = keras_embedding.value.shape[0]
-    if torch_vocab_size < keras_nlp_vocab_size:
-        diff = keras_nlp_vocab_size - torch_vocab_size
+    keras_hub_vocab_size = keras_embedding.value.shape[0]
+    if torch_vocab_size < keras_hub_vocab_size:
+        diff = keras_hub_vocab_size - torch_vocab_size
         update_state_dict(
             model.embedder,
             "weight",
@@ -182,8 +182,8 @@ def convert_checkpoints(preset, weights_file, size, output_file, vocab_dir):
         )
 
     # Decoder blocks
-    for i in range(keras_nlp_model.backbone.num_layers):
-        decoder_block = keras_nlp_model.backbone.get_layer(f"decoder_block_{i}")
+    for i in range(keras_hub_model.backbone.num_layers):
+        decoder_block = keras_hub_model.backbone.get_layer(f"decoder_block_{i}")
         # Pre-attention norm
         update_state_dict(
             model.model.layers[i].input_layernorm,
@@ -246,7 +246,7 @@ def convert_checkpoints(preset, weights_file, size, output_file, vocab_dir):
     update_state_dict(
         model.model.norm,
         "weight",
-        keras_nlp_model.backbone.layers[-1].weights[0].value,
+        keras_hub_model.backbone.layers[-1].weights[0].value,
     )
 
     print("\n✅ Weights converted successfully.")
@@ -262,9 +262,9 @@ def convert_checkpoints(preset, weights_file, size, output_file, vocab_dir):
     if preset is not None:
         # Tokenizer
         print(
-            f"\n-> Loading KerasNLP Gemma tokenizer with preset `{preset}`..."
+            f"\n-> Loading KerasHub Gemma tokenizer with preset `{preset}`..."
         )
-        keras_nlp_tokenizer = keras_nlp.models.GemmaTokenizer.from_preset(
+        keras_hub_tokenizer = keras_hub.models.GemmaTokenizer.from_preset(
             preset
         )
         print("\n✅ Model loading complete.")
@@ -272,7 +272,7 @@ def convert_checkpoints(preset, weights_file, size, output_file, vocab_dir):
 
         # Save tokenizer state
         os.makedirs(vocab_dir, exist_ok=True)
-        keras_nlp_tokenizer.save_assets(vocab_dir)
+        keras_hub_tokenizer.save_assets(vocab_dir)
 
         print(
             "\n✅ Saving complete. Tokenizer state "
diff --git a/tools/gemma/run_gemma_xla.py b/tools/gemma/run_gemma_xla.py
index f212154c99..57987db60c 100644
--- a/tools/gemma/run_gemma_xla.py
+++ b/tools/gemma/run_gemma_xla.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -53,7 +53,7 @@
 vocabulary file, and test prompt.
 
 ```
-python keras-nlp-gemma/tools/gemma/run_gemma_xla.py \
+python keras-hub-gemma/tools/gemma/run_gemma_xla.py \
   --size 2b \
   --checkpoint_file fine_tuned_imdb.ckpt \
   --vocab_file gemma_tokenizer/vocabulary.spm \
@@ -72,7 +72,7 @@
 the associated preset name:
 
 ```
-python keras-nlp-gemma/tools/gemma/run_gemma_xla.py \
+python keras-hub-gemma/tools/gemma/run_gemma_xla.py \
     --preset gemma_2b_en \
     --checkpoint_file gemma_2b.ckpt \
     --prompt "California is the largest"
diff --git a/tools/glue.py b/tools/glue.py
index 50f9852fd7..a43b5badf6 100644
--- a/tools/glue.py
+++ b/tools/glue.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -23,7 +23,7 @@
 from absl import flags
 from tensorflow import keras
 
-import keras_nlp
+import keras_hub
 
 FLAGS = flags.FLAGS
 
@@ -228,7 +228,7 @@ def main(_):
     train_ds, test_ds, val_ds, idx_order = load_data(FLAGS.task_name)
     # ----- Custom code block starts -----
     bert_preprocessor = (
-        keras_nlp.models.BertTextClassifierPreprocessor.from_preset(
+        keras_hub.models.BertTextClassifierPreprocessor.from_preset(
             "bert_base_en_uncased"
         )
     )
@@ -272,10 +272,10 @@ def preprocess_fn(feature, label):
             # Commonly the classifier is simply your model + several dense layers,
             # please refer to "Make the Finetuning Model" section in README for
             # detailed instructions.
-            bert_model = keras_nlp.models.BertBackbone.from_preset(
+            bert_model = keras_hub.models.BertBackbone.from_preset(
                 "bert_base_en_uncased"
             )
-            finetuning_model = keras_nlp.models.BertTextClassifier(
+            finetuning_model = keras_hub.models.BertTextClassifier(
                 backbone=bert_model,
                 num_classes=num_classes,
             )
diff --git a/tools/pretrained_tokenizers/word_piece_cleaning_script.py b/tools/pretrained_tokenizers/word_piece_cleaning_script.py
index 7ee257ac81..9c4f8bec96 100644
--- a/tools/pretrained_tokenizers/word_piece_cleaning_script.py
+++ b/tools/pretrained_tokenizers/word_piece_cleaning_script.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/pretrained_tokenizers/word_piece_training_script.py b/tools/pretrained_tokenizers/word_piece_training_script.py
index 485e30a868..5b82290341 100644
--- a/tools/pretrained_tokenizers/word_piece_training_script.py
+++ b/tools/pretrained_tokenizers/word_piece_training_script.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,7 +14,7 @@
 import os
 import time
 
-import keras_nlp
+import keras_hub
 
 # List directories of parsed Wikipedia articles and vocab sizes
 directories = [
@@ -45,7 +45,7 @@
             raise ValueError("already done.")
 
         start = time.time()
-        keras_nlp.tokenizers.compute_word_piece_vocabulary(
+        keras_hub.tokenizers.compute_word_piece_vocabulary(
             files,
             vocabulary_size=vocab_size,
             lowercase=False,
diff --git a/tools/quantize_checkpoints.py b/tools/quantize_checkpoints.py
index 68f5e57ba4..1f28bb35d9 100644
--- a/tools/quantize_checkpoints.py
+++ b/tools/quantize_checkpoints.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -24,7 +24,7 @@
 from absl import app
 from absl import flags
 
-import keras_nlp
+import keras_hub
 
 FLAGS = flags.FLAGS
 
@@ -32,7 +32,7 @@
 flags.DEFINE_string(
     "preset",
     None,
-    "Must be a valid `CausalLM` preset from KerasNLP",
+    "Must be a valid `CausalLM` preset from KerasHub",
     required=True,
 )
 flags.DEFINE_string(
@@ -49,7 +49,7 @@ def validate_output(causal_lm):
 
     keras_output = causal_lm.generate([input_str], max_length=length)
     keras_output = keras_output[0]
-    print("🔶 KerasNLP output:", keras_output)
+    print("🔶 KerasHub output:", keras_output)
 
 
 def main(_):
@@ -59,7 +59,7 @@ def main(_):
 
     keras.config.set_floatx("bfloat16")
 
-    causal_lm = keras_nlp.models.CausalLM.from_preset(preset, dtype="bfloat16")
+    causal_lm = keras_hub.models.CausalLM.from_preset(preset, dtype="bfloat16")
     backbone = causal_lm.backbone
     tokenizer = causal_lm.preprocessor.tokenizer
 
@@ -76,7 +76,7 @@ def main(_):
     print(f"🏁 Preset saved to ./{quantized_preset}")
 
     if upload_uri:
-        keras_nlp.upload_preset(uri=upload_uri, preset=quantized_preset)
+        keras_hub.upload_preset(uri=upload_uri, preset=quantized_preset)
         print(f"🏁 Preset uploaded to {upload_uri}")
 
 
diff --git a/tools/sentencepiece_testing/__init__.py b/tools/sentencepiece_testing/__init__.py
index 3364a6bd16..fd48fde00f 100644
--- a/tools/sentencepiece_testing/__init__.py
+++ b/tools/sentencepiece_testing/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/create_albert_test_proto.py b/tools/sentencepiece_testing/create_albert_test_proto.py
index 5799734424..d7a85d3778 100644
--- a/tools/sentencepiece_testing/create_albert_test_proto.py
+++ b/tools/sentencepiece_testing/create_albert_test_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/create_deberta_v3_test_proto.py b/tools/sentencepiece_testing/create_deberta_v3_test_proto.py
index c9562f0f43..ddafb33d4d 100644
--- a/tools/sentencepiece_testing/create_deberta_v3_test_proto.py
+++ b/tools/sentencepiece_testing/create_deberta_v3_test_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/create_f_net_test_proto.py b/tools/sentencepiece_testing/create_f_net_test_proto.py
index a836a3c9c8..9dcbb930e4 100644
--- a/tools/sentencepiece_testing/create_f_net_test_proto.py
+++ b/tools/sentencepiece_testing/create_f_net_test_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/create_gemma_test_proto.py b/tools/sentencepiece_testing/create_gemma_test_proto.py
index 8031edc5d5..f8ffc078cc 100644
--- a/tools/sentencepiece_testing/create_gemma_test_proto.py
+++ b/tools/sentencepiece_testing/create_gemma_test_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/create_llama_test_proto.py b/tools/sentencepiece_testing/create_llama_test_proto.py
index 6787f320ac..2028daa1f6 100644
--- a/tools/sentencepiece_testing/create_llama_test_proto.py
+++ b/tools/sentencepiece_testing/create_llama_test_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/create_mistral_test_proto.py b/tools/sentencepiece_testing/create_mistral_test_proto.py
index 8d8fb27a39..682ca1fbdf 100644
--- a/tools/sentencepiece_testing/create_mistral_test_proto.py
+++ b/tools/sentencepiece_testing/create_mistral_test_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/create_no_special_token_proto.py b/tools/sentencepiece_testing/create_no_special_token_proto.py
index 38af2cdc58..0706b7aca5 100644
--- a/tools/sentencepiece_testing/create_no_special_token_proto.py
+++ b/tools/sentencepiece_testing/create_no_special_token_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/create_phi3_test_proto.py b/tools/sentencepiece_testing/create_phi3_test_proto.py
index a9e02ff92e..eb4375c8af 100644
--- a/tools/sentencepiece_testing/create_phi3_test_proto.py
+++ b/tools/sentencepiece_testing/create_phi3_test_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -30,7 +30,7 @@
 def add_added_tokens(filename):
     with open(
         pathlib.Path(__file__).parent.parent.parent
-        / "keras_nlp"
+        / "keras_hub"
         / "src"
         / "tests"
         / "test_data"
@@ -47,7 +47,7 @@ def add_added_tokens(filename):
         model_proto.pieces.append(new_token)
     with open(
         pathlib.Path(__file__).parent.parent.parent
-        / "keras_nlp"
+        / "keras_hub"
         / "src"
         / "tests"
         / "test_data"
diff --git a/tools/sentencepiece_testing/create_sentence_piece_tokenizer_proto.py b/tools/sentencepiece_testing/create_sentence_piece_tokenizer_proto.py
index d2abbc67a1..40f5ce88bf 100644
--- a/tools/sentencepiece_testing/create_sentence_piece_tokenizer_proto.py
+++ b/tools/sentencepiece_testing/create_sentence_piece_tokenizer_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/create_t5_test_proto.py b/tools/sentencepiece_testing/create_t5_test_proto.py
index 0133d6cff8..7e4a9c3984 100644
--- a/tools/sentencepiece_testing/create_t5_test_proto.py
+++ b/tools/sentencepiece_testing/create_t5_test_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/create_xlm_roberta_test_proto.py b/tools/sentencepiece_testing/create_xlm_roberta_test_proto.py
index f59f5339d6..82751d8d1a 100644
--- a/tools/sentencepiece_testing/create_xlm_roberta_test_proto.py
+++ b/tools/sentencepiece_testing/create_xlm_roberta_test_proto.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
diff --git a/tools/sentencepiece_testing/utils.py b/tools/sentencepiece_testing/utils.py
index b6c0a84261..646c9854c1 100644
--- a/tools/sentencepiece_testing/utils.py
+++ b/tools/sentencepiece_testing/utils.py
@@ -1,4 +1,4 @@
-# Copyright 2024 The KerasNLP Authors
+# Copyright 2024 The KerasHub Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -24,7 +24,7 @@ def train_sentencepiece(data, filename, *args, **kwargs):
     )
     with open(
         pathlib.Path(__file__).parent.parent.parent
-        / "keras_nlp"
+        / "keras_hub"
         / "src"
         / "tests"
         / "test_data"