Dockerize API and Integrate BERT/FastText Models for Easy Deployment #1

ajamous · 2024-02-12T21:58:44Z

Description

We aim to simplify the deployment process of our FastAPI application, which serves as an interface to our BERT and FastText models for SMS classification. The current setup process is manual and requires several steps, including setting up Python environments, installing dependencies, and loading pre-trained models. To enhance usability and facilitate a smoother setup for developers and users alike, we propose dockerizing the application along with the BERT and FastText models.

Objective

Dockerize the FastAPI application: Package the FastAPI application into a Docker container, encapsulating its dependencies and runtime environment to ensure consistency across different setups.
Integrate OTS BERT and FastText models: Ensure the Docker container has access to the pre-trained BERT and FastText models, enabling the API to perform SMS classification without additional setup.
Simplify deployment: Allow users to deploy the application with minimal setup, ideally with a single command or a few simple steps.

Tasks

Create a Dockerfile that specifies the environment, installs dependencies, and sets up the application.
Ensure the Docker container can access the BERT and FastText models, either by bundling them with the container or by implementing a mechanism to load them on startup.
Write documentation explaining how to build and run the Docker container, including how to access the API and perform classifications.
Test the Docker setup on different platforms (e.g., Linux, macOS, Windows) to ensure compatibility and ease of use.

Requirements

The Docker container should be based on a lightweight, secure base image (e.g., Python Alpine).
Include comments in the Dockerfile and documentation to guide new users through the Docker setup and deployment process.
Ensure the API's performance and responsiveness are not negatively impacted by containerization.
Consider security best practices for Docker deployment, especially regarding data handling and network configurations.

Discussion Points

Model Storage: Discuss whether to bundle the models within the Docker image or to download them dynamically upon container startup. Consider trade-offs in terms of image size, startup time, and flexibility.
Configuration Management: Explore options for configuring the API and models within the Docker environment, possibly using environment variables or external configuration files.

Contributions

Contributions are welcome, and this issue serves as a starting point for discussion, planning, and implementation. If you have experience with Docker, Python environments, or have insights into deploying machine learning models in production, your input would be highly valued.

The text was updated successfully, but these errors were encountered:

ajamous · 2024-09-13T04:44:43Z

Update on Dockerization Plan

Hello everyone,

We wanted to provide an update on our plans regarding the dockerization of the OTS (Open Text Shield). After reviewing our objectives and considering community feedback, we've decided to expand the scope:

Complete Dockerization: We're now aiming to dockerize the entire OTS, including both the API and the pre-trained model (latest version 2.1). This means you can have the Docker container up and running with just a few commands, ready to start processing requests immediately.
Ease of Deployment for Telcos and Solution Providers: This enhancement enables Telcos and Solution Providers to easily deploy OTS within their own networks. Whether you want to complement your existing rules-based firewall to make it smarter with OTS or deploy OTS as a standalone messaging firewall, the process will be straightforward.

Model Retraining Support

Custom Training: If you wish to retrain the model, you can do so within the Docker environment. You'll need to allocate additional resources and make some configuration changes to the training script.
Hardware Optimization:
- Apple Silicon (M1 to M4): The training script is designed to auto-detect and leverage Apple Silicon chips for improved performance.
- Other GPUs: If you're using other types of GPUs, you may need to introduce some code changes to fully utilize them. We're open to contributions that can help automate this process for various hardware setups.

Code Refactoring

Current Focus: Our primary focus remains on training and research to improve OTS's capabilities.
Future Plans: Code refactoring is on our roadmap but isn't a priority right now. We acknowledge that the codebase could be cleaner, and we welcome contributions from anyone interested in helping make the code nicer to work with.

Next Steps

We'll be updating the repository soon with these changes, along with detailed documentation to guide you through the setup and deployment process.

Community Contributions

We highly value community input and contributions. If you have experience with Docker, GPU optimization, or code refactoring, we'd love to hear from you.

Thank you for your continued support!

ajamous · 2024-09-14T06:58:49Z

Good news! Now OTS is dockerized with pre-trained models bundled in. You can now deploy a working OTS in just a few minutes and start making predictions on any text you provide.

Quick Start with Docker

Setting up Open Text Shield is quick and easy with Docker. Follow these steps to get started:

1. Pull the Latest Docker Image

docker pull telecomsxchange/opentextshield:latest

2. Run the Docker Container

docker run -d -p 8002:8002 telecomsxchange/opentextshield:latest

3. Send a Message for Prediction

Once the container is running, you can send HTTP requests to the API to classify messages.

Example curl request:

curl -X POST "http://localhost:8002/predict/" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d "{\"text\":\"Your SMS content here\",\"model\":\"bert\"}"

Example Response:

{
  "label": "ham",
  "probability": 0.9971883893013,
  "processing_time": 0.6801116466522217,
  "Model_Name": "OTS_mBERT",
  "Model_Version": "bert-base-uncased",
  "Model_Author": "TelecomsXChange (TCXC)",
  "Last_Training": "2024-03-20"
}

ajamous · 2024-09-16T22:48:13Z

x86 arch docker image has been released, to use it run : docker pull telecomsxchange/opentextshield:2.1-x86

ajamous added the good first issue Good for newcomers label Feb 12, 2024

ajamous added a commit that referenced this issue Sep 14, 2024

Use Ubuntu base image #1

956e7d6

ajamous added a commit that referenced this issue Sep 14, 2024

#1 OS packages

cc3b6b3

ajamous added a commit that referenced this issue Sep 14, 2024

#1 Use pre-defined OS and Python Packages list

cc2c3ef

ajamous added a commit that referenced this issue Sep 14, 2024

required python pkgs #1

3251685

ajamous added a commit that referenced this issue Sep 14, 2024

#1 Check py environment on start, create if does not exist

5da112b

ajamous self-assigned this Sep 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dockerize API and Integrate BERT/FastText Models for Easy Deployment #1

Dockerize API and Integrate BERT/FastText Models for Easy Deployment #1

ajamous commented Feb 12, 2024

ajamous commented Sep 13, 2024

ajamous commented Sep 14, 2024

ajamous commented Sep 16, 2024

Dockerize API and Integrate BERT/FastText Models for Easy Deployment #1

Dockerize API and Integrate BERT/FastText Models for Easy Deployment #1

Comments

ajamous commented Feb 12, 2024

Description

Objective

Tasks

Requirements

Discussion Points

Contributions

ajamous commented Sep 13, 2024

ajamous commented Sep 14, 2024

Quick Start with Docker

1. Pull the Latest Docker Image

2. Run the Docker Container

3. Send a Message for Prediction

Example Response:

ajamous commented Sep 16, 2024