diff --git a/README.md b/README.md
index 03e2c71..9d18cf6 100644
--- a/README.md
+++ b/README.md
@@ -7,6 +7,7 @@
 Official repository for the paper [Robust High-Resolution Video Matting with Temporal Guidance](https://peterl1n.github.io/RobustVideoMatting/). RVM is specifically designed for robust human video matting. Unlike existing neural models that process frames as independent images, RVM uses a recurrent neural network to process videos with temporal memory. RVM can perform matting in real-time on any videos without additional inputs. It achieves **4K 76FPS** and **HD 104FPS** on an Nvidia GTX 1080 Ti GPU. The project was developed at [ByteDance Inc.](https://www.bytedance.com/)
 
 <br>
+<a href="https://replicate.com/arielreplicate/robust_video_matting"><img src="https://replicate.com/arielreplicate/robust_video_matting/badge"></a>
 
 ## News
 
@@ -34,7 +35,7 @@ All footage in the video are available in [Google Drive](https://drive.google.co
 ## Demo
 * [Webcam Demo](https://peterl1n.github.io/RobustVideoMatting/#/demo): Run the model live in your browser. Visualize recurrent states.
 * [Colab Demo](https://colab.research.google.com/drive/10z-pNKRnVNsp0Lq9tH1J_XPZ7CBC_uHm?usp=sharing): Test our model on your own videos with free GPU. 
-
+* [Replicate Demo](https://replicate.com/arielreplicate/robust_video_matting): Test our model on Replicate UI/python API.
 <br>
 
 ## Download
diff --git a/cog.yaml b/cog.yaml
new file mode 100644
index 0000000..e7a4f3f
--- /dev/null
+++ b/cog.yaml
@@ -0,0 +1,14 @@
+build:
+  gpu: true
+  python_version: 3.8
+  system_packages:
+    - libgl1-mesa-glx
+    - libglib2.0-0
+  python_packages:
+    - torch==1.9.0
+    - torchvision==0.10.0
+    - av==8.0.3
+    - tqdm==4.61.1
+    - pims==0.5
+
+predict: "predict.py:Predictor"
diff --git a/predict.py b/predict.py
new file mode 100644
index 0000000..d7a9707
--- /dev/null
+++ b/predict.py
@@ -0,0 +1,32 @@
+import torch
+from model import MattingNetwork
+from inference import convert_video
+
+from cog import BasePredictor, Path, Input
+
+
+class Predictor(BasePredictor):
+    def setup(self):
+        self.model = MattingNetwork('resnet50').eval().cuda()
+        self.model.load_state_dict(torch.load('rvm_resnet50.pth'))
+
+    def predict(
+            self,
+            input_video: Path = Input(description="Video to segment."),
+            output_type: str = Input(default="green-screen", choices=["green-screen", "alpha-mask", "foreground-mask"]),
+
+    ) -> Path:
+
+        convert_video(
+            self.model,  # The model, can be on any device (cpu or cuda).
+            input_source=str(input_video),  # A video file or an image sequence directory.
+            output_type='video',  # Choose "video" or "png_sequence"
+            output_composition='green-screen.mp4',  # File path if video; directory path if png sequence.
+            output_alpha="alpha-mask.mp4",  # [Optional] Output the raw alpha prediction.
+            output_foreground="foreground-mask.mp4",  # [Optional] Output the raw foreground prediction.
+            output_video_mbps=4,  # Output video mbps. Not needed for png sequence.
+            downsample_ratio=None,  # A hyperparameter to adjust or use None for auto.
+            seq_chunk=12,  # Process n frames at once for better parallelism.
+        )
+        output_type = str(output_type)
+        return Path(f'{output_type}.mp4')