diff --git a/README.md b/README.md
index 70285409a..79b707bd3 100644
--- a/README.md
+++ b/README.md
@@ -63,7 +63,7 @@ Running Platform:
 - [DSSM](docs/source/models/dssm.md) / [MIND](docs/source/models/mind.md) / [DropoutNet](docs/source/models/dropoutnet.md) / [CoMetricLearningI2I](docs/source/models/co_metric_learning_i2i.md) / [PDN](docs/source/models/pdn.md)
 - [W&D](docs/source/models/wide_and_deep.md) / [DeepFM](docs/source/models/deepfm.md) / [MultiTower](docs/source/models/multi_tower.md) / [DCN](docs/source/models/dcn.md) / [FiBiNet](docs/source/models/fibinet.md) / [MaskNet](docs/source/models/masknet.md) / [PPNet](docs/source/models/ppnet.md) / [CDN](docs/source/models/cdn.md)
 - [DIN](docs/source/models/din.md) / [BST](docs/source/models/bst.md) / [CL4SRec](docs/source/models/cl4srec.md)
-- [MMoE](docs/source/models/mmoe.md) / [ESMM](docs/source/models/esmm.md) / [DBMTL](docs/source/models/dbmtl.md) / [PLE](docs/source/models/ple.md)
+- [MMoE](docs/source/models/mmoe.md) / [ESMM](docs/source/models/esmm.md) / [DBMTL](docs/source/models/dbmtl.md) / [AITM](docs/source/models/aitm.md) / [PLE](docs/source/models/ple.md)
 - [HighwayNetwork](docs/source/models/highway.md) / [CMBF](docs/source/models/cmbf.md) / [UNITER](docs/source/models/uniter.md)
 - More models in development
 
diff --git a/docs/images/models/aitm.jpg b/docs/images/models/aitm.jpg
new file mode 100644
index 000000000..4eab9af17
Binary files /dev/null and b/docs/images/models/aitm.jpg differ
diff --git a/docs/source/benchmark.md b/docs/source/benchmark.md
index 8e2d20c6f..8a2c5348e 100644
--- a/docs/source/benchmark.md
+++ b/docs/source/benchmark.md
@@ -9,6 +9,7 @@
 - 该数据集是淘宝展示广告点击率预估数据集，包含用户、广告特征和行为日志。[天池比赛链接](https://tianchi.aliyun.com/dataset/dataDetail?dataId=56)
 - 训练数据表：pai_online_project.easyrec_demo_taobao_train_data
 - 测试数据表：pai_online_project.easyrec_demo_taobao_test_data
+- 其中pai_online_project是一个公共读的MaxCompute project，里面写入了一些数据表做测试，不需要申请权限。
 - 在PAI上面测试使用的资源包括2个parameter server，9个worker，其中一个worker做评估:
   ```json
   {"ps":{"count":2,
diff --git a/docs/source/component/backbone.md b/docs/source/component/backbone.md
index 2a0ec03a5..d9edd2ef7 100644
--- a/docs/source/component/backbone.md
+++ b/docs/source/component/backbone.md
@@ -1111,13 +1111,14 @@ MovieLens-1M数据集效果：
 
 ## 2.特征交叉组件
 
-| 类名             | 功能               | 说明           | 示例                                                                                                                         |
-| -------------- | ---------------- | ------------ | -------------------------------------------------------------------------------------------------------------------------- |
-| FM             | 二阶交叉             | DeepFM模型的组件  | [案例2](#deepfm)                                                                                                             |
-| DotInteraction | 二阶内积交叉           | DLRM模型的组件    | [案例4](#dlrm)                                                                                                               |
-| Cross          | bit-wise交叉       | DCN v2模型的组件  | [案例3](#dcn)                                                                                                                |
-| BiLinear       | 双线性              | FiBiNet模型的组件 | [fibinet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/fibinet_on_movielens.config) |
-| FiBiNet        | SENet & BiLinear | FiBiNet模型    | [fibinet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/fibinet_on_movielens.config) |
+| 类名             | 功能                    | 说明               | 示例                                                                                                                         |
+| -------------- | --------------------- | ---------------- | -------------------------------------------------------------------------------------------------------------------------- |
+| FM             | 二阶交叉                  | DeepFM模型的组件      | [案例2](#deepfm)                                                                                                             |
+| DotInteraction | 二阶内积交叉                | DLRM模型的组件        | [案例4](#dlrm)                                                                                                               |
+| Cross          | bit-wise交叉            | DCN v2模型的组件      | [案例3](#dcn)                                                                                                                |
+| BiLinear       | 双线性                   | FiBiNet模型的组件     | [fibinet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/fibinet_on_movielens.config) |
+| FiBiNet        | SENet & BiLinear      | FiBiNet模型        | [fibinet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/fibinet_on_movielens.config) |
+| Attention      | Dot-product attention | Transformer模型的组件 |                                                                                                                            |
 
 ## 3.特征重要度学习组件
 
diff --git a/docs/source/component/component.md b/docs/source/component/component.md
index 897e53162..731e95759 100644
--- a/docs/source/component/component.md
+++ b/docs/source/component/component.md
@@ -79,6 +79,33 @@
 | senet    | SENet    |     | protobuf message |
 | mlp      | MLP      |     | protobuf message |
 
+- Attention
+
+Dot-product attention layer, a.k.a. Luong-style attention.
+
+The calculation follows the steps:
+
+1. Calculate attention scores using query and key with shape (batch_size, Tq, Tv).
+1. Use scores to calculate a softmax distribution with shape (batch_size, Tq, Tv).
+1. Use the softmax distribution to create a linear combination of value with shape (batch_size, Tq, dim).
+
+| 参数                      | 类型     | 默认值   | 说明                                                                                                                                                                                                                                     |
+| ----------------------- | ------ | ----- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| use_scale               | bool   | False | If True, will create a scalar variable to scale the attention scores.                                                                                                                                                                  |
+| score_mode              | string | dot   | Function to use to compute attention scores, one of {"dot", "concat"}. "dot" refers to the dot product between the query and key vectors. "concat" refers to the hyperbolic tangent of the concatenation of the query and key vectors. |
+| dropout                 | float  | 0.0   | Float between 0 and 1. Fraction of the units to drop for the attention scores.                                                                                                                                                         |
+| seed                    | int    | None  | A Python integer to use as random seed incase of dropout.                                                                                                                                                                              |
+| return_attention_scores | bool   | False | if True, returns the attention scores (after masking and softmax) as an additional output argument.                                                                                                                                    |
+| use_causal_mask         | bool   | False | Set to True for decoder self-attention. Adds a mask such that position i cannot attend to positions j > i. This prevents the flow of information from the future towards the past.                                                     |
+
+- inputs: List of the following tensors:
+  - query: Query tensor of shape (batch_size, Tq, dim).
+  - value: Value tensor of shape (batch_size, Tv, dim).
+  - key: Optional key tensor of shape (batch_size, Tv, dim). If not given, will use value for both key and value, which is the most common case.
+- output:
+  - Attention outputs of shape (batch_size, Tq, dim).
+  - (Optional) Attention scores after masking and softmax with shape (batch_size, Tq, Tv).
+
 ## 3.特征重要度学习组件
 
 - SENet
diff --git a/docs/source/feature/data.md b/docs/source/feature/data.md
index 827791ffb..169902b78 100644
--- a/docs/source/feature/data.md
+++ b/docs/source/feature/data.md
@@ -2,7 +2,7 @@
 
 EasyRec作为阿里云PAI的推荐算法包，可以无缝对接MaxCompute的数据表，也可以读取OSS中的大文件，还支持E-MapReduce环境中的HDFS文件，也支持local环境中的csv文件。
 
-为了识别这些输入数据中的字段信息，需要设置相应的字段名称和字段类型、设置默认值，帮助EasyRec去读取相应的数据。设置label字段，作为训练的目标。为了适应多目标模型，label字段可以设置多个。
+为了识别这些输入数据中的字段信息，需要设置相应的字段名称和字段类型、设置默认值，帮助EasyRec去读取相应的数据。设置label字段，作为训练的目标。为了适配多目标模型，label字段可设置多个。
 
 另外还有一些参数如prefetch_size，是tensorflow中读取数据需要设置的参数。
 
@@ -10,7 +10,7 @@ EasyRec作为阿里云PAI的推荐算法包，可以无缝对接MaxCompute的数
 
 这个配置里面，只有三个字段，用户ID（uid）、物品ID（item_id）、label字段（click）。
 
-OdpsInputV2表示读取MaxCompute的表作为输入数据。
+OdpsInputV2表示读取MaxCompute的表作为输入数据。如果是本地机器上训练，注意使用CSVInput类型。
 
 ```protobuf
 data_config {
@@ -160,7 +160,7 @@ def remap_lbl(labels):
 ### prefetch_size
 
 - data prefetch，以batch为单位，默认是32
-- 设置prefetch size可以提高数据加载的速度，防止数据瓶颈
+- 设置prefetch size可以提高数据加载的速度，防止数据瓶颈。但是当batchsize较小的时候，该值可适当调小。
 
 ### shard && file_shard
 
diff --git a/docs/source/feature/feature.rst b/docs/source/feature/feature.rst
index a41b42a53..901fe6673 100644
--- a/docs/source/feature/feature.rst
+++ b/docs/source/feature/feature.rst
@@ -3,7 +3,7 @@
 
 在上一节介绍了输入数据包括MaxCompute表、csv文件、hdfs文件、OSS文件等，表或文件的一列对应一个特征。
 
-在数据中可以有一个或者多个label字段，而特征比较丰富，支持的类型包括IdFeature，RawFeature，TagFeature，SequenceFeature, ComboFeature.
+在数据中可以有一个或者多个label字段，在多目标模型中，需要多个label字段。而特征比较丰富，支持的类型包括IdFeature，RawFeature，TagFeature，SequenceFeature, ComboFeature。
 
 各种特征共用字段
 ----------------------------------------------------------------
@@ -71,12 +71,12 @@ IdFeature: 离散值特征/ID类特征
 
    .. math::
 
-        embedding\_dim=8+x^{0.25}
-  - 其中，x 为不同特征取值的个数
+        embedding\_dim=8+n^{0.25}
+  - 其中，n 是特征的唯一值的个数（如gender特征的取值是男、女，则n=2）
 
 -  hash\_bucket\_size: hash bucket的大小。适用于category_id, user_id等
 
--  对于user\_id等规模比较大的，hash冲突影响比较小的特征，
+-  对于user\_id等规模比较大的，hash冲突影响比较小的特征，用户行为日志不够丰富可通过hash压缩id数量，
 
    .. math::
 
@@ -91,7 +91,8 @@ IdFeature: 离散值特征/ID类特征
 
 
 -  num\_buckets: buckets number,
-   仅仅当输入是integer类型时，可以使用num\_buckets
+   仅仅当输入是integer类型时，可以使用num\_buckets。
+   但是当使用fg特征的时候，不要用integer特征用num\_buckets的方式来变换，注意要用hash\_bucket\_size的方式。
 
 -  vocab\_list:
    指定词表，适合取值比较少可以枚举的特征，如星期，月份，星座等
diff --git a/docs/source/feature/pai_rec_callback_conf.md b/docs/source/feature/pai_rec_callback_conf.md
index 151c07b1d..5679222d7 100644
--- a/docs/source/feature/pai_rec_callback_conf.md
+++ b/docs/source/feature/pai_rec_callback_conf.md
@@ -1,5 +1,9 @@
 # PAI-REC 全埋点配置
 
+## PAI-Rec引擎的callback服务文档
+
+- [文档](http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/pairec/docs/pairec/html/intro/callback_api.html)
+
 ## 模板
 
 ```json
diff --git a/docs/source/feature/rtp_fg.md b/docs/source/feature/rtp_fg.md
index baeaf078b..40da4852e 100644
--- a/docs/source/feature/rtp_fg.md
+++ b/docs/source/feature/rtp_fg.md
@@ -2,7 +2,7 @@
 
 - RTP FG: RealTime Predict Feature Generation, 解决实时预测需要的特征工程需求. 特征工程在推荐链路里面也占用了比较长的时间.
 
-- RTP FG能够以比较高的效率生成一些复杂的交叉特征，如match feature和lookup feature, 通过使用同一套c++代码保证离线在线的一致性.
+- RTP FG能够以比较高的效率生成一些复杂的交叉特征，如match feature和lookup feature.离线训练和在线预测的时候通过使用同一套c++代码保证离线在线的一致性.
 
 - 其生成的特征可以接入EasyRec进行训练，从RTP FG的配置(fg.json)可以生成EasyRec的配置文件(pipeline.config).
 
diff --git a/docs/source/feature/rtp_native.md b/docs/source/feature/rtp_native.md
index d2524079a..8774041c7 100644
--- a/docs/source/feature/rtp_native.md
+++ b/docs/source/feature/rtp_native.md
@@ -1,6 +1,6 @@
 # RTP部署
 
-本文档介绍将EasyRec模型部署到RTP上的流程.
+本文档介绍将EasyRec模型部署到RTP（Real Time Prediction，实时打分服务）上的流程.
 
 - RTP目前仅支持checkpoint形式的模型部署，因此需要将EasyRec模型导出为checkpoint形式
 
diff --git a/docs/source/intro.md b/docs/source/intro.md
index f4dabcb76..b91c0e7cf 100644
--- a/docs/source/intro.md
+++ b/docs/source/intro.md
@@ -63,4 +63,5 @@ EasyRec implements state of the art machine learning models used in common recom
 
 ### Contact
 
+- DingDing Group: 32260796. (EasyRec usage general discussion.)
 - DingDing Group: 37930014162, click [this url](https://qr.dingtalk.com/action/joingroup?code=v1,k1,oHNqtNObbu+xUClHh77gCuKdGGH8AYoQ8AjKU23zTg4=&_dt_no_comment=1&origin=11) or scan QrCode to join![new_group.jpg](../images/qrcode/new_group.jpg)
diff --git a/docs/source/models/aitm.md b/docs/source/models/aitm.md
new file mode 100644
index 000000000..a15ea0489
--- /dev/null
+++ b/docs/source/models/aitm.md
@@ -0,0 +1,118 @@
+# AITM
+
+### 简介
+
+在推荐场景里，用户的转化链路往往有多个中间步骤（曝光->点击->转化），AITM是一种多任务模型框架，充分利用了链路上各个节点的样本，提升模型对后端节点转化率的预估。
+
+![AITM](../../images/models/aitm.jpg)
+
+1. (a) Expert-Bottom pattern。如 [MMoE](mmoe.md)
+1. (b) Probability-Transfer pattern。如 [ESMM](esmm.md)
+1. (c)  Adaptive Information Transfer Multi-task (AITM) framework.
+
+两个特点：
+
+1. 使用Attention机制来融合多个目标对应的特征表征；
+1. 引入了行为校正的辅助损失函数。
+
+### 配置说明
+
+```protobuf
+model_config {
+  model_name: "AITM"
+  model_class: "MultiTaskModel"
+  feature_groups {
+    group_name: "all"
+    feature_names: "user_id"
+    feature_names: "cms_segid"
+    ...
+    feature_names: "tag_brand_list"
+    wide_deep: DEEP
+  }
+  backbone {
+    blocks {
+      name: "mlp"
+      inputs {
+        feature_group_name: "all"
+      }
+      keras_layer {
+        class_name: 'MLP'
+        mlp {
+          hidden_units: [512, 256]
+        }
+      }
+    }
+  }
+  model_params {
+    task_towers {
+      tower_name: "ctr"
+      label_name: "clk"
+      loss_type: CLASSIFICATION
+      metrics_set: {
+        auc {}
+      }
+      dnn {
+        hidden_units: [256, 128]
+      }
+      use_ait_module: true
+      weight: 1.0
+    }
+    task_towers {
+      tower_name: "cvr"
+      label_name: "buy"
+      losses {
+        loss_type: CLASSIFICATION
+      }
+      losses {
+        loss_type: ORDER_CALIBRATE_LOSS
+      }
+      metrics_set: {
+        auc {}
+      }
+      dnn {
+        hidden_units: [256, 128]
+      }
+      relation_tower_names: ["ctr"]
+      use_ait_module: true
+      ait_project_dim: 128
+      weight: 1.0
+    }
+    l2_regularization: 1e-6
+  }
+  embedding_regularization: 5e-6
+}
+```
+
+- model_name: 任意自定义字符串，仅有注释作用
+
+- model_class: 'MultiTaskModel', 不需要修改, 通过组件化方式搭建的多目标排序模型都叫这个名字
+
+- feature_groups: 配置一组特征。
+
+- backbone: 通过组件化的方式搭建的主干网络，[参考文档](../component/backbone.md)
+
+  - blocks: 由多个`组件块`组成的一个有向无环图（DAG），框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑，构建TF Graph的一个子图
+  - name/inputs: 每个`block`有一个唯一的名字（name），并且有一个或多个输入(inputs)和输出
+  - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer，执行一段代码逻辑；[参考文档](../component/backbone.md#keraslayer)
+  - mlp: MLP模型的参数，详见[参考文档](../component/component.md#id1)
+
+- model_params: AITM相关的参数
+
+  - task_towers 根据任务数配置task_towers
+    - tower_name
+    - dnn deep part的参数配置
+      - hidden_units: dnn每一层的channel数目，即神经元的数目
+    - use_ait_module: if true 使用`AITM`模型；否则，使用[DBMTL](dbmtl.md)模型
+    - ait_project_dim: 每个tower对应的表征向量的维度，一般设为最后一个隐藏的维度即可
+    - 默认为二分类任务，即num_class默认为1，weight默认为1.0，loss_type默认为CLASSIFICATION，metrics_set为auc
+    - loss_type: ORDER_CALIBRATE_LOSS 使用目标依赖关系校正预测结果的辅助损失函数，详见原始论文
+    - 注：label_fields需与task_towers一一对齐。
+  - embedding_regularization: 对embedding部分加regularization，防止overfit
+
+### 示例Config
+
+- [AITM_demo.config](https://github.com/alibaba/EasyRec/blob/master/samples/model_config/aitm_on_taobao.config)
+
+### 参考论文
+
+[AITM: Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning in Targeted Display Advertising](https://arxiv.org/pdf/2105.08489.pdf)
diff --git a/docs/source/models/loss.md b/docs/source/models/loss.md
index f1246299f..881794e6a 100644
--- a/docs/source/models/loss.md
+++ b/docs/source/models/loss.md
@@ -19,6 +19,7 @@ EasyRec支持两种损失函数配置方式：1）使用单个损失函数；2
 | PAIRWISE_LOGISTIC_LOSS                     | pair粒度的logistic loss, 支持自定义pair分组                          |
 | JRC_LOSS                                   | 二分类 + listwise ranking loss                                |
 | F1_REWEIGHTED_LOSS                         | 可以调整二分类召回率和准确率相对权重的损失函数，可有效对抗正负样本不平衡问题                     |
+| ORDER_CALIBRATE_LOSS                       | 使用目标依赖关系校正预测结果的辅助损失函数，详见[AITM](aitm.md)模型                  |
 
 - 说明：SOFTMAX_CROSS_ENTROPY_WITH_NEGATIVE_MINING
   - 支持参数配置，升级为 [support vector guided softmax loss](https://128.84.21.199/abs/1812.11317) ，
@@ -71,9 +72,9 @@ EasyRec支持两种损失函数配置方式：1）使用单个损失函数；2
 
   - f1_beta_square: 大于1的值会导致模型更关注recall，小于1的值会导致模型更关注precision
   - F1 分数，又称平衡F分数（balanced F Score），它被定义为精确率和召回率的调和平均数。
-    - ![f1 score](../images/other/f1_score.svg)
+    - ![f1 score](../../images/other/f1_score.svg)
   - 更一般的，我们定义 F_beta 分数为:
-    - ![f_beta score](../images/other/f_beta_score.svg)
+    - ![f_beta score](../../images/other/f_beta_score.svg)
   - f1_beta_square 即为 上述公式中的 beta 系数的平方。
 
 - PAIRWISE_FOCAL_LOSS 的参数配置
@@ -211,3 +212,4 @@ task_towers {
 
 - 《 Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics 》
 - 《 [Reasonable Effectiveness of Random Weighting: A Litmus Test for Multi-Task Learning](https://arxiv.org/abs/2111.10603) 》
+- [AITM: Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning in Targeted Display Advertising](https://arxiv.org/pdf/2105.08489.pdf)
diff --git a/docs/source/models/multi_target.rst b/docs/source/models/multi_target.rst
index 9012aca9b..2263a27c0 100644
--- a/docs/source/models/multi_target.rst
+++ b/docs/source/models/multi_target.rst
@@ -7,5 +7,6 @@
    esmm
    mmoe
    dbmtl
+   aitm
    ple
    simple_multi_task
diff --git "a/docs/source/predict/MaxCompute \347\246\273\347\272\277\351\242\204\346\265\213.md" "b/docs/source/predict/MaxCompute \347\246\273\347\272\277\351\242\204\346\265\213.md"
index 7f0b9e675..dd867a165 100644
--- "a/docs/source/predict/MaxCompute \347\246\273\347\272\277\351\242\204\346\265\213.md"	
+++ "b/docs/source/predict/MaxCompute \347\246\273\347\272\277\351\242\204\346\265\213.md"	
@@ -11,7 +11,7 @@
 drop table if exists ctr_test_output;
 pai -name easy_rec_ext
 -Dcmd=predict
--Dcluster='{"worker" : {"count":5, "cpu":1600,  "memory":40000, "gpu":100}}'
+-Dcluster='{"worker" : {"count":5, "cpu":1000,  "memory":40000, "gpu":0}}'
 -Darn=acs:ram::xxx:role/aliyunodpspaidefaultrole
 -Dbuckets=oss://easyrec/
 -Dsaved_model_dir=oss://easyrec/easy_rec_test/experiment/export/1597299619
@@ -23,6 +23,7 @@ pai -name easy_rec_ext
 -DossHost=oss-cn-beijing-internal.aliyuncs.com;
 ```
 
+- cluster: 这里cpu:1000表示是10个cpu核；核与内存的关系设置1:4000，一般不超过40000；gpu设置为0，表示不用GPU推理。
 - saved_model_dir: 导出的模型目录
 - output_table: 输出表，不需要提前创建，会自动创建
 - excluded_cols: 预测模型不需要的columns，比如labels
@@ -55,6 +56,8 @@ pai -name easy_rec_ext
   - 多分类模型(num_class > 1)，导出字段:
     - logits: string(json), softmax之前的vector, shape\[num_class\]
     - probs: string(json), softmax之后的vector, shape\[num_class\]
+      - 如果一个分类目标是is_click, 输出概率的变量名称是probs_is_click
+      - 多目标模型中有一个回归目标是paytime，那么输出回归预测分的变量名称是：y_paytime
     - logits_y: logits\[y\], float, 类别y对应的softmax之前的概率
     - probs_y: probs\[y\], float, 类别y对应的概率
     - y: 类别id, = argmax(probs_y), int, 概率最大的类别
diff --git a/docs/source/predict/processor.md b/docs/source/predict/processor.md
index 0ce0b4bd8..dabdb7aa1 100644
--- a/docs/source/predict/processor.md
+++ b/docs/source/predict/processor.md
@@ -1,17 +1,17 @@
 # EasyRec Processor
 
-EasyRec Processor, 是EasyRec对应的高性能在线打分引擎, 包含特征处理和模型推理功能. EasyRecProcessor运行在PAI-EAS之上, 可以充分利用PAI-EAS多种优化特性.
+EasyRec Processor([阿里云上的EasyRec Processor详细文档，包括版本、使用方式](https://help.aliyun.com/zh/pai/user-guide/easyrec)), 是EasyRec对应的高性能在线打分引擎, 包含特征处理和模型推理功能. EasyRecProcessor运行在PAI-EAS之上, 可以充分利用PAI-EAS多种优化特性.
 
 ## 架构设计
 
-EasyRec Processor包含三个部分: Item特征缓存, 特征处理(Feature Generator), TFModel(tensorflow model).
+EasyRec Processor包含三个部分: Item特征缓存（支持通过[FeatureStore](https://help.aliyun.com/zh/pai/user-guide/featurestore-overview)加载MaxCompute表做初始化）, 特征生成(Feature Generator), TFModel(tensorflow model).
 ![image.png](../../images/processor/easy_rec_processor_1.png)
 
 ## 性能优化
 
 ### 基础实现
 
-将FeatureGenerator和TFModel分开, 先做特征生成，然后再Run TFModel.
+将FeatureGenerator和TFModel分开, 先做特征生成（即fg），然后再Run TFModel得到预测结果.
 
 ### 优化实现
 
diff --git "a/docs/source/predict/\345\234\250\347\272\277\351\242\204\346\265\213.md" "b/docs/source/predict/\345\234\250\347\272\277\351\242\204\346\265\213.md"
index 56f496945..8cb7db1ca 100644
--- "a/docs/source/predict/\345\234\250\347\272\277\351\242\204\346\265\213.md"
+++ "b/docs/source/predict/\345\234\250\347\272\277\351\242\204\346\265\213.md"
@@ -1,6 +1,6 @@
 # Model Serving
 
-推荐使用阿里云上的[模型在线服务(PAI-EAS)](https://help.aliyun.com/document_detail/113696.html)预置的EasyRecProcessor 来部署在线推理服务。EasyRecProcessor针对推荐模型做了多种优化, 相比tensorflow serving和TensorRT方式部署具有显著的[性能优势](./processor.md)。
+推荐使用阿里云上的[模型在线服务(PAI-EAS)](https://help.aliyun.com/document_detail/113696.html)预置的EasyRecProcessor 来部署在线推理服务。EasyRec Processor（[阿里云文档](https://help.aliyun.com/zh/pai/user-guide/easyrec)）针对推荐模型做了多种优化, 相比tensorflow serving和TensorRT方式部署具有显著的[性能优势](./processor.md)。
 
 ## 命令行部署
 
diff --git a/docs/source/quick_start/designer_tutorial.md b/docs/source/quick_start/designer_tutorial.md
index 95d9899b2..66a22d15c 100644
--- a/docs/source/quick_start/designer_tutorial.md
+++ b/docs/source/quick_start/designer_tutorial.md
@@ -94,3 +94,7 @@ PAI-Designer（Studio 2.0）是基于云原生架构Pipeline Service（PAIFlow
 `pai -name easy_rec_ext -project algo_public  -Dcmd=predict`
 
 - 具体命令及详细[参数说明](../train.md#on-pai)
+
+### 推荐算法定制的方案
+
+- 在Designer中做推荐算法特征工程、排序模型训练、向量召回等案例的阿里云官网[文档链接](https://help.aliyun.com/zh/pai/use-cases/overview-18)
diff --git a/docs/source/quick_start/dlc_tutorial.md b/docs/source/quick_start/dlc_tutorial.md
index 22e067daa..f766a5f93 100644
--- a/docs/source/quick_start/dlc_tutorial.md
+++ b/docs/source/quick_start/dlc_tutorial.md
@@ -88,16 +88,16 @@ dlc submit tfjob \
     --workspace_id=67849 \
     --priority=1 \
     --workers=1 \
-    --worker_image=mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.4.9 \
+    --worker_image=mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.7.4 \
     --worker_spec=ecs.g6.2xlarge \
     --ps=1 \
-    --ps_image=mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.4.9 \
+    --ps_image=mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.7.4 \
     --ps_spec=ecs.g6.2xlarge \
     --chief=true \
-    --chief_image=mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.4.9 \
+    --chief_image=mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.7.4 \
     --chief_spec=ecs.g6.2xlarge \
     --evaluators=1 \
-    --evaluator_image=mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.4.9 \
+    --evaluator_image=mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.7.4 \
     --evaluator_spec=ecs.g6.2xlarge
 ```
 
diff --git a/docs/source/quick_start/local_tutorial.md b/docs/source/quick_start/local_tutorial.md
index 8074c1218..443312ce9 100644
--- a/docs/source/quick_start/local_tutorial.md
+++ b/docs/source/quick_start/local_tutorial.md
@@ -4,6 +4,8 @@
 
 我们提供了`本地Anaconda安装`和`Docker镜像启动`两种方式。
 
+有技术问题可加钉钉群：37930014162
+
 #### 本地Anaconda安装
 
 Demo实验中使用的环境为 `python=3.6.8` + `tenserflow=1.12.0`
@@ -31,8 +33,8 @@ Docker的环境为`python=3.6.9` + `tenserflow=1.15.5`
 ```bash
 git clone https://github.com/alibaba/EasyRec.git
 cd EasyRec
-docker pull mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.6.3
-docker run -td --network host -v /local_path/EasyRec:/docker_path/EasyRec mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.6.3
+docker pull mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.7.4
+docker run -td --network host -v /local_path/EasyRec:/docker_path/EasyRec mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.7.4
 docker exec -it <CONTAINER_ID> bash
 ```
 
@@ -42,7 +44,7 @@ docker exec -it <CONTAINER_ID> bash
 git clone https://github.com/alibaba/EasyRec.git
 cd EasyRec
 bash scripts/build_docker.sh
-sudo docker run -td --network host -v /local_path:/docker_path mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-<easyrec_version>
+sudo docker run -td --network host -v /local_path:/docker_path mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-<easyrec_version>
 sudo docker exec -it <CONTAINER_ID> bash
 ```
 
@@ -52,7 +54,7 @@ sudo docker exec -it <CONTAINER_ID> bash
 
 输入一般是csv格式的文件。
 
-#### 示例数据
+#### 示例数据(点击下载)
 
 - train: [dwd_avazu_ctr_deepmodel_train.csv](http://easyrec.oss-cn-beijing.aliyuncs.com/data/dwd_avazu_ctr_deepmodel_train.csv)
 - test: [dwd_avazu_ctr_deepmodel_test.csv](http://easyrec.oss-cn-beijing.aliyuncs.com/data/dwd_avazu_ctr_deepmodel_test.csv)
diff --git a/docs/source/quick_start/mc_tutorial_inner.md b/docs/source/quick_start/mc_tutorial_inner.md
index 04940ad8c..d04dc2e8d 100644
--- a/docs/source/quick_start/mc_tutorial_inner.md
+++ b/docs/source/quick_start/mc_tutorial_inner.md
@@ -34,7 +34,7 @@ pai -name easy_rec_ext -project algo_public
 -Dconfig=oss://easyrec/config/MultiTower/dwd_avazu_ctr_deepmodel_ext.config
 -Dtrain_tables='odps://pai_online_project/tables/dwd_avazu_ctr_deepmodel_train'
 -Deval_tables='odps://pai_online_project/tables/dwd_avazu_ctr_deepmodel_test'
--Dcluster='{"ps":{"count":1, "cpu":1000}, "worker" : {"count":3, "cpu":1000, "gpu":100, "memory":40000}}'
+-Dcluster='{"ps":{"count":1, "cpu":1000}, "worker" : {"count":3, "cpu":1000, "gpu":0, "memory":40000}}'
 -Deval_method=separate
 -Dmodel_dir=oss://easyrec/ckpt/MultiTower
 -Dbuckets=oss://easyrec/?role_arn=acs:ram::xxx:role/xxx&host=oss-cn-beijing-internal.aliyuncs.com;
diff --git a/docs/source/train.md b/docs/source/train.md
index 843955e81..85dd4af0b 100644
--- a/docs/source/train.md
+++ b/docs/source/train.md
@@ -194,9 +194,9 @@ pai -name easy_rec_ext -project algo_public
 ### 依赖
 
 - 混合并行使用Horovod做底层的通信, 因此需要安装Horovod, 可以直接使用下面的镜像
-- mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:sok-tf212-gpus-v5
+- mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:sok-tf212-gpus-v5
   ```
-    sudo docker run --gpus=all --privileged -v /home/easyrec/:/home/easyrec/ -ti mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:sok-tf212-gpus-v5 bash
+    sudo docker run --gpus=all --privileged -v /home/easyrec/:/home/easyrec/ -ti mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:sok-tf212-gpus-v5 bash
   ```
 
 ### 配置
diff --git a/easy_rec/python/core/sampler.py b/easy_rec/python/core/sampler.py
index 6baee406f..779b30b48 100644
--- a/easy_rec/python/core/sampler.py
+++ b/easy_rec/python/core/sampler.py
@@ -268,6 +268,7 @@ def __init__(self,
 
   def _get_impl(self, ids):
     ids = np.array(ids, dtype=np.int64)
+    ids = np.pad(ids, (0, self._batch_size - len(ids)), 'edge')
     nodes = self._sampler.get(ids)
     features = self._parse_nodes(nodes)
     return features
@@ -491,7 +492,9 @@ def __init__(self,
 
   def _get_impl(self, src_ids, dst_ids):
     src_ids = np.array(src_ids, dtype=np.int64)
+    src_ids = np.pad(src_ids, (0, self._batch_size - len(src_ids)), 'edge')
     dst_ids = np.array(dst_ids, dtype=np.int64)
+    dst_ids = np.pad(dst_ids, (0, self._batch_size - len(dst_ids)), 'edge')
     nodes = self._sampler.get(src_ids, dst_ids)
     features = self._parse_nodes(nodes)
     return features
@@ -571,6 +574,7 @@ def __init__(self,
   def _get_impl(self, src_ids, dst_ids):
     src_ids = np.array(src_ids, dtype=np.int64)
     dst_ids = np.array(dst_ids, dtype=np.int64)
+    dst_ids = np.pad(dst_ids, (0, self._batch_size - len(dst_ids)), 'edge')
     nodes = self._neg_sampler.get(dst_ids)
     neg_features = self._parse_nodes(nodes)
     sparse_nodes = self._hard_neg_sampler.get(src_ids).layer_nodes(1)
@@ -669,8 +673,11 @@ def __init__(self,
 
   def _get_impl(self, src_ids, dst_ids):
     src_ids = np.array(src_ids, dtype=np.int64)
+    src_ids_padded = np.pad(src_ids, (0, self._batch_size - len(src_ids)),
+                            'edge')
     dst_ids = np.array(dst_ids, dtype=np.int64)
-    nodes = self._neg_sampler.get(src_ids, dst_ids)
+    dst_ids = np.pad(dst_ids, (0, self._batch_size - len(dst_ids)), 'edge')
+    nodes = self._neg_sampler.get(src_ids_padded, dst_ids)
     neg_features = self._parse_nodes(nodes)
     sparse_nodes = self._hard_neg_sampler.get(src_ids).layer_nodes(1)
     hard_neg_features, hard_neg_indices = self._parse_sparse_nodes(sparse_nodes)
diff --git a/easy_rec/python/layers/keras/__init__.py b/easy_rec/python/layers/keras/__init__.py
index 0e59090ce..f029b9c66 100644
--- a/easy_rec/python/layers/keras/__init__.py
+++ b/easy_rec/python/layers/keras/__init__.py
@@ -1,3 +1,4 @@
+from .attention import Attention
 from .auxiliary_loss import AuxiliaryLoss
 from .blocks import MLP
 from .blocks import Gate
diff --git a/easy_rec/python/layers/keras/attention.py b/easy_rec/python/layers/keras/attention.py
new file mode 100644
index 000000000..d7f717cb5
--- /dev/null
+++ b/easy_rec/python/layers/keras/attention.py
@@ -0,0 +1,268 @@
+# -*- encoding:utf-8 -*-
+# Copyright (c) Alibaba, Inc. and its affiliates.
+"""Attention layers that can be used in sequence DNN/CNN models.
+
+This file follows the terminology of https://arxiv.org/abs/1706.03762 Figure 2.
+Attention is formed by three tensors: Query, Key and Value.
+"""
+import tensorflow as tf
+from tensorflow.python.keras.layers import Layer
+
+
+class Attention(Layer):
+  """Dot-product attention layer, a.k.a. Luong-style attention.
+
+  Inputs are a list with 2 or 3 elements:
+  1. A `query` tensor of shape `(batch_size, Tq, dim)`.
+  2. A `value` tensor of shape `(batch_size, Tv, dim)`.
+  3. A optional `key` tensor of shape `(batch_size, Tv, dim)`. If none
+      supplied, `value` will be used as a `key`.
+
+  The calculation follows the steps:
+  1. Calculate attention scores using `query` and `key` with shape
+      `(batch_size, Tq, Tv)`.
+  2. Use scores to calculate a softmax distribution with shape
+      `(batch_size, Tq, Tv)`.
+  3. Use the softmax distribution to create a linear combination of `value`
+      with shape `(batch_size, Tq, dim)`.
+
+  Args:
+      use_scale: If `True`, will create a scalar variable to scale the
+          attention scores.
+      dropout: Float between 0 and 1. Fraction of the units to drop for the
+          attention scores. Defaults to `0.0`.
+      seed: A Python integer to use as random seed in case of `dropout`.
+      score_mode: Function to use to compute attention scores, one of
+          `{"dot", "concat"}`. `"dot"` refers to the dot product between the
+          query and key vectors. `"concat"` refers to the hyperbolic tangent
+          of the concatenation of the `query` and `key` vectors.
+
+  Call Args:
+      inputs: List of the following tensors:
+          - `query`: Query tensor of shape `(batch_size, Tq, dim)`.
+          - `value`: Value tensor of shape `(batch_size, Tv, dim)`.
+          - `key`: Optional key tensor of shape `(batch_size, Tv, dim)`. If
+              not given, will use `value` for both `key` and `value`, which is
+              the most common case.
+      mask: List of the following tensors:
+          - `query_mask`: A boolean mask tensor of shape `(batch_size, Tq)`.
+              If given, the output will be zero at the positions where
+              `mask==False`.
+          - `value_mask`: A boolean mask tensor of shape `(batch_size, Tv)`.
+              If given, will apply the mask such that values at positions
+               where `mask==False` do not contribute to the result.
+      return_attention_scores: bool, it `True`, returns the attention scores
+          (after masking and softmax) as an additional output argument.
+      training: Python boolean indicating whether the layer should behave in
+          training mode (adding dropout) or in inference mode (no dropout).
+      use_causal_mask: Boolean. Set to `True` for decoder self-attention. Adds
+          a mask such that position `i` cannot attend to positions `j > i`.
+          This prevents the flow of information from the future towards the
+          past. Defaults to `False`.
+
+  Output:
+      Attention outputs of shape `(batch_size, Tq, dim)`.
+      (Optional) Attention scores after masking and softmax with shape
+          `(batch_size, Tq, Tv)`.
+  """
+
+  def __init__(self, params, name='attention', reuse=None, **kwargs):
+    super(Attention, self).__init__(name=name, **kwargs)
+    self.use_scale = params.get_or_default('use_scale', False)
+    self.scale_by_dim = params.get_or_default('scale_by_dim', False)
+    self.score_mode = params.get_or_default('score_mode', 'dot')
+    if self.score_mode not in ['dot', 'concat']:
+      raise ValueError('Invalid value for argument score_mode. '
+                       "Expected one of {'dot', 'concat'}. "
+                       'Received: score_mode=%s' % self.score_mode)
+    self.dropout = params.get_or_default('dropout', 0.0)
+    self.seed = params.get_or_default('seed', None)
+    self.scale = None
+    self.concat_score_weight = None
+    self.return_attention_scores = params.get_or_default(
+        'return_attention_scores', False)
+    self.use_causal_mask = params.get_or_default('use_causal_mask', False)
+
+  def build(self, input_shape):
+    self._validate_inputs(input_shape)
+    if self.use_scale:
+      self.scale = self.add_weight(
+          name='scale',
+          shape=(),
+          initializer='ones',
+          dtype=self.dtype,
+          trainable=True,
+      )
+    if self.score_mode == 'concat':
+      self.concat_score_weight = self.add_weight(
+          name='concat_score_weight',
+          shape=(),
+          initializer='ones',
+          dtype=self.dtype,
+          trainable=True,
+      )
+    self.built = True
+
+  def _calculate_scores(self, query, key):
+    """Calculates attention scores as a query-key dot product.
+
+    Args:
+        query: Query tensor of shape `(batch_size, Tq, dim)`.
+        key: Key tensor of shape `(batch_size, Tv, dim)`.
+
+    Returns:
+        Tensor of shape `(batch_size, Tq, Tv)`.
+    """
+    if self.score_mode == 'dot':
+      scores = tf.matmul(query, tf.transpose(key, [0, 2, 1]))
+      if self.scale is not None:
+        scores *= self.scale
+      elif self.scale_by_dim:
+        dk = tf.cast(tf.shape(key)[-1], tf.float32)
+        scores /= tf.math.sqrt(dk)
+    elif self.score_mode == 'concat':
+      # Reshape tensors to enable broadcasting.
+      # Reshape into [batch_size, Tq, 1, dim].
+      q_reshaped = tf.expand_dims(query, axis=-2)
+      # Reshape into [batch_size, 1, Tv, dim].
+      k_reshaped = tf.expand_dims(key, axis=-3)
+      if self.scale is not None:
+        scores = self.concat_score_weight * tf.reduce_sum(
+            tf.tanh(self.scale * (q_reshaped + k_reshaped)), axis=-1)
+      else:
+        scores = self.concat_score_weight * tf.reduce_sum(
+            tf.tanh(q_reshaped + k_reshaped), axis=-1)
+    return scores
+
+  def _apply_scores(self, scores, value, scores_mask=None, training=False):
+    """Applies attention scores to the given value tensor.
+
+    To use this method in your attention layer, follow the steps:
+
+    * Use `query` tensor of shape `(batch_size, Tq)` and `key` tensor of
+        shape `(batch_size, Tv)` to calculate the attention `scores`.
+    * Pass `scores` and `value` tensors to this method. The method applies
+        `scores_mask`, calculates
+        `attention_distribution = softmax(scores)`, then returns
+        `matmul(attention_distribution, value).
+    * Apply `query_mask` and return the result.
+
+    Args:
+        scores: Scores float tensor of shape `(batch_size, Tq, Tv)`.
+        value: Value tensor of shape `(batch_size, Tv, dim)`.
+        scores_mask: A boolean mask tensor of shape `(batch_size, 1, Tv)`
+            or `(batch_size, Tq, Tv)`. If given, scores at positions where
+            `scores_mask==False` do not contribute to the result. It must
+            contain at least one `True` value in each line along the last
+            dimension.
+        training: Python boolean indicating whether the layer should behave
+            in training mode (adding dropout) or in inference mode
+            (no dropout).
+
+    Returns:
+        Tensor of shape `(batch_size, Tq, dim)`.
+        Attention scores after masking and softmax with shape
+            `(batch_size, Tq, Tv)`.
+    """
+    if scores_mask is not None:
+      padding_mask = tf.logical_not(scores_mask)
+      # Bias so padding positions do not contribute to attention
+      # distribution.  Note 65504. is the max float16 value.
+      max_value = 65504.0 if scores.dtype == 'float16' else 1.0e9
+      scores -= max_value * tf.cast(padding_mask, dtype=scores.dtype)
+
+    weights = tf.nn.softmax(scores, axis=-1)
+    if training and self.dropout > 0:
+      weights = tf.nn.dropout(weights, 1.0 - self.dropout, seed=self.seed)
+    return tf.matmul(weights, value), weights
+
+  def _calculate_score_mask(self, scores, v_mask, use_causal_mask):
+    if use_causal_mask:
+      # Creates a lower triangular mask, so position i cannot attend to
+      # positions j > i. This prevents the flow of information from the
+      # future into the past.
+      score_shape = tf.shape(scores)
+      # causal_mask_shape = [1, Tq, Tv].
+      mask_shape = (1, score_shape[-2], score_shape[-1])
+      ones_mask = tf.ones(shape=mask_shape, dtype='int32')
+      row_index = tf.cumsum(ones_mask, axis=-2)
+      col_index = tf.cumsum(ones_mask, axis=-1)
+      causal_mask = tf.greater_equal(row_index, col_index)
+
+      if v_mask is not None:
+        # Mask of shape [batch_size, 1, Tv].
+        v_mask = tf.expand_dims(v_mask, axis=-2)
+        return tf.logical_and(v_mask, causal_mask)
+      return causal_mask
+    else:
+      # If not using causal mask, return the value mask as is,
+      # or None if the value mask is not provided.
+      return v_mask
+
+  def call(
+      self,
+      inputs,
+      mask=None,
+      training=False,
+  ):
+    self._validate_inputs(inputs=inputs, mask=mask)
+    q = inputs[0]
+    v = inputs[1]
+    k = inputs[2] if len(inputs) > 2 else v
+    q_mask = mask[0] if mask else None
+    v_mask = mask[1] if mask else None
+    scores = self._calculate_scores(query=q, key=k)
+    scores_mask = self._calculate_score_mask(scores, v_mask,
+                                             self.use_causal_mask)
+    result, attention_scores = self._apply_scores(
+        scores=scores, value=v, scores_mask=scores_mask, training=training)
+    if q_mask is not None:
+      # Mask of shape [batch_size, Tq, 1].
+      q_mask = tf.expand_dims(q_mask, axis=-1)
+      result *= tf.cast(q_mask, dtype=result.dtype)
+    if self.return_attention_scores:
+      return result, attention_scores
+    return result
+
+  def compute_mask(self, inputs, mask=None):
+    self._validate_inputs(inputs=inputs, mask=mask)
+    if mask is None or mask[0] is None:
+      return None
+    return tf.convert_to_tensor(mask[0])
+
+  def compute_output_shape(self, input_shape):
+    """Returns shape of value tensor dim, but for query tensor length."""
+    return list(input_shape[0][:-1]), input_shape[1][-1]
+
+  def _validate_inputs(self, inputs, mask=None):
+    """Validates arguments of the call method."""
+    class_name = self.__class__.__name__
+    if not isinstance(inputs, list):
+      raise ValueError('{class_name} layer must be called on a list of inputs, '
+                       'namely [query, value] or [query, value, key]. '
+                       'Received: inputs={inputs}.'.format(
+                           class_name=class_name, inputs=inputs))
+    if len(inputs) < 2 or len(inputs) > 3:
+      raise ValueError('%s layer accepts inputs list of length 2 or 3, '
+                       'namely [query, value] or [query, value, key]. '
+                       'Received length: %d.' % (class_name, len(inputs)))
+    if mask is not None:
+      if not isinstance(mask, list):
+        raise ValueError(
+            '{class_name} layer mask must be a list, '
+            'namely [query_mask, value_mask]. Received: mask={mask}.'.format(
+                class_name=class_name, mask=mask))
+      if len(mask) < 2 or len(mask) > 3:
+        raise ValueError(
+            '{class_name} layer accepts mask list of length 2 or 3. '
+            'Received: inputs={inputs}, mask={mask}.'.format(
+                class_name=class_name, inputs=inputs, mask=mask))
+
+  def get_config(self):
+    base_config = super(Attention, self).get_config()
+    config = {
+        'use_scale': self.use_scale,
+        'score_mode': self.score_mode,
+        'dropout': self.dropout,
+    }
+    return dict(list(base_config.items()) + list(config.items()))
diff --git a/easy_rec/python/layers/keras/blocks.py b/easy_rec/python/layers/keras/blocks.py
index 06ce11cbf..13cd14612 100644
--- a/easy_rec/python/layers/keras/blocks.py
+++ b/easy_rec/python/layers/keras/blocks.py
@@ -4,6 +4,11 @@
 import logging
 
 import tensorflow as tf
+from tensorflow.python.keras.initializers import Constant
+from tensorflow.python.keras.layers import Dense
+from tensorflow.python.keras.layers import Dropout
+from tensorflow.python.keras.layers import Lambda
+from tensorflow.python.keras.layers import Layer
 
 from easy_rec.python.layers.keras.activation import activation_layer
 from easy_rec.python.layers.utils import Parameter
@@ -14,7 +19,7 @@
   tf = tf.compat.v1
 
 
-class MLP(tf.keras.layers.Layer):
+class MLP(Layer):
   """Sequential multi-layer perceptron (MLP) block.
 
   Attributes:
@@ -74,7 +79,7 @@ def add_rich_layer(self,
                      l2_reg=None):
     act_layer = activation_layer(activation)
     if use_bn and not use_bn_after_activation:
-      dense = tf.keras.layers.Dense(
+      dense = Dense(
           units=num_units,
           use_bias=use_bias,
           kernel_initializer=initializer,
@@ -86,7 +91,7 @@ def add_rich_layer(self,
       self._sub_layers.append(bn)
       self._sub_layers.append(act_layer)
     else:
-      dense = tf.keras.layers.Dense(
+      dense = Dense(
           num_units,
           use_bias=use_bias,
           kernel_initializer=initializer,
@@ -99,7 +104,7 @@ def add_rich_layer(self,
         self._sub_layers.append(bn)
 
     if 0.0 < dropout_rate < 1.0:
-      dropout = tf.keras.layers.Dropout(dropout_rate, name='%s/dropout' % name)
+      dropout = Dropout(dropout_rate, name='%s/dropout' % name)
       self._sub_layers.append(dropout)
     elif dropout_rate >= 1.0:
       raise ValueError('invalid dropout_ratio: %.3f' % dropout_rate)
@@ -117,31 +122,56 @@ def call(self, x, training=None, **kwargs):
     return x
 
 
-class Highway(tf.keras.layers.Layer):
+class Highway(Layer):
 
   def __init__(self, params, name='highway', reuse=None, **kwargs):
     super(Highway, self).__init__(name, **kwargs)
     self.emb_size = params.get_or_default('emb_size', None)
     self.num_layers = params.get_or_default('num_layers', 1)
-    self.activation = params.get_or_default('activation', 'gelu')
+    self.activation = params.get_or_default('activation', 'relu')
     self.dropout_rate = params.get_or_default('dropout_rate', 0.0)
     self.init_gate_bias = params.get_or_default('init_gate_bias', -3.0)
-    self.reuse = reuse
+    self.act_layer = activation_layer(self.activation)
+    self.dropout_layer = Dropout(
+        self.dropout_rate) if self.dropout_rate > 0.0 else None
+    self.project_layer = None
+    self.gate_bias_initializer = Constant(self.init_gate_bias)
+    self.gates = []  # T
+    self.transforms = []  # H
+    self.multiply_layer = tf.keras.layers.Multiply()
+    self.add_layer = tf.keras.layers.Add()
+
+  def build(self, input_shape):
+    dim = input_shape[-1]
+    if self.emb_size is not None and dim != self.emb_size:
+      self.project_layer = Dense(self.emb_size, name='input_projection')
+      dim = self.emb_size
+    self.carry_gate = Lambda(lambda x: 1.0 - x, output_shape=(dim,))
+    for i in range(self.num_layers):
+      gate = Dense(
+          units=dim,
+          bias_initializer=self.gate_bias_initializer,
+          activation='sigmoid',
+          name='gate_%d' % i)
+      self.gates.append(gate)
+      self.transforms.append(Dense(units=dim))
 
   def call(self, inputs, training=None, **kwargs):
-    from easy_rec.python.layers.common_layers import highway
-    return highway(
-        inputs,
-        self.emb_size,
-        activation=self.activation,
-        num_layers=self.num_layers,
-        dropout=self.dropout_rate if training else 0.0,
-        init_gate_bias=self.init_gate_bias,
-        scope=self.name,
-        reuse=self.reuse)
-
-
-class Gate(tf.keras.layers.Layer):
+    value = inputs
+    if self.project_layer is not None:
+      value = self.project_layer(inputs)
+    for i in range(self.num_layers):
+      gate = self.gates[i](value)
+      transformed = self.act_layer(self.transforms[i](value))
+      if self.dropout_layer is not None:
+        transformed = self.dropout_layer(transformed, training=training)
+      transformed_gated = self.multiply_layer([gate, transformed])
+      identity_gated = self.multiply_layer([self.carry_gate(gate), value])
+      value = self.add_layer([transformed_gated, identity_gated])
+    return value
+
+
+class Gate(Layer):
   """Weighted sum gate."""
 
   def __init__(self, params, name='gate', reuse=None, **kwargs):
@@ -165,7 +195,7 @@ def call(self, inputs, **kwargs):
     return output
 
 
-class TextCNN(tf.keras.layers.Layer):
+class TextCNN(Layer):
   """Text CNN Model.
 
   References
diff --git a/easy_rec/python/model/easy_rec_model.py b/easy_rec/python/model/easy_rec_model.py
index e45010553..f2408ba47 100644
--- a/easy_rec/python/model/easy_rec_model.py
+++ b/easy_rec/python/model/easy_rec_model.py
@@ -120,6 +120,8 @@ def backbone(self):
       kwargs = {
           'loss_dict': self._loss_dict,
           'metric_dict': self._metric_dict,
+          'prediction_dict': self._prediction_dict,
+          'labels': self._labels,
           constant.SAMPLE_WEIGHT: self._sample_weight
       }
       return self._backbone_net(self._is_training, **kwargs)
diff --git a/easy_rec/python/model/multi_task_model.py b/easy_rec/python/model/multi_task_model.py
index f35148a65..f38d825a1 100644
--- a/easy_rec/python/model/multi_task_model.py
+++ b/easy_rec/python/model/multi_task_model.py
@@ -4,9 +4,13 @@
 from collections import OrderedDict
 
 import tensorflow as tf
+from google.protobuf import struct_pb2
+from tensorflow.python.keras.layers import Dense
 
 from easy_rec.python.builders import loss_builder
 from easy_rec.python.layers.dnn import DNN
+from easy_rec.python.layers.keras.attention import Attention
+from easy_rec.python.layers.utils import Parameter
 from easy_rec.python.model.rank_model import RankModel
 from easy_rec.python.protos import tower_pb2
 from easy_rec.python.protos.easy_rec_model_pb2 import EasyRecModel
@@ -82,6 +86,28 @@ def build_predict_graph(self):
             tower_inputs, axis=-1, name=tower_name + '/relation_input')
         relation_fea = relation_dnn(relation_input)
         relation_features[tower_name] = relation_fea
+      elif task_tower_cfg.use_ait_module:
+        tower_inputs = [tower_features[tower_name]]
+        for relation_tower_name in task_tower_cfg.relation_tower_names:
+          tower_inputs.append(relation_features[relation_tower_name])
+        if len(tower_inputs) == 1:
+          relation_fea = tower_inputs[0]
+          relation_features[tower_name] = relation_fea
+        else:
+          if task_tower_cfg.HasField('ait_project_dim'):
+            dim = task_tower_cfg.ait_project_dim
+          else:
+            dim = int(tower_inputs[0].shape[-1])
+          queries = tf.stack([Dense(dim)(x) for x in tower_inputs], axis=1)
+          keys = tf.stack([Dense(dim)(x) for x in tower_inputs], axis=1)
+          values = tf.stack([Dense(dim)(x) for x in tower_inputs], axis=1)
+          st_params = struct_pb2.Struct()
+          st_params.update({'scale_by_dim': True})
+          params = Parameter(st_params, True)
+          attention_layer = Attention(params, name='AITM_%s' % tower_name)
+          result = attention_layer([queries, values, keys])
+          relation_fea = result[:, 0, :]
+          relation_features[tower_name] = relation_fea
       else:
         relation_fea = tower_features[tower_name]
 
@@ -224,7 +250,17 @@ def build_loss_graph(self):
         for loss_name in loss_dict.keys():
           loss_dict[loss_name] = loss_dict[loss_name] * task_loss_weight[0]
       else:
+        calibrate_loss = []
         for loss in losses:
+          if loss.loss_type == LossType.ORDER_CALIBRATE_LOSS:
+            y_t = self._prediction_dict['probs_%s' % tower_name]
+            for relation_tower_name in task_tower_cfg.relation_tower_names:
+              y_rt = self._prediction_dict['probs_%s' % relation_tower_name]
+              cali_loss = tf.reduce_mean(tf.nn.relu(y_t - y_rt))
+              calibrate_loss.append(cali_loss * loss.weight)
+              logging.info('calibrate loss: %s -> %s' %
+                           (relation_tower_name, tower_name))
+            continue
           loss_param = loss.WhichOneof('loss_param')
           if loss_param is not None:
             loss_param = getattr(loss, loss_param)
@@ -243,6 +279,10 @@ def build_loss_graph(self):
                   loss.loss_type, loss_name, loss_value)
             else:
               loss_dict[loss_name] = loss_value * task_loss_weight[i]
+        if calibrate_loss:
+          cali_loss = tf.add_n(calibrate_loss)
+          loss_dict['order_calibrate_loss'] = cali_loss
+          tf.summary.scalar('loss/order_calibrate_loss', cali_loss)
       self._loss_dict.update(loss_dict)
 
     kd_loss_dict = loss_builder.build_kd_loss(self.kd, self._prediction_dict,
@@ -263,6 +303,8 @@ def get_outputs(self):
                 suffix='_%s' % tower_name))
       else:
         for loss in task_tower_cfg.losses:
+          if loss.loss_type == LossType.ORDER_CALIBRATE_LOSS:
+            continue
           outputs.extend(
               self._get_outputs_impl(
                   loss.loss_type,
diff --git a/easy_rec/python/protos/keras_layer.proto b/easy_rec/python/protos/keras_layer.proto
index 3b7c0d34d..a8b92d1a7 100644
--- a/easy_rec/python/protos/keras_layer.proto
+++ b/easy_rec/python/protos/keras_layer.proto
@@ -26,5 +26,6 @@ message KerasLayer {
         SequenceAugment seq_aug = 15;
         PPNet ppnet = 16;
         TextCNN text_cnn = 17;
+        HighWayTower highway = 18;
     }
 }
diff --git a/easy_rec/python/protos/layer.proto b/easy_rec/python/protos/layer.proto
index df51009bc..c0a01686a 100644
--- a/easy_rec/python/protos/layer.proto
+++ b/easy_rec/python/protos/layer.proto
@@ -6,8 +6,10 @@ import "easy_rec/python/protos/dnn.proto";
 message HighWayTower {
     optional string input = 1;
     required uint32 emb_size = 2;
-    required string activation = 3 [default = 'gelu'];
+    required string activation = 3 [default = 'relu'];
     optional float dropout_rate = 4;
+    optional float init_gate_bias = 5 [default = -3.0];
+    optional uint32 num_layers = 6 [default = 1];
 }
 
 message PeriodicEmbedding {
diff --git a/easy_rec/python/protos/loss.proto b/easy_rec/python/protos/loss.proto
index 5c913bf6e..5098518b3 100644
--- a/easy_rec/python/protos/loss.proto
+++ b/easy_rec/python/protos/loss.proto
@@ -17,6 +17,7 @@ enum LossType {
     PAIRWISE_FOCAL_LOSS = 11;
     PAIRWISE_LOGISTIC_LOSS = 12;
     JRC_LOSS = 13;
+    ORDER_CALIBRATE_LOSS = 14;
 }
 
 message Loss {
diff --git a/easy_rec/python/protos/tower.proto b/easy_rec/python/protos/tower.proto
index 3cd6f6253..73df3e6f7 100644
--- a/easy_rec/python/protos/tower.proto
+++ b/easy_rec/python/protos/tower.proto
@@ -60,7 +60,7 @@ message BayesTaskTower {
     optional DNN relation_dnn = 8;
     // training loss weights
     optional float weight = 9 [default = 1.0];
-    // label name for indcating the sample space for the task tower
+    // label name for indicating the sample space for the task tower
     optional string task_space_indicator_label = 10;
     // the loss weight for sample in the task space
     optional float in_task_space_weight = 11 [default = 1.0];
@@ -74,6 +74,10 @@ message BayesTaskTower {
     repeated Loss losses = 15;
     // whether to use sample weight in this tower
     required bool use_sample_weight = 16 [default = true];
+    // whether to use AIT module
+    optional bool use_ait_module = 17 [default = false];
+    // set this when the dimensions of last layer of towers are not equal
+    optional uint32 ait_project_dim = 18;
     // training loss label dynamic weights
-    optional string dynamic_weight = 17;
+    optional string dynamic_weight = 19;
 };
diff --git a/easy_rec/python/test/train_eval_test.py b/easy_rec/python/test/train_eval_test.py
index f689dcd01..a682e91bc 100644
--- a/easy_rec/python/test/train_eval_test.py
+++ b/easy_rec/python/test/train_eval_test.py
@@ -650,6 +650,11 @@ def test_tag_kv_input(self):
         'samples/model_config/kv_tag.config', self._test_dir)
     self.assertTrue(self._success)
 
+  def test_aitm(self):
+    self._success = test_utils.test_single_train_eval(
+        'samples/model_config/aitm_on_taobao.config', self._test_dir)
+    self.assertTrue(self._success)
+
   def test_dbmtl(self):
     self._success = test_utils.test_single_train_eval(
         'samples/model_config/dbmtl_on_taobao.config', self._test_dir)
diff --git a/easy_rec/version.py b/easy_rec/version.py
index 68e35a53c..2ae7769a6 100644
--- a/easy_rec/version.py
+++ b/easy_rec/version.py
@@ -1,4 +1,4 @@
 # -*- encoding:utf-8 -*-
 # Copyright (c) Alibaba, Inc. and its affiliates.
 
-__version__ = '0.8.0'
+__version__ = '0.8.1'
diff --git a/examples/readme.md b/examples/readme.md
index fd02b6825..bf936cf21 100644
--- a/examples/readme.md
+++ b/examples/readme.md
@@ -36,12 +36,14 @@ cd EasyRec
 
 -- Docker环境可选
 (1) `python=3.6.9` + `tenserflow=1.15.5`
-docker pull mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.6.3
-docker run -td --network host -v /local_path/EasyRec:/docker_path/EasyRec mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.6.3
+docker pull mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.7.4
+docker run -td --network host -v /local_path/EasyRec:/docker_path/EasyRec mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.7.4
+docker exec -it <CONTAINER_ID> bash
+
 
 (2) `python=3.8.10` + `tenserflow=2.10.0`
-docker pull mybigpai-registry-vpc.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py38-tf2.10-0.6.4
-docker run -td --network host -v /local_path/EasyRec:/docker_path/EasyRec mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py38-tf2.10-0.6.4
+docker pull mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py38-tf2.10-0.7.4
+docker run -td --network host -v /local_path/EasyRec:/docker_path/EasyRec mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py38-tf2.10-0.7.4
 
 docker exec -it <CONTAINER_ID> bash
 ```
@@ -55,11 +57,11 @@ cd EasyRec
 -- Docker环境可选
 (1) `python=3.6.9` + `tenserflow=1.15.5`
 bash scripts/build_docker.sh
-sudo docker run -td --network host -v /local_path:/docker_path mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-<easyrec_version>
+sudo docker run -td --network host -v /local_path:/docker_path mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-<easyrec_version>
 
 (2) `python=3.8.10` + `tenserflow=2.10.0`
 bash scripts/build_docker_tf210.sh
-sudo docker run -td --network host -v /local_path:/docker_path mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py38-tf2.10-<easyrec_version>
+sudo docker run -td --network host -v /local_path:/docker_path mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py38-tf2.10-<easyrec_version>
 
 sudo docker exec -it <CONTAINER_ID> bash
 ```
diff --git a/samples/model_config/aitm_on_taobao.config b/samples/model_config/aitm_on_taobao.config
new file mode 100644
index 000000000..c67f1d677
--- /dev/null
+++ b/samples/model_config/aitm_on_taobao.config
@@ -0,0 +1,295 @@
+train_input_path: "data/test/tb_data/taobao_train_data"
+eval_input_path: "data/test/tb_data/taobao_test_data"
+model_dir: "experiments/aitm_taobao_ckpt"
+
+train_config {
+  optimizer_config {
+    adam_optimizer {
+      learning_rate {
+        constant_learning_rate {
+          learning_rate: 0.0001
+        }
+      }
+    }
+    use_moving_average: false
+  }
+  num_steps: 500
+  sync_replicas: true
+  save_checkpoints_steps: 100
+  log_step_count_steps: 100
+}
+data_config {
+  batch_size: 4096
+  label_fields: "clk"
+  label_fields: "buy"
+  prefetch_size: 32
+  input_type: CSVInput
+  input_fields {
+    input_name: "clk"
+    input_type: INT32
+  }
+  input_fields {
+    input_name: "buy"
+    input_type: INT32
+  }
+  input_fields {
+    input_name: "pid"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "adgroup_id"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "cate_id"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "campaign_id"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "customer"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "brand"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "user_id"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "cms_segid"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "cms_group_id"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "final_gender_code"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "age_level"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "pvalue_level"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "shopping_level"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "occupation"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "new_user_class_level"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "tag_category_list"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "tag_brand_list"
+    input_type: STRING
+  }
+  input_fields {
+    input_name: "price"
+    input_type: INT32
+  }
+}
+feature_config: {
+  features {
+    input_names: "pid"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 10
+  }
+  features {
+    input_names: "adgroup_id"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 100000
+  }
+  features {
+    input_names: "cate_id"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 10000
+  }
+  features {
+    input_names: "campaign_id"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 100000
+  }
+  features {
+    input_names: "customer"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 100000
+  }
+  features {
+    input_names: "brand"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 100000
+  }
+  features {
+    input_names: "user_id"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 100000
+  }
+  features {
+    input_names: "cms_segid"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 100
+  }
+  features {
+    input_names: "cms_group_id"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 100
+  }
+  features {
+    input_names: "final_gender_code"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 10
+  }
+  features {
+    input_names: "age_level"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 10
+  }
+  features {
+    input_names: "pvalue_level"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 10
+  }
+  features {
+    input_names: "shopping_level"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 10
+  }
+  features {
+    input_names: "occupation"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 10
+  }
+  features {
+    input_names: "new_user_class_level"
+    feature_type: IdFeature
+    embedding_dim: 16
+    hash_bucket_size: 10
+  }
+  features {
+    input_names: "tag_category_list"
+    feature_type: TagFeature
+    embedding_dim: 16
+    hash_bucket_size: 100000
+    separator: "|"
+  }
+  features {
+    input_names: "tag_brand_list"
+    feature_type: TagFeature
+    embedding_dim: 16
+    hash_bucket_size: 100000
+    separator: "|"
+  }
+  features {
+    input_names: "price"
+    feature_type: IdFeature
+    embedding_dim: 16
+    num_buckets: 50
+  }
+}
+model_config {
+  model_name: "AITM"
+  model_class: "MultiTaskModel"
+  feature_groups {
+    group_name: "all"
+    feature_names: "user_id"
+    feature_names: "cms_segid"
+    feature_names: "cms_group_id"
+    feature_names: "age_level"
+    feature_names: "pvalue_level"
+    feature_names: "shopping_level"
+    feature_names: "occupation"
+    feature_names: "new_user_class_level"
+    feature_names: "adgroup_id"
+    feature_names: "cate_id"
+    feature_names: "campaign_id"
+    feature_names: "customer"
+    feature_names: "brand"
+    feature_names: "price"
+    feature_names: "pid"
+    feature_names: "tag_category_list"
+    feature_names: "tag_brand_list"
+    wide_deep: DEEP
+  }
+  backbone {
+    blocks {
+      name: "mlp"
+      inputs {
+        feature_group_name: "all"
+      }
+      keras_layer {
+        class_name: 'MLP'
+        mlp {
+          hidden_units: [512, 256]
+        }
+      }
+    }
+  }
+  model_params {
+    task_towers {
+      tower_name: "ctr"
+      label_name: "clk"
+      loss_type: CLASSIFICATION
+      metrics_set: {
+        auc {}
+      }
+      dnn {
+        hidden_units: [256, 128]
+      }
+      use_ait_module: true
+      weight: 1.0
+    }
+    task_towers {
+      tower_name: "cvr"
+      label_name: "buy"
+      losses {
+        loss_type: CLASSIFICATION
+      }
+      losses {
+        loss_type: ORDER_CALIBRATE_LOSS
+      }
+      metrics_set: {
+        auc {}
+      }
+      dnn {
+        hidden_units: [256, 128]
+      }
+      relation_tower_names: ["ctr"]
+      use_ait_module: true
+      ait_project_dim: 128
+      weight: 1.0
+    }
+    l2_regularization: 1e-6
+  }
+  embedding_regularization: 5e-6
+}
diff --git a/scripts/build_docker.sh b/scripts/build_docker.sh
index be4113257..16a80775a 100644
--- a/scripts/build_docker.sh
+++ b/scripts/build_docker.sh
@@ -18,4 +18,4 @@ then
   exit 1
 fi
 
-sudo docker build --network=host . -f docker/Dockerfile -t  mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-${version}
+sudo docker build --network=host . -f docker/Dockerfile -t  mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-${version}
diff --git a/scripts/build_docker_tf210.sh b/scripts/build_docker_tf210.sh
index 876d6dd06..33bc1a11d 100644
--- a/scripts/build_docker_tf210.sh
+++ b/scripts/build_docker_tf210.sh
@@ -18,4 +18,4 @@ then
   exit 1
 fi
 
-sudo docker build --progress=plain --network=host . -f docker/Dockerfile_tf210 -t  mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py38-tf2.10-${version}
+sudo docker build --progress=plain --network=host . -f docker/Dockerfile_tf210 -t  mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py38-tf2.10-${version}