Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] add a DSSM derivative model: integrate SENet into DSSM #489

Merged
merged 10 commits into from
Oct 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/images/models/dssm+senet.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
83 changes: 83 additions & 0 deletions docs/source/models/dssm_derivatives.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# DSSM衍生扩展模型

## DSSM + SENet

### 简介

在推荐场景中,往往存在多种用户特征和物品特征,特征类型各不相同,各种特征经过embedding层后进入双塔模型的DNN层进行训练,在部分场景中甚至还会引入多模态embedding特征, 如图像和文本的embedding。
然而各个特征对目标的影响不尽相同,有的特征重要性高,对模型整体表现影响大,有的特征则影响较小。因此当特征不断增多时,可以结合SENet自动学习每个特征的权重,增强重要信息到塔顶的能力。

![dssm+senet](../../images/models/dssm+senet.png)

### 配置说明

```protobuf
model_config:{
model_class: "DSSM_SENet"
feature_groups: {
group_name: 'user'
feature_names: 'user_id'
feature_names: 'cms_segid'
feature_names: 'cms_group_id'
feature_names: 'age_level'
feature_names: 'pvalue_level'
feature_names: 'shopping_level'
feature_names: 'occupation'
feature_names: 'new_user_class_level'
feature_names: 'tag_category_list'
feature_names: 'tag_brand_list'
wide_deep:DEEP
}
feature_groups: {
group_name: "item"
feature_names: 'adgroup_id'
feature_names: 'cate_id'
feature_names: 'campaign_id'
feature_names: 'customer'
feature_names: 'brand'
#feature_names: 'price'
#feature_names: 'pid'
wide_deep:DEEP
}
dssm_senet {
user_tower {
id: "user_id"
senet {
num_squeeze_group : 2
reduction_ratio: 4
}
dnn {
hidden_units: [128, 32]
}
}
item_tower {
id: "adgroup_id"
senet {
num_squeeze_group : 2
reduction_ratio: 4
}
dnn {
hidden_units: [128, 32]
}
}
simi_func: COSINE
scale_simi: false
temperature: 0.01
l2_regularization: 1e-6
}
loss_type: SOFTMAX_CROSS_ENTROPY
embedding_regularization: 5e-5
}
```

- senet参数配置:
- num_squeeze_group: 每个特征embedding的分组个数, 默认为2
- reduction_ratio: 维度压缩比例, 默认为4

### 示例Config

[dssm_senet_on_taobao.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/dssm_senet_on_taobao.config)

### 参考论文

[Squeeze-and-Excitation Networks](https://arxiv.org/abs/1709.01507)
1 change: 1 addition & 0 deletions docs/source/models/recall.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

dssm
dssm_neg_sampler
dssm_derivatives
mind
co_metric_learning_i2i
pdn
Expand Down
73 changes: 73 additions & 0 deletions easy_rec/python/layers/senet.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# -*- encoding:utf-8 -*-
# Copyright (c) Alibaba, Inc. and its affiliates.
import tensorflow as tf

if tf.__version__ >= '2.0':
tf = tf.compat.v1


class SENet:
"""Squeeze and Excite Network.

Input shape
- A list of 2D tensor with shape: ``(batch_size,embedding_size)``.
The ``embedding_size`` of each field can have different value.

Args:
num_fields: int, number of fields.
num_squeeze_group: int, number of groups for squeeze.
reduction_ratio: int, reduction ratio for squeeze.
l2_reg: float, l2 regularizer for embedding.
name: str, name of the layer.
"""

def __init__(self,
num_fields,
num_squeeze_group,
reduction_ratio,
l2_reg,
name='SENet'):
self.num_fields = num_fields
self.num_squeeze_group = num_squeeze_group
self.reduction_ratio = reduction_ratio
self._l2_reg = l2_reg
self._name = name

def __call__(self, inputs):
g = self.num_squeeze_group
f = self.num_fields
r = self.reduction_ratio
reduction_size = max(1, f * g * 2 // r)

emb_size = 0
for input in inputs:
emb_size += int(input.shape[-1])

group_embs = [
tf.reshape(emb, [-1, g, int(emb.shape[-1]) // g]) for emb in inputs
]

squeezed = []
for emb in group_embs:
squeezed.append(tf.reduce_max(emb, axis=-1)) # [B, g]
squeezed.append(tf.reduce_mean(emb, axis=-1)) # [B, g]
z = tf.concat(squeezed, axis=1) # [bs, field_size * num_groups * 2]

reduced = tf.layers.dense(
inputs=z,
units=reduction_size,
kernel_regularizer=self._l2_reg,
activation='relu',
name='%s/reduce' % self._name)

excited_weights = tf.layers.dense(
inputs=reduced,
units=emb_size,
kernel_initializer='glorot_normal',
name='%s/excite' % self._name)

# Re-weight
inputs = tf.concat(inputs, axis=-1)
output = inputs * excited_weights

return output
143 changes: 143 additions & 0 deletions easy_rec/python/model/dssm_senet.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
# -*- encoding:utf-8 -*-
# Copyright (c) Alibaba, Inc. and its affiliates.
import tensorflow as tf

from easy_rec.python.layers import dnn
from easy_rec.python.layers import senet
from easy_rec.python.model.dssm import DSSM
from easy_rec.python.model.match_model import MatchModel
from easy_rec.python.protos.loss_pb2 import LossType
from easy_rec.python.protos.simi_pb2 import Similarity
from easy_rec.python.utils.proto_util import copy_obj

from easy_rec.python.protos.dssm_senet_pb2 import DSSM_SENet as DSSM_SENet_Config # NOQA

if tf.__version__ >= '2.0':
tf = tf.compat.v1
losses = tf.losses


class DSSM_SENet(DSSM):

def __init__(self,
model_config,
feature_configs,
features,
labels=None,
is_training=False):

MatchModel.__init__(self, model_config, feature_configs, features, labels,
is_training)

assert self._model_config.WhichOneof('model') == 'dssm_senet', \
'invalid model config: %s' % self._model_config.WhichOneof('model')
self._model_config = self._model_config.dssm_senet
assert isinstance(self._model_config, DSSM_SENet_Config)

# copy_obj so that any modification will not affect original config
self.user_tower = copy_obj(self._model_config.user_tower)

self.user_seq_features, self.user_plain_features, self.user_feature_list = self._input_layer(
self._feature_dict, 'user', is_combine=False)
self.user_num_fields = len(self.user_feature_list)

# copy_obj so that any modification will not affect original config
self.item_tower = copy_obj(self._model_config.item_tower)

self.item_seq_features, self.item_plain_features, self.item_feature_list = self._input_layer(
self._feature_dict, 'item', is_combine=False)
self.item_num_fields = len(self.item_feature_list)

self._user_tower_emb = None
self._item_tower_emb = None

def build_predict_graph(self):
user_senet = senet.SENet(
num_fields=self.user_num_fields,
num_squeeze_group=self.user_tower.senet.num_squeeze_group,
reduction_ratio=self.user_tower.senet.reduction_ratio,
l2_reg=self._l2_reg,
name='user_senet')
user_senet_output_list = user_senet(self.user_feature_list)
user_senet_output = tf.concat(user_senet_output_list, axis=-1)

num_user_dnn_layer = len(self.user_tower.dnn.hidden_units)
last_user_hidden = self.user_tower.dnn.hidden_units.pop()
user_dnn = dnn.DNN(self.user_tower.dnn, self._l2_reg, 'user_dnn',
self._is_training)
user_tower_emb = user_dnn(user_senet_output)
user_tower_emb = tf.layers.dense(
inputs=user_tower_emb,
units=last_user_hidden,
kernel_regularizer=self._l2_reg,
name='user_dnn/dnn_%d' % (num_user_dnn_layer - 1))

item_senet = senet.SENet(
num_fields=self.item_num_fields,
num_squeeze_group=self.item_tower.senet.num_squeeze_group,
reduction_ratio=self.item_tower.senet.reduction_ratio,
l2_reg=self._l2_reg,
name='item_senet')

item_senet_output_list = item_senet(self.item_feature_list)
item_senet_output = tf.concat(item_senet_output_list, axis=-1)

num_item_dnn_layer = len(self.item_tower.dnn.hidden_units)
last_item_hidden = self.item_tower.dnn.hidden_units.pop()
item_dnn = dnn.DNN(self.item_tower.dnn, self._l2_reg, 'item_dnn',
self._is_training)
item_tower_emb = item_dnn(item_senet_output)
item_tower_emb = tf.layers.dense(
inputs=item_tower_emb,
units=last_item_hidden,
kernel_regularizer=self._l2_reg,
name='item_dnn/dnn_%d' % (num_item_dnn_layer - 1))

if self._model_config.simi_func == Similarity.COSINE:
user_tower_emb = self.norm(user_tower_emb)
item_tower_emb = self.norm(item_tower_emb)
temperature = self._model_config.temperature
else:
temperature = 1.0

user_item_sim = self.sim(user_tower_emb, item_tower_emb) / temperature
if self._model_config.scale_simi:
sim_w = tf.get_variable(
'sim_w',
dtype=tf.float32,
shape=(1),
initializer=tf.ones_initializer())
sim_b = tf.get_variable(
'sim_b',
dtype=tf.float32,
shape=(1),
initializer=tf.zeros_initializer())
y_pred = user_item_sim * tf.abs(sim_w) + sim_b
else:
y_pred = user_item_sim

if self._is_point_wise:
y_pred = tf.reshape(y_pred, [-1])

if self._loss_type == LossType.CLASSIFICATION:
self._prediction_dict['logits'] = y_pred
self._prediction_dict['probs'] = tf.nn.sigmoid(y_pred)
elif self._loss_type == LossType.SOFTMAX_CROSS_ENTROPY:
y_pred = self._mask_in_batch(y_pred)
self._prediction_dict['logits'] = y_pred
self._prediction_dict['probs'] = tf.nn.softmax(y_pred)
else:
self._prediction_dict['y'] = y_pred

self._prediction_dict['user_tower_emb'] = user_tower_emb
self._prediction_dict['item_tower_emb'] = item_tower_emb
self._prediction_dict['user_emb'] = tf.reduce_join(
tf.as_string(user_tower_emb), axis=-1, separator=',')
self._prediction_dict['item_emb'] = tf.reduce_join(
tf.as_string(item_tower_emb), axis=-1, separator=',')
return self._prediction_dict

def build_output_dict(self):
output_dict = MatchModel.build_output_dict(self)

return output_dict
27 changes: 27 additions & 0 deletions easy_rec/python/protos/dssm_senet.proto
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
syntax = "proto2";
package protos;

import "easy_rec/python/protos/dnn.proto";
import "easy_rec/python/protos/simi.proto";
import "easy_rec/python/protos/layer.proto";

message DSSM_SENet_Tower {
required string id = 1;
required SENet senet = 2;
required DNN dnn = 3;

};


message DSSM_SENet {
required DSSM_SENet_Tower user_tower = 1;
required DSSM_SENet_Tower item_tower = 2;
required float l2_regularization = 3 [default = 1e-4];
optional Similarity simi_func = 4 [default=COSINE];
// add a layer for scaling the similarity
optional bool scale_simi = 5 [default = true];
optional string item_id = 9;
required bool ignore_in_batch_neg_sam = 10 [default = false];
// normalize user_tower_embedding and item_tower_embedding
optional float temperature = 11 [default = 1.0];
}
2 changes: 2 additions & 0 deletions easy_rec/python/protos/easy_rec_model.proto
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import "easy_rec/python/protos/variational_dropout.proto";
import "easy_rec/python/protos/multi_tower_recall.proto";
import "easy_rec/python/protos/tower.proto";
import "easy_rec/python/protos/pdn.proto";
import "easy_rec/python/protos/dssm_senet.proto";

// for input performance test
message DummyModel {
Expand Down Expand Up @@ -106,6 +107,7 @@ message EasyRecModel {
DropoutNet dropoutnet = 203;
CoMetricLearningI2I metric_learning = 204;
PDN pdn = 205;
DSSM_SENet dssm_senet = 206;

MMoE mmoe = 301;
ESMM esmm = 302;
Expand Down
6 changes: 6 additions & 0 deletions easy_rec/python/test/train_eval_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -1248,6 +1248,12 @@ def test_pdn(self):
'samples/model_config/pdn_on_taobao.config', self._test_dir)
self.assertTrue(self._success)

@unittest.skipIf(gl is None, 'graphlearn is not installed')
def test_dssm_senet(self):
self._success = test_utils.test_single_train_eval(
'samples/model_config/dssm_senet_on_taobao.config', self._test_dir)
self.assertTrue(self._success)


if __name__ == '__main__':
tf.test.main()
Loading
Loading