基于Goalng、Docker和微服务思想实现了高并发、高性能和高可用的推荐系统推理微服务,包括多种召回/排序服务,并提供多种接口访问方式(REST、gRPC和Dubbo)等,每日可处理上千万次推理请求。
Large Scale Deep Learning Recommender Systems Inference Services / Microservices base on TFServing、Faiss 、Redis、Dubbo、Nacos、gRPC and Golang. This system provides real-time inference services(Dubbo api、gPRC api and REST api),which can withstand millions of inference requests per day.
The model inference microservices based on deep learning mainly uses the following components:
Type | Component | Description |
---|---|---|
Data | Hive / Spark | ETL millions users's behavior data and then build the feature data warehouse. |
Redis | Save the training samples in TFRecord format and store them in Redis Cluster. | |
Model | TensorFlow | Training deep learning recall / rank model , alse you can use other deep learning framework ,but need save models as *.pb format. |
TensorFlow Serving | Deploy models and provide a grpc service. | |
FAISS | Quick ANN search thousands items from millions items. | |
Microservices | Nacos | Manage config files and services. |
Dubbo | Build dubbo protocol RPC services and register them to Nacos. | |
Hystrix | How to distribute traffic during peak traffic (Latency and Fault Tolerance). | |
Skywalking | Record the time spent on each request. | |
Deploy | Docker | Docker containerization deployment services. |
Kubernetes | Manage dockers and monitor the resource consumption of each service, such as memory and CPUs. | |
Nginx、Apisix | API gateway. |
The core components of model inference microservices are as follows:
Type | Component | Description |
---|---|---|
Feature | feature engineering | user offline、user realtime、user seq features and item features. |
Sample | recall/rank samples | create TFRcords format samples. |
Recall | cf recall | user cf 、 item cf and swing. |
dssm recall | recall from dssm model and faiss index. | |
simple recall | rules recall, such as hot items recall. | |
cold start | new users and new items cold start. | |
Rank | pre_ranking | thousands items pre_ranking after recall . |
ranking | hundreds items ranking after pre_ranking. | |
re_rank | hundreds items re_ranking after ranking . | |
Services | config loader | Sparse service's start config from Naocs, such as grpc info 、 redis info and index info. |
dubbo service | dubbo protocol service. | |
gRPC service | grpc protocol service. | |
rest service | restful service. | |
APIs | dubbo api | provide dubbo protocol api. |
gRPC api | provide gRPC protocol api. | |
rest api | provide http protocol api. | |
Web | web | services manage and Service monitor page. |
Deploy | faiss | faiss index service deploy. |
tfserving | tensorflow model deploy. | |
infer | recommend system infer deploy. |
Docker
Kubernetes
Nginx
Apisix
ELK
1.推荐系统
王树森推荐系统公开课 - 基于小红书的场景讲解工业界真实的推荐系统。
● Recommender_System
2.YouTuBe推荐系统排序模型
以"DNN_for_YouTube_Recommendations"模型和电影评分数据集(ml-1m)为基础,详尽的展示了如何基于TensorFlow2实现推荐系统排序模型。
● YouTube深度排序模型(多值embedding、多目标学习)
3.机器学习 Sklearn入门教程
● 机器学习Sklearn入门教程
4.深度学习TensorFlow入门教程
● 深度学习TensorFlow入门教程