Skip to content

Latest commit

 

History

History
1007 lines (727 loc) · 92.7 KB

README.md

File metadata and controls

1007 lines (727 loc) · 92.7 KB

Awesome Real-Time Machine Learning Awesome GitHub X

Here's a curated list of awesome real-time machine learning blogs, videos, tools and platforms, conferences, research papers, etc.

Table of Contents

What even is "real-time" Machine Learning?

Real-time Machine Learning (ML) delivers predictions and adapts models with extremely low latency, using fresh, continuously streaming data. It employs online or continual learning to instantly update models with new information, ensuring the most relevant insights for immediate actions. This dynamic approach contrasts with batch processing and is crucial for applications requiring instant responsiveness to changing patterns.

  • Real-Time Predictions: Model outputs generated on-demand as data arrives with extremely low latency.
  • Real-Time Features: Input attributes derived from real-time, rapidly changing data, processed quickly.
  • Real-Time Learning: Continuous model updating (online or continual learning) using new data for adaptation and improvement of model performance over time.

Traditional ML vs Real-Time ML

Aspect Traditional ML Real-Time ML
Data Processing Processes static, historical datasets in batches. Continuously ingests and processes streaming data in real-time.
Model Training Models are trained offline using complete datasets. Models are updated incrementally as new data arrives, often using online learning algorithms.
Latency Can tolerate higher latency in processing and predictions. Requires low-latency processing and near-instantaneous predictions.
Scalability Typically scales vertically with more powerful hardware. Horizontal scaling is possible with distributed frameworks. Often requires horizontal scalability to handle high-volume data streams.
Infrastructure Can run on standard computing resources. Often requires specialized streaming infrastructure like Apache Kafka or Apache Flink.
Adaptability Models are less adaptive to changing patterns without manual retraining. Models can adapt to concept drift and evolving patterns in real-time.
Feature Engineering Features are often engineered manually and in advance. Features may be generated on-the-fly or use automated feature extraction techniques.
Model Deployment Models are deployed as static versions, updated periodically. Models are continuously updated and deployed in a streaming fashion.
Use Cases Effective for predictive analytics, segmentation, and batch or streaming data predictions. Ideal for fraud detection, real-time bidding, and personalized recommendations.
Data Volume Can work effectively with smaller datasets. Typically requires larger volumes of data for accurate real-time predictions.
Computational Resources Generally less computationally intensive. Optimizes computational resource usage by processing data incrementally, reducing the need for reprocessing entire datasets, but may require consistent resource availability for real-time updates.
Monitoring Periodic model performance checks are usually sufficient unless operating in dynamic environments. Requires continuous monitoring of model performance and data quality.
Feedback Loop Feedback is incorporated in batch updates. Immediate feedback integration for rapid model adjustments.
Complexity Generally simpler to implement and maintain. More complex, requiring specialized knowledge in streaming architectures and online learning algorithms.
Time-to-Insight Longer time from data collection to actionable insights. Near-immediate insights from incoming data streams.

Tools & Workflow Stages

  1. Event streaming platforms

  2. Streaming Engines

  3. Feature Engineering and Feature Stores

  4. Model Development and Training

  5. Workflow Orchestration

  6. Experiment and Metadata Management

  7. Model Deployment and Serving

  8. Monitoring and Feedback Loop

Real-Time ML Internal Platform Resources

  1. Picnic

    Industry: e-commerce and grocery retail

  2. Netflix

    Industry: Media and Entertainment, Streaming Services

  3. Uber

    Industry: Transportation and Technology

  4. TikTok

    Industry: Social Media, Entertainment, and Technology

  5. Meta

    Industry: Technology, Social Media, and Artificial Intelligence

  6. Google

    Industry: Technology, Internet Services, and Artificial Intelligence

    • Real-time AI with Google Cloud Vertex AI This blog post introduces Streaming Ingestion for Vertex AI Matching Engine and Feature Store, enabling real-time updates and low-latency retrieval of data for ML models.

    • Streaming analytics solutions | Google Cloud This page describes Google Cloud's streaming analytics solutions for ingesting, processing, and analyzing event streams in real-time.

    • Introduction to Vertex AI Feature Store This documentation explains how Vertex AI Feature Store provides a centralized repository for organizing, storing, and serving ML features in real-time.

    • Real-time Data Infrastructure at Google This paper describes Google's real-time data infrastructure, which processes petabytes of data daily to support various use cases including customer incentives, fraud detection, and machine learning model predictions.

  7. Spotify

    Industry: Music Streaming, Technology, and Entertainment

  8. Instacart

    Industry: E-commerce, Grocery Delivery, Technology

  9. DoorDash

    Industry: Food Delivery, Technology, and Logistics

  10. Booking.com

    Industry: Travel and Technology

  11. Grab

    Industry: Technology, Ride-Hailing, Food Delivery, and Digital Payments

  12. Didact AI

    Industry: Finance, Machine Learning, Stock Trading

  13. Glassdoor

    Industry: Technology, Job Search and Company Reviews

  14. Dailymotion

    Industry: Video Sharing and Streaming, AdTech

  15. Coupang

    Industry: E-commerce, Technology, and Logistics

  16. Slack

    Industry: Technology, Communication, and Collaboration Software

  17. Swiggy

    Industry: Food Delivery, Technology, and E-commerce

  18. Nubank

    Industry: Financial Technology (Fintech), Digital Banking

  19. Replit

    Industry: Developer Tools, AI-Powered Coding Platforms

  20. Noon

    Industry: E-commerce

  21. Lyft

    Industry: Ridesharing, Mobility-as-a-Service

    • Powering Millions of Real-Time Decisions with LyftLearn Serving
      This blog explains how Lyft’s platform, LyftLearn Serving, powers hundreds of millions of real-time decisions daily for use cases like price optimization, ETA prediction, and fraud detection, with a focus on managing both data and control planes.

    • Building Real-Time Machine Learning Foundations at Lyft
      This article discusses Lyft's initiative to integrate streaming data into its ML workflows, enabling real-time anomaly detection, event-driven decisions, and enhanced traffic infrastructure using geohash aggregation.

    • ML Feature Serving Infrastructure at Lyft
      This post details the architecture of Lyft’s Feature Service, which supports real-time feature availability for online inference with single-digit millisecond latency, serving millions of requests per minute.

    • Real-Time ML with Beam at Lyft
      Lyft uses Apache Beam to power real-time ML pipelines for critical functions like dynamic pricing, ETA prediction, and traffic-aware routing. The infrastructure processes millions of events per minute with sub-second latency.

    • How Lyft Uses AI to Get You Where You Want to Go Faster
      This blog highlights how Lyft uses machine learning to provide real-time ETA predictions, dynamic routing based on live traffic data, and personalized destination suggestions based on user behavior.

    • How Lyft Stores the Data Powering Their ML Models
      This article explains how Lyft ensures low-latency access to feature data for both training and real-time inference by hosting thousands of features in its Feature Serving service.

    • ETA (Estimated Time of Arrival) Reliability at Lyft
      Lyft leverages real-time ML to enhance ETA reliability by dynamically analyzing driver availability, traffic, and marketplace conditions, with continuous model updates to adapt to changing environments.

    • The Recommendation System at Lyft
      Lyft’s recommendation system uses real-time ML to dynamically rank ride modes, adapt to marketplace conditions, and personalize user experiences while exploring reinforcement learning for continuous improvement.

    • Pricing at Lyft
      Lyft’s pricing system leverages real-time ML with online reinforcement learning to dynamically optimize prices, balancing supply and demand while adapting to market conditions.

    • How Lyft Predicts a Rider’s Destination for Better In-App Experience
      Lyft’s destination prediction system leverages real-time ML and attention mechanisms to dynamically suggest personalized destinations based on historical rides and session context.

    • How Lyft Creates Hyper-Accurate Maps from Open-Source Maps and Real-Time Data
      Lyft uses real-time ML with GPS data and map-matching algorithms to detect and correct map errors, creating hyper-accurate maps for efficient routing and localization.

    • Building Lyft’s Marketing Automation Platform
      Lyft’s Symphony platform uses real-time ML and reinforcement learning to optimize marketing decisions, dynamically allocate budgets, and improve campaign performance at scale.

    • Fingerprinting Fraudulent Behavior
      Lyft uses real-time ML with deep learning architectures to detect fraudulent behavior by analyzing sequential user activity logs and dynamically identifying anomalies.

    • From Shallow to Deep Learning in Fraud
      Lyft employs real-time ML with deep learning models to detect fraud dynamically, leveraging sequential user behavior and advanced infrastructure for scalable prototype-to-production workflows.

  22. Wayfair

    Industry: E-commerce, Furniture & Home Goods

  23. Airbnb

    Industry: Travel Services, Hospitality

Videos

  1. "Bring the power of machine learning to the world of streaming data"
    This video from Google Cloud Next demonstrates how to deploy and manage complete ML pipelines for real-time inference and predictions using Dataflow ML.
    Watch here

  2. "Jukebox: Spotify's Feature Infrastructure"
    Explains how Spotify manages features for machine learning, including their approach to feature stores and real-time feature serving.
    Watch here

  3. "Scaling Up Machine Learning in Instacart Search for the 2020 Surge"
    Discusses how Instacart scaled up its machine learning capabilities to handle the surge in demand during 2020, likely including real-time aspects of their search system.
    Watch here

  4. "How Booking.com Used Data Streaming to Put Travel Decisions into Customers' Hands"
    Explains how Booking.com leveraged data streaming to provide a comprehensive booking experience, including the use of Confluent's data streaming platform.
    Watch here

  5. "Inside Coupang's AI-Powered Fulfillment Center"
    Showcases Coupang's newest fulfillment center, highlighting its AI-directed nerve center and army of robots for efficient operations.
    Watch here

  6. "Real-time Machine Learning: Architecture and Challenges"
    Explores architectures and challenges in implementing real-time machine learning systems, emphasizing the importance of fresh data and low-latency predictions.
    Watch here

  7. "Batch-scoring vs Real-time ML systems"
    Compares batch scoring and real-time machine learning systems, discussing their advantages, disadvantages, and implementation differences.
    Watch here

  8. "Journey to Real-Time ML: A Look at Feature Platforms & Modern RT ML Architectures Using Tecton"
    Demonstrates how to build a robust MLOps platform using MLflow and Tecton on Databricks for managing real-time ML models and features, with insights from FanDuel's implementation.
    Watch here

  9. "How to Build a Real Time ML Pipeline for Fraud Prediction"
    Demonstrates how to build a machine learning pipeline with real-time feature engineering for fraud detection, using Iguazio's data science platform to streamline the process from data ingestion to model deployment and monitoring.
    Watch here

  10. "Real-Time ML: Features and Inference // Sasha Ovsankin and Rupesh Gupta // MLOps Podcast #135”
    Explores challenges and solutions in implementing real-time machine learning features and inference at LinkedIn.
    Watch here

  11. "Need for Speed: Machine Learning in the Era of Real-Time"
    Explores the evolution of real-time machine learning, discussing challenges in latency, data freshness, and resource efficiency, while providing insights on implementing RTML solutions.
    Watch here

  12. "Real-Time Event Processing for AI/ML with Numaflow // Sri Harsha Yayi // DE4AI"
    Discusses Intuit's development of Numaflow, an open-source platform designed to simplify event processing and inference on streaming data for machine learning applications.
    Watch here

  13. "ML Batch vs streaming vs real-time data processing"
    Compares batch, streaming, and real-time data processing for machine learning, discussing misconceptions, costs, and decision-making criteria.
    Watch here

  14. "Machine Learning is Going Real-Time"
    Chip Huyen explores the state, use cases, solutions, and challenges of real-time machine learning in production across US and Chinese companies.
    Watch here

  15. "Realtime Stock Market Anomaly Detection using ML Models | An End to End Data Engineering Project"
    Demonstrates building a real-time anomaly detection system for stock market data using Quix Streams, Redpanda, and Docker.
    Watch here

  16. "Real Time ML: Challenges and Solutions - Chip Huyen"
    Explores challenges in implementing real-time machine learning systems, including latency, train-predict inconsistency, and managing streaming infrastructure, while discussing potential solutions and architectures.
    Watch here

  17. "Real-time ML Model Monitoring with Data Sketches and Apache Pinot"
    Demonstrates how Uber leverages Apache Pinot as a data sketch store for ML model monitoring, using data profiling and sketch-based solutions to enable efficient and scalable monitoring across different data sources.
    Watch here

  18. "MLOps vs ML Orchestration // Ketan Umare // MLOps Podcast #183"
    Explores real-time machine learning challenges in traffic prediction and fraud detection, highlighting the importance of buffering and damping reactions in ML systems.
    Watch here

  19. "Lessons Learned: The Journey to Real-Time Machine Learning at Instacart"
    Guanghua Shu discusses Instacart's transition from batch-oriented to real-time ML systems, covering infrastructure changes, use cases, and key lessons learned in implementing real-time ML for their e-commerce platform.
    Watch here

  20. "Leveraging GraphQL for Continual Learning in Real-Time ML Systems"
    Discusses how to set up real-time infrastructure and continual learning using GraphQL for machine learning systems, addressing limitations of batch-training paradigms and enabling adaptive models.
    Watch here

  21. "Real-Time ML Insights with Richmond Alake"
    Explores real-time machine learning tools, techniques, and career insights with Richmond Alake, a Machine Learning Architect at Slalom Build, covering his work experiences and AI startups.
    Watch here

  22. "Why real time event streaming pattern is indispensable for an AI native future"
    Discusses the importance of distributed event streaming for real-time analytics and AI-powered experiences, exploring its applications in data collection, enrichment, and measuring drift and explainability.
    Watch here

  23. "Realtime Prediction with Machine Learning and Data Transform with Redpanda"
    Explores building efficient AI applications using stateless pipelines and WebAssembly-powered streaming data transforms, demonstrating how to simplify data architecture for real-time analytics and machine learning.
    Watch here

  24. "Build Real-time Machine Learning Apps on Generative AI with Kafka Streams"
    Stepan Hinger discusses integrating large language models with data streaming using Kafka, demonstrating real-time AI applications for unstructured data analysis, customer support automation, and business intelligence through a framework called Area.
    Watch here

  25. "Feeding ML models with the data from the databases in real-time - DevConf.CZ 2024"
    Vojtech Juranek demonstrates how to use Debezium to ingest real-time data from databases into machine learning models, showcasing a live demo with TensorFlow and discussing challenges and solutions in implementing such systems.
    Watch here

  26. "Architecting Data and Machine Learning Platforms"
    Marco Tranquillin and Firat Tekiner preview their upcoming book, covering the entire data lifecycle in cloud environments, from ingestion to activation, with a cloud-agnostic approach to data and ML platform architecture.
    Watch here

  27. "Real-Time ML Workflows at Capital One with Disha Singla"
    Disha Singla, Senior Director of Machine Learning Engineering at Capital One, discusses democratizing AI through reusable libraries and workflows for citizen data scientists, focusing on time series analysis, anomaly detection, and fraud prevention.
    Watch here

  28. "ML Auto-Retraining: Update Your Model in Real Time"
    Discusses real-time ML techniques for cybersecurity, including aggregate engines for adapting to attacker shifts, Redis-based key-value stores for tracking indicators of compromise, and an auto-retraining framework for regularly updating models on different cadences.
    Watch here

  29. "Real-Time Data Processing for ML Feature Engineering | Weiran Liu and Ping Chen"
    Discusses Meta's evolution of real-time data processing infrastructure for machine learning, covering applications in recommendation systems, content understanding, and fraud detection, with a focus on their latest platform "Extreme" and its use in real-time feature engineering.
    Watch here

  30. "Building Real-Time ML Pipelines: Challenges and Solutions"
    Yaron Haviv discusses challenges in productizing AI/ML, introduces MLOps and feature stores, and demonstrates building real-time ML pipelines using the open-source MLRun framework, with examples of churn prediction and fraud detection use cases.
    Watch here

  31. "Apache Spark and Apache Kafka for Real-Time Machine Learning"
    This webinar explores the integration of Apache Kafka and Apache Spark for building scalable real-time machine learning pipelines, covering fundamentals of real-time ML, challenges faced by data teams, and optimal usage of these technologies for data processing and analysis.
    Watch here

A list of vendors that provide solutions for machine learning.

🚀 Full-Stack ML Vendors

Vendors that offer end-to-end solutions covering feature engineering, model training, serving, and monitoring.

  • TurboML – A machine learning platform that's reinvented for real-time. All steps in the ML lifecycle - from data ingestion to feature engineering, model training, deployment, and post-deployment monitoring are designed to handle real-time data.

    🔹 Real-time Predictions – Get fresh model outputs on-demand with low-latency online inference.
    🔹 Real-time Features – Transform recent data streams as live context for your models.
    🔹 Continual Learning – Update models dynamically with new data.
    🔹 Streaming Integrations – Natively supports real-time data sources.

    TurboML enables ML teams to iterate quickly by testing hypotheses on live production data. Whether refining ETA predictions based on ride completions or improving fraud detection using chargeback events, TurboML ensures models stay relevant and effective.

  • Databricks – Unified data and AI platform with Delta Live Tables for real-time ML workflows.

  • AWS SageMaker – Fully managed ML service with real-time feature ingestion, training, and model deployment.

  • Google Cloud Vertex AI – A fully managed ML platform that unifies data prep, training, and deployment with AutoML and custom model support for real-time inference.

  • Qwak – An end-to-end AI platform that enables organizations to build, deploy, manage, and monitor machine learning workflows.

  • DataRobot – An AI platform that automates the end-to-end process of building, deploying, and maintaining machine learning models.

  • Abacus.ai – An end-to-end platform for building, deploying, and managing AI models with real-time data and automated monitoring.

  • Dataiku – An end-to-end AI platform that streamlines data preparation, model development, deployment, and governance for enterprises.

  • H2O.ai – An AI and machine learning platform providing tools for building, deploying, and managing models at scale with a focus on automation and high performance.

  • Iguazio – An AI platform that streamlines the deployment and management of machine learning applications, offering tools for pipeline orchestration, model monitoring, and GPU provisioning.

  • Xenonstack – An advanced analytics platform offering end-to-end MLOps, data pipeline management, and AI-driven insights

  • Azure Machine Learning – A comprehensive cloud-based platform for building, deploying, and managing ML models at scale.

  • Modzy – A platform that enables organizations to deploy, connect, and run machine learning models across various environments, including enterprise systems and edge devices, offering fully managed infrastructure, tools, and workflows.

  • ZenML – A framework designed to standardize and streamline machine learning workflows, enabling reproducibility, collaboration, and seamless deployment across diverse environments.

  • Valohai – A machine learning platform for managing and automating ML workflows and deployments.

  • Datatron – An MLOps platform for model management, deployment, and monitoring.

  • ClearML – An MLOps platform designed to streamline the entire machine learning lifecycle.


📊 Feature Engineering & Feature Stores

Focus on feature storage, transformation, and real-time serving.

  • Tecton – Real-time feature platform for ML, integrating feature storage, transformation, and serving.
  • Hopsworks – Feature store + ML pipeline orchestration for real-time ML.
  • Fennel – A real-time feature engineering platform with an efficient CDC-aware engine for fresh, incremental ML computations.
  • Chalk AI – A real-time platform for machine learning that enables data teams to declare features and their dependencies using idiomatic Python in online, streaming, and batch environments.
  • Featurebyte – A self-service platform that automates feature engineering and deployment for ML models.

🧪 ML Experiment Tracking & Model Management

Focus on tracking experiments, managing model versions, and monitoring performance.

  • Weights & Biases – An AI developer platform that streamlines machine learning workflows, offering tools for experiment tracking, model management, and evaluation of generative AI applications.
  • Comet – A full-stack MLOps platform for tracking experiments, managing models, and deploying them to production.
  • Galileo – A platform that gives AI teams a way to evaluate, iterate, monitor and protect AI applications at enterprise scale.

⚡ Model Deployment & Inference

Focus on running, deploying, and serving machine learning models in production.

  • Seldon – An MLOps platform that enables organizations to deploy, manage, monitor, and explain machine learning models at scale
  • Fal.ai – A platform for high-performance AI model inference and training, specializing in generative media with production-ready APIs and serverless deployment.
  • Modelbit – A platform for running machine learning models in production.

🏢 AI Integration & Operationalization

Focus on integrating AI into business applications and making insights accessible.

  • AI Squared – A data and AI integration platform that helps make intelligent insights accessible to all.
  • MindsDB – A platform that integrates various artificial intelligence (AI) models with traditional databases or other data management system

Conferences

Conference Date Location Format
HumanX March 9-13, 2025 Las Vegas Onsite
Current Bengaluru March 19, 2025 Bengaluru Onsite
MLConf March 27, 2025 New York Onsite
Data Science Salon SEA April 16, 2025 Seattle Hybrid
Data Council April 22-24, 2025 San Francisco Onsite
Machine Learning Prague 2025 April 28, 2025 Prague Hybrid
Smart Data and AI Summit May 5-6, 2025 Riyadh Onsite
Data Science Next Conference Europe May 7-9, 2025 Amsterdam Onsite
Conference on Machine Learning and Systems (MLSys) 2025 May 12-15, 2025 Santa Clara Onsite
Real-Time Analytics Summit 2025 May 14, 2025 Virtual Virtual
Current London May 20-21, 2025 London Onsite
AI & Big Data Expo North America 2025 June 4-5, 2025 Santa Clara Onsite
Data + AI Summit 2025 June 9-12, 2025 San Francisco Hybrid
London Tech Week June 9-13, 2025 London Onsite
SuperAI June 18-19, 2025 Singapore Onsite
RAISE Summit July 8-9, 2025 Paris Onsite
The MachineCon by AIM July 25, 2025 New York Onsite
Ai4 August 11-13, 2025 Las Vegas Onsite

Contributing

Your contributions are always welcome! Please read the contribution guidelines first.

License

This awesome list is under the MIT License.