Note: For this tutorial, you will need an API key and tokens to evaluate closed-source models. Ensure you have:
- An API key from the respective service provider (e.g., OpenAI).
- Sufficient tokens or credits to make API calls and evaluate the models.
You can obtain an API key by signing up on the provider's platform and following their instructions for API access. Make sure to review the pricing and usage limits to manage your costs effectively.
Although we have covered a lot of open source examples as well, you need to practise with both Open source and closed source to learn well
- Visit the OpenAI website.
- Sign up for an account using your email address.
- Guide to NumPy - Travis E. Oliphant, PhD
- From Python to Numpy - Nicolas P. Rougier
- Elegant SciPy - Juan Nunez-Iglesias
- Numerical Python - Robert Johansson
- Inside NumPy - Barkeley Institute for Data Science (BIDS)
- NumPy Cheat Sheet: Data Analysis in Python
- Faster Python calculations with Numba: 2 lines of code, 13× speed-up
- https://www.youtube.com/watch?v=cKPlPJyQrt4
- https://github.com/brandon-rhodes/pycon-pandas-tutorial
- https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html
- https://assets.datacamp.com/blog_assets/PandasPythonForDataScience.pdf
- https://www.webpages.uidaho.edu/~stevel/504/Pandas%20DataFrame%20Notes.pdf
- https://github.com/pandas-dev/pandas/blob/main/doc/cheatsheet/Pandas_Cheat_Sheet.pdf
- https://www.olcf.ornl.gov/wp-content/uploads/2013/02/Intro_to_CUDA_C-TS.pdf
- https://web.archive.org/web/20160729221949/https://docs.nvidia.com/cuda/samples/6_Advanced/reduction/doc/reduction.pdf
- https://www.ce.jhu.edu/dalrymple/classes/602/Class13.pdf
- Pattern Recognition and Machine Learning by Christopher Bishop: https://www.springer.com/gp/book/9780387310732
- Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville: https://www.deeplearningbook.org/
- Machine Learning Yearning by Andrew Ng: https://www.deeplearning.ai/machine-learning-yearning/
- Machine Learning by Andrew Ng on Coursera: https://www.coursera.org/learn/machine-learning
- Deep Learning Specialization by Andrew Ng on Coursera: https://www.coursera.org/specializations/deep-learning
- Introduction to Machine Learning with Python by DataCamp: https://www.datacamp.com/courses/intro-to-machine-learning-with-python
- Machine Learning Crash Course by Google: https://developers.google.com/machine-learning/crash-course
- The Machine Learning Guide on GitHub: https://github.com/mikeroyal/Machine-Learning-Guide
- ML Notes in Markdown by Ashish Patel: https://github.com/ashishpatel26/ML-Notes-in-Markdown
- TensorFlow Learning Resources: https://www.tensorflow.org/resources/learn-ml
- Bookdown: Authoring Books and Technical Documents with R Markdown: https://bookdown.org/
- Deep Learning Neural Networks Explained in Plain English. https://www.freecodecamp.org/news/deep-learning-neural-networks-explained-in-plain-english/.
- Deep learning - Wikipedia. https://en.wikipedia.org/wiki/Deep_learning.
- What Is Deep Learning? - IBM. https://www.ibm.com/topics/deep-learning.
- Difference between a Neural Network and a Deep Learning System. https://www.geeksforgeeks.org/difference-between-a-neural-network-and-a-deep-learning-system/.
- Research with Generative AI - Generative AI @ Harvard - Harvard University. https://www.harvard.edu/ai/research-resources/.
- Generative AI Resources - Wharton AI & Analytics Initiative. https://ai-analytics.wharton.upenn.edu/generative-ai-lab/resources/.
- 11 Best Generative AI Tools and Platforms in 2024 - Turing. https://www.turing.com/resources/generative-ai-tools.
- List of Generative AI Resources - Analytics Vidhya. https://www.analyticsvidhya.com/blog/2024/01/ultimate-list-of-generative-ai-resources/.
- Generative AI Resources for Faculty – University Center for Teaching .... https://teaching.pitt.edu/generative-ai-resources-for-faculty/.
- Free Resources to Master LLMs - Roadmap. https://roadmap.sh/guides/free-resources-to-learn-llms.
- A Comprehensive List of Resources to Master Large Language Models. https://www.kdnuggets.com/a-comprehensive-list-of-resources-to-master-large-language-models.
- 8 Top Open-Source LLMs for 2024 and Their Uses - DataCamp. https://www.datacamp.com/blog/top-open-source-llms.
- Evaluating Large Language Models: A Comprehensive Survey. https://arxiv.org/abs/2310.19736.
- LLM Evaluations: Techniques, Challenges, and Best Practices. https://labelstud.io/blog/llm-evaluations-techniques-challenges-and-best-practices/.
- Best Resources to Learn & Understand Evaluating LLMs. https://towardsai.net/p/data-science/best-resources-to-learn-understand-evaluating-llms.
- LLM Evaluation Metrics : A Complete Guide to Evaluating LLMs. https://aisera.com/blog/llm-evaluation/.
- 7 Ways to Evaluate and Monitor LLMs - WhyLabs. https://whylabs.ai/blog/posts/7-ways-to-evaluate-and-monitor-llms.
- Evaluating Large Language Models: Methods, Best Practices & Tools - Lakera. https://www.lakera.ai/blog/large-language-model-evaluation.
- https://doi.org/10.48550/arXiv.2310.19736.
-Prompt Engineering for Generative AI | Machine Learning - Google Developers. https://developers.google.com/machine-learning/resources/prompt-eng.
- dair-ai/Prompt-Engineering-Guide - GitHub. https://github.com/dair-ai/Prompt-Engineering-Guide.
- 3 New Prompt Engineering Resources to Check Out - KDnuggets. https://www.kdnuggets.com/3-new-prompt-engineering-resources.
- Prompt Engineering Guide | Prompt Engineering Guide. https://www.promptingguide.ai/.
- Awesome Prompt Engineering ♂️ - GitHub. https://github.com/promptslab/Awesome-Prompt-Engineering.
- The 5 Best Vector Databases | A List With Examples - DataCamp. https://www.datacamp.com/blog/the-top-5-vector-databases.
- What is a Vector Database? - Vector Databases Explained - AWS. https://aws.amazon.com/what-is/vector-databases/.
- What is a vector database? - Cloudflare. https://www.cloudflare.com/learning/ai/what-is-vector-database/.
- Vector database | Microsoft Learn. https://learn.microsoft.com/en-us/azure/cosmos-db/vector-database.
- Vector Databases Explained: The Backbone of Modern Semantic ... - Airbyte. https://airbyte.com/data-engineering-resources/vector-databases.
- What is RAG? - Retrieval-Augmented Generation AI Explained - AWS. https://aws.amazon.com/what-is/retrieval-augmented-generation/.
- What is Retrieval-Augmented Generation (RAG) - GeeksforGeeks. https://www.geeksforgeeks.org/what-is-retrieval-augmented-generation-rag/.
- What Is Retrieval-Augmented Generation, aka RAG? - NVIDIA Blog. https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/.
- Retrieval-augmented Generation (RAG): A Comprehensive Guide - DataStax. https://www.datastax.com/guides/what-is-retrieval-augmented-generation.
- Retrieval-augmented generation - Wikipedia. https://en.wikipedia.org/wiki/Retrieval-augmented_generation.
Tool | Application |
---|---|
Giskard | Automated testing for bias, fairness, robustness; integrates with CI/CD pipelines. |
DeepXplore | White-box testing of deep learning models; finds incorrect behaviors under different conditions. |
SHAP | Explaining model predictions by attributing feature importance. |
CleverHans | Adversarial example testing for model robustness. |
ART (Adversarial Robustness Toolbox) | Securing AI models against adversarial attacks and enhancing model robustness. |
FOOLBOX | Benchmarking adversarial robustness and model evaluation under attack scenarios. |
LangChain | Integrating LLMs and creating pipelines for generative model testing. |
MLflow | Tracking, comparing, and managing machine learning experiments and models. |
Weights & Biases | Monitoring and visualizing model performance, logging experiments, and tracking metrics. |
Facets | Visualizing and understanding dataset distributions and model performance. |
TensorFlow Model Analysis | Provides tools for evaluating models, including fairness and performance analysis. |
DataRobot MLOps | Operationalizing machine learning models, monitoring, and automated testing in production. |
Deepchecks | Open-source tool for data validation and model testing, including drift detection and fairness checks. |
Great Expectations | Data validation and pipeline quality checks for ensuring consistent input data during model training. |
TruEra | Automated testing, model debugging, and explainability; ensures fairness and improves model quality. |
LakeFS | Version control for data lakes, allowing for reproducible data pipelines in machine learning projects. |
Feast | Open-source feature store that helps manage, store, and serve features for ML models in production. |
Alibi Detect | Open-source library for drift detection, outlier detection, and model monitoring. |
Seldon Core | Machine learning deployment platform that supports monitoring and testing of production models. |
Kubeflow | Manages ML workflows, model training, and serves models; useful for testing AI/ML pipelines. |
DeepMind’s Polycoder | Testing AI code generation and ensuring the quality of ML-based code generation models. |
H2O.ai Driverless AI | Automated machine learning platform focusing on model explainability, bias detection, and fairness. |
Neptune.ai | Experiment tracking tool for collaborative development and monitoring AI models in production. |
ExplainX.ai | Focuses on model explainability and fairness, helping ensure transparent AI/ML decisions. |