diff --git a/tamingllms/_build/.doctrees/environment.pickle b/tamingllms/_build/.doctrees/environment.pickle
index ac149b2..548e3b2 100644
Binary files a/tamingllms/_build/.doctrees/environment.pickle and b/tamingllms/_build/.doctrees/environment.pickle differ
diff --git a/tamingllms/_build/.doctrees/markdown/preface.doctree b/tamingllms/_build/.doctrees/markdown/preface.doctree
new file mode 100644
index 0000000..b9d4c1b
Binary files /dev/null and b/tamingllms/_build/.doctrees/markdown/preface.doctree differ
diff --git a/tamingllms/_build/.doctrees/markdown/toc.doctree b/tamingllms/_build/.doctrees/markdown/toc.doctree
index 47492f3..de117bd 100644
Binary files a/tamingllms/_build/.doctrees/markdown/toc.doctree and b/tamingllms/_build/.doctrees/markdown/toc.doctree differ
diff --git a/tamingllms/_build/.doctrees/notebooks/alignment.doctree b/tamingllms/_build/.doctrees/notebooks/alignment.doctree
index 8cd2c14..423fb7a 100644
Binary files a/tamingllms/_build/.doctrees/notebooks/alignment.doctree and b/tamingllms/_build/.doctrees/notebooks/alignment.doctree differ
diff --git a/tamingllms/_build/.doctrees/notebooks/evals.doctree b/tamingllms/_build/.doctrees/notebooks/evals.doctree
index f957c46..7dd38bb 100644
Binary files a/tamingllms/_build/.doctrees/notebooks/evals.doctree and b/tamingllms/_build/.doctrees/notebooks/evals.doctree differ
diff --git a/tamingllms/_build/.doctrees/notebooks/output_size_limit.doctree b/tamingllms/_build/.doctrees/notebooks/output_size_limit.doctree
index 16a3106..618cb60 100644
Binary files a/tamingllms/_build/.doctrees/notebooks/output_size_limit.doctree and b/tamingllms/_build/.doctrees/notebooks/output_size_limit.doctree differ
diff --git a/tamingllms/_build/.doctrees/notebooks/safety.doctree b/tamingllms/_build/.doctrees/notebooks/safety.doctree
index c188f26..d343b78 100644
Binary files a/tamingllms/_build/.doctrees/notebooks/safety.doctree and b/tamingllms/_build/.doctrees/notebooks/safety.doctree differ
diff --git a/tamingllms/_build/.doctrees/notebooks/structured_output.doctree b/tamingllms/_build/.doctrees/notebooks/structured_output.doctree
index 67348dd..951501e 100644
Binary files a/tamingllms/_build/.doctrees/notebooks/structured_output.doctree and b/tamingllms/_build/.doctrees/notebooks/structured_output.doctree differ
diff --git a/tamingllms/_build/html/_images/salad.png b/tamingllms/_build/html/_images/salad.png
new file mode 100644
index 0000000..637e182
Binary files /dev/null and b/tamingllms/_build/html/_images/salad.png differ
diff --git a/tamingllms/_build/html/_sources/markdown/preface.md b/tamingllms/_build/html/_sources/markdown/preface.md
new file mode 100644
index 0000000..56a5338
--- /dev/null
+++ b/tamingllms/_build/html/_sources/markdown/preface.md
@@ -0,0 +1,26 @@
+# Preface
+
+```{epigraph}
+Models tell you merely what something is like, not what something is.
+
+-- Emanuel Derman
+```
+
+
+An alternative title of this book could have been "Language Models Behaving Badly". If you are coming from a background in financial modeling, you may have noticed the parallel with Emanuel Derman's seminal work "Models.Behaving.Badly" {cite}`derman2011models`. This parallel is not coincidental. Just as Derman cautioned against treating financial models as perfect representations of reality, this book aims to highlight the limitations and pitfalls of Large Language Models (LLMs) in practical applications (of course baring the fact Derman is an actual physicist and legendary author, professor and quant; I am not).
+
+The book "Models.Behaving.Badly" by Emanuel Derman, a former physicist and Goldman Sachs quant, explores how financial and scientific models can fail when we mistake them for reality rather than treating them as approximations full of assumptions.
+The core premise of his work is that while models can be useful tools for understanding aspects of the world, they inherently involve simplification and assumptions. Derman argues that many financial crises, including the 2008 crash, occurred partly because people put too much faith in mathematical models without recognizing their limitations.
+
+Like financial models that failed to capture the complexity of human behavior and market dynamics, LLMs have inherent constraints. They can hallucinate facts, struggle with logical reasoning, and fail to maintain consistency across long outputs. Their responses, while often convincing, are probabilistic approximations based on training data rather than true understanding even though humans insist on treating them as "machines that can reason".
+
+Today, there is this growing pervasive belief that these models could solve any problem, understand any context, or generate any content as wished by the user. Moreover, language models that were initially designed to be next-token prediction machines and chatbots are now been twisted and wrapped into "reasoning" machines for further integration into technology products and daily-life workflows that control, affect, or decide daily actions of our lives. This technological optimism coupled with lack of understanding of the models' limitations may pose risks we are still trying to figure out.
+
+This book serves as an introductory, practical guide for practitioners and technology product builders - software engineers, data scientists, and product managers - who want to create the next generation of GenAI-based products with LLMs while remaining clear-eyed about their limitations and therefore their implications to end-users. Through detailed technical analysis, reproducible Python code examples we explore the gap between LLM capabilities and reliable software product development.
+
+The goal is not to diminish the transformative potential of LLMs, but rather to promote a more nuanced understanding of their behavior. By acknowledging and working within their constraints, developers can create more reliable and trustworthy applications. After all, as Derman taught us, the first step to using a model effectively is understanding where it breaks down.
+
+## References
+```{bibliography}
+:filter: docname in docnames
+```
\ No newline at end of file
diff --git a/tamingllms/_build/html/_sources/notebooks/safety.ipynb b/tamingllms/_build/html/_sources/notebooks/safety.ipynb
index 2759eb3..7074e47 100644
--- a/tamingllms/_build/html/_sources/notebooks/safety.ipynb
+++ b/tamingllms/_build/html/_sources/notebooks/safety.ipynb
@@ -413,7 +413,253 @@
"source": [
"## Technical Implementation Components\n",
"\n",
- "### Datasets\n",
+ "### Benchmarks & Datasets\n",
+ "\n",
+ "\n",
+ "#### SALAD-Bench\n",
+ "\n",
+ "SALAD-Bench {cite}`li2024saladbenchhierarchicalcomprehensivesafety` is a recently published benchmark designed for evaluating the safety of Large Language Models (LLMs). It aims to address limitations of prior safety benchmarks which focused on a narrow perspective of safety threats, lacked challenging questions, relied on time-consuming and costly human evaluation, and were limited in scope. SALAD-Bench offers several key features to aid in LLM safety:\n",
+ "\n",
+ "* **Compact Taxonomy with Hierarchical Levels:** It uses a structured, three-level hierarchy consisting of 6 domains, 16 tasks, and 66 categories for in-depth safety evaluation across specific dimensions. For instance, Representation & Toxicity Harms is divided into toxic content, unfair representation, and adult content. Each category is represented by at least 200 questions, ensuring a comprehensive evaluation across all areas. \n",
+ "* **Enhanced Difficulty and Complexity:** It includes attack-enhanced questions generated using methods like human-designed prompts, red-teaming LLMs, and gradient-based methods, presenting a more stringent test of LLMs’ safety responses. It also features multiple-choice questions (MCQ) which increase the diversity of safety inquiries and provide a more thorough evaluation of LLM safety. \n",
+ "* **Reliable and Seamless Evaluator:** SALAD-Bench features two evaluators: MD-Judge for question-answer pairs and MCQ-Judge for multiple-choice questions. MD-Judge is an LLM-based evaluator fine-tuned on standard and attack-enhanced questions labeled according to the SALAD-Bench taxonomy. It integrates taxonomy details into its input and classifies responses based on customized instruction tasks. MCQ-Judge uses in-context learning and regex parsing to assess performance on multiple-choice questions. \n",
+ "* **Joint-Purpose Utility:** In addition to evaluating LLM safety, SALAD-Bench can be used to assess both LLM attack and defense methods. It contains subsets for testing attack techniques and examining defense capabilities, allowing researchers to improve LLM resilience against attacks. \n",
+ "\n",
+ "{numref}`salad-bench` illustrates SALAD-Bench's question enhancement and evaluation methodology. Base questions are expanded into multiple variants including multiple-choice, attack-enhanced, and defense-enhanced subsets. This multi-faceted approach enables comprehensive safety evaluation across different dimensions. The attack-enhanced questions help assess defense capabilities, while defense-enhanced questions evaluate attack methods. The visualization, highlighted by purple circles, reveals the nuanced safety performance differences across domains, tasks, and categories.\n",
+ "\n",
+ "\n",
+ "```{figure} ../_static/safety/salad.png\n",
+ "---\n",
+ "name: salad-bench\n",
+ "alt: SALAD-Bench's compact taxonomy with hierarchical levels.\n",
+ "width: 70%\n",
+ "align: center\n",
+ "---\n",
+ "SALAD-Bench's compact taxonomy with hierarchical levels {cite}`li2024saladbenchhierarchicalcomprehensivesafety`.\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The SALAD-Bench benchmark is accompanied by a Leaderboard {cite}`opensafetylab2024saladbenchleaderboard` and a dataset available on Hugging Face {cite}`opensafetylab2024saladdata`.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "/home/tobias/src/tamingLLMs/tamingllms/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+ " from .autonotebook import tqdm as notebook_tqdm\n",
+ "Generating train split: 100%|██████████| 21318/21318 [00:00<00:00, 66534.59 examples/s]\n"
+ ]
+ }
+ ],
+ "source": [
+ "SALAD_BENCH_DATASET = \"OpenSafetyLab/Salad-Data\"\n",
+ "\n",
+ "from datasets import load_dataset\n",
+ "\n",
+ "dataset = load_dataset(SALAD_BENCH_DATASET, name='base_set', split='train')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Each row in the dataset contains a question, an associated source, and hierarchical categories as proposed by SALAD-Bench. The question is a potentially harmful prompt to be evaluated, which has been aggregated by a source. An example of a source is \"GPTFuzzer\" {cite}`gptfuzzer2024` which explores red teaming of large language models (LLMs) using auto-generated jailbreak prompts. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
In recent years, Large Language Models (LLMs) have emerged as a transformative force in technology, promising to revolutionize how we build products and interact with computers. From ChatGPT to GitHub Copilot and Claude Artifacts these systems have captured the public imagination and sparked a gold rush of AI-powered applications. However, beneath the surface of this technological revolution lies a complex landscape of challenges that practitioners must navigate.
This book focuses on bringing awareness to key LLM limitations and harnessing open source solutions to overcome them for building robust AI-powered products. It offers a critical perspective on implementation challenges, backed by practical and reproducible Python examples. While many resources cover the capabilities of LLMs, this book specifically addresses the hidden complexities and pitfalls that engineers and technical product managers face when building LLM-powered applications while offering a comprehensive guide on how to leverage battle-tested open source tools and solutions.
Throughout this book, we’ll tackle the following (non-exhaustive) list of critical challenges:
Throughout this book, we’ll leverage open source tools and frameworks to address common LLM challenges. In that way, we are prioritizing:
Transparency: Open source solutions provide visibility into how challenges are being addressed, allowing for better understanding and customization of solutions.
In keeping with these open source principles, this book itself is open source and available on GitHub. It’s designed to be a living document that evolves with the changing landscape of LLM technology and implementation practices. Readers are encouraged to:
Report issues or suggest improvements through GitHub Issues
While this book takes a critical look at LLM limitations, our goal is not to discourage their use but to enable more robust and reliable implementations. By understanding these challenges upfront, you’ll be better equipped to build systems that leverage LLMs effectively while avoiding common pitfalls.
The current discourse around LLMs tends toward extremes—either uncritical enthusiasm or wholesale dismissal. This book takes a different approach:
This book is intended for Software Developers taking their first steps with Large Language Models. It provides critical insights into the practical challenges of LLM implementation, along with guidance on leveraging open source tools and frameworks to avoid common pitfalls that could derail projects. The goal is to help developers understand and address these challenges early, before they become costly problems too late in the software development lifecycle.
After reading this book, the reader will understand critical LLM limitations and their implications and have practical experience on recommended open source tools and frameworks to help navigate common LLM pitfalls. The reader will be able to:
Implement effective strategies for managing LLMs limitations
Dr. Tharsis Souza is a computer scientist and product leader specializing in AI-based products. He is a Lecturer at Columbia University’s Master of Science program in Applied Analytics, (incoming) Head of Product, Equities at Citadel, and former Senior VP at Two Sigma Investments. He also enjoys mentoring under-represented students & working professionals to help create a more diverse global AI ecosystem.
With over 15 years of experience delivering technology products across startups and Fortune 500 companies, Dr. Souza is also an author of numerous scholarly publications and is a frequent speaker at academic and business conferences. Grounded on academic background and drawing from practical experience building and scaling up products powered by language models at early-stage startups, major institutions as well as advising non-profit organizations, and contributing to open source projects, he brings a unique perspective on bridging the gap between LLMs promised potential and their practical implementation challenges to enable the next generation of AI-powered products.
Dr. Tharsis holds a Ph.D. in Computer Science from UCL, University of London following an M.Phil. and M.Sc. in Computer Science and a B.Sc. in Computer Engineering.