Lists (11)
Sort Name ascending (A-Z)
Stars
Simulation of Capital Accumulation Plan using Monte-Carlo simulation
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
Quantitative Interview Preparation Guide, updated version here ==>
An up-to-date list of hackathons happening across Europe
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
Python training for business analysts and traders
[ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models".
Run PyTorch LLMs locally on servers, desktop and mobile
A benchmark for prompt injection detection systems.
An Open-Source Package for Textual Adversarial Attack.
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models. ECCV 2024
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Personal Expense Report on Google Sheet with Telegram Bot integration
Curated list of project-based tutorials
TOMORROW BRINGS GREATER KNOWLEDGE: LARGE LANGUAGE MODELS JOIN DYNAMIC TEMPORAL KNOWLEDGE GRAPHS
Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Started as a GBA emulator, now it's an ARM7TDMI emulator :)
Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
An easy-to-use Python framework to generate adversarial jailbreak prompts.
[ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]
C++ web crawler using a breadth-first search for efficient web indexing.
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal