Extracting RAI Metrics from Langkit #321

ishachinniah-hds · 2025-02-11T16:31:45Z

I want to clarify the implementation of Langkit to extract the following metrics from my genAI application: 'prompt.injection', 'prompt.jailbreak_similarity', 'prompt.toxicity', 'response.hallucination', 'response.refusal_similarity', 'response.toxicity'.

Code:

    ## LANGKIT - Injections, jailbreak/refusal similarity, hallucination and toxicity
    response_hallucination.init(llm=langkit_azure_llm(), num_samples=3)
    profile = why.log({"prompt": query, "response": response},schema=text_schema).profile().view().to_pandas()
    # View the data
    print(profile)
    filepath = os.path.join(os.getcwd(),"src","evaluation", "RAI.csv")
    profile.to_csv(filepath, index=True)  # index=True includes the index column

RAI.csv output:

In the above table, are the repeating scores under distribution/max, distribution/mean, distribution/median.. the accurate metric scores? Or is this not the correct way to get these scores?

Thank you for clarifying

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting RAI Metrics from Langkit #321

Extracting RAI Metrics from Langkit #321

ishachinniah-hds commented Feb 11, 2025

Extracting RAI Metrics from Langkit #321

Extracting RAI Metrics from Langkit #321

Comments

ishachinniah-hds commented Feb 11, 2025

Code:

RAI.csv output: