Update job chat prompt and add prompt evaluation for issue 97 #118

hanna-paasivirta · 2024-11-13T19:29:41Z

Short Description

This PR primarily adjusts the system prompt in the job_chat service to be less strict about external information. It also adds a notebook to evaluate the online and the new prompts.

Fixes #97 #108 partially.

Implementation Details

To address issue #97 , the prompt was edited to allow the assistant to provide information on external platforms and services.

To address issue #108 , this PR also adds a notebook to generate a prompt test dataset targeting the issue in question. The notebook provides an initial case study of how we can track and evaluate the effects of changes to the LLM pipeline more thoroughly without relying only on qualitative evaluation and spot checking.

The small fully generated evaluation dataset is also added in this PR, and it can be used as part of a routine test and expanded on as we target different issue areas. The dataset shows the LLM outputs for the same set of questions using the online v1 and the candidate v2 prompts. The generated result column indicates whether the answer successfully answered the question, and can be used to calculate a success score. The dataset shows that our new prompt improves the success rate on the external information issue from 20% to 60%.

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

You can read more details in our Responsible AI Policy

josephjclark

This is fantastic Hanna, thank you.

As you say, we've got some way to go to make a generic test suite for this stuff. But it's an excellent way to validate these specific prompt improvements.

And it's not perfect, but I can see it is better! So we'll take it, thank you.

update job chat prompt and add prompt evaluation for issue 97

6f20bff

josephjclark approved these changes Nov 14, 2024

View reviewed changes

version: 0.5.2

04242d6

josephjclark merged commit 1a1b5a3 into main Nov 14, 2024
1 check passed

josephjclark deleted the issue_97 branch November 14, 2024 16:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update job chat prompt and add prompt evaluation for issue 97 #118

Update job chat prompt and add prompt evaluation for issue 97 #118

hanna-paasivirta commented Nov 13, 2024 •

edited

Loading

josephjclark left a comment

Update job chat prompt and add prompt evaluation for issue 97 #118

Update job chat prompt and add prompt evaluation for issue 97 #118

Conversation

hanna-paasivirta commented Nov 13, 2024 • edited Loading

Short Description

Implementation Details

AI Usage

josephjclark left a comment

Choose a reason for hiding this comment

hanna-paasivirta commented Nov 13, 2024 •

edited

Loading