LLM generated MCQs, Kaggle LLM Science Exam #209

MeDott29 · 2023-11-21T02:02:17Z

MeDott29
Nov 21, 2023

Thoughts

This is at once the most mundane, plain Jane bullshit and the coolest thing I've ever done in my life. The sweet spot.

Notes

I made some real progress today. https://github.com/MeDott29/llm_science/blob/main/improved_retrieval.ipynb
I decided it's mandatory to find all the reference chunks for the data in 'train.csv'. I want to emulate the exact process I follow when I'm looking through wikipedia for the paragraph that was used to generate a test question. I was able to do that today.

Summary

Ask for advice:

class SearchTerms(BaseModel):
    first: str
    second: str
    third: str
    
resp = client.chat.completions.create(
    model="gpt-3.5-turbo-1106",
    messages=[
        {
            "role": "user", 
            "content": f"Read the MCQ and give the top three terms you would use to search for an answer: {test_item}",
        },
    ],
    response_model=SearchTerms
)
resp
print(resp)

Use first term to search Wikipedia API. Use second and third terms to find a paragraph that contains both terms.
Reference paragraph found.

2023/11/20

Have you seen it? https://www.kaggle.com/competitions/kaggle-llm-science-exam
The competition is over but I'm working on a system that will hopefully improve on the high score.
The highest scoring notebook as of right now uses 2.5TB of reference data and takes 9 hours to complete. Each of the top 50 teams have a score between 93% and 91%. They all used top_k and RAG.

I just read https://jxnl.github.io/instructor/blog/2023/11/18/validate-citations/ and I figured I'd reach out in hopes that our interests might align on this project.

I'm still grasping at straws but I think the play right now is to create a dataset for training a model that excels at answering difficult science questions. This repo gets you started on looking at the data and brainstorming schemas. https://github.com/MeDott29/llm_science

Can we use instructor to create a dataset that teaches a model to find the exact chunk of text a MCQ was generated from?
I see this work as one of the most important areas in the field right now. I believe it promises to unlock an unprecedented degree of dependability and confidence for both parties in the LLM + person partnership.

Looking Forward

Emulate test question reference chunk retrieval
Automate test question reference chunk retrieval
Automate training data acquisition
Generate test questions
Automate and optimize test model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM generated MCQs, Kaggle LLM Science Exam #209

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

LLM generated MCQs, Kaggle LLM Science Exam #209

MeDott29 Nov 21, 2023

Thoughts

Notes

Summary

2023/11/20

Looking Forward

Replies: 0 comments

MeDott29
Nov 21, 2023