You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is at once the most mundane, plain Jane bullshit and the coolest thing I've ever done in my life. The sweet spot.
Notes
I made some real progress today. https://github.com/MeDott29/llm_science/blob/main/improved_retrieval.ipynb
I decided it's mandatory to find all the reference chunks for the data in 'train.csv'. I want to emulate the exact process I follow when I'm looking through wikipedia for the paragraph that was used to generate a test question. I was able to do that today.
Summary
Ask for advice:
class SearchTerms(BaseModel):
first: str
second: str
third: str
resp = client.chat.completions.create(
model="gpt-3.5-turbo-1106",
messages=[
{
"role": "user",
"content": f"Read the MCQ and give the top three terms you would use to search for an answer: {test_item}",
},
],
response_model=SearchTerms
)
resp
print(resp)
Use first term to search Wikipedia API. Use second and third terms to find a paragraph that contains both terms.
Reference paragraph found.
2023/11/20
Have you seen it? https://www.kaggle.com/competitions/kaggle-llm-science-exam
The competition is over but I'm working on a system that will hopefully improve on the high score.
The highest scoring notebook as of right now uses 2.5TB of reference data and takes 9 hours to complete. Each of the top 50 teams have a score between 93% and 91%. They all used top_k and RAG.
I'm still grasping at straws but I think the play right now is to create a dataset for training a model that excels at answering difficult science questions. This repo gets you started on looking at the data and brainstorming schemas. https://github.com/MeDott29/llm_science
Can we use instructor to create a dataset that teaches a model to find the exact chunk of text a MCQ was generated from?
I see this work as one of the most important areas in the field right now. I believe it promises to unlock an unprecedented degree of dependability and confidence for both parties in the LLM + person partnership.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Thoughts
This is at once the most mundane, plain Jane bullshit and the coolest thing I've ever done in my life. The sweet spot.
Notes
I made some real progress today. https://github.com/MeDott29/llm_science/blob/main/improved_retrieval.ipynb
I decided it's mandatory to find all the reference chunks for the data in 'train.csv'. I want to emulate the exact process I follow when I'm looking through wikipedia for the paragraph that was used to generate a test question. I was able to do that today.
Summary
Ask for advice:
Use first term to search Wikipedia API. Use second and third terms to find a paragraph that contains both terms.
Reference paragraph found.
2023/11/20
Have you seen it? https://www.kaggle.com/competitions/kaggle-llm-science-exam
The competition is over but I'm working on a system that will hopefully improve on the high score.
The highest scoring notebook as of right now uses 2.5TB of reference data and takes 9 hours to complete. Each of the top 50 teams have a score between 93% and 91%. They all used top_k and RAG.
I just read https://jxnl.github.io/instructor/blog/2023/11/18/validate-citations/ and I figured I'd reach out in hopes that our interests might align on this project.
I'm still grasping at straws but I think the play right now is to create a dataset for training a model that excels at answering difficult science questions. This repo gets you started on looking at the data and brainstorming schemas. https://github.com/MeDott29/llm_science
Can we use instructor to create a dataset that teaches a model to find the exact chunk of text a MCQ was generated from?
I see this work as one of the most important areas in the field right now. I believe it promises to unlock an unprecedented degree of dependability and confidence for both parties in the LLM + person partnership.
Looking Forward
Beta Was this translation helpful? Give feedback.
All reactions