This project aims to develop a framework that uses Large Language Models (LLMs) to stage breast cancer based on the radiology and pathology reports. Here is the general framework:
The retrieval pipeline is based on indexing. Each merged note will be split to multiple chunks and each chunk will then be embedded. These embeddings will then be stored on a vector database. Then, a conversational retrieval will be used to incorporate only relevant chunks using similarity search, as shown in the following figure: