Architecture Decision: choice of local Database (Postgres)

Problem

So far, this project was directly writing to files for persistence, data management was happening using JSON objects, and those objects are flushed to files. this project doesn't have a database so far, understanding existing & feature requirements for this project - Research on choice of database.

Requirements:

Light weight (simple metrics startup time - when you spawn docker DB container)
Atomicity (for applied, skipped jobs use cases)
ORM Support (this is required as modules get complex)
Local RAG requirements, vector support
High query flexibility > horizontal scaling (its local database)
Data modeling flexibility (Research if we need this? current data model can be fit is RDMBS - as it is structed data)
- Do we have usecases / goona get thsoe uses cases where nested json is requried. (this is where mongo excels)
Supporting Analytical Queries
Storage constraints (nosql data must be denormaized) (low prioroty as storge cost is alwaays low)

Choice 1: MongoDB

For now the most popular nosql db is mongodb and it has pretty good support for python (https://github.com/mongodb/mongo-python-driver).

Choice 2: Postgres

Choice 3: sqlite

TBD #Alternatives Considered TBD #Reference tbd

from @Surapuramakhil
mongoDB will be good choice atleaste if we have nested JSON use cases (we don't even have that) it lacks ORM Support, not good choice for data requirmetns of complex software modules

Decision:

going with Postgres (for covering all rdbms usebases)

to be added later if needed

mongo satifying these usecases Currently we are using JSON file, want to present the data in a NoSQL and vector databases for bettern llm response + RAG model.

a NoSQL database to store the raw data we collect from the internet before processing it and pushing it into the vector database. As we work with unstructured text data, the flexibility of the NoSQL database fits. Integrate it with the different job board api's as a unified datawarehouse. FTI design and the LLM Twin architecture

Provide feedback

Saved searches

Use saved searches to filter your results more quickly