Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CrateDB vector: Improve SQLAlchemy model factory #13

Merged
merged 1 commit into from
Nov 21, 2023

Conversation

amotl
Copy link

@amotl amotl commented Nov 20, 2023

Hi there,

this patch intends to support fixing GH-11, and to make progress towards GH-12. From now on, all instances 1 of SQLAlchemy model types will be created at runtime through the ModelFactory utility, effectively making the storage model more dynamic.

By using __table_args__ = {"keep_existing": True} on the ORM entity definitions, this seems to work well, even with multiple invocations of CrateDBVectorSearch.from_texts() using different collection_name argument values.

The patch is very hard to read. Additional context/explanation: 1.

With kind regards,
Andreas.

Footnotes

  1. There have been some details of the SA model definitions which still happened at compile-time beforehand. Now, everything concerned about establishing SA model classes is happening at runtime. This may feel weird, but is the approved way to get more flexibility. A future patch will probably leverage that flexibility by also configuring the table name based on variants. 2

@amotl amotl requested review from seut, matriv and andnig November 20, 2023 20:37
@amotl amotl changed the title CrateDB vector: Improve model factory CrateDB vector: Improve SQLAlchemy model factory Nov 20, 2023
From now on, _all_ instances of SQLAlchemy model types will be created
at runtime through the `ModelFactory` utility.

By using `__table_args__ = {"keep_existing": True}` on the ORM entity
definitions, this seems to work well, even with multiple invocations
of `CrateDBVectorSearch.from_texts()` using different `collection_name`
argument values.

While being at it, this patch also fixes a few linter errors.
@amotl amotl requested review from hammerhead, matriv and mkleen and removed request for seut, matriv and hammerhead November 20, 2023 23:27
@amotl amotl merged commit a104222 into cratedb Nov 21, 2023
@amotl amotl deleted the full-dynamic-schema branch November 21, 2023 18:03
amotl pushed a commit that referenced this pull request Mar 28, 2024
…e `text-splitter` directory is copied before poetry installation (langchain-ai#19214)

## Description
This PR modifies the settings in `libs/langchain/dev.Dockerfile` to
ensure that the `text-splitters` directory is copied before the poetry
installation process begins.

Without this modification, the `docker build` command fails for
`dev.Dockerfile`, preventing the setup of some development environments,
including `.devcontainer`.

## Bug Details

### Repro
Run the following command:

```bash
docker build -f libs/langchain/dev.Dockerfile .
```

### Current Behavior
The docker build command fails, raising the following error:

```
...
 => [langchain-dev-dependencies 4/5] COPY libs/community/ ../community/                                                                                0.4s
 => ERROR [langchain-dev-dependencies 5/5] RUN poetry install --no-interaction --no-ansi --with dev,test,docs                                          1.1s
------                                                                                                                                                      
 > [langchain-dev-dependencies 5/5] RUN poetry install --no-interaction --no-ansi --with dev,test,docs:
#13 0.970 
#13 0.970 Directory ../text-splitters does not exist
------
executor failed running [/bin/sh -c poetry install --no-interaction --no-ansi --with dev,test,docs]: exit code: 1
```

### Expected Behavior
The `docker build` command successfully completes without the poetry
error.

### Analysis
The error occurs because the `text-splitters` directory is not copied
into the build environment, unlike the other packages under the `libs`
directory. I suspect that the `COPY` setting was overlooked since
`text-splitters` was separated in a recent PR.

## Fix
Add the following lines to the `libs/langchain/dev.Dockerfile`:

```dockerfile
# Copy the text-splitters library for installation
COPY libs/text-splitters/ ../text-splitters/
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant