-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable importing of CSV data files #201
Comments
Hi @visakha. Thanks for the feedback! We definitely agree that structured data is super important in this space and we welcome suggestions on the best way to incorporate it. We do have a JSON reader ( sycamore/sycamore/scans/file_scan.py Line 161 in 33e3245
|
Since you mentioned that that data originates in an RDBMS, another question is whether it would be useful to have connectors directly to the database rather than doing an intermediate CSV export. |
If we end up doing CSV, we should add TSV at the same time. It's trivial and a lot easier to work with. |
The reason I say CSV/ TSV (vs direct DB Conn) is the clean boundary.
The integration concerns will have a clear Starting point
…On Mon, Jan 8, 2024 at 3:44 PM Alex Meyer ***@***.***> wrote:
If we end up doing CSV, we should add TSV at the same time. It's trivial
and a lot easier to work with.
—
Reply to this email directly, view it on GitHub
<#201 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACATVRALOLBZ4FZDD3GNXRLYNRSDPAVCNFSM6AAAAABBJEU5N2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBRHA3DSNBWGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Ingesting structured data is also a big requirement. companies have good understanding of their structured data already, they want the data to work for them and Aryn could be a channel to enable that.
solution
PDF Ingestion should be like a plugin - I have not read the code yet, but if it is, then the same plugin architecture could be adopted for other file formats, in this case csv.
if not then, I will be more than happy to work under guidance to implement it.
Alternatives
No alternatives
Additional context
Say that a client has customer data in a CRM system that is RDBMS backed, now we want to put some conversational intelligence into that space, how do we that. We would first export tables into CSV files 1:1 and then use Sycamore to ingest it and build relationships between the CSV files to Accelerate known knowledge.
The text was updated successfully, but these errors were encountered: