Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ feat: Infer required fields. #278

Closed
wants to merge 1 commit into from
Closed

Conversation

Okanmercan99
Copy link
Contributor

No description provided.

@@ -114,7 +115,6 @@ class SchemaDefinitionService(schemaRepository: ISchemaRepository, mappingReposi
val dataFrame = try {
SourceHandler
.readSource(inferTask.name, ToFhirConfig.sparkSession, inferTask.sourceBinding, inferTask.mappingJobSourceSettings.head._2, schema = None)
.limit(1) // It is enough to take the first row to infer the schema.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood that this is removed to infer the column is required or not. But, if we make inferring using a large CSV, will we have performance problems because it reads all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we determine if a column is mandatory without reviewing all rows? We need to ensure that every row contains a specific value for this column to confirm its necessity.

@Okanmercan99 Okanmercan99 deleted the infer-cardinality branch January 23, 2025 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants