Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: SpannerChangeStreamsToBigQuery does not support null primary key column values. #831

Open
rob-pomelo opened this issue Jun 30, 2023 · 6 comments
Labels
bug Something isn't working needs triage p2

Comments

@rob-pomelo
Copy link

rob-pomelo commented Jun 30, 2023

Related Template(s)

SpannerChangeStreamsToBigQuery

What happened?

In Spanner it is valid to have nullable columns in a primary key.

However, the way the pipeline reads values out of json here it seems like it is extracting null when a column in the primary key is null and passing that value to bigquery.

Bigquery does not seem accept this and throws "This field: XXX is not a record."

This is blocking us rolling out a new feature that relies on a null primary key column.

Full example:

CREATE TABLE TestTable (
    Col1 STRING(40),
    Col2 STRING(40)
) PRIMARY KEY (Col1, Col2)

say we delete row with the following values:

Col1="123",Col2=NULL

keysJson will be:

{\"Col1\":\"123\",\"Col2\":null}

error is:

"location":"col2","message":"This field: col2 is not a record.","reason":"invalid"}],"index":0}}

Beam Version

Newer than 2.46.0

Relevant log output

{"message":{"keysJson":"{\"XXX\":null,\"XXX\":\"XXXX\",\"XXX\":\"XXX\"}","newValuesJson":"{}","commitTimestampSeconds":XXX,"commitTimestampNanos":XXX,"serverTransactionId":"XXX","isLastRecordInTransactionInPartition":false,"recordSequence":"XXX","tableName":"XXX","modType":"DELETE","valueCaptureType":"OLD_AND_NEW_VALUES","numberOfRecordsInTransaction":2,"numberOfPartitionsInTransaction":1,"_metadata_error":{"errors":[{"debugInfo":"","location":"XXX","message":"This field: XXX is not a record.","reason":"invalid"}],"index":0},"_metadata_retry_count":1},"error_message":{"errors":[{"debugInfo":"","location":"XXX","message":"This field: XXX is not a record.","reason":"invalid"}],"index":0}}
@rob-pomelo rob-pomelo added bug Something isn't working needs triage p2 labels Jun 30, 2023
@rob-pomelo rob-pomelo changed the title [Bug]: SpannerChangeStreamsToBigQuery does not support null primary key values. [Bug]: SpannerChangeStreamsToBigQuery does not support null primary key column values. Jun 30, 2023
@rob-pomelo
Copy link
Author

rob-pomelo commented Jun 30, 2023

As a followup i dug deeper and i think the issue is that

val jsonObject =
      JSONObject("{\"Col1\":null,\"Col2\":\"123\"}")
val v = j.get("Col1")
val isNull = v != null  // TRUE, because JSONObject.NULL and not java null

Details on null: https://stleary.github.io/JSON-java/org/json/JSONObject.html#NULL

This differs from non PK fields where java null is used and works

I think an explicit isNull check and skip similar to the following would fix it

@rob-pomelo
Copy link
Author

hi any updates here?

Copy link

This issue has been marked as stale due to 180 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the issue at any time. Thank you for your contributions.

@github-actions github-actions bot added the stale label May 20, 2024
@rob-pomelo
Copy link
Author

this still is not fixed

@github-actions github-actions bot removed the stale label Jul 26, 2024
Copy link

This issue has been marked as stale due to 180 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the issue at any time. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jan 22, 2025
@rob-pomelo
Copy link
Author

still not fixed

@github-actions github-actions bot removed the stale label Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage p2
Projects
None yet
Development

No branches or pull requests

1 participant