-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to update values from PCollection. #1
Comments
Hi, thanks for using the library! I would start by adding error logging, just like in this example: https://github.com/medzin/beam-postgres/blob/main/examples/write_error.py#L31 Also, it looks like there is a typo in the if value is not None:
yield DocId(None) This |
Thank you for your prompt reply. Yes, that seems to have been the case. As you might guess this is just an imitation of the actual code I am trying to run. Let me recheck if I have made the same mistake in my main code. I tried to log the error and here is the error I got.
It seems like I am passing the correct object and it still looks like it does not recognise the parameter.
|
The error message suggests that |
But the error message shows the dataclass value right? The document id is clearly set to a valid value. |
What runner are you using to run the pipeline (e.g. DataflowRunner)? |
Right now I'm testing using the direct runner but eventually I'll be using the dataflow runner |
I experienced a few problems with data class serialization on Dataflow - fields appeared to be set. Still, some class metadata was lost after deserialization, and function call here is returning an empty tuple. They only work without problems when I put my models in a dedicated Python package outside the pipeline code. I use |
The issue is I am not yet even using DataFlow, just the plain old direct runner. I see that the code example above actually updates the required documents. And in my main code, I am actually using the exact same environment |
Well, I'm using Python 3.9.16, which could be the difference causing the problem. Still, I suggest you put the data class in a dedicated Python package and check if it removes the problematic behavior. |
Okay, let me try that. I'll report the results. |
Hey so what I did was use tuples instead of Dataclasses. It seems to work now although I am not entirely sure what was causing the line you mentioned to fail. |
The problem is that the |
Hey I have been trying to use your library (which is very useful by the way) to try to Update a few records in by database. I am providing a snippet of code here which should provide some insight on what I am trying to achieve.
Unfortunately this does not seem to be working, the pipeline runs fine, but I do not see the changes reflected in my database. I am really new to the
apcache-beam
space and I am having troubleshooting, could you help?The text was updated successfully, but these errors were encountered: