5120 – Create Tags in the background #2005

vasconsaurus · 2024-08-27T13:24:44Z

Description

Context

While working on API/Zapier Error: The app did not respond in-time. It may or may not have completed successfully we thought tags creation could be causing the slow response.

We couldn’t pinpoint it as the cause, but it’s still a good performance improvement and should help improve the response time. So we are moving forward with it.

Implementation Notes

Sidekiq’s delayed extensions were deprecated in version 5 and removed in 7. So we are taking this opportunity to create a GenericJob, and eventually substitute the instances of .delay/.delay_for we have.

References: 5120, 4959

How has this been tested?

rails test test/models/tag_test.rb:288
rails test test/workers/generic_worker_test.rb

Things to pay attention to during code review

Please describe parts of the change that require extra attention during code review, for example:

File FFFF, line LL: This refactoring does this and this. Is it consistent with how it’s implemented elsewhere?
Etc.

Checklist

I have performed a self-review of my own code
I have added unit and feature tests, if the PR implements a new feature or otherwise would benefit from additional testing
I have added regression tests, if the PR fixes a bug
I have added logging, exception reporting, and custom tracing with any additional information required for debugging
I considered secure coding practices when writing this code. Any security concerns are noted above.
I have commented my code in hard-to-understand areas, if any
I have made needed changes to the README
My changes generate no new warnings
If I added a third party module, I included a rationale for doing so and followed our current guidelines

vasconsaurus · 2024-08-27T13:50:02Z

@caiosba I tried to summarize what I've been thinking and some of the issues I went through.
Could you take a look and see if this makes any sense at all?

On where I think this work could go

This is a work in progress. I thought it would be nice if we could maybe have a wrapper or something, so the usage of this worker would be similar to delay. So instead of GenericWorker.perform_in... we could call something like Tag.run_in...

Tests

I wrote two tests to try to guide this work, I think they make sense?

rails test test/models/tag_test.rb:288
rails test test/workers/generic_worker_test.rb

On Tag creation:

Contexts

We have different contexts: Creating tags when creating a project media, creating tags by themselves.
Since we are talking about this from the Zapier/Check Integration context, that means creating a tag when we create a project media.

How it works

If a project media has a set_tags array, it runs an after_create callback create_tags: I thought it made sense to have that callback happen in the background.

Annotations

This part of the codebase touches annotations.
I had an issue because originally create_tags sent the entire project_media object as annotator. Here:

check-api/app/models/concerns/project_media_creators.rb

Line 266 in f1f3b6c

self.set_tags.each { |tag| Tag.create!(annotated: self, tag: tag.strip, skip_check_ability: true) } if self.set_tags.is_a?(Array)
But I looked through our graphql queries wiki and saw that just sending the annotator_type and annotator_id should be enough to create a Tag. Here: https://github.com/meedan/check/wiki/GraphQL-query-examples#add-a-tag-to-an-item
I updated create_tags to send only those two values, and it seems to work.
So I didn't have to touch annotations, but I do wonder if we are getting rid of them, if this change here makes sense. Or if we want to go another way about this.

caiosba

Thanks @vasconsaurus , the direction is correct, I left a suggestion for better encapsulation and more clear code. Please let me know if it makes sense to you.

caiosba · 2024-08-27T22:11:31Z

app/models/concerns/project_media_creators.rb

@@ -263,6 +263,6 @@ def create_claim_description_and_fact_check
  end

  def create_tags
-    self.set_tags.each { |tag| Tag.create!(annotated: self, tag: tag.strip, skip_check_ability: true) } if self.set_tags.is_a?(Array)
+    self.set_tags.each { |tag| GenericWorker.perform_in(1.second, 'Tag', 'create!', annotated_type: 'ProjectMedia' , annotated_id: self.id, tag: tag.strip, skip_check_ability: true) } if self.set_tags.is_a?(Array)


Suggestion: I think it's going to be easier to encapsulate and to unit-test if we wrap the whole execution in a single worker run. Something like:

def create_tags if self.set_tags.is_a?(Array) GenericWorker.perform_in(1.second, 'ProjectMedia', 'create_tags_in_background', self.id, self.set_tags.to_json) end end def self.create_tags_in_background(project_media_id, tags_array_as_json) project_media = ProjectMedia.find(project_media_id) tags = JSON.parse(tags_array_as_json) tags.each { |tag| Tag.create! annotated: project_media, tag: tag.strip, skip_check_ability: true) } end

Rationale:

For clarity, I think it's important for the signature of GenericWorker to be explicitly class/method/arguments (I see it's implicit now) - more on that in the next comment

Each tag in a separate background job run means background job overhead for a small task, and it can also lead to inconsistencies

caiosba · 2024-08-27T22:13:27Z

app/workers/generic_worker.rb

+  def perform(klass_name, *args)
+    klass = klass_name.constantize
+    options = args.extract_options!


Suggestion for clarity: The signature could be explicitly klass_name, klass_method, method_args.

caiosba · 2024-08-27T22:15:31Z

@jayjay-w @melsawy , as we talked, please support/follow @vasconsaurus here too.

melsawy · 2024-08-28T12:20:15Z

@jayjay-w @melsawy , as we talked, please support/follow @vasconsaurus here too.

Sure Caio

vasconsaurus · 2024-09-02T19:31:13Z

@melsawy @jayjay-w could you please review? (it is still a work in progress)
Please take into consideration @caiosba requests, and I added some more context in this comment

app/models/project_media.rb

app/workers/generic_worker.rb

test/models/project_media_8_test.rb

melsawy

Left comments related to User.current and tests

vasconsaurus · 2024-09-13T19:16:47Z

@melsawy @caiosba

I have made some updates based on Sawy's review, and started to try to write a wrapper method.
Currently it works, and I added a couple of new tests, but it doesn't work as a drop-in replacement.

I think we could try to implement something like this

ProjectMedia.run_later_in(time).create_tags(project_media_id: self.id, tags_json: self.set_tags.to_json, user_id: self.user_id)

But, even then, we might not or want a drop in replacement? We could pass an object to delay_for, but that is, I think, one of the reasons they removed it. So we might not want to replicate that behavior, and update our code accordingly, what to you think?

Besides that, I have other questions:

What do you think about this so far?
Are there any things you think are missing or anything to change?
How should I move forward?
On errors: My tests pass and the tags get greated but I'm seeing at least a couple errors on Sidekiq:

Let me know if you have any questions 🙂

caiosba · 2024-09-14T00:46:44Z

app/models/project_media.rb

@@ -528,6 +528,19 @@ def add_nested_objects(ms)
    ms.attributes[:requests] = requests
  end

+  def self.create_tags(**params)


In order to make the code more clear and more predictable, it's better to explicitly list all arguments that the method expects, instead of **params (**params is more common in meta-programming).

caiosba · 2024-09-14T00:51:51Z

test/models/project_media_8_test.rb

+
+class ProjectMedia8Test < ActiveSupport::TestCase
+  def setup
+    super


Tests will run faster if you don't run everything that super runs. I suggest you start with empty setup and teardown methods and then add only the steps you actually need:

def setup end def teardown end

test/models/project_media_8_test.rb

caiosba · 2024-09-14T00:55:00Z

test/workers/generic_worker_test.rb

+
+class GenericWorkerTest < ActiveSupport::TestCase
+  def setup
+    super


Tests will run faster if you don't run everything that super runs. I suggest you start with empty setup and teardown methods and then add only the steps you actually need:

def setup end def teardown end

test/workers/generic_worker_test.rb

caiosba · 2024-09-14T00:59:04Z

Great stuff @vasconsaurus ! The error you're seeing in Sidekiq usually means that code was changed but the application was not restarted after that. Just remember to do touch tmp/restart.txt every time you chance code and so you should never see this error in Sidekiq.

I agree we should not do a drop-in replacement. Let's have our own implementation with our own signature. It will be easier to track during the migration too.

I added other comments directly to the code - nothing that changes the direction, and I also don't see anything missing. Once comments are addressed, all tests are passing and all the code is covered, I think this is ready for final review and tests. Great work!

I tried to create a general structure of the worker and two tests: - one test just to make sure the GenericWorker works - one test to make sure tags are created in the background BUT: - The tests are wonky - The worker test passes but shouldn't - The tag tests doesn't work because create_tags is private: - I think that is the place where/when we create the tags, seem we use it in a after_create callback. But I'm not completelly sure, and if I'm right, not sure the best way to test it.

I think the tags are created by a after_create callback inside ProjectMedia. I was having a bit of trouble testing that, so I added the creating of tags to the sample_data method, ans tested that it worked. Now that I know that works, I'll try moving it to the background.

- I added a new test for GenericWorker and specified which models we are testing. I think that might be easier than a too generic test. - I think now the tests make sense (though I'm not sure the tags test should be in there or inside project media tests) - Both tests trying to create tags fail, because we need to pass a whole object to annotations, so that is an issue

I'm unsure if it's ok to change this, but looking in our GraphQL wiki we should be able to create a Tag by sending the annotated_type and id. If we update create_tags to send those two values instead of the entire object, we are able to do it in the background job.

When creating tags while creating an item, we should wrap the whole execution in a single worker run. Each tag in a separate background job run means background job overhead for a small task, and it can also lead to inconsistencies

What we are testing is that tags are being created when we create a project media. The method we use to do that is a method called by project_media, so I thought it made more sense to move the tests.

Each tag in a separate background job run means background job overhead for a small task, and it can also lead to inconsistencies

We should notify Sentry when we fail to create the tags. (CheckSentry.notify also logs the error) Added tests: - make sure create_tags_in_background creates tags associated with the project_media - make sure an error is not raised, but an error is sent to Sentry Added .with_indifferent_access to make sure we are able to get the data from the params hash.

We log the revisions for tag creation in case User.current exists otherwise no logs for this action

- call the notify inside else, before I was always sending a sentry error - it's better to send the project_media_id to Sentry, even if there isn't a project media object, an id might exist

I think the method that was called 'in_background' is actually foreground and vice-versa.

…d of **params while doing this I found a bug: GenericWorker worked for methods that took hashes as a parameter, but not for methods that took standalone parameters. (not sure if standalone parameters is the correct way to call this)

Tests will run faster if you don't run everything that super runs. request here: #2005 (comment)

this way we make sure the team was created, the background job did what it was supposed to.

caiosba

Thanks @vasconsaurus for addressing all the comments from my last review! It's looking great! Just one last ask here: please add a test for this case:

check-api/app/workers/generic_worker.rb

Line 15 in 2c9c1d0

klass.public_send(klass_method)

.

for check-api/app/workers/generic_worker.rb:15 klass.public_send(klass_method)

caiosba · 2024-09-20T21:55:08Z

app/workers/generic_worker.rb

+  def perform(klass_name, klass_method, *method_args)
+    klass = klass_name.constantize
+    options = method_args.extract_options!.with_indifferent_access
+    if options


@vasconsaurus the tests you added in your last commit are still not covering line 15 - I found out that the problem is here... options is always a hash, so if options is always true... you should replace this conditional by something like unless options.blank?, because when there are no parameters, options value is {}.

@caiosba I updated this, but have a question regarding tests:
The two tests I added were passing before, so they are not really working as expected. How could I fix it?

Also, is there any extra logging or error handling that I should add?

Thanks, Manu. No, I don't think any additional logging or error handling is needed here. But for the test, I suggest you actually assess that the method is executed. For now your test for the no-arguments case just verifies that no exception is raised.

true 🤦‍♀️ still, do you think I should verify, assert anything else?

true 🤦‍♀️ still, do you think I should verify, assert anything else?

In the test I think you should make sure that the method is executed... for example, Blank.create! is a class method that doesn't take any arguments and you can use assert_difference 'Blank.count' do to make sure it was executed.

I updated to use assert_difference both here and for the test in line 28.
But one last comment: even when the test wasn't hitting the else, because I was using if options, the test still succeeded. So blank was still created.

So I guess what I mean is, this test makes sure Blank is created, but it doesn't make sure of which branch we are hitting.

Correct, I think that the other block was still able to handle it just fine. That said, maybe the if/else is not even needed, but it's more readable/clear to have them separate.

Yup, so I was wondering about that. I think it might be nice to have the option with just the one straightforward step, but it isn't really needed. Should we keep it or remove it (the conditional)?

Yeah, that's what I said, I think it's more clear to have the conditional. Ruby Meta programming can be harder to read, otherwise :)

options is always a hash, so `if options` is always `true`. replaced this conditional by `unless options.blank?`, because when there are no parameters, options value is `{}`.

caiosba · 2024-09-23T12:38:06Z

@melsawy , since you requested changes before, please re-review this PR.

melsawy

LGTM, great work @vasconsaurus
You can merge once apply Caio's requests and check test coverage.

Pretty straight forward. Sets the maximum retry count to 3. Note We want to add custom retry logic, but that works differently between Sidekiq 5 and 6. So we think it is better to focus later on the Sidekiq upgrade, and until there, worker instances can handle their exceptions instead of relying on the generic worker for that. References: 5120, #2005 PR: 2052

vasconsaurus changed the title ~~WIP – 4959 – Create Tags in the background~~ WIP – 5120 – Create Tags in the background Aug 27, 2024

caiosba requested changes Aug 27, 2024

View reviewed changes

melsawy reviewed Sep 3, 2024

View reviewed changes

app/models/project_media.rb Outdated Show resolved Hide resolved

melsawy reviewed Sep 3, 2024

View reviewed changes

app/workers/generic_worker.rb Show resolved Hide resolved

melsawy reviewed Sep 3, 2024

View reviewed changes

test/models/project_media_8_test.rb Show resolved Hide resolved

melsawy reviewed Sep 3, 2024

View reviewed changes

vasconsaurus force-pushed the 4959-create-tags-in-background branch from a1bde47 to 3536121 Compare September 5, 2024 20:08

vasconsaurus force-pushed the 4959-create-tags-in-background branch from 3536121 to 49f71c0 Compare September 12, 2024 20:19

caiosba requested changes Sep 14, 2024

View reviewed changes

vasconsaurus added 15 commits September 16, 2024 09:32

wip: try to move create_tags to the background

7bc2a5f

update GenericWorker method signature

370d300

move tests from TagTest to ProjectMedia8Test

61f48d0

What we are testing is that tags are being created when we create a project media. The method we use to do that is a method called by project_media, so I thought it made more sense to move the tests.

Create all tags in one background job

e0e5873

Each tag in a separate background job run means background job overhead for a small task, and it can also lead to inconsistencies

avoid raising an error if project media is nil

9a2986d

Pass User.current to the GenericWorker

ef52d2f

We log the revisions for tag creation in case User.current exists otherwise no logs for this action

update tests

6781a9c

update how we notify Sentry

ffd4a14

- call the notify inside else, before I was always sending a sentry error - it's better to send the project_media_id to Sentry, even if there isn't a project media object, an id might exist

update how we set User.current

3a67d18

vasconsaurus added 6 commits September 16, 2024 09:32

switch the methods' names

c8a0dd2

I think the method that was called 'in_background' is actually foreground and vice-versa.

create a wrapper method so this is easier to read

8a6efa5

keep setup and teardown to the bare minimum

e35c643

Tests will run faster if you don't run everything that super runs. request here: #2005 (comment)

add test to ensure job is queued and clean up tests

324ad39

change assert_nothing_raised to assert_difference

b510a7b

this way we make sure the team was created, the background job did what it was supposed to.

vasconsaurus force-pushed the 4959-create-tags-in-background branch from dffb66d to b510a7b Compare September 16, 2024 18:52

vasconsaurus added 2 commits September 20, 2024 09:13

move run_later methods to a module

935f44f

add GenericWorkerHelpersTest

2c9c1d0

vasconsaurus requested a review from caiosba September 20, 2024 14:02

vasconsaurus marked this pull request as ready for review September 20, 2024 14:02

vasconsaurus requested review from DGaffney and jayjay-w as code owners September 20, 2024 14:02

vasconsaurus changed the title ~~WIP – 5120 – Create Tags in the background~~ 5120 – Create Tags in the background Sep 20, 2024

caiosba requested changes Sep 20, 2024

View reviewed changes

add test

ff01619

for check-api/app/workers/generic_worker.rb:15 klass.public_send(klass_method)

vasconsaurus requested a review from caiosba September 20, 2024 19:56

caiosba requested changes Sep 20, 2024

View reviewed changes

caiosba requested a review from melsawy September 20, 2024 21:55

change conditional

5c1c941

options is always a hash, so `if options` is always `true`. replaced this conditional by `unless options.blank?`, because when there are no parameters, options value is `{}`.

melsawy reviewed Sep 23, 2024

View reviewed changes

update tests to assert_difference

9f1141e

vasconsaurus requested a review from caiosba September 23, 2024 13:43

caiosba approved these changes Sep 23, 2024

View reviewed changes

melsawy approved these changes Sep 23, 2024

View reviewed changes

vasconsaurus merged commit a04be2d into develop Sep 23, 2024
10 checks passed

vasconsaurus deleted the 4959-create-tags-in-background branch September 23, 2024 17:06

vasconsaurus mentioned this pull request Sep 25, 2024

Set sidekiq_options retry to 3 for the GenericWorker #2052

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

5120 – Create Tags in the background #2005

5120 – Create Tags in the background #2005

vasconsaurus commented Aug 27, 2024 •

edited

Loading

vasconsaurus commented Aug 27, 2024

caiosba left a comment

caiosba Aug 27, 2024

caiosba Aug 27, 2024

caiosba commented Aug 27, 2024

melsawy commented Aug 28, 2024

vasconsaurus commented Sep 2, 2024

melsawy left a comment

vasconsaurus commented Sep 13, 2024

caiosba Sep 14, 2024

caiosba Sep 14, 2024

caiosba Sep 14, 2024

caiosba commented Sep 14, 2024

caiosba left a comment

caiosba Sep 20, 2024 •

edited

Loading

vasconsaurus Sep 23, 2024

caiosba Sep 23, 2024

vasconsaurus Sep 23, 2024

caiosba Sep 23, 2024

vasconsaurus Sep 23, 2024

caiosba Sep 23, 2024

vasconsaurus Sep 23, 2024

caiosba Sep 23, 2024

caiosba commented Sep 23, 2024

melsawy left a comment

5120 – Create Tags in the background #2005

5120 – Create Tags in the background #2005

Conversation

vasconsaurus commented Aug 27, 2024 • edited Loading

Description

Context

Implementation Notes

How has this been tested?

Things to pay attention to during code review

Checklist

vasconsaurus commented Aug 27, 2024

On where I think this work could go

Tests

On Tag creation:

Contexts

How it works

Annotations

caiosba left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caiosba commented Aug 27, 2024

melsawy commented Aug 28, 2024

vasconsaurus commented Sep 2, 2024

melsawy left a comment

Choose a reason for hiding this comment

vasconsaurus commented Sep 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caiosba commented Sep 14, 2024

caiosba left a comment

Choose a reason for hiding this comment

caiosba Sep 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caiosba commented Sep 23, 2024

melsawy left a comment

Choose a reason for hiding this comment

vasconsaurus commented Aug 27, 2024 •

edited

Loading

caiosba Sep 20, 2024 •

edited

Loading