Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebuild image on changes to static class variables #1959

Merged
merged 10 commits into from
Jul 5, 2024

Conversation

TheQuantumFractal
Copy link
Contributor

@TheQuantumFractal TheQuantumFractal commented Jul 2, 2024

Describe your changes

Addresses MOD-2989

This code change allows for static class variables that are accessed by the build function to be tracked for rebuilds. In particular, it determines the static class variables of the parent class if one is defined. It then traverses the code of the function to see which class attributes are accessed. The set of accessed static class variables is then tracked alongside the globals in the build_function definition.

Also added tests (tests need Mock Client Servicer to only build an image on new requests for ImageGetOrCreate)

Backward/forward compatibility checks

Check these boxes or delete any item (or this section) if not relevant for this PR.

  • [-] Client+Server: this change is compatible with old servers
    NOTE: This change will force image rebuilds for all classes with static class variables
  • Client forward compatibility: this change ensures client can accept data intended for later versions of itself

Note on protobuf: protobuf message changes in one place may have impact to
multiple entities (client, server, worker, database). See points above.


Changelog

  • Image layers defined with a @modal.build method will now rebuild when there are changes to the values of any class variables referenced by the build method.

@TheQuantumFractal TheQuantumFractal changed the title Rebuild image on changes to instance variables Rebuild image on changes to static class variables Jul 3, 2024
@luiscape luiscape requested a review from mwaskom July 3, 2024 03:19
@mwaskom
Copy link
Contributor

mwaskom commented Jul 3, 2024

Not a comment on the implementation, but IMO, this might produce a net decrease in the coherence of the client behavior. I think it's especially a little weird to rebuild the image on changes to static class variables — but not on changes to class initialization parameters — especially since init params are one of the main reasons to use the class Function pattern.

I can't see the Slack discussion that's linked from the Linear ticket though; @aksh-at do you remember the original context here?

@@ -323,6 +323,9 @@ def _extract_code_globals(co):

return out_names

def _extract_code_attr(co):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These modifications essentially fork the vendored code. Can we move these out?

Copy link
Contributor

@mwaskom mwaskom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits and suggestions, but I do think we should figure out a different home for your additional extraction utilities to reduce the chance for future confusion if we need to update the vendored cloudpickle.

modal/_utils/function_utils.py Outdated Show resolved Hide resolved
modal/_utils/function_utils.py Outdated Show resolved Hide resolved
modal/_vendor/cloudpickle.py Outdated Show resolved Hide resolved
test/image_test.py Outdated Show resolved Hide resolved
test/image_test.py Outdated Show resolved Hide resolved
Copy link
Contributor

@mwaskom mwaskom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment on lines +204 to +206
import dis

import opcode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these heavy imports? We should already have loaded them anyway on client startup via modal._serialization -> modal._vendor.cloudpickle. Maybe we should care more about optimizing client startup time (although I think we need serialization utilities to do most things).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should be pretty light

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, not reason not to import them at the top of the module then IMO

rebuild_app.cls(image=Image.debian_slim())(FooInstance)
with rebuild_app.run(client=client):
image_id2 = list(servicer.images)[-1]
FooInstance.used_by_build_method = "normal"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to check that this doesn't rebuild again after you revert the change? That seems like a good idea if so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sure good point

import opcode

LOAD_ATTR = opcode.opmap["LOAD_ATTR"]
STORE_ATTR = opcode.opmap["STORE_ATTR"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it weird to rebuild if users only store a new value for a class attribute, but don't read it, during the build phase? (Actually, separate discussion, but since we're doing this anyay we should maybe warn or error if users try to set class/instance attributes in @build, since that won't propagate to runtime containers, and they get confused about it occasionally.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this is fair


func = self.raw_f
code = func.__code__
f_attr_ops = {instr.argval for instr in dis.get_instructions(code) if instr.opcode in ATTR_OPS}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double-checking for my own understanding that instr.argval here will be the attribute name not attribute value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. It's going to be the name of the attribute accessed in the opcode

elif instr.opcode == STORE_ATTR:
f_attr_ops.add(instr.argval)
logger.warning(
"\n\nWarning: %s set in @build. You must set class variables in "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwaskom does this warning seem ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd probably want to issue it via the warning module — but since it's not directly related let's defer this change and think about it a tiny bit to make sure it's a good idea. It makes sense, but could potentially be annoying if we don't do it right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'll remove it for now. Are we otherwise good to merge? This change will force rebuilds for all users that have static variables in their class so I do want to make sure things are ok on that front

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change will force rebuilds for all users that have static variables in their class so I do want to make sure things are ok on that front.

Ah that's a great callout — we should be a little conservative about that.

On the one hand, layer defined by the class @build function should be the final layer(s) so won't trigger cascading reinstalls of packages, which is often where people get into trouble (if they don't pin their dependency versions). On the other hand, they're often used for doing things like downloading models, which will take a while (and in some rare cases, break, if people point at a model revision that gets removed).

I guess it should specifically be limited to apps with build functions that read or write to class variables, is that right? In that case, I suspect it's a narrow-enough population who will directly benefit going forward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct. It's only apps with build functions that read / write to class variables. I don't think that's a large population, but good to flag in any case.

@TheQuantumFractal TheQuantumFractal merged commit 1af3876 into main Jul 5, 2024
22 checks passed
@TheQuantumFractal TheQuantumFractal deleted the ksalahi/static-vars branch July 5, 2024 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants