Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to lazy load the repr handlers? #16

Open
laike9m opened this issue Nov 13, 2020 · 8 comments
Open

Is it possible to lazy load the repr handlers? #16

laike9m opened this issue Nov 13, 2020 · 8 comments

Comments

@laike9m
Copy link

laike9m commented Nov 13, 2020

When profiling my program, I find that importing cheap_repr is very expensive. Here's one profiling result using py-spy:

https://laike9m.github.io/images/af9b0d4.svg

image

Importing cheap_repr took ~8% of the total time. In __init__.py, it seems that it should be possible to make @try_register_repr calls lazy. Especially for registering Pandas handlers, they took most of the time. This becomes worse when users have Pandas installed but didn't really call cheap_repr on Pandas objects.

@alexmojaki
Copy link
Owner

I am interested in this (#3) but I don't know the best way to do it or how easy it would be. Happy to accept a PR. Since it seems like it requires an import hook it could be a Python 3 only enhancement.

@laike9m
Copy link
Author

laike9m commented Nov 13, 2020

I think it would be helpful to write down what you have in mind about import hooks, and we can see if somebody will have time to work on this.

@alexmojaki
Copy link
Owner

I'm imagining an import hook that is triggered whenever a module name that was registered with try_register_repr is imported for the first time, and only then is the repr registered against the actual class. So instead of importing pandas, it waits until someone else imports pandas, and at that moment registers the reprs for DataFrame etc.

it seems that it should be possible to make @try_register_repr calls lazy

What did you have in mind?

@laike9m
Copy link
Author

laike9m commented Nov 13, 2020

I was more thinking of doing reigsitering when it is actually used

@alexmojaki
Copy link
Owner

How?

@laike9m
Copy link
Author

laike9m commented Nov 13, 2020

I don't know if that's possible, I'll have to take a closer look at the code. It's just a thought for now.

@char101
Copy link

char101 commented Nov 7, 2024

Hi,

The way this module works is that it imports a lot of modules that are not necessarily used. I have pandas installed and although my application does not use pandas, it is still being imported which adds around 500ms to the initial launch time (not much by itself, but since my application is a GUI application which I have to restart over and over, I prefer to shave off that 500ms if possible).

Do you think that instead of using cls as the keys for the registry, you can use the __module__ and type(x).__name__ values instead.

For example

>>> import pandas as pd
>>> df = pd.DataFrame({'a': [1,2,3]})
>>> df.__module__
'pandas.core.frame'
>>> type(df).__name__
'DataFrame'

>>> from collections import ChainMap
>>> cm = ChainMap()
>>> type(cm).__name__
'ChainMap'
>>> cm.__module__
'collections'

Then these values can be used for @try_register_repr

@try_register_repr('pandas.core.frame', 'DataFrame')

Then these values can be matched at runtime (as strings) instead of imported.

@char101
Copy link

char101 commented Nov 7, 2024

I tested it with these minimal changes and it seems to work for pandas.

--- __init__.py.orig    Thu Nov 07 19:44:05 2024
+++ __init__.py Thu Nov 07 20:01:33 2024
@@ -85,13 +85,16 @@
     If the class cannot be imported, nothing happens.
     """
     try:
-        cls = getattr(import_module(module_name), class_name)
+        cls = module_name + ':' + class_name
         return register_repr(cls)
     except Exception:
         return lambda x: x


+def class_name(cls):
+    return cls.__module__.split('.', 1)[0] + ':' + cls.__name__
+
+
 def register_repr(cls):
     """
     Register a repr function for cls. The function must accept two arguments:
@@ -100,11 +103,11 @@
     and can be retrieved by find_repr_function(cls).
     """

-    assert inspect.isclass(cls), 'register_repr must be called with a class. ' \
-                                 'The type of %s is %s' % (cheap_repr(cls), type_name(cls))
+    # assert inspect.isclass(cls), 'register_repr must be called with a class. ' \
+    #                              'The type of %s is %s' % (cheap_repr(cls), type_name(cls))

     def decorator(func):
-        repr_registry[cls] = func
+        repr_registry[cls if isinstance(cls, str) else class_name(cls)] = func
         func.__dict__.setdefault('maxparts', 6)
         return func

@@ -164,7 +167,7 @@

 def find_repr_function(cls):
     for cls in inspect.getmro(cls):
-        func = repr_registry.get(cls)
+        func = repr_registry.get(class_name(cls))
         if func:
             return func

@@ -190,7 +193,7 @@
     for cls in inspect.getmro(x_cls):
         if cls in suppressed_classes:
             return _basic_but('repr suppressed', x)
-        func = repr_registry.get(cls)
+        func = repr_registry.get(class_name(cls))
         if func:
             helper = ReprHelper(level, func, target_length)
             return _try_repr(func, x, helper)
>>> import pandas as pd
>>> df = pd.DataFrame({'a': range(10)})
>>> from cheap_repr import cheap_repr
>>> print(cheap_repr(df))
    a
0   0
1   1
2   2
3   3
.. ..
6   6
7   7
8   8
9   9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants