-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add standardized and overridable logging #234
Conversation
I realized that setting the log level does not require its own method. You can simply do:
Perhaps we should update the documentations regarding this? Or do we want anything fancier than this. Perhaps we want to support an API like this instead:
This is similar to what Or should we go even simpler:
Maybe call the method something else, like Suggestions, @zainhoda? |
I added that
By default the logging level is "INFO". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this up! Several suggestions, but this is a good start, thanks for going through and doing the needed boilerplate conversions, figuring out which prints should be .info, .warn, .error, etc.
Returns: | ||
logging: Logger | ||
""" | ||
return logging.getLogger(__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If inside vanna/mistral/mistral.py, if we call .get_logger()
, I would expect to get the logger for vanna.mistral.mistral
. But I think this always return the logger for "vanna.logger"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just remove this function entirely. Callers inside this library should just call import logging; logger = logging.getLogger(__name__)
directly, it is the exact same amount of code.
Application developers should just call logging.getLogger("vanna")
directly. We should add this to the docs somewhere too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure I follow. I thought the goal of the PR is to have one standardized logger that is fixed for the entire framework?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you read the python logging cookbook? in the library code, we log to the vanna.mistral.mistral
logger. Then, a user can either configure that logger directly using logging.getLogger("vanna.mistral.mistral")
, or they configure ALL of vanna's loggers at the same time with logging.getLogger("vanna")
return get_logger() | ||
|
||
|
||
def setVerbosity(level: str = "INFO"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop this, we don't want multiple ways of doing things. Users should call logging.getLogger("vanna").setLevel(level)
instead. Should document this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I disagree. Having a convenience method for fetching the logging used throughout the framework is useful. Also for end users. TensorFlow does exactly this, AFAIK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I could have suggested this in a nicer way! I think this is more of a matter of taste, I'm curious if anyone else has thoughts. At the least, I would like this to be def set_log_level(level: str)
so that there isn't the verbosity/level mismatch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opinions;
- use snake case, like the rest of codebase
setVerbosity
currently does only 1 thing (1 line), so this as an interface isn't providing a helpful abstraction for anything. If in the future, this function should take care of other things, like pointing to the correct logger, maybe then it might be justified... but that would be building things for a potential use case, not a right-now use case.- perhaps later, if a real need arises to have more complicated logging (across multiple libraries etc), then maybe a convenience func could be a nice addition. but best wait for that situation to arise.
I've not been following this PR for long, so please take that into consideration with the above, these opinions may be uninformed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not much point doing anything here until I hear from @zainhoda, IMO.
We can always sketch some solution that works, but it needs to suit what the Vanna team wants. But I agree with @NickCrews proposals, I am just waiting for instructions.
@@ -184,13 +185,15 @@ | |||
) | |||
from .utils import sanitize_model_name, validate_config_path | |||
|
|||
_logger = initialize_logger() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should not configure the output of the logger at the global level. Then as soon as someone inports this module, the logger will start printing to stdout. Instead, configure it as one of the first things in the actual main() function.
Also, if this is the only place where initialize_logger()
is used, it doesn't need to be in that distant module, but just define it directly here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then as soon as someone inports this module, the logger will start printing to stdout
I thought the goal was to standardize all prints to the new logging solution? If you import vanna using my fix, all prints will follow this new standard. You are also free to change it yourself from outside the framework, if you'd like, using vn.setVerbosity("DEBUG")
. That sounded like behaviour that we would like, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I see what you're saying. I think it just comes down to what we want as the default:
- default to not configuring the logger, and then users opt-in to logging with
logging.basicConfig(...)
or the more finetippedlogging.getLogger("vanna").do_stuff(...)
. - default to configuring the logger, and have users opt-out.
Currently the behavior is default-on. But arguably the reason for that is just because we can't turn print() off. Some libraries go for default on (eg splink), others go for default off, flask is more nuanced. I prefer quietness, so I prefer to not do anything in library code, and let the end app developer configure logging. But I see the pros/cons of all 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, this is good reading: https://docs.python.org/3/howto/logging.html#configuring-logging-for-a-library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quote from there:
Note It is strongly advised that you do not add any handlers other than NullHandler to your library’s loggers. This is because the configuration of handlers is the prerogative of the application developer who uses your library. The application developer knows their target audience and what handlers are most appropriate for their application: if you add handlers ‘under the hood’, you might well interfere with their ability to carry out unit tests and deliver logs which suit their requirements.
So at least I think we should lean on the conservative side of adding handlers, maybe doing something similar to flask?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all CLI/main entrypoints of this lib, we should configure logging of course, so behavior will stay similar to today. But if I'm using vanna as a lib, this is when behavior would change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I see what you mean. It was a little unclear to me what the you wanted to solution to look like. But still, we do not want print()
inside the framework, as we cannot mute these. A good example is that when training Vanna, the documents and SQLs will be printed directly in the console. This is unfortunate for production use.
I'm fine with removing the _logger
solution given the new insight, but I guess I could revert to just doing logging.info()
(and similar) and then it is up to the end user to change the logging behaviour.
I can make an attempt at this later today. A bit busy atm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really appreciate the work here. I am rereading my review and it sounds a bit harsh, I could have been more graceful, so I'm sorry if I came off as a jerk.
Maybe wait for @zainhoda to chime in before you go through the rewrite? They might have a different idea totally for what they want the result to look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really appreciate the work here. I am rereading my review and it sounds a bit harsh, I could have been more graceful, so I'm sorry if I came off as a jerk.
@NickCrews No worries! I figured it was late in the day ;) Have done some open-source and learned to take some criticism on my bad ideas and code suggestions. So its all good!
Maybe wait for @zainhoda to chime in before you go through the rewrite? They might have a different idea totally for what they want the result to look like.
Yeah, that sounds like a good idea. I am in no rush. Just ping me if I fail to see an update :]
But it is a little unfortunate that when training Vanna, the training data is printed directly in the console. Not ideal for our production use case...
@NickCrews @andreped my apologies for the delay here. I haven’t yet had a chance to keep up with this PR and the thread. I’ll review this weekend and come back with thoughts. Thank you so much! |
@NickCrews @andreped thank you so much for your contributions here. If you don't mind, I would like to put a temporary "hold" on this feature while we define it a little more. I'm going to take the suggestions you've come up with and use that as a basis for how we implement this. I'll submit an independent PR for this but I'll make sure you're in the loop. To give you a sense of what I'm thinking:
I might also need to add some functions to retrieve the logs although I'll give that some more thought. Basically I want to be able to optionally support viewing the logs associated with a particular question within the UI |
@zainhoda Sounds good. Your argument is sound. Feel free to close this PR and tag it when you have opened a separate PR :] |
@zainhoda I believe there has been done some work on this already, which have already been merged. The original issue will remain open till the issue is entirely closed, but I am closing this PR, as this proposal will not be used. |
This PR adds standardized, overridable logging as referred to in issue #176.
The solution is using native python logging. I only added the simple logging solution and replaced all print statements with the appropriate logging type.
I tested the solution using the
Getting_started.ipynb
Jupyter Notebook. Logging seems to work as intended.CC: @justbee007, @zainhoda