Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(python): Remove defeatist quote from timezones.md #17824

Closed
wants to merge 1 commit into from

Conversation

bionicles
Copy link

We are forced to deal with time zones, so this is not a helpful piece of docs, because it discourages contributors from improving polars' time zone support.

We are forced to deal with time zones, so this is not a helpful piece of docs, because it discourages contributors from improving polars' time zone support.
@bionicles bionicles changed the title Remove defeatist quote from timezones.md docs(python): Remove defeatist quote from timezones.md Jul 23, 2024
@github-actions github-actions bot added documentation Improvements or additions to documentation python Related to Python Polars and removed title needs formatting labels Jul 23, 2024
@MarcoGorelli
Copy link
Collaborator

Thanks for your contribution

It's just intended as a little bit of humour, which a few people have told me made them smile laugh 😄

Contributions to improve time zone support are welcome!

@bionicles
Copy link
Author

bionicles commented Aug 3, 2024

FYI @MarcoGorelli this quote could be misconstrued as official advice to use time zone naive datetime data which has cost me many hours of extra work figuring out how to deal with other people's data that "avoided using time zones if they could" and I think this joke creates a hazard.

Aka, encouragement of avoidance of time zones in official polars docs can promote the sort of philosophical mistake that causes huge projects to fail for stupid reasons ("we thought it was yesterday because you didn't tell us it was localized")

no offense but i hope this "funny" joke is worth potential "mars orbiter" level screwups!

TLDR I got some date data written as midnight date times and converted instead of replaced which made them all into the previous day and given polars focus on correctness, this is a disaster level near miss, and as you know, near misses almost always signal huge risk, I think we ought to improve the clarity of this page of docs to NOT jokingly encourage time zone naive data and to explicitly highlight this potential error in some kind of "potential pitfall" box here and we could explicitly mention which does which (replace sets the time zone variable without offsetting (adding or subtracting hours) while convert adds or subtracts hours to offset one time zone to the other)

Right now if you already know about and understand this dichotomy then you can figure it out by careful reading of the definitions and cross referencing them with the code example but if someone is a newbie they could see a joke implying not to worry about time zones and not understand the difference between replace and convert and why that could be a critical failure point if they get it wrong and they might not have the guidance they need to make a serious informed decision, then decide to make a quick easy decision to avoid some work and wind up owning themselves or their user.

luckily for my use case it didn't matter except making me feel dumb because some smart closed alpha test user noticed the days were yesterdays (cost quite some time to track down and pinpoint the source of a problem I could have avoided entirely from the start, though)

a few hours across a bunch of data can absolutely be deadly or permanently disabling for some stakeholder(S) of a future polars project or could cause some multi million or billion dollar project to fail for no reason, and neither of those are worth a cheap JOKE and I am convinced they WON'T be laughing about it

How could we to ensure we make it as clear as possible that no, the best practice is definitely NOT to "avoid using time zones?"

Anyway how could we add

  • access the time zone with the .dt accessor I.E. can we add df[datetime_col_name].dt.tz to avoid needing to go into the schema

  • given dates from noobs who "don't want to deal with time zones", set at the zeroth instant at 00:00:00 without knowing if they are UTC or localized, interpret them as localized America/New_York time instead of converting them from UTC and making them the previous date by subtracting hours from UTC

this is replace_time_zone which neither adds nor subtracts hours and we could definitely better clarify this foot gun for people about using that instead of convert

  • given the same aforementioned time zone naive datetime timestamps for dates, convert them to localized

this is convert_time_zone and docs could be clear this adds or subtracts hours potentially changing dates from midnight

just seems like one obvious way to improve the clarity of this document is to not lead with a joke implying this is ok to ignore

P.S. and I agree with the sentiment, time zones are annoying. Too bad we are stuck with them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants