Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_date() seems to look for lowercase timezone information #1209

Open
alejandro-colomar opened this issue Feb 17, 2025 · 3 comments
Open

get_date() seems to look for lowercase timezone information #1209

alejandro-colomar opened this issue Feb 17, 2025 · 3 comments

Comments

@alejandro-colomar
Copy link
Collaborator

alejandro-colomar commented Feb 17, 2025

/* The timezone table. */

for (tp = TimezoneTable; tp->name; tp++)

shadow/lib/getdate.y

Lines 542 to 570 in 77eb67d

/* Military timezone table. */
static TABLE const MilitaryTable[] = {
{ "a", tZONE, HOUR ( 1) },
{ "b", tZONE, HOUR ( 2) },
{ "c", tZONE, HOUR ( 3) },
{ "d", tZONE, HOUR ( 4) },
{ "e", tZONE, HOUR ( 5) },
{ "f", tZONE, HOUR ( 6) },
{ "g", tZONE, HOUR ( 7) },
{ "h", tZONE, HOUR ( 8) },
{ "i", tZONE, HOUR ( 9) },
{ "k", tZONE, HOUR ( 10) },
{ "l", tZONE, HOUR ( 11) },
{ "m", tZONE, HOUR ( 12) },
{ "n", tZONE, HOUR (- 1) },
{ "o", tZONE, HOUR (- 2) },
{ "p", tZONE, HOUR (- 3) },
{ "q", tZONE, HOUR (- 4) },
{ "r", tZONE, HOUR (- 5) },
{ "s", tZONE, HOUR (- 6) },
{ "t", tZONE, HOUR (- 7) },
{ "u", tZONE, HOUR (- 8) },
{ "v", tZONE, HOUR (- 9) },
{ "w", tZONE, HOUR (-10) },
{ "x", tZONE, HOUR (-11) },
{ "y", tZONE, HOUR (-12) },
{ "z", tZONE, HOUR ( 0) },
{ NULL, 0, 0 }
};

shadow/lib/getdate.y

Lines 711 to 720 in 77eb67d

/* Military timezones. */
if (buff[1] == '\0' && isalpha (*buff))
{
for (tp = MilitaryTable; tp->name; tp++)
if (streq(buff, tp->name))
{
yylval.Number = tp->value;
return tp->type;
}
}

@timparenti , do you know if time zones should be uppercase or lowercase? I was testing passing a date with a trailing Z and it wasn't being interpreted correctly, and now I realize that it wants lowercase letters, but that seems weird. What do standards say about this?

Cc: @zeha

@alejandro-colomar
Copy link
Collaborator Author

alejandro-colomar commented Feb 17, 2025

Ahh, no, we're actually making all text lowercase before comparison. Then I'm not sure why my tests seemed wrong.

/* Make it lowercase. */

@timparenti
Copy link

do you know if time zones should be uppercase or lowercase?

My main reference here is ISO/WD 8601-1 2016-02-16 which, while not technically the current standard, is reasonably close and recent. §§4.1–4.3 are most relevant to the broader issue at-hand, i.e., fixing #1202 and the like.

A note in §3.4.1 states that "[i]n date and time representations lower case characters may be used when upper case characters are not available", but I'll admit that seems like a rare condition these days and I can't say that I've seen it in actual practice. So I would adopt the standard robustness principle here: Accept z, but only output Z.

What do standards say about this?

Trying to parse timestamps with single-character zone designations from the "military timezone table" included above is not really compatible with the modern standard. Only the character Z is defined for the purpose of specifying a timezone as "UTC designator" (see §3.4.3 and §4.2.4). Several other single letters have other meanings defined in §3.4.3, most notably T, which is used as "time designator", separating the date from the time.

More broadly, in fact, we discourage trying to infer any UTC offsets from any abbreviations in general. From https://data.iana.org/time-zones/data/theory.html#abbreviations:

Application writers should note that these abbreviations are ambiguous in practice: e.g., 'CST' means one thing in China and something else in North America, and 'IST' can refer to time in India, Ireland or Israel. To avoid ambiguity, use numeric UT offsets like '-0600' instead of time zone abbreviations like 'CST'.

In short, "the mapping doesn't go that way", nor is it really intended to. That hasn't stopped folks from trying, though, as evidenced by the "timezone table" listed here:

shadow/lib/getdate.y

Lines 487 to 540 in 77eb67d

/* The timezone table. */
static TABLE const TimezoneTable[] = {
{ "gmt", tZONE, HOUR ( 0) }, /* Greenwich Mean */
{ "ut", tZONE, HOUR ( 0) }, /* Universal (Coordinated) */
{ "utc", tZONE, HOUR ( 0) },
{ "wet", tZONE, HOUR ( 0) }, /* Western European */
{ "bst", tDAYZONE, HOUR ( 0) }, /* British Summer */
{ "wat", tZONE, HOUR ( 1) }, /* West Africa */
{ "at", tZONE, HOUR ( 2) }, /* Azores */
{ "ast", tZONE, HOUR ( 4) }, /* Atlantic Standard */
{ "adt", tDAYZONE, HOUR ( 4) }, /* Atlantic Daylight */
{ "est", tZONE, HOUR ( 5) }, /* Eastern Standard */
{ "edt", tDAYZONE, HOUR ( 5) }, /* Eastern Daylight */
{ "cst", tZONE, HOUR ( 6) }, /* Central Standard */
{ "cdt", tDAYZONE, HOUR ( 6) }, /* Central Daylight */
{ "mst", tZONE, HOUR ( 7) }, /* Mountain Standard */
{ "mdt", tDAYZONE, HOUR ( 7) }, /* Mountain Daylight */
{ "pst", tZONE, HOUR ( 8) }, /* Pacific Standard */
{ "pdt", tDAYZONE, HOUR ( 8) }, /* Pacific Daylight */
{ "yst", tZONE, HOUR ( 9) }, /* Yukon Standard */
{ "ydt", tDAYZONE, HOUR ( 9) }, /* Yukon Daylight */
{ "hst", tZONE, HOUR (10) }, /* Hawaii Standard */
{ "hdt", tDAYZONE, HOUR (10) }, /* Hawaii Daylight */
{ "cat", tZONE, HOUR (10) }, /* Central Alaska */
{ "ahst", tZONE, HOUR (10) }, /* Alaska-Hawaii Standard */
{ "nt", tZONE, HOUR (11) }, /* Nome */
{ "idlw", tZONE, HOUR (12) }, /* International Date Line West */
{ "cet", tZONE, -HOUR (1) }, /* Central European */
{ "met", tZONE, -HOUR (1) }, /* Middle European */
{ "mewt", tZONE, -HOUR (1) }, /* Middle European Winter */
{ "mest", tDAYZONE, -HOUR (1) }, /* Middle European Summer */
{ "mesz", tDAYZONE, -HOUR (1) }, /* Middle European Summer */
{ "swt", tZONE, -HOUR (1) }, /* Swedish Winter */
{ "sst", tDAYZONE, -HOUR (1) }, /* Swedish Summer */
{ "fwt", tZONE, -HOUR (1) }, /* French Winter */
{ "fst", tDAYZONE, -HOUR (1) }, /* French Summer */
{ "eet", tZONE, -HOUR (2) }, /* Eastern Europe, USSR Zone 1 */
{ "bt", tZONE, -HOUR (3) }, /* Baghdad, USSR Zone 2 */
{ "zp4", tZONE, -HOUR (4) }, /* USSR Zone 3 */
{ "zp5", tZONE, -HOUR (5) }, /* USSR Zone 4 */
{ "zp6", tZONE, -HOUR (6) }, /* USSR Zone 5 */
{ "wast", tZONE, -HOUR (7) }, /* West Australian Standard */
{ "wadt", tDAYZONE, -HOUR (7) }, /* West Australian Daylight */
{ "cct", tZONE, -HOUR (8) }, /* China Coast, USSR Zone 7 */
{ "jst", tZONE, -HOUR (9) }, /* Japan Standard, USSR Zone 8 */
{ "east", tZONE, -HOUR (10) }, /* Eastern Australian Standard */
{ "eadt", tDAYZONE, -HOUR (10) }, /* Eastern Australian Daylight */
{ "gst", tZONE, -HOUR (10) }, /* Guam Standard, USSR Zone 9 */
{ "nzt", tZONE, -HOUR (12) }, /* New Zealand */
{ "nzst", tZONE, -HOUR (12) }, /* New Zealand Standard */
{ "nzdt", tDAYZONE, -HOUR (12) }, /* New Zealand Daylight */
{ "idle", tZONE, -HOUR (12) }, /* International Date Line East */
{ NULL, 0, 0 }
};

This seems to attempt to define UTC offsets for several standard and non-standard abbreviations, but even beyond the normal question of ambiguity it has further serious problems. Constructions like f[ws]t, zp[456], cct, idl[we], and others don't come from the tz project — and while liberally accepting abbreviations is not itself the problem, it does raise questions on where they came from and their accuracy. Meanwhile, several listed offsets are just plain wrong: In Africa, WAT typically refers to UTC+1, not UTC−1 as this table implies. And while we use different abbreviations than those here, Western Australia uses UTC+8, not UTC+7.

I would suggest that, in general, these existing tables may do more harm than good. In the broader solution to these time issues, they could likely be replaced with a simpler check on inputs:

  • If a time is suffixed with Z, or some ±hh[:?mm] offset defined in §4.2.5, interpret the provided time according to the specified offset.
    • Maybe consider adding a case for explicit UT[C?] to this list, but not for other arbitrary abbreviations.
  • If a time is suffixed with some other arbitrary abbreviation, produce an error.
  • If a time is not suffixed, treat it as local time.

Combined with "assume 23:59:59 if no time is specified", that could go a long way.

@alejandro-colomar
Copy link
Collaborator Author

@timparenti Thanks for your thorough response!

Hmmmm, I checked, and there's exactly one caller of get_date():

$ find -type f | grep '\.[hc]$' | xargs grepc -ntu get_date
./lib/strtoday.c:36:long strtoday (const char *str)
{
	time_t t;
	const char *s = str;

	/*
	 * get_date() interprets an empty string as the current date,
	 * which is not what we expect, unless you're a BOFH :-).
	 * (useradd sets sp_expire = current date for new lusers)
	 */
	if ((NULL == str) || streq(str, "")) {
		return -1;
	}

	/* If a numerical value is provided, this is already a number of
	 * days since EPOCH.
	 */
	if ('-' == *s) {
		s++;
	}
	s = stpspn(s, " ");
	if (strisdigit(s)) {
		long retdate;

		if (str2sl(&retdate, str) == -1)
			return -2;
		return retdate;
	}

	t = get_date(str, NULL);
	if ((time_t) - 1 == t) {
		return -2;
	}
	return t / DAY;
}

This caller wants a date without a time, and thus we want UTC (because local dates without times are a mess, as we've seen).

I guess we can remove all the code calculating time. And then we can remove all code for timezones. So that get_time() is reformed to only accept dates, and always treat them as UTC.

alejandro-colomar added a commit to alejandro-colomar/shadow that referenced this issue Feb 17, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones was quite bad.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
alejandro-colomar added a commit to alejandro-colomar/shadow that referenced this issue Feb 17, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones was quite bad.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
alejandro-colomar added a commit to alejandro-colomar/shadow that referenced this issue Feb 17, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones was quite bad.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
alejandro-colomar added a commit to alejandro-colomar/shadow that referenced this issue Feb 18, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones was quite bad.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
alejandro-colomar added a commit to alejandro-colomar/shadow that referenced this issue Feb 18, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones was quite bad.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
alejandro-colomar added a commit to alejandro-colomar/shadow that referenced this issue Feb 18, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones was quite bad.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
alejandro-colomar added a commit to alejandro-colomar/shadow that referenced this issue Feb 18, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones was quite bad.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
alejandro-colomar added a commit to alejandro-colomar/shadow that referenced this issue Feb 18, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones was quite bad.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
alejandro-colomar added a commit to alejandro-colomar/shadow that referenced this issue Feb 18, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones was quite bad.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
alejandro-colomar added a commit to alejandro-colomar/shadow that referenced this issue Feb 28, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones is quite bad, for today's standards.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
hallyn pushed a commit that referenced this issue Mar 2, 2025
There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones is quite bad, for today's standards.

Link: <#1202>
Link: <#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants