Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reading dates and times #44

Open
rhdunn opened this issue Apr 11, 2013 · 0 comments
Open

Support reading dates and times #44

rhdunn opened this issue Apr 11, 2013 · 0 comments

Comments

@rhdunn
Copy link
Owner

rhdunn commented Apr 11, 2013

This is detecting years, months, days, hours minutes and seconds, associating the correct pronunciation to them.

This won't detect all uses of these, but the more that can be detected the better the reading experience will be.

Should also look at how different languages represent dates (both written and pronounced), e.g. Japanese/Chinese uses <number>{sun}<number>{moon}<number>{year} where {...} represents the associated Han character.

Years

Years less than 2000 should be pronounced in digit pairs, e.g. 1876 should be pronounced as 18 76. This can be done in tts/word_stream by setting the number scale to 100 (i.e. 2 digits) and not pronouncing an "and" between the groups.

Years in isolation can be detected in some contexts. For example, 1960s represents a year (technically, it is a range of years).

Months

Months can be abbreviated (e.g. Apr for April). These should be defined in a months.dict file per locale. It should also have the long-form names so this set can be used to identify dates correctly.

NOTE: The months.dict data should only be used to detect date formats and not arbitrarily expand the month abbreviations (e.g. Jan can also be a person's name).

Days

Days are written as cardinal numbers, but spoken as ordinal numbers.

Hours, Minutes, Seconds

  1. <hours>:<minutes>:<seconds> is pronounced as <hours> <minutes> and <seconds> seconds.
  2. <hours>:00:<seconds> is pronounced as <hours> hours and <seconds> seconds.
  3. <hours>:<minutes>pm is pronounced as <hours> <minutes> p m.
  4. <hours>:00pm is pronounced as <hours> p m.

There may also be an AM/PM after it as well as a timezone (BST, PST, etc). All of these are pronounced as abbreviations.

Date/Time Formats

2013-01-28
1970s

The formats:

Wed Apr 10, 2013 7:26 pm
Friday, December 21, 2018 4:05 PM EST
Apr 12, 1:50am BST
SEPTEMBER 16
Wednesday, November 19, 2003
November 19, 2003
Mid-June to mid-September, 2004
September 2002
Monday, Nov. 18

all share a common format:

MONTH_NAME := (SHORT_MONTH_NAME '.'?) | LONG_MONTH_NAME
DATE := (DAY_OF_WEEK ','?)? MONTH_NAME (DAY (',' YEAR)? | YEAR)?
TIME := HOURS ':' MINUTES ('am' | 'pm') ('est' | 'bst' | ...)
DATETIME := DATE TIME?


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant