Skip to content

Some Notes about Dates and Datetimes

code-repository edited this page Oct 26, 2012 · 5 revisions

Datetimes

The State of Play

PostgreSQL on its own

PostgreSQL standards recommend that datetimes be stored in the database without timezone information.

When PostgreSQL has a datetime value without a timezone, it does not assume it is UTC. It assumes it is local time. What local time is can be implied or explicitly set by a number of mechanisms. Either by the operating system, environment, or with explicit commands.

Casting a specific timezone on a datetime object using PostgreSQL datetime functions will correctly translate it from the "local time" to the target timezone.

PostgreSQL and OpenERP

OpenERP follows this standard by storing all datetime objects without timezone, and having pre-converted them from local to UTC values.

When the datetime value is handled by the OpenERP ORM, it explicitly casts these non-timezoned values to UTC and deals with them as such, only converting them to local values when they are presented to or retrieved from user ineteraction. The timezone that it converts to or from is the 'tz' value in the context dictionary.

While this has its own issues (where a context is not available, like workflow transitions), this works generally well and its issues are not the scope of discussion here.

Pentaho and OpenERP

Pentaho itself has many ways of storing and carrying datetime values, particularly parameters and formula results. It seems to make no assumption about timezone if timezone is not provided.

The relatively safe assumption that we make is that a value keyed by a user, or provided as a parameter by a user through a wizard, or from the OpenERP database which is a date only value (not timestamped) is probably a "local" value.

The Problems and Some Suggestions

What is the "Local" Timezone?

When the report is run from OpenERP, the best source of timezone is the context. "Pentaho-reports-for-OpenERP" sets up and passes a reserved value called context_tz. It needs to be declared as a report parameter, as a string. It does not matter if it is declared as hidden or not as the automatic wizard generated by OpenERP will not include any of the reserved parameters. However, not selecting hidden certainly allows for easier testing during the designer phase.

PostgreSQL Retrieved Datetime Values

As explained above, a datetime retrieved by Pentaho from SQL table using a PostgreSQL query, it is assumed to be local time, not UTC (as OpenERP implicitly knows). This is fine when comparing to other datetimes from the database, but can start to be a problem when comparing to straight date values or to datetimes that come from other sources, such as keyed parameters.

We have been using a PostgreSQL construct along the lines of the following convert a datetime database value:

SELECT
	((lead.create_date::TIMESTAMP::VARCHAR || ' UTC')::TIMESTAMPTZ AT TIME ZONE COALESCE(${context_tz}, 'AEST')) create_date

FROM
	crm_lead lead

What we are doing:

  • Explicitly typecasting the value as a timestamp to remove any explicit timezone. (The timestamp data type does not store a timezone). This is redundant for OpenERP datetime values, as we have already pointed out they are stored without timezone. But, better to have a redundant step than get bitten down the track.

  • Explicitly typecasting our non-timezoned data type as a varchar value and appending a specific timezone of 'UTC'

  • This string is then cast as a timestamptz value. We now have a value that has overridden PostgreSQL's natural behaviour of treating the value as local, and have treated it as 'UTC'.

This would be enough simply for internal data comparisons. However, operator thinking is usually going to be in local time. When we print the data, they will usually want to see it locally specified. When values are keyed as parameters, the operator is more likely to be thinking locally.

  • We convert the value to our local timezone that came from the context. This process does NOT store the resulting timezone in the value, but returns a timestamp without timezone value "as at" the other timezone. Using 'coalesce' allows the conversion to work during design phase, or if we happen to run in an OpenERP environment where the users have not set their timezones explicitly. Our default just happens to be "Australian Eastern Standard Time" ("AEST").

Parameter Datetime Values

As stated abouve, parameter values are more likely to have been keyed by an operator thinking in local terms. However, we have already noted that OpenERP will pass a keyed datetime value preconverted to UTC. So the same construct could be used.

SELECT
	((${keyed_datetime}::TIMESTAMP::VARCHAR || ' UTC')::TIMESTAMPTZ AT TIME ZONE COALESCE(${context_tz}, 'AEST')) cutoff

Parameter Date Only Values

Date only values keyed go through no conversion to UTC. So, we can assume they are already "local".

The following assumes a cutoff datetime at midnight of the date keyed, and would include all the data in the date keyed.

SELECT
	((${keyed_date}::DATE || ' 00:00:00')::TIMESTAMP + '1 day'::INTERVAL) cutoff
  • By explicitly typecasting the value as a date we remove the potential error of a timestamp or timestamptz being passed (although it may assume the wrong date in this situation as it did not convert to local time first).

  • We then force it to midnight of that morning, turn it in to a timestamp, and then increment to the following midnight.

An Example

Here is one example. If you have some suggested improvements, do let us know.

WITH
	leadconv AS (SELECT
		((${ReportDate}::DATE || ' 00:00:00')::TIMESTAMP + '1 day'::INTERVAL) cutoff,
		(EXTRACT(YEAR FROM ${ReportDate}::DATE) - (CASE WHEN (EXTRACT(MONTH FROM ${ReportDate}::DATE) < 7) THEN 1 ELSE 0 END) || '-07-01 00:00:00')::TIMESTAMP year_start,
		id,
		((lead.create_date::TIMESTAMP::VARCHAR || ' UTC')::TIMESTAMPTZ AT TIME ZONE COALESCE(${context_tz}, 'AEST')) create_date
	FROM
		crm_lead lead
		)

SELECT
	lead.name lead_name,
	lead.planned_revenue revenue,
	lead.probability lead_prob,
	leadconv.create_date create_date
FROM
	crm_lead lead
	LEFT JOIN leadconv ON (leadconv.id = lead.id)
WHERE
	lead.type = 'opportunity'
	AND lead.probability > 0
	AND leadconv.create_date < leadconv.cutoff
	AND leadconv.create_date >= leadconv.year_start