You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wanted to connect Superset to Trino (V403) using Hive(V3.1.3) connector and a shared S3 bucket.
Creating a schema in Trino worked, the schema could be listed as well: CREATE SCHEMA IF NOT EXISTS hive.myschema WITH (location = 's3a://bucket/')
Creating a table in that schema and querying that (via Trino CLI or Superset SQL query) worked:
CREATE TABLE IF NOT EXISTS hive.myschema.mytable (
myInt BIGINT
)
WITH (
external_location = 's3a://bucket/foo/',
format = 'PARQUET'
);
Listing that table via: SHOW TABLES IN hive.myschema
leads to a EXTERNAL_HIVE_EXCEPTION due to a socket timeout (in Trino). The Hive metastore pod was reachable from the Trino coordinator pod (curl).
In Superset, when creating a dataset from a Trino database, the tables cannot be listed either (and therefore no dataset created). This can be circumvented but is not great in terms of user experience.
Switching Hive to Postgres instead of Derby worked.
We have a related issue (comment) with Derby #154 (comment) to document that this should not be used for HA or in production at all.
In demos and integration tests with hive, afaik we always use Postgres.
This needs some investigation and a decision on how to deal/support Derby in the future.
I did not test Hive 2.3.9 or other Trino versions.
The text was updated successfully, but these errors were encountered:
I wanted to connect Superset to Trino (V403) using Hive(V3.1.3) connector and a shared S3 bucket.
Creating a schema in Trino worked, the schema could be listed as well:
CREATE SCHEMA IF NOT EXISTS hive.myschema WITH (location = 's3a://bucket/')
Creating a table in that schema and querying that (via Trino CLI or Superset SQL query) worked:
Listing that table via:
SHOW TABLES IN hive.myschema
leads to a
EXTERNAL_HIVE_EXCEPTION
due to a socket timeout (in Trino). The Hive metastore pod was reachable from the Trino coordinator pod (curl).In Superset, when creating a dataset from a Trino database, the tables cannot be listed either (and therefore no dataset created). This can be circumvented but is not great in terms of user experience.
Switching Hive to Postgres instead of Derby worked.
We have a related issue (comment) with Derby #154 (comment) to document that this should not be used for HA or in production at all.
In demos and integration tests with hive, afaik we always use Postgres.
This needs some investigation and a decision on how to deal/support Derby in the future.
I did not test Hive 2.3.9 or other Trino versions.
The text was updated successfully, but these errors were encountered: