0.14.0 (2020-10-28)
###Highlights
- Add support for IN clause to pull queries (#6409) (d5fc365)
- Add support for ALTER STREAM|TABLE (#6400) (a58e041)
- support variable substitution in SQL statements (#6504) (e185c1f)
- enable support for
JSON
key format (#6411) (fe97cde) - support for
DELIMITED
key format (#6344) (04af65c)
NONE
format for key-less streams (#6349) (25bb352)- add aggregated rocksdb metrics (#6354) (ecc6625)
- Add an endpoint for returning the query limit configuration (#6353) (84d202d)
- add commandRunnerCheck to healthcheck detail (#6346) (5f64d05)
- Add metrics for pull query request/response size in bytes (#6148) (946d2d3)
- avoid spurious tombstones in table output (#6405) (4c7c9b5)
- new CLI parameter to execute a command and quit (without CLI interaction) (#6267) (0d60246)
- support Comparisons on complex types (#6149) (0695213)
- new command to restore ksqlDB command topic backups (#6361) (036df20)
- new CLI parameter (-f,--file) to execute commands from a file (6440) (0e03a38 )
- new syntax to interact with session variables (define/undefine/show variables) (6474) (df98ef4)
- #6319 default port for CLI: set default port if not mentionned (#6410) (d46b147)
- add back configs for setting TLS protocols and cipher suites (#6558) (cf7de69)
- avoid RUN SCRIPT to override CLI session variables/properties (#6551) (8603ec8)
- backup files are re-created on every restart (#6348) (28b8486)
- check for nested UnspportedVersionException during auth op check (#6467) (369c3f1)
- Internal Server Error for /healthcheck endpoint in RBAC-enabled (#6482) (ebee5ec)
- JSON format to set correct scale of decimals (#6295) (57b7b2e)
- Properly clean up state when executing a command fails (#6437) (242a32e)
- recovery hangs when using TERMINATE ALL (#6397) (7a57b3c)
- support joins on key formats with different default serde features (#6550) (61e4073)
- support unwrapped struct value inference (#6446) (ed3ca5e)
- NPE in PARTITON BY on null value (#6508) (dbc7867)
- clean up leaked admin client threads when issuing join query to query-stream endpoint (#6532) (9cc7698)
- CASE expressions can now handle 12+ conditions in docker + cloud (#6535) (#6541) (f6c39dc)
- This change fixes a bug where unnecessary tombstones where being emitted when a
HAVING
clause filtered out a row from the source that is not in the output table For example, given:
-- source stream:
CREATE STREAM FOO (ID INT KEY, VAL INT) WITH (...);
-- aggregate into a table:
CREATE TABLE BAR AS
SELECT ID, SUM(VAL) AS SUM
FROM FOO
GROUP BY ID
HAVING SUM(VAL) > 0;
-- insert some values into the stream:
INSERT INTO FOO VALUES(1, -5);
INSERT INTO FOO VALUES(1, 6);
INSERT INTO FOO VALUES(1, -2);
INSERT INTO FOO VALUES(1, -1);
Where previously the contents of the sink topic BAR
would have contained records:
Key | Value | Notes |
---|---|---|
1. | null. | Spurious tombstone: the table does not contain a row with key 1 , so no tombstone is required. |
1. | {sum=1} | Row added as HAVING criteria now met |
1. | null. | Row deleted as HAVING criteria now not met |
1. | null. | Spurious tombstone: the table does not contain a row with key 1 , so no tombstone is required. |
Note: the first record in the tom | ||
The topic will now contain: | ||
Key | Value | |
----- | ------- | |
1. | {sum=1} | |
1. | null. | |
Co-authored-by: Andy Coates [email protected] |
0.13.0 (2020-09-29)
- add KSQL processing log message on uncaught streams exceptions (#6253) (ac8875f)
- Adds support for 0x, X'...', x'...' type hex strings in udf:encode (#6118) (d492556)
- clarify key or value in (de)serialization processing log messages (#6109) (7a16b91)
- CommandRunner enters degraded state when corruption detected in metastore (#6164) (2b29ee0)
- latest and earliest ByOffset UDFs to capture N values (#6014) (96bb12a)
- Support for IF NOT EXISTS on CREATE CONNECTOR (#6036) (8466197)
- Support IF NOT EXISTS on CREATE TYPE (#6173) (a0f381b)
- support PARTITION BY NULL for creating keyless stream (#6096) (81e3142)
- surface error to user when command topic deleted while server running (#6240) (c5d6b56)
- allow expressions in flat map (#6163) (52a897b)
- create /var/lib/kafka-streams directory on RPM installation (#6126) (31ed3d3)
- delete zombie consumer groups 🧟 (#6160) (2d1697a)
- delimited format should write decimals in a format it can read (#6238) (626965e)
- don't use queryId of last terminate command after restore (#6278) (2753ccd)
- earliest/latest_by_offset should accept nulls (#5729) (6eb5a41)
- fail on non-string MAP keys (#6182) (9d4cc6d)
- improve error handling of invalid Avro identifier (#6239) (8dd3942)
- missing topic classifier now uses MissingSourceTopicException (#6172) (cf8e15d)
- register correct unwrapped schema (#6188) (cb25f9c)
- scale of ROUND() return value (#6236) (42ab721)
0.12.0 (2020-09-01)
- support CREATE OR REPLACE w/ config guard but w/o restrictions (#5766) (e7ff81a)
- add a serverStatus to ServerInfo and display the status in the CLI (#6040) (1921d0e)
- Add consumer offsets to DESCRIBE EXTENDED (#5476) (9ce3c97)
- add Ksql warning to KsqlResource response when CommandRunner degraded (#6039) (6d547da)
- add serialization exceptions to processing logger (#6084) (8ab98a5)
- add service to restart failed persistent queries (#5807) (0ffe3e2)
- Allow udfs to be provided via a new flag to KsqlTestingTool (#5964) (a85ef14)
- CommandRunner enters degraded states when it processes command with higher version than it supports (#6032) (a841443)
- DistributingExecutor fails DDL statement if CommandRunner DEGRADED (#6031) (62b6d9a)
- Enable datagen to set the message timestamp (#5849) (3cba869)
- hard delete schemas for push queries (#6061) (a036b8b)
- introduce the sql-based testing tool (YATT) (#6051) (33e71c3)
- move command topic deserialization to CommandRunner and introduce DEGRADED CommandRunnerStatus (#6012) (ab8cec2)
- New ksql.properties.overrides.denylist to deny clients configs overrides (#5877) (7d1ad25)
- Support [IF EXISTS] on DROP TYPE command (#5962) (431c2ff)
- Support IF EXISTS keyword on DROP CONNECTOR (#6067) (2c99df9)
- Support subscript and nested functions in grouping queries (#5998) (8d383db)
- client: support describe source in Java client (#5944) (a154373)
- support UDAFs with and without init Args with same param type (#5982) (0bf3296)
- allow implicit cast of numbers literals to decimals on insert/select (#6005) (2bc15dd)
- allow joins in windowed aggregations (9452ab6), closes #5898 #5931
- avoid losing cause of processing errors (#5937) (014c049)
- NPE when udf metrics enabled (#5960) (e32183f)
- protobuf format does not support unwrapping (#6033) (6c08e5d)
- remove unnecessary parser token (#6048) (a6c3864)
- set restarted query as healthy during a time threshold (#6018) (ae5c215)
- Use a SandboxedPersistentQueryMetadata to not interact with KafkStreams (#6066) (e000b41)
- Uses pull query metrics for all paths, not just /query (#5983) (143849c)
0.11.0 (2020-08-03)
- client: support streaming inserts in Java client (#5641) (1e109bf)
- client: support admin operations in Java client (#5671) (7d0079a)
- client: support list queries in Java client (#5682) (4d860f8)
- client: support DDL/DML statements in Java client (#5775) (53ca76f)
- support WINDOWEND in WHERE of pull queries (#5680) (40f2f13)
- Adds SSL mutual auth support to intra-cluster requests (#5482) (82b137f)
- new array_remove UDF (#5843) (4ebeff2)
- Replay command topic to local file to backup KSQL Metastore (#5831) (8523051)
- organize UDFs by category (#5813) (7f5b843)
- expose query ID in CommandStatusEntity (MINOR) (#5814) (bb29185)
- circumvent KAFKA-10179 by forcing changelog topics for tables (#5781) (ef8fa4f)
- always use the changelog subject in table state stores (#5823) (e69acb4)
- remove schema compat check if schema exists (#5872) (6338270)
- ensure null values cast to varchar/string remain null (#5769) (530eb7f)
- ksqlDB should not truncate decimals (#5763) (ba833f7)
- Make sure UDTF describe shows actual function description (#5744) (afe85d9)
- Reuse KsqlClient instance for inter node requests (#5742) (cd7f540)
- SEC-1034: log4j migration to confluent repackaged version (#5783) (4563d02)
- show overridden props in CLI (#5750) (f6fd2ee)
- simplify pull query error message (#5672) (9bc4755)
- Upgrade to Vert.x 3.9.1 which depends on version of Netty which allows backported ALPN in JDK 1.8.0_252 to be used, and provide warning if openSSL is not installed (#5818) (36e44a6)
- windowed tables now have cleanup policy compact+delete (#5743) (2038770)
- configure topic retention based on retention clause for windowed tables (#5835) (b509c99)
- set Schema Registry port in tutorials docker compose (f46d358)
- create the metastore backups directory if it does not exist (#5879) (d77d8b7)
- close query on invalid use of HTTP/2 with /query endpoint (#5883) (bcab116)
- adds a handler to gracefully shutdown (#5895) (5fbf171)
- ksqlDB now creates windowed tables with cleanup policy "compact,delete", rather than "compact". Also, topics that back streams are always created with cleanup policy "delete", rather than the broker default (by default, "delete").
0.10.2 (2020-10-05)
- adds a handler to gracefully shutdown (#5895) (5fbf171)
- allow expressions in flat map (#6165) (b6ad9bc)
- allow joins in windowed aggregations (9452ab6), closes #5898 #5931
- always use the changelog subject in table state stores (#5823) (#5837) (c87aa69)
- avoid losing cause of processing errors (#5937) (014c049)
- close query on invalid use of HTTP/2 with /query endpoint (#5885) (0b75411)
- create /var/lib/kafka-streams directory on RPM installation (#6126) (31ed3d3)
- ksqlDB should not truncate decimals (#5763) (ba833f7)
- remove schema compat check if schema exists (#5871) (9bcb5b0)
- Reuse KsqlClient instance for inter node requests (#5742) (#5844) (7645abd)
- windowed table topic retention fixes (#5842) (b3db23c)
- ksqlDB now creates windowed tables with cleanup policy "compact,delete", rather than "compact". Also, topics that back streams are always created with cleanup policy "delete", rather than the broker default (by default, "delete").
0.10.1 (2020-07-09)
- allow empty structs in schema inference (#5656) (3c38c8c)
- do not overwrite schemas from a CREATE STREAM/TABLE (#5756) (5aa0b72)
- make sure old query stream doesn't block on close (#5730) (663a67b)
- query w/ scoped all columns, join and where clause (#5684) (304eb0c)
- support CASE statements returning NULL (#5703) (5062942)
- UDTF don't return null on keyless stream (#5761) (190f9e2)
0.10.0 (2020-06-25)
- Any key name (#5093) (1f0ca3e)
- Explicit keys (#5533) (d0db0cf)
- add extra log messages for pull queries (#4909) (d622ecc)
- Adds the ability have internal endpoints listen on ksql.internal.listener (#5212) (46acb73)
- create MetricsContext for ksql metrics reporters (#5528) (50561a5)
- drop WITH(KEY) syntax (#5363) (bb43d23)
- expose JMX metric that classifies an error state (#5374) (52271bf)
- Expose Vert.x metrics (#5340) (e82f762)
- introduce RegexClassifier for classifying errors via cfg (#5412) (b25dd98)
- Pull Queries: QPS check utilizes internal API flag to determine if forwarded (#5392) (08b428f)
- reload TLS certificate without restarting server (#5516) (a5920b0)
- support TIMESTAMP being a key column (#5542) (286ce08)
- turn on snappy compression for produced data (#5495) (27d8ad5)
- client: Java client with push + pull query support (#5200) (280ef0c)
- client: support (non-streaming) insert into in Java client (#5448) (9e8234a)
- client: support push query termination in Java client (#5371) (62dacca)
- New UDF/UDAF
- Adds UDF regexp_extract_all (#5507) (e19233c)
- Adds UDF regexp_replace (#5504) (30309bf)
- Adds udf regexp_split_to_array (#5501) (3766129)
- new UDFs for array max/min/sort (#5505) (415d930)
- new UDFs for set-like operations on Arrays (#5548) (50428c7)
- new UDFs for working with Maps (#5536) (bc9ad2e)
- new UUID UDF (#5535) (cfa65da)
- new split_to_map udf (#5563) (a68b9ad)
- new string UDFs LPad, RPad (#5546) (00f5083)
- implement earliest_by_offset() UDAF (#5273) (2a356ac)
- INSTR function #881 (#5385) (ca86bbf)
- add CHR UDF (#5559) (5a746e8)
- implements ARRAY_JOIN as requested in (#5028) (#5474) (#5638) (6c67866)
- Add encode udf (#5523) (b02f1ce)
- /inserts-stream endpoint now accepts complex types (#5469) (0840160)
- allow dynamic construction of an ARRAY of STRUCTS with duplicate values #5436 (#5506) (0b1162c)
- allow setting auto.offset.reset=latest on /query-stream endpoint (#5455) (d91c016)
- allow structs in schema provider return types (#5287) (2e604f0)
- Allow value delimiter to be specified for datagen (#5332) (865e834)
- avoid unnecessary warnings from PRINT (#5459) (080fba9)
- Block writer thread if response output buffer is full (#5386) (0edda40)
- deals with issue #5521 by adding more descriptive error message (#5529) (86f8b67)
- disallow requests to /inserts-stream if insert values disabled (#5592) (e277f25)
- exclude window bounds from persistent query value & schema match (#5425) (3596ceb)
- fail AVRO/Protobuf/JSON Schema statements if SR is missing (#5597) (85a0320)
- fail on WINDOW clause without matching GROUP BY (#5431) (68354d4)
- improve print topic (#5552) (e193576)
- Improve pull query error logging (#5477) (f23412e)
- KSQL does not accept more queries when running QueryLimit - 1 queries (#5461) (d64f1bc)
- make stream and column names case-insensitive in /inserts-stream (#5591) (e9e3042)
- NPE in latest_by_offset if first element being processed has null… (#4975) (a9668d2)
- Prevent memory leaks caused by pull query logging (#5532) (723b6cb)
- remove leading zeros when casting decimal to string (#5270) (e1cc8ad)
- Remove stacktrace from error message (#5478) (b63d7e8)
- Retry on connection closed (#5515) (8eb1f88)
- set retention.ms to -1 instead of Long.MAX_VALUE (#5560) (22da8a0)
- update Concat UDF to new framework and make variadic (#5513) (cab6a86)
- zero decimal bug (#5531) (1b9094b)
- /query-stream endpoint should serialize Struct (MINOR) (#5205) (12b092b)
- Move Cors handler in front of /chc handlers (#5239) (004ced2)
- use schema in annotation as schema provider if present (1a90eeb)
- use sr's jackson-jsonschema version (#5213) (0b3899a)
- /inserts-stream endpoint now supports nested types (#5621) (866ae34)
- don't fail if broker does not support AuthorizedOperations (#5617) (0feb081)
- ensure only deserializable cmds are written to command topic (#5645) (4ad2bde)
- support GROUP BY with no source columns used (#5644) (a8e6630)
An associated [blog post](An associated blog post also covers many of these breaking changes: https://www.confluent.io/blog/ksqldb-0-10-updates-key-columns/) also covers many of these breaking changes.
Statements containing PARTITION BY, GROUP BY, or JOIN clauses now produce different output schemas.
For PARTITION BY and GROUP BY statements, the name of the key column in the result is determined by the PARTITION BY or GROUP BY clause:
- Where the partitioning or grouping is a single column reference, then the key column has the same name as this column. For example:
-- OUTPUT will have a key column called X;
CREATE STREAM OUTPUT AS
SELECT *
FROM INPUT
GROUP BY X;
- Where the partitioning or grouping is a single struct field, then the key column has the same name as the field. For example:
-- OUTPUT will have a key column called FIELD1;
CREATE STREAM OUTPUT AS
SELECT *
FROM INPUT
GROUP BY X->field1;
- Otherwise, the key column name is system-generated and has the form
KSQL_COL_n
, wheren
is a positive integer.
In all cases, except where grouping by more than one column, you can set the new key column's name by defining an alias in the projection. For example:
-- OUTPUT will have a key column named ID.
CREATE TABLE OUTPUT AS
SELECT
USERID AS ID,
COUNT(*)
FROM USERS
GROUP BY ID;
For groupings of multiple expressions, you can't provide a name for the system-generated key column. However, a work around is to combine the grouping columns yourself, which does enable you to provide an alias:
-- products_by_sub_cat will have a key column named COMPOSITEKEY:
CREATE TABLE products_by_sub_cat AS
SELECT
categoryId + ‘§’ + subCategoryId AS compositeKey
SUM(quantity) as totalQty
FROM purchases
GROUP BY CAST(categoryId AS STRING) + ‘§’ + CAST(subCategoryId AS STRING);
For JOIN statements, the name of the key column in the result is determined by the join criteria.
- For INNER and LEFT OUTER joins where the join criteria contain at least one column reference, the key column is named based on the left-most source whose join criteria is a column reference. For example:
-- OUTPUT will have a key column named I2_ID.
CREATE TABLE OUTPUT AS
SELECT *
FROM I1
JOIN I2 ON abs(I1.ID) = I2.ID JOIN I3 ON I2.ID = I3.ID;
The key column can be given a new name, if required, by defining an alias in the projection. For example:
-- OUTPUT will have a key column named ID.
CREATE TABLE OUTPUT AS
SELECT
I2.ID AS ID,
I1.V0,
I2.V0,
I3.V0
FROM I1
JOIN I2 ON abs(I1.ID) = I2.ID
JOIN I3 ON I2.ID = I3.ID;
- For FULL OUTER joins and other joins where the join criteria are not on column references, the key column in the output is not equivalent to any column from any source. The key column has a system-generated name in the form
KSQL_COL_n
, wheren
is a positive integer. For example:
-- OUTPUT will have a key column named KSQL_COL_0, or similar.
CREATE TABLE OUTPUT AS
SELECT *
FROM I1
FULL OUTER JOIN I2 ON I1.ID = I2.ID;
The key column can be given a new name, if required, by defining an alias in the projection. A new UDF has been introduced to help define the alias called JOINKEY
. It takes the join criteria as its parameters. For example:
-- OUTPUT will have a key column named ID.
CREATE TABLE OUTPUT AS
SELECT
JOINKEY(I1.ID, I2.ID) AS ID,
I1.V0,
I2.V0
FROM I1
FULL OUTER JOIN I2 ON I1.ID = I2.ID;
JOINKEY
will be deprecated in a future release of ksqlDB once multiple key columns are supported.
CREATE TABLE
statements will now fail if the PRIMARY KEY
column is not provided.
For example, a statement such as:
CREATE TABLE FOO (
name STRING
) WITH (
kafka_topic='foo',
value_format='json'
);
Will need to be updated to include the definition of the PRIMARY KEY, for example:
CREATE TABLE FOO (
ID STRING PRIMARY KEY,
name STRING
) WITH (
kafka_topic='foo',
value_format='json'
);
If using schema inference, i.e. loading the value columns of the topic from the Schema Registry, the primary key can be provided as a partial schema, for example:
-- FOO will have value columns loaded from the Schema Registry
CREATE TABLE FOO (
ID INT PRIMARY KEY
) WITH (
kafka_topic='foo',
value_format='avro'
);
CREATE STREAM
statements that do not define a KEY
column no longer have an implicit ROWKEY
key column.
For example:
CREATE STREAM BAR (
NAME STRING
) WITH (...);
Previously, the above statement would have resulted in a stream with two columns: ROWKEY STRING KEY
and NAME STRING
.
With this change, the above statement results in a stream with only the NAME STRING
column.
Streams with no KEY column are serialized to Kafka topics with a null
key.
A statement that creates a materialized view must include the key columns in the projection. For example:
CREATE TABLE OUTPUT AS
SELECT
productId, // <-- key column in projection
SUM(quantity) as unitsSold
FROM sales
GROUP BY productId;
The key column productId
is required in the projection. In previous versions of ksqlDB, the presence
of productId
in the projection would have placed a copy of the data into the value of the underlying
Kafka topic's record. But starting in version 0.10.0, the projection must include the key columns, and ksqlDB stores these columns
in the key of the underlying Kafka record. Optionally, you may provide an alias for
the key column(s).
CREATE TABLE OUTPUT AS
SELECT
productId as id, // <-- aliased key column
SUM(quantity) as unitsSold
FROM sales
GROUP BY productId;
If you need a copy of the key column in the Kafka record's value, use the
AS_VALUE function to indicate this to ksqlDB. For example, the following statement produces an output inline with the previous version of ksqlDB
for the above example materialized view:
CREATE TABLE OUTPUT AS
SELECT
productId as ROWKEY, // <-- key column named ROWKEY
AS_VALUE(productId) as productId, // <-- productId copied into value
SUM(quantity) as unitsSold
FROM sales
GROUP BY productId;
In previous versions, all key columns were called ROWKEY
. To enable using a more
user-friendly name for the key column in queries, it was possible
to supply an alias for the key column in the WITH clause, for example:
CREATE TABLE INPUT (
ROWKEY INT PRIMARY KEY,
ID INT,
V0 STRING
) WITH (
key='ID',
...
);
With the previous query, the ID
column can be used as an alias for ROWKEY
.
This approach required the Kafka message value to contain an exact copy of the key.
KLIP-24
removed the restriction that key columns must be named ROWKEY
, negating the need for the WITH(KEY)
syntax, which has been removed. Also, this change removed the requirement for
the Kafka message value to contain an exact copy of the key.
Update your queries by removing the KEY
from the WITH
clause and naming
your KEY
and PRIMARY KEY
columns appropriately. For example, the previous
CREATE TABLE statement can now be rewritten as:
CREATE TABLE INPUT (
ID INT PRIMARY KEY,
V0 STRING
) WITH (...);
Unless the value format is DELIMITED
, which means the value columns are
order dependent, so dropping the ID
value column would result in a
deserialization error or the wrong values being loaded. If you're using
DELIMITED
, consider rewriting as:
CREATE TABLE INPUT (
ID INT PRIMARY KEY,
ignoreMe INT,
V0 STRING
) WITH (...);
0.9.0 (2020-05-11)
- add multi-join expression support (#5081) (002cd5a)
- support more advanced suite of LIKE expressions (#5013) (67cd9d9)
- add COALESCE function (#4829) (251c237)
- add GROUP BY support for any key names (#4899) (e7cbdfc), closes #4898
- partition-by primitive key support (#4098) (7addf88), closes #4098
- add KsqlQueryStatus to decouple from KafkaStreams.State (#5029) (e8cbcde)
- Adds rate limiting to pull queries (#4951) (6284111)
- Do not allow access to new streaming endpoints using HTTP1.x (#5193) (8b90035)
- fail startup if command contains incompatible version (#5104) (a1751b1)
- klip-14 - rowtime as pseduocolumn (#5150) (d541420)
- 'SHOW QUERIES [EXTENDED]' statement returns results for all nodes in the cluster (#4875) (7385a31)
- transient queries added to show queries output (#5105) (e8a2a63)
- 'drop (stream|table) if exists' should not fail if source does not exist (#4872) (b0669a0)
- add deserializer for SqlType (#4830) (eed9912)
- allow trailing slash in listeners config (MINOR) (#5012) (13b0455)
- Allows unclosed quote characters to exist in comments (#4993) (fd65021)
- avoid duplicate column name errors from auto-generated aliases (#4827) (258d0b0)
- avoid long possible format lists on PRINT TOPIC for nulls (#4867) (a66e489)
- better error code on shutdown (#4754) (17d758d)
- Catch server startup exceptions (#4974) (898f3a1)
- CommandRunner metric has correct metric displayed when thread dies (#4653) (1db542b)
- do not allow GROUP BY and PARTITION BY on boolean expressions (#4940) (d84d2d1)
- do not allow grouping sets (#4942) (51bb9f6)
- do not allow implicit casting in UDAF function lookup (#5145) (6709010)
- do not filter out rows where PARTITION BY resolves to null (#4823) (e75a792)
- Filter out hosts with no lag info by default (#4859) (e10bbce)
- fix repartition semantics (#4816) (609e9e2)
- generated aliases for struct field access no longer require quoting (#4977) (2002458)
- handle leap days correctly during timestamp extraction (#4878) (c54db81), closes #4864
- If startup hangs (esp on preconditions), shutdown server correctly (#4889) (20c4b59)
- Improve error message for where/having type errors (#5023) (23eb80d)
- improve handling of NULLs (#5019) (c53dd68), closes #4912
- include lower-case identifiers among those that need quotes (#3723) (#5139) (3bcbcf4)
- make endpoints available while waiting for precondition (#5069) (1136162)
- Make sure internal client is configured for TLS (#5059) (37c7713)
- NPE in latest_by_offset if first two elements processed are both null (#4975) (4c72f93)
- only create processing log stream if it doesn't exist (#4805) (8dead0f)
- output valid multiline queries when running SHOW QUERIES (#4956) (ec74a33)
- query id for TERMINATE should be case insensitive (#5005) (588c1e9)
- reject requests to new API server if server is not ready (#5048) (d988722)
- Remove unnecessary error logging for heartbeat (#4809) (fa84576)
- replace 'null' in explain plan with correct op type (#5075) (f9bc0e6)
- Returns empty lag info for a dead host rather than last received lags (#4837) (3d98527)
- Stop PARTITION BY and UDTFs that fail from terminating the query (#4822) (522fe84)
- support quotes in explain statements (MINOR) (#5142) (bee2fe0)
- throw error when column does not exist during INSERT INTO (#4926) (89e261d)
- remove immutable properties from headless mode (#4936) (5550880)
- rename ksqldb in normal logging path (MINOR) (#4915) (16172ba)
- support trailing slashes in listener URLs (#5076) (e9e0431)
- logic when closing ksqlEngine fixed (#4917) (a217eb9)
- use describeTopics() for Kafka healtcheck probe (#4814) (578d0d5)
- Do not allow access to new streaming endpoints using HTTP1.x (#5193) (8b90035)
- speed up restarts by not building topologies for terminated queries (#5002) (2472382)
-
Select star, i.e.
select *
, no longer expands to includeROWTIME
column(s). Instead,ROWTIME
is only included in the results of queries if explicitly included in the projection, e.g.select rowtime, *
. This only affects new statements. Any view previously created via aCREATE STREAM AS SELECT
orCREATE TABLE AS SELECT
statement is unaffected. -
This release changes the system generated column name for any columns in projections that are struct field dereferences. Previously, the full path was used when generating the name, now only the final field name is used. For example,
SELECT someStruct->someField, ...
previously generated a column name ofSOMESTRUCT__SOMEFIELD
and now generates a name ofSOMEFIELD
. Generated column names may have a numeral appended to the end to ensure uniqueness, for exampleSOMEFIELD_2
.Note: it is recommended that you do not rely on system generated column names for production systems, because naming logic may change between releases. Providing an explicit alias ensures consistent naming across releases, for example,
SELECT someStruct->someField AS someField
. Backward compatibility: existing running queries will not be affected by this change, and they will continue to run with the same column names. Any statements executed after the upgrade will use the new names where no explicit alias is provided. Add explicit aliases to your statements if you require the old names, for example:SELECT someStruct->someField AS SOMESTRUCT__SOMEFIELD, ...
-
Existing queries that reference a single GROUP BY column in the projection would fail if they were resubmitted, due to a duplicate column. The same existing queries will continue to run if already running, i.e. this is only a change for newly submitted queries. Existing queries will use the old query semantics.
-
Push queries, which rely on auto-generated column names, may see changes in column names. Pull queries and any existing persistent queries are unaffected, e.g. those created with
CREATE STREAM AS SELECT
,CREATE TABLE AS SELECT
orINSERT INTO
. -
The ksqlDB server no longer ships with Jetty. This means that when you start the server, you must supply Jetty-specific dependencies, like certain login modules used for basic authentication, by using the KSQL_CLASSPATH environment variable for them to be found.
If you're upgrading from a ksqlDB version before 0.7, follow these upgrade instructions.
0.8.1 (2020-03-30)
- Don't wait for streams thread to be in running state (#4908) (2f83119)
- Infer TLS based on scheme of server string (#4893) (a519ed3)
0.8.0 (2020-03-18)
- support Protobuf in ksqlDB (#4469) (a77cebe)
- introduce JSON_SR format (#4596) (daa04d2)
- support for tunable retention, grace period for windowed tables (#4733) (30d49b3), closes #4157
- add REGEXP_EXTRACT UDF (#4728) (a25f0fb)
- add ARRAY_LENGTH UDF (#4725) (31a9d9d)
- Implement latest_by_offset() UDAF (#4782) (0c13bb0)
- add confluent-hub to ksqlDB docker image (#4729) (b74867a)
- add the topic name to deserialization errors (#4573) (0f7edf6)
- Add metrics for pull queries endpoint (#4608) (23e3868)
- log groupby errors to processing logger (#4575) (b503d25)
- Provide upper limit on number of push queries (#4581) (2cd66c7)
- display errors in CLI in red text (#4509) (56f9c9b)
- enhance
PRINT TOPIC
's format detection (#4551) (8b19bc6)
- change default exception handling for timestamp extractors (#4632) (1576af0)
- create schemas at topic creation (#4717) (514025d)
- don't display decimals in scientific notation in CLI (#4723) (3626f42)
- stop logging about command topic creation on startup if exists (MINOR) (#4709) (f4cec0a)
- added special handling for forwarded pull query request (#4597) (ba4fe74)
- backport fixes from query close (#4662) (8168002)
- change configOverrides back to streamsProperties (#4675) (ce74cf8)
- csas/ctas with timestamp column is used for output rowtime (#4489) (ddddf92)
- patch KafkaStreamsInternalTopicsAccessor as KS internals changed (#4621) (eb07370)
- use HTTPS instead of HTTP to resolve dependencies in Maven archetype (#4511) (f21823f)
- add deserializer for SqlType (#4830) (eed9912)
0.7.1 (2020-02-28)
-
support custom column widths in cli (#4616) (cb66e05)
A new ksqlDB CLI configuration allows you to specify the width of each column in tabular output.
ksql> SET CLI COLUMN-WIDTH 10
Given a customized value, subsequent renderings of output use the setting:
ksql> SELECT * FROM riderLocations > WHERE GEO_DISTANCE(latitude, longitude, 37.4133, -122.1162) <= 5 EMIT CHANGES; +----------+----------+----------+----------+----------+ |ROWTIME |ROWKEY |PROFILEID |LATITUDE |LONGITUDE | +----------+----------+----------+----------+----------+
The default behavior, which determines column width based on terminal width and the number of columns, can be re-enabled using:
ksql> SET CLI COLUMN-WIDTH 0
- add functional-test dependencies to Docker module (#4586) (04fcf8d)
- don't cleanup topics on engine close (#4658) (ad66a81)
- idempotent terminate that can handle hung streams (#4643) (d96db14)
0.7.0 (2020-02-11)
Note that ksqlDB 0.7.0 has a number of breaking changes when compared with ksqlDB 0.6.0 (see the 'Breaking changes' section below for details). Please make sure to read and follow these upgrade instructions if you are upgrading from a previous ksqlDB version.
-
feat: primitive key support (#4478) (ddf09d)
ksqlDB now supports the following primitive key types:
INT
,BIGINT
,DOUBLE
as well as the existingSTRING
type.The key type can be defined in the CREATE TABLE or CREATE STREAM statement by including a column definition for
ROWKEY
in the formROWKEY <primitive-key-type> KEY,
, for example:CREATE TABLE USERS (ROWKEY BIGINT KEY, NAME STRING, RATING DOUBLE) WITH (kafka_topic='users', VALUE_FORMAT='json');
ksqlDB currently requires the name of the key column to be
ROWKEY
. Support for arbitrary key names is tracked by #3536.ksqlDB currently requires keys to use the
KAFKA
format. Support for additional formats is tracked by https://github.com/confluentinc/ksql/projects/3.Schema inference currently only works with
STRING
keys, Support for additional key types is tracked by #4462. (Schema inference is where ksqlDB infers the schema of a CREATE TABLE and CREATE STREAM statements from the schema registered in the Schema Registry, as opposed to the user supplying the set of columns in the statement).Apache Kafka Connect can be configured to output keys in the
KAFKA
format by using a Converter, e.g."key.converter": "org.apache.kafka.connect.converters.IntegerConverter"
. Details of which converter to use for which key type can be found here: https://docs.confluent.io/current/ksql/docs/developer-guide/serialization.html#kafka in theConnect Converter
column.@rmoff has written an introductory blog about primitive keys: https://rmoff.net/2020/02/07/primitive-keys-in-ksqldb/
-
add a new default SchemaRegistryClient and remove default for SR url (#4325) (e045f7c)
-
Adds lag reporting and API for use in lag aware routing as described in KLIP 12 (#4392) (cb9ae29)
-
better error message when transaction to command topic fails to initialize by timeout (#4486) (a5fed3b)
-
hide internal/system topics from SHOW TOPICS (#4322) (075fed3)
-
Implement pull query routing to standbys if active is down (#4398) (ace23b1)
-
Implementation of heartbeat mechanism as part of KLIP-12 (#4173) (37c1eaa)
-
add COUNT_DISTINCT and allow generics in UDAFs (#4150) (2d5e680)
-
remove WindowStart() and WindowEnd() UDAFs (#4459) (eda2e34)
-
make (certain types of) server error messages configurable (#4121) (cedf47e)
-
allow environment variables to configure embedded connect (#4260) (e032ea9)
-
enable Kafla ACL authorization checks for Pull Queries (#4187) (5ee1e9e)
-
show properties now includes embedded connect properties and scope (#4099) (ebac104)
-
add support to terminate all running queries (#3944) (abbce84)
-
expose execution plans from the ksql engine API (#3482) (067139c)
- Avoids logging INFO for rest-util requests, since it hurts pull query performance (#4302) (50b4c1c)
- Improves pull query performance by making the default schema service a singleton (#4216) (f991752)
- add ksql-test-runner deps to ksql package lib (#4272) (6e28cc4)
- ConcurrentModificationException in ClusterStatusResource (#4510) (c79cba9)
- deadlock when closing transient push query (#4297) (ac8fb63)
- delimiters reset across non-delimited types (reverts #4366) (#4371) (5788729)
- do not throw error if VALUE_DELIMITER is set on non-DELIMITED topic (#4366) (2b59b8b)
- exception on shutdown of ksqlDB server (#4483) (126e2cf)
- fix compilation error due to
Format
refactoring (#4465) (07a4dcd) - fix NPE in CLI if not username supplied (#4312) (0b6da0b)
- Fixes the single host lag reporting case (#4494) (6b8bc2a)
- floating point comparison was inexact (#4372) (2a4ca47)
- Include functional tests jar in docker images (#4274) (2559b2f)
- include valid alternative UDF signatures in error message (MINOR) (#4403) (f397ad8)
- Make null key serialization/deserialization symmetrical (#4351) (2a61acb)
- partial push & persistent query support for window bounds columns (#4401) (48aa6ec)
- print root cause in error message (#4505) (6299410)
- pull queries should work across nodes (#4169) (#4271) (2369213)
- remove deprecated Acl API (#4373) (a2b69f7)
- remove duplicate comment about Schema Regitry URL from sample server properties (#4346) (0d542c5)
- rename stale to standby in KsqlConfig (#4467) (f8bb986)
- report window type and query status better from API (#4313) (ca9368a)
- reserve
WINDOWSTART
andWINDOWEND
as system column names (#4388) (ea0a0ac) - Sets timezone of RestQueryTranslationTest test to make it work in non UTC zones (#4407) (50b25d5)
- show queries now returns the correct Kafka Topic if the query string contains with clause (#4430) (1b713cd)
- support conversion of STRING to BIGINT for window bounds (#4500) (9c3cbf8)
- support WindowStart() and WindowEnd() in pull queries (#4435) (8da2b63)
- add logging during restore (#4270) (4e32da6)
- log4j properties files (#4293) (5911faf)
- report clearer error message when AVG used with DELIMITED (#4295) (307bf4d)
- better error message on self-join (#4248) (1281ab2)
- change query id generation to work with planned commands (#4149) (91c421a)
- CLI commands may be terminated with semicolon+whitespace (MINOR) (#4234) (096b78f)
- decimals in structs should display as numeric (#4165) (75b539e)
- don't load current qtt test case from legacy loader (#4245) (9479fd6)
- immutability in some more classes (MINOR) (#4179) (cbd3bab)
- include path of field that causes JSON deserialization error (#4249) (5cc718b)
- reintroduce FetchFieldFromStruct as a public UDF (#4185) (a50a665)
- show topics doesn't display topics with different casing (#4159) (0ac8747)
- untracked file after cloning on Windows (#4122) (04de30e)
- array access is now 1-indexed instead of 0-indexed (#4057) (f09f797)
- Explicitly disallow table functions with table sources, fixes #4033 (#4085) (60e20ef)
- fix issues with multi-statement requests failing to validate (#3952) (3e7169b), closes #3363
- NPE when starting StandaloneExecutor (#4119) (c6c00b1)
- properly set key when partition by ROWKEY and join on non-ROWKEY (#4090) (6c80941)
- remove mapValues that excluded ROWTIME and ROWKEY columns (#4066) (a6982bd), closes #4052
- robin's requested message changes (#4021) (422a2e3)
- schema column order returned by websocket pull query (#4012) (85fef09)
- some terminals dont work with JLine 3.11 (#3931) (ad183ec)
- the Abs, Ceil and Floor methods now return proper types (#3948) (3d6e119)
- UncaughtExceptionHandler not being set for Persistent Queries (#4087) (e193a2a)
- unify behavior for PARTITION BY and GROUP BY (#3982) (67d3f8c)
- wrong source type in pull query error message (#3885) (65523c7), closes #3523
- existing queries that perform a PARTITION BY or GROUP BY on a single column of one of the above supported primitive key types will now set the key to the appropriate type, not a
STRING
as previously. - The
WindowStart()
andWindowEnd()
UDAFs have been removed from KSQL. Use theWindowStart
andWindowEnd
system columns to access the window bounds within the SELECT expression instead. - the order of columns for internal topics has changed. The
DELIMITED
format can not handle this in a backwards compatible way. Hence this is a breaking change for any existing queries the use theDELIMITED
format and have internal topics. This change has been made now for two reasons:- its a breaking change, making it much harder to do later.
- The recent confluentinc#4404 change introduced this same issue for pull queries. This current change corrects pull queries too.
- Any query of a windowed source that uses
ROWKEY
in the SELECT projection will see the contents ofROWKEY
change from a formattedSTRING
containing the underlying key and the window bounds, to just the underlying key. Queries can access the window bounds usingWINDOWSTART
andWINDOWEND
. - Joins on windowed sources now include
WINDOWSTART
andWINDOWEND
columns from both sides on aSELECT *
. WINDOWSTART
andWINDOWEND
are now reserved system column names. Any query that previously used those names will need to be changed: for example, alias the columns to a different name. These column names are being reserved for use as system columns when dealing with streams and tables that have a windowed key.- standalone literals that used to be doubles may now be interpreted as BigDecimal. In most scenarios, this won't affect any queries as the DECIMAL can auto-cast to DOUBLE; in the case were the literal stands alone, the output schema will be a DECIMAL instead of a DOUBLE. To specify a DOUBLE literal, use scientific notation (e.g. 1.234E-5).
- The response from the RESTful API has changed for some commands with this commit: the
SourceDescription
type no longer has aformat
field. Instead it haskeyFormat
andvalueFormat
fields. Response now includes astate
property for each query that indicates the state of the query. e.g.The CLI output was:{ "queryString" : "create table OUTPUT as select * from INPUT;", "sinks" : [ "OUTPUT" ], "id" : "CSAS_OUTPUT_0", "state" : "Running" }
and is now:ksql> show queries; Query ID | Kafka Topic | Query String CSAS_OUTPUT_0 | OUTPUT | CREATE STREAM OUTPUT WITH (KAFKA_TOPIC='OUTPUT', PARTITIONS=1, REPLICAS=1) AS SELECT * FROM INPUT INPUT EMIT CHANGES; CTAS_CLICK_USER_SESSIONS_5 | CLICK_USER_SESSIONS | CREATE TABLE CLICK_USER_SESSIONS WITH (KAFKA_TOPIC='CLICK_USER_SESSIONS', PARTITIONS=1, REPLICAS=1) AS SELECT CLICKSTREAM.USERID USERID, COUNT(*) COUNT FROM CLICKSTREAM CLICKSTREAM WINDOW SESSION ( 300 SECONDS ) GROUP BY CLICKSTREAM.USERID EMIT CHANGES; For detailed information on a Query run: EXPLAIN <Query ID>;
Note the addition of theQuery ID | Status | Kafka Topic | Query String CSAS_OUTPUT_0 | RUNNING | OUTPUT | CREATE STREAM OUTPUT WITH (KAFKA_TOPIC='OUTPUT', PARTITIONS=1, REPLICAS=1) AS SELECT *FROM INPUT INPUTEMIT CHANGES; For detailed information on a Query run: EXPLAIN <Query ID>;
Status
column and the fact thatQuery String
is now longer being written across multiple lines. old CLI output:New CLI output:ksql> describe CLICK_USER_SESSIONS; Name : CLICK_USER_SESSIONS Field | Type ROWTIME | BIGINT (system) ROWKEY | INTEGER (system) USERID | INTEGER COUNT | BIGINT For runtime statistics and query details run: DESCRIBE EXTENDED <Stream,Table>;
Note the addition of theksql> describe CLICK_USER_SESSIONS; Name : CLICK_USER_SESSIONS Field | Type ROWTIME | BIGINT (system) ROWKEY | INTEGER (system) (Window type: SESSION) USERID | INTEGER COUNT | BIGINT For runtime statistics and query details run: DESCRIBE EXTENDED <Stream,Table>;
Window Type
information. The extended version of the command has also changed. Old output:ksql> describe extended CLICK_USER_SESSIONS; Name : CLICK_USER_SESSIONS Type : TABLE Key field : USERID Key format : STRING Timestamp field : Not set - using <ROWTIME> Value Format : JSON Kafka topic : CLICK_USER_SESSIONS (partitions: 1, replication: 1) Statement : CREATE TABLE CLICK_USER_SESSIONS WITH (KAFKA_TOPIC='CLICK_USER_SESSIONS', PARTITIONS=1, REPLICAS=1) AS SELECT CLICKSTREAM.USERID USERID, COUNT(*) COUNT FROM CLICKSTREAM CLICKSTREAM WINDOW SESSION ( 300 SECONDS ) GROUP BY CLICKSTREAM.USERID EMIT CHANGES; Field | Type ROWTIME | BIGINT (system) ROWKEY | INTEGER (system) USERID | INTEGER COUNT | BIGINT
- Any
KEY
column identified in theWITH
clause must be of the same Sql type asROWKEY
. Users can provide the name of a value column that matches the key column, e.g.Before primitive keys was introduced all keys were treated asCREATE STREAM S (ID INT, NAME STRING) WITH (KEY='ID', ...);
STRING
. With primitive keysROWKEY
can be types other thanSTRING
, e.g.BIGINT
. It therefore follows that anyKEY
column identified in theWITH
clause must have the same SQL type as the actual key, i.e.ROWKEY
. With this change the above example statement will fail with the error:As the error message says, the error can be resolved by changing the statement to:The KEY field (ID) identified in the WITH clause is of a different type to the actual key column. Either change the type of the KEY field to match ROWKEY, or explicitly set ROWKEY to the type of the KEY field by adding 'ROWKEY INTEGER KEY' in the schema. KEY field type: INTEGER ROWKEY type: STRING
CREATE STREAM S (ROWKEY INT KEY, ID INT, NAME STRING) WITH (KEY='ID', ...);
- Some existing joins may now fail and the type of
ROWKEY
in the result schema of joins may have changed. WhenROWKEY
was always aSTRING
it was possible to join anINTEGER
column with aBIGINT
column. This is no longer the case. AJOIN
requires the join columns to be of the same type. (See confluentinc#4130 which tracks adding support for being able toCAST
join criteria). Where joining on twoINT
columns would previously have resulted in a schema containingROWKEY STRING KEY
, it would not result inROWKEY INT KEY
. - A
GROUP BY
on single expressions now changes the SQL type ofROWKEY
in the output schema of the query to match the SQL type of the expression. For example, consider:Previously, the above would have resulted in an output schema ofCREATE STREAM INPUT (ROWKEY STRING KEY, ID INT) WITH (...); CREATE TABLE OUTPUT AS SELECT COUNT(*) AS COUNT FROM INPUT GROUP BY ID;
ROWKEY STRING KEY, COUNT BIGINT
, whereROWKEY
would have stored the string representation of the integer from theID
column. With this commit the output schema will beROWKEY INT KEY COUNT BIGINT
. - Any
GROUP BY
expression that resolves toNULL
, including because a UDF throws an exception, now results in the row being excluded from the result. Previously, as the key was aSTRING
a value of"null"
could be used. With other primitive types this is not possible. As key columns must be non-null any exception is logged and the row is excluded. - commands that were persisted with RUN SCRIPT will no longer be executable
- the ARRAYCONTAINS function now needs to be referenced as either JSON_ARRAY_CONTAINS or ARRAY_CONTAINS depending on the intended param types
- A
PARTITION BY
now changes the SQL type ofROWKEY
in the output schema of a query. For example, consider:Previously, the above would have resulted in an output schema ofCREATE STREAM INPUT (ROWKEY STRING KEY, ID INT) WITH (...); CREATE STREAM OUTPUT AS SELECT ROWKEY AS NAME FROM INPUT PARTITION BY ID;
ROWKEY STRING KEY, NAME STRING
, whereROWKEY
would have stored the string representation of the integer from theID
column. With this commit the output schema will beROWKEY INT KEY, NAME STRING
. - any queries that were using array index mechanism should change to use 1-base indexing instead of 0-base.
- The maxInterval parameter for ksql-datagen is now deprecated. Use msgRate instead.
- this change makes it so that PARTITION BY statements use the source schema, not the value/projection schema, when selecting the value to partition by. This is consistent with GROUP BY, and standard SQL for GROUP by. Any statement that previously used PARTITION BY may need to be reworked. 1/2
- when querying with EMIT CHANGES and PARTITION BY, the PARTITION BY clause should now come before EMIT CHANGES. 2/2
- KSQL will now, by default, not create duplicate changelog for table sources.
fixes: confluentinc#3621
Now that Kafka Steams has a
KTable.transformValues
we no longer need to create a table by first creating a stream, then doing a select/groupby/aggregate on it. Instead, we can just useStreamBuilder.table
. This change makes the switch, removing theStreamToTable
types and calls and replaces them with eitherTableSource
orWindowedTableSource
, copying the existing pattern forStreamSource
andWindowedStreamSource
. It also reinstates a change inKsqlConfig
that ensures topology optimisations are on by default. This was the case for 5.4.x, but was inadvertently turned off. With the optimisation config turned on, and the new builder step used, KSQL no longer creates a changelog topic to back the tables state store. This is not needed as the source topic is itself the changelog. The change includes new tests intable.json
to confirm the change log topic is not created by default and is created if the user turns off optimisations. This change also removes the line in theTestExecutor
that explicitly sets topology optimisations toall
. The test should not of being doing tis. This may been why the bug turning off optimisations was not detected. - this change removes the old method of generating query IDs based on their sequence of successful execution. Instead all queries will use their offset in the command topic. Similarly, all DROP queries issued before 5.0 will no longer cascade query terminiation.
ALL
is now a reserved word and can not be used for identifiers without being quoted.- abs, ceil and floor will now return types aligned with other databases systems (i.e. the same type as the input). Previously these udfs would always return Double.
- Statements in the command topic will be retried until they succeed. For example, if the source topic has been deleted for a create stream/table statement, the server may fail to start since command runner will be stuck processing the statement. This ensures that the same set of streams/tables are created when restarting the server. You can check to see if the command runner is stuck by:
- Looking in the server logs to see if a statement is being retried.
- The JMX metric
_confluent-ksql-<service-id>ksql-rest-app-command-runner
will be in anERROR
state
v0.6.0 (2019-11-19)
- add config to disable pull queries when validation is required (#3879) (ccc636d), closes #3863
- add configurable metrics reporters (#3490) (378b8af)
- add flag to disable pull queries (MINOR) (#3778) (04e206f)
- add health check endpoint (#3501) (2308686)
- add KsqlUncaughtExceptionHandler and new KsqlRestConfig for enabling it (#3425) (d83c787)
- add request logging (#3518) (c401ec0)
- Add UDF invoker benchmark (#3592) (83dfc24)
- Added UDFs ENTRIES and GENERATE_SERIES (#3724) (0a4558b)
- build ks app from an execution plan visitor (#3418) (b57d194)
- build materializations from the physical plan (#3494) (f45d649)
- change /metadata REST path to /v1/metadata (#3467) (ed94895)
- Config file for the no-response bot which closes issues which haven't been responded to (#3765) (1dfdb68)
- drop legacy key field functionality (#3764) (5369dc2)
- expose query status through EXPLAIN (#3570) (8ef82eb)
- expression support for insert values (#3612) (37f9763)
- Implement complex expressions for table functions (#3683) (200022b)
- Implement describe and list functions for UDTFs (#3716) (b0bbea4)
- Implement EXPLODE(ARRAY) for single table function in SELECT (#3589) (8b52aa8)
- Implement schemaProvider for UDTFs (#3690) (4e66825)
- Implement user defined table functions (#3687) (e62bd46)
- Makes timeout for owner lookup in StaticQueryExecutor and rebalancing in KsStateStore configurable (#3856) (39245c6)
- pass auth header to connect client for RBAC integration (#3492) (cef0ea3)
- serialize expressions (#3721) (e1cd477)
- Support multiple table functions in queries (#3685) (44be5a2)
- support numeric json serde for decimals (#3588) (8621594)
- support quoted identifiers in column names (#3477) (be2bdcc)
- Transactional Produces to Command Topic (#3660) (cba2877)
- static: allow logical schema to have fields in any order (#3422) (d935af3)
- static: allow windowed queries without bounds on rowtime (#3438) (6593ee3)
- static: fail on ROWTIME in projection (#3430) (2f27b68)
- static: support for partial datetimes on
WindowStart
bounds (#3435) (99f6e24) - static: support ROWKEY in the projection of static queries (#3439) (9218766)
- static: switch partial datetime parser to use UTC by default (#3473) (81557e3)
- static: unordered table elements and meta columns serialization (#3428) (3b23fd6)
- add KsqlRocksDBConfigSetter to bound memory and set num threads (#3167) (cdcaa2d)
- add logs/ to .gitignore (MINOR) (#3353) (81272cf)
- add offest to QueuedCommand and flag to Command (#3343) (fd112a4)
- add option for datagen to produce indefinitely (MINOR) (#3307) (6281738)
- add REST /metadata path to display KSQL server information (replaces /info) (#3313) (8be29b9)
- add SHOW TYPES to list all custom types (#3280) (13fde33)
- Add support for average aggregate function (#3302) (6757d9f)
- back out Connect auto-import feature (#3386) (d4c748c)
- build execution plan from structured package (#3285) (0d0b1c3)
- change the public API of schema provider method (#3287) (1324285)
- custom comparators for array, map and struct (#3385) (fe63d21)
- Implement ROUND() UDF (#3404) (f9783a9)
- Implement user defined delimiter for value format (#3393) (b84d0aa)
- move aggregation to plan builder (#3391) (3aaeb73)
- move filters to plan builder (#3346) (d4d52f3)
- move groupBy into plan builders (#3359) (730c913)
- move joins to plan builder (#3361) (e243c74)
- move selectKey impl to plan builder (#3362) (f312fcc)
- move setup of the sink to plan builders (#3360) (bfbdc20)
- remove equalsIgnoreCase from all Name classes (MINOR) (#3411) (b78619c)
- update query id generation to use command topic record offset (#3354) (295314e)
- cli: add the feature to turn of WRAP for CLI output (#3341) (3814c71)
- static: add custom jackson JSON serde for handling LogicalSchema (#3322) (c571508)
- static: add forEach() to KsqlStruct (MINOR) (#3320) (587b545)
- static: initial syntax for static queries (#3300) (8917e48)
- static: static select support (#3369) (e4b3275)
- move toTable kstreams calls to plan builder (#3334) (06aa252)
- use coherent naming scheme for generated java code (#3417) (06a2a57)
- static: initial drop of static query functionality (#3340) (54c5139)
- add ability to DROP custom types (#3281) (32005ed)
- Add schema resolver method to UDF specification (#3215) (08855ad)
- add support to register custom types with KSQL (
CREATE TYPE
) (#3266) (08ffebf) - include error message in DESCRIBE CONNECTOR (MINOR) (#3289) (458f1d8)
- perform topic permission checks for KSQL service principal (#3261) (ba1f613)
- wire in the KS config needed for point queries (MINOR) (#3251) (5152d06)
- add a new module ksql-execution for the execution plan interfaces (#3125) (3251d25)
- add an initial set of execution steps (#3214) (c860793)
- add config for custom metrics tags (5.3.x) (#2996) (76f5590)
- add config for enabling topic access validator (#3079) (440e247)
- add connect templates and simplify JDBC source (MINOR) (#3231) (ba0fb99)
- add DESCRIBE functionality for connectors (#3206) (a79adb4)
- add DROP CONNECTOR functionality (#3245) (103c958)
- add extension for custom metrics (5.3.x) (#2997) (94a8ae7)
- add Logarithm, Exponential and Sqrt functions (#3091) (a4ca934)
- add SHOW CONNECTORS functionality (#3210) (0bf31eb)
- Add SHOW TOPICS EXTENDED (#3183) (dd3eb5f), closes #1268
- add SIGN, REPLACE and INITCAP (#3189) (ab67684)
- Add window size for time window tables (#3102) (6ff07d5)
- allow for decimals to be used as input types for UDFs (#3217) (4a2e4b9)
- enhance datagen for use as a load generator (#3230) (ddb970b)
- Extend UdfLoader to allow loading specific classes with UDFs (#3234) (99c79b3)
- format error messages better (MINOR) (#3233) (c727d50)
- improved error message and updated error code for PrintTopics command (#3246) (4b94f22)
- some robustness improvements for Connect integration (#3227) (bc1a2f8)
- spin up a connect worker embedded inside the KSQL JVM (#3241) (4d7ef2a)
- validate createTopic permissions on SandboxedKafkaTopicClient (#3250) (0ea157b)
- data-gen: support KAFKA format in DataGen (#3120) (cb7abcc)
- ksql-connect: poll connect-configs and auto register sources (#3178) (6dd21fd)
- wrap timestamps in ROWTIME expressions with STRINGTOTIMESTAMP (#3160) (42acd78)
- ksql-connect: introduce ConnectClient for REST requests (#3137) (15548ce)
- ksql-connect: introduce syntax for CREATE CONNECTOR (syntax only) (#3139) (e823659)
- ksql-connect: wiring for creating connectors (#3149) (cd20d57)
- udfs: generic support for UDFs (#3054) (a381c48)
- cli: improve CLI transient queries with headers and spacing (partial fix for #892) (#3047) (050b72a)
- serde: kafka format (#3065) (2b5c3d1)
- add basic support for key syntax (#3034) (ca6478a)
- Add REST and Websocket authorization hooks and interface (#3000) (39af991)
- Arrays should be indexable from the end too (#3004) (a166075), closes #2974
- decimal math with other numbers (#3001) (14d2bb7)
- New UNIX_TIMESTAMP and UNIX_DATE datetime functions (#2459) (39ce7f4)
- do not spam the logs with config defs (#3044) (94904a3)
- Only look up index of new key field once, not per row processed (#3020) (fda1c7f)
- Remove parsing of integer literals (#3019) (6195b76)
/query
rest endpoint should return valid JSON (#3819) (b278e83)- address upstream change in KafkaAvroDeserializer (#3372) (b32e6a9)
- address upstream change in KafkaAvroDeserializer (revert previous fix) (#3437) (bed164b)
- allow streams topic prefixed configs (#3691) (939c45a), closes #817
- apply filter before flat mapping in logical planner (#3730) (f4bd083)
- band-aid around RestQueryTranslationTest (#3326) (677e03c)
- be more lax on validating config (#3599) (3c80cf1), closes #2279
- better error message if tz invalid in datetime string (#3449) (e93c445)
- better error message when users enter old style syntax for query (#3397) (f948ec0)
- changes required for compatibility with KIP-479 (#3466) (567f056)
- Created new test for running topology generation manually from the IDE, and removed main() from TopologyFileGenerator (#3609) (381e563)
- do not allow inserts into tables with null key (#3605) (7e326b7), closes #3021
- do not allow WITH clause to be created with invalid WINDOW_SIZE (#3432) (96bfc11)
- Don't throw NPE on null columns (#3647) (6969768), closes #3617
- drop
TopicDescriptionFactory
class (#3528) (5281c74) - ensure default server config works with IP6 (fixes #3309) (#3310) (92b03ec)
- error message when UDAF has STRUCT type with no schema (#3407) (49f456e)
- fix Avro schema validation (#3499) (a59954d)
- fix broken map coersion test case (#3694) (b5ea24c)
- fix NPE when printing records with empty value (MINOR) (#3470) (47313ff)
- fix parsing issue with LIST property overrides (#3601) (6459fa1)
- fix test for KIP-307 final PR (#3402) (d77db50)
- improve error message on query or print statement on /ksql (#3337) (dae28eb)
- improve print topic error message when topic does not exist (#3464) (0fa4d24)
- include lower-case identifiers among those that need quotes (#3723) (62c47bf)
- make sure use of non threadsafe UdfIndex is synchronized (#3486) (618aae8)
- pull queries available on
/query
rest & ws endpoint (#3820) (9a47eaf), closes #3672 #3495 - quoted identifiers for source names (#3695) (7d3cf92)
- race condition in KsStateStore (#3474) (7336389)
- Remove dependencies on test-jars from ksql-functional-tests jar (#3421) (e09d6ad)
- Remove duplicate ksql-common dependency in ksql-streams pom (#3703) (0620906)
- Remove unnecessary arg coercion for UDFs (#3595) (4c42530)
- Rename Delimiter:parse(char c) to Delimiter.of(char c) (#3433) (8716c41)
- renamed method to avoid checkstyle error (#3652) (a8a3588)
- revert ipv6 address and update docs (#3314) (0ff4a51)
- Revert named stores in expected topologies, disable naming stores from StreamJoined, re-enable join tests. (#3550) (0b8ccc1), closes #3364
- should be able to parse empty STRUCT schema (MINOR) (#3318) (a6549e1)
- Some renaming around KsqlFunction etc (#3747) (b30d965)
- standardize KSQL up-casting (#3516) (7fe8772)
- support NULL return values from CASE statements (#3531) (eb9e41b)
- support UDAFs with different intermediate schema (#3412) (70e10e9)
- switch AdminClient to be sandbox proxy (#3351) (6747d5c)
- typo in static query WHERE clause example (#3423) (7ad3248)
- Update repartition processor names for KAFKA-9098 (#3802) (2b86cd8)
- Use the correct Immutable interface (#3488) (a1096bf)
- 3356: struct rewritter missed EXPLAIN (#3398) (daf974b)
- 3441: stabilize the StaticQueryFunctionalTest (#3442) (44ae3a0), closes #3441
- 3524: improve pull query error message (#3540) (2be8385), closes #3524
- 3525: sET should only affect statements after it (#3529) (5315f1e), closes #3525
- address deprecation of getAdminClient (#3276) (6a50fca)
- error message with DROP DELETE TOPIC is invalid (#3279) (4284b8c)
- find bugs issue in KafkaTopicClientImpl (#3268) (70e880f)
- fixed how wrapped KsqlTopicAuthorizationException error messages are displayed (#3258) (63672ae)
- improve escaping of identifiers (#3295) (04435d7)
- respect reserved words in more clauses for SqlFormatter (MINOR) (#3284) (6974a80)
- schema converters should handle List and Map impl (#3290) (af779dc)
COLLECT_LIST
can now be applied to tables (#3104) (c239785)- add ksql-functional-tests to the ksql package (#3111) (9548135)
- authorization filter is logging incorrect username (#3138) (b15c6d0)
- broken build due to bad import statements (#3204) (8ec4c2b)
- check for other sources using a topic before deleting (#3070) (b3fa315)
- default timestamp extractor override is not working (#3176) (d1db07b)
- drop succeeds even if missing topic or schema (#3131) (ba03d6f)
- dummy placeholder class in ksql-execution (#3142) (c9f1cbb)
- expose user/password command line options (#3129) (1fd70fa)
- filter null entries before creating KafkaConfigStore (#3147) (2852af1)
- fix auth error message with insert values command (#3257) (abe410a)
- Implement new KIP-455 AdminClient AlterPartitionReassignments and tPartitionReassignments APIs (#3218) (d951026)
- incorrect SR authorization message is displayed (#3186) (b3b6c82)
- logicalSchema toString() to include key fields (MINOR) (#3123) (0984529)
- Remove delete.topic.enable check (#3089) (71ec1c0)
- remove non-standard JavaFx usage of Pair (#3145) (3508847)
- replace nashorn @Immutable with errorprone for JDK12 (MINOR) (#3239) (0c47a34)
- request / on makeRootRequest instead of /info (#3197) (7935488)
- the QTTs now run through SqlFormatter & other formatting fixes (#3222) (79da68c)
- use errorprone Immutable annotation instead of nashorn version (#3150) (e7f5e17)
DESCRIBE
now works for sources with decimal types (#3083) (0eaa101)- don't log out to stderr on parser errors (#3052) (29dea47)
- drop describe topic functionality (MINOR) (#3072) (1290b82)
- ensure topology generator test runs in build (#3067) (3168150)
- misplaced commas when formatting CTAS statements (#3058) (c05615d)
- remove any rowtime or rowkey columns from query schema (MINOR) (Fixes 3039) (#3043) (0346933)
- remove last of registered topics stuff from api / cli (MINOR) (#3068) (24d874c)
- sqlformatter to correctly handle describe (#3074) (8de57bd)
- Introduced
EMIT CHANGES
syntax to differentiate push queries from new pull queries. Persistent push queries do not yet require anEMIT CHANGES
clause, but transient push queries do. - the response from the RESTful API for push queries has changed: it is now a valid JSON document containing a JSON array, where each element is JSON object containing either a row of data, an error message, or a final message. The
terminal
field has been removed. - the response from the RESTful API for push queries has changed: it now returns a line with the schema and query id in a
header
field and null fields are not included in the payload. The CLI is backwards compatible with older versions of the server, though it won't output column headings from older versions. - If users are relying on the previous behaviour of uppercasing topic names, this change breaks that
- If no value is passed for the KSQL datagen option
iterations
, datagen will now produce indefinitely, rather than terminating after a default of 1,000,000 rows. - Previously CLI startup permissions were based on whether the user has access to /info, but now it's based on whether the user has access to /. This change implies that if a user has permissions to / but not /info, they now have access to the CLI whereas previously they did not.
- "SHOW TOPICS" no longer includes the "Consumers" and
"ConsumerGroups" columns. You can use "SHOW TOPICS EXTENDED" to get the
output previous emitted from "SHOW TOPICS". See below for examples.
This change splits "SHOW TOPICS" into two commands:
- "SHOW TOPICS EXTENDED", which shows what was previously shown by
"SHOW TOPICS". Sample output:
ksql> show topics extended; Kafka Topic | Partitions | Partition Replicas | Consumers | ConsumerGroups -------------------------------------------------------------------------------------------------------------------------------------------------------------- _confluent-command | 1 | 1 | 1 | 1 _confluent-controlcenter-5-3-0-1-actual-group-consumption-rekey | 1 | 1 | 1 | 1
- "SHOW TOPICS", which now no longer queries consumer groups and their
active consumers. Sample output:
ksql> show topics; Kafka Topic | Partitions | Partition Replicas --------------------------------------------------------------------------------------------------------------------------------- _confluent-command | 1 | 1 _confluent-controlcenter-5-3-0-1-actual-group-consumption-rekey | 1 | 1
- "SHOW TOPICS EXTENDED", which shows what was previously shown by
"SHOW TOPICS". Sample output:
Release notes for KSQL releases prior to ksqlDB v0.6.0 can be found at docs/changelog.rst.