Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Unique Fields] : Unique Field calculation problem in customer instance #31262

Open
jcastro-dotcms opened this issue Jan 29, 2025 · 2 comments · Fixed by #31281
Open

[Unique Fields] : Unique Field calculation problem in customer instance #31262

jcastro-dotcms opened this issue Jan 29, 2025 · 2 comments · Fixed by #31281

Comments

@jcastro-dotcms
Copy link
Contributor

Parent Issue

Problem Statement

When enabling the DOT_FEATURE_FLAG_DB_UNIQUE_FIELD_VALIDATION feature flag in a specific customer environment, the process is taking a long time before failing with the following error in the log:

20:26:35.772  ERROR lang.Class - ERROR: invalid input syntax for type bytea{
  "SQL": ["INSERT INTO unique_fields (unique_key_val, supporting_values) SELECT encode(sha256(CONCAT(content_type_id, field_var_name, language_id, field_value,                   CASE WHEN uniquePerSite = 'true' THEN host_id ELSE '' END)::bytea), 'hex') as unique_key_val,        json_build_object('contentTypeId', content_type_id, 'fieldVariableName', field_var_name, 'languageId', language_id, 'fieldValue', field_value, 'siteId', host_id, 'variant', variant_id, 'uniquePerSite', uniquePerSite, 'live', live, 'contentletIds', contentlet_identifier) AS supporting_values FROM (        SELECT structure.inode                                       AS content_type_id,               field.velocity_var_name                               AS field_var_name,               contentlet.language_id                                AS language_id,               identifier.host_inode                                 AS host_id,               jsonb_extract_path_text(contentlet_as_json -> 'fields', field.velocity_var_name)::jsonb ->>'value' AS field_value,               ARRAY_AGG(DISTINCT contentlet.identifier)                      AS contentlet_identifier,               (CASE WHEN COUNT(DISTINCT contentlet_version_info.variant_id) > 1 THEN 'DEFAULT' ELSE MAX(contentlet_version_info.variant_id) END) AS variant_id,                ((CASE WHEN COUNT(*) > 1 AND COUNT(DISTINCT contentlet_version_info.live_inode = contentlet.inode) > 1 THEN 0                    ELSE MAX((CASE WHEN contentlet_version_info.live_inode = contentlet.inode THEN 1 ELSE 0 END)::int)                    END) = 1) AS live,               (MAX(CASE WHEN field_variable.variable_value = 'true' THEN 1 ELSE 0 END)) = 1 AS uniquePerSite        FROM contentlet                 INNER JOIN structure ON structure.inode = contentlet.structure_inode                 INNER JOIN field ON structure.inode = field.structure_inode                 INNER JOIN identifier ON contentlet.identifier = identifier.id                 INNER JOIN contentlet_version_info ON contentlet_version_info.live_inode = contentlet.inode OR                                                       contentlet_version_info.working_inode = contentlet.inode                 LEFT JOIN field_variable ON field_variable.field_id = field.inode AND field_variable.variable_key = 'uniquePerSite'        WHERE jsonb_extract_path_text(contentlet_as_json -> 'fields', field.velocity_var_name) IS NOT NULL          AND field.unique_ = true        GROUP BY structure.inode,                 field.velocity_var_name,                 contentlet.language_id,                 identifier.host_inode,                 jsonb_extract_path_text(contentlet_as_json -> 'fields', field.velocity_var_name)::jsonb ->>'value') as data_to_populate"],
  "maxRows": [-1],
  "offest": [0]
}
com.dotmarketing.exception.DotDataException: ERROR: invalid input syntax for type bytea{
  "SQL": ["INSERT INTO unique_fields (unique_key_val, supporting_values) SELECT encode(sha256(CONCAT(content_type_id, field_var_name, language_id, field_value,                   CASE WHEN uniquePerSite = 'true' THEN host_id ELSE '' END)::bytea), 'hex') as unique_key_val,        json_build_object('contentTypeId', content_type_id, 'fieldVariableName', field_var_name, 'languageId', language_id, 'fieldValue', field_value, 'siteId', host_id, 'variant', variant_id, 'uniquePerSite', uniquePerSite, 'live', live, 'contentletIds', contentlet_identifier) AS supporting_values FROM (        SELECT structure.inode                                       AS content_type_id,               field.velocity_var_name                               AS field_var_name,               contentlet.language_id                                AS language_id,               identifier.host_inode                                 AS host_id,               jsonb_extract_path_text(contentlet_as_json -> 'fields', field.velocity_var_name)::jsonb ->>'value' AS field_value,               ARRAY_AGG(DISTINCT contentlet.identifier)                      AS contentlet_identifier,               (CASE WHEN COUNT(DISTINCT contentlet_version_info.variant_id) > 1 THEN 'DEFAULT' ELSE MAX(contentlet_version_info.variant_id) END) AS variant_id,                ((CASE WHEN COUNT(*) > 1 AND COUNT(DISTINCT contentlet_version_info.live_inode = contentlet.inode) > 1 THEN 0                    ELSE MAX((CASE WHEN contentlet_version_info.live_inode = contentlet.inode THEN 1 ELSE 0 END)::int)                    END) = 1) AS live,               (MAX(CASE WHEN field_variable.variable_value = 'true' THEN 1 ELSE 0 END)) = 1 AS uniquePerSite        FROM contentlet                 INNER JOIN structure ON structure.inode = contentlet.structure_inode                 INNER JOIN field ON structure.inode = field.structure_inode                 INNER JOIN identifier ON contentlet.identifier = identifier.id                 INNER JOIN contentlet_version_info ON contentlet_version_info.live_inode = contentlet.inode OR                                                       contentlet_version_info.working_inode = contentlet.inode                 LEFT JOIN field_variable ON field_variable.field_id = field.inode AND field_variable.variable_key = 'uniquePerSite'        WHERE jsonb_extract_path_text(contentlet_as_json -> 'fields', field.velocity_var_name) IS NOT NULL          AND field.unique_ = true        GROUP BY structure.inode,                 field.velocity_var_name,                 contentlet.language_id,                 identifier.host_inode,                 jsonb_extract_path_text(contentlet_as_json -> 'fields', field.velocity_var_name)::jsonb ->>'value') as data_to_populate"],
  "maxRows": [-1],
  "offest": [0]
}
	at com.dotmarketing.common.db.DotConnect.loadResult(DotConnect.java:310) ~[?:?]
	at com.dotmarketing.common.db.DotConnect.loadObjectResults(DotConnect.java:997) ~[?:?]
	at com.dotcms.contenttype.business.uniquefields.extratable.UniqueFieldDataBaseUtil.populateUniqueFieldsTable(UniqueFieldDataBaseUtil.java:478) ~[?:?]
	at com.dotcms.contenttype.business.uniquefields.extratable.UniqueFieldDataBaseUtil.createTableAnsPopulate(UniqueFieldDataBaseUtil.java:442) ~[?:?]
	at com.dotcms.contenttype.business.uniquefields.extratable.UniqueFieldDataBaseUtil$Proxy$_$$_WeldClientProxy.createTableAnsPopulate(Unknown Source) ~[?:?]
	at com.dotcms.contenttype.business.uniquefields.UniqueFieldsValidationInitializer.init(UniqueFieldsValidationInitializer.java:45) ~[?:?]
	at com.dotcms.config.DotInitializationService.initService(DotInitializationService.java:74) ~[?:?]
	at java.base/java.lang.Iterable.forEach(Iterable.java:75) ~[?:?]
	at com.dotcms.config.DotInitializationService.initialize(DotInitializationService.java:114) ~[?:?]
	at com.liferay.portal.servlet.MainServlet.init(MainServlet.java:212) ~[?:?]
	at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1106) ~[?:?]
	at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1063) ~[?:?]
	at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:960) ~[?:?]
	at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4641) ~[?:?]
	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:4948) ~[?:?]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:171) ~[?:?]
	at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:683) ~[?:?]
	at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:658) ~[?:?]
	at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:661) ~[?:?]
	at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1186) ~[?:?]
	at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:1941) ~[?:?]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572) ~[?:?]
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[?:?]
	at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75) ~[?:?]
	at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:123) ~[?:?]
	at org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:1096) ~[?:?]
	at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:478) ~[?:?]
	at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1619) ~[?:?]
	at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:318) ~[?:?]
	at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:114) ~[?:?]
	at org.apache.catalina.util.LifecycleBase.setStateInternal(LifecycleBase.java:402) ~[?:?]
	at org.apache.catalina.util.LifecycleBase.setState(LifecycleBase.java:345) ~[?:?]
	at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:893) ~[?:?]
	at org.apache.catalina.core.StandardHost.startInternal(StandardHost.java:794) ~[?:?]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:171) ~[?:?]
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1332) ~[?:?]
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1322) ~[?:?]
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[?:?]
	at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75) ~[?:?]
	at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:145) ~[?:?]
	at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:866) ~[?:?]
	at org.apache.catalina.core.StandardEngine.startInternal(StandardEngine.java:248) ~[?:?]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:171) ~[?:?]
	at org.apache.catalina.core.StandardService.startInternal(StandardService.java:433) ~[?:?]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:171) ~[?:?]
	at org.apache.catalina.core.StandardServer.startInternal(StandardServer.java:921) ~[?:?]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:171) ~[?:?]
	at org.apache.catalina.startup.Catalina.start(Catalina.java:772) ~[?:?]
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[?:?]
	at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[?:?]
	at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:347) ~[bootstrap.jar:9.0.85]
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:478) ~[bootstrap.jar:9.0.85]
Caused by: org.postgresql.util.PSQLException: ERROR: invalid input syntax for type bytea
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2725) ~[?:?]
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2412) ~[?:?]
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:371) ~[?:?]
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:502) ~[?:?]
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:419) ~[?:?]
	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:341) ~[?:?]
	at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:326) ~[?:?]
	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:302) ~[?:?]
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:297) ~[?:?]
	at com.zaxxer.hikari.pool.ProxyStatement.execute(ProxyStatement.java:95) ~[?:?]
	at com.zaxxer.hikari.pool.HikariProxyStatement.execute(HikariProxyStatement.java) ~[?:?]
	at com.dotmarketing.common.db.DotConnect.executeQuery(DotConnect.java:698) ~[?:?]
	at com.dotmarketing.common.db.DotConnect.executeQuery(DotConnect.java:600) ~[?:?]
	at com.dotmarketing.common.db.DotConnect.loadResult(DotConnect.java:308) ~[?:?]
	... 51 more

Considering that the same initialization process works with smaller datasets, we believe that this error might be related to data inconsistencies or corruption. There are a few things we have in mind:

  • Migrating the SQL query to a stored procedure so that we can have more control on the data being processed, and report any potential errors with more detailed information.
  • Provide information on the number of contents that will be processed or affected during the execution of this task.

Steps to Reproduce

The issue was reported by the Enterprise Support Team, after enabling the Unique Validation via database, and restarting the customer's instance.

Acceptance Criteria

We need to improve the error handling and reporting of the process that loads unique field data in a dotCMS instance.

dotCMS Version

25.01.17, and latest main

Proposed Objective

Core Features

Proposed Priority

Priority 1 - Show Stopper

External Links... Slack Conversations, Support Tickets, Figma Designs, etc.

Slack threads:

Assumptions & Initiation Needs

No response

Quality Assurance Notes & Workarounds

No response

Sub-Tasks & Estimates

No response

Copy link

@yolabingo
Copy link
Contributor

On dotcms:25.02.05_d00f61f we encountered this error:

com.dotmarketing.exception.DotDataException: ERROR: duplicate key value violates unique constraint "unique_fields_pkey"
 Detail: Key (unique_key_val)=(d6664580dff260e9d6595645df6cbaf9315d05d38ba3543b5227040df9b366e0) already exists.{
 "SQL": ["INSERT INTO unique_fields (unique_key_val, supporting_values) SELECT encode(  sha256(    convert_to(     CONCAT(      content_type_id::text,      field_var_name::text,      language_id::text,      field_value::text,      CASE WHEN uniquePerSite = 'true' THEN COALESCE(host_id::text, '') ELSE '' END     ),     'UTF8'    )  ),  'hex'  ) AS unique_key_val,  json_build_object('contentTypeId', content_type_id, 'fieldVariableName', field_var_name, 'languageId', language_id, 'fieldValue', field_value, 'siteId', host_id, 'variant', variant_id, 'uniquePerSite', uniquePerSite, 'live', live, 'contentletIds', contentlet_identifier) AS supporting_values FROM (  SELECT structure.inode       AS content_type_id,   field.velocity_var_name      AS field_var_name,   contentlet.language_id      AS language_id,   identifier.host_inode      AS host_id,   jsonb_extract_path_text(contentlet_as_json -> 'fields', field.velocity_var_name)::jsonb ->>'value' AS field_value,   ARRAY_AGG(DISTINCT contentlet.identifier)    AS contentlet_identifier,   (CASE WHEN COUNT(DISTINCT contentlet_version_info.variant_id) > 1 THEN 'DEFAULT' ELSE MAX(contentlet_version_info.variant_id) END) AS variant_id,   ((CASE WHEN COUNT(*) > 1 AND COUNT(DISTINCT contentlet_version_info.live_inode = contentlet.inode) > 1 THEN 0    ELSE MAX((CASE WHEN contentlet_version_info.live_inode = contentlet.inode THEN 1 ELSE 0 END)::int)    END) = 1) AS live,   (MAX(CASE WHEN field_variable.variable_value = 'true' THEN 1 ELSE 0 END)) = 1 AS uniquePerSite  FROM contentlet    INNER JOIN structure ON structure.inode = contentlet.structure_inode    INNER JOIN field ON structure.inode = field.structure_inode    INNER JOIN identifier ON contentlet.identifier = identifier.id    INNER JOIN contentlet_version_info ON contentlet_version_info.live_inode = contentlet.inode OR          contentlet_version_info.working_inode = contentlet.inode    LEFT JOIN field_variable ON field_variable.field_id = field.inode AND field_variable.variable_key = 'uniquePerSite'  WHERE jsonb_extract_path_text(contentlet_as_json -> 'fields', field.velocity_var_name) IS NOT NULL  AND field.unique_ = true  GROUP BY structure.inode,    field.velocity_var_name,    contentlet.language_id,    identifier.host_inode,    jsonb_extract_path_text(contentlet_as_json -> 'fields', field.velocity_var_name)::jsonb ->>'value') as data_to_populate"],
 "maxRows": [-1],
 "offest": [0]
 }
 	at com.dotmarketing.common.db.DotConnect.loadResult(DotConnect.java:310) ~[?:?]
 	at com.dotmarketing.common.db.DotConnect.loadObjectResults(DotConnect.java:997) ~[?:?]
 	at com.dotcms.contenttype.business.uniquefields.extratable.UniqueFieldDataBaseUtil.populateUniqueFieldsTable(UniqueFieldDataBaseUtil.java:482) ~[?:?]
 	at com.dotcms.contenttype.business.uniquefields.extratable.UniqueFieldDataBaseUtil.createTableAnsPopulate(UniqueFieldDataBaseUtil.java:446) ~[?:?]
 	at com.dotcms.contenttype.business.uniquefields.extratable.UniqueFieldDataBaseUtil$Proxy$_$$_WeldClientProxy.createTableAnsPopulate(Unknown Source) ~[?:?]
 	at com.dotcms.contenttype.business.uniquefields.UniqueFieldsValidationInitializer.init(UniqueFieldsValidationInitializer.java:45) ~[?:?]
 	at com.dotcms.config.DotInitializationService.initService(DotInitializationService.java:74) ~[?:?]
 	at java.base/java.lang.Iterable.forEach(Iterable.java:75)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

3 participants