diff --git a/modules/ROOT/pages/purge-tool-deletion-volume-testing.adoc b/modules/ROOT/pages/purge-tool-deletion-volume-testing.adoc deleted file mode 100644 index efa471d8da..0000000000 --- a/modules/ROOT/pages/purge-tool-deletion-volume-testing.adoc +++ /dev/null @@ -1,436 +0,0 @@ -= Database deletion volume testing -:description: The following tests have been run in the following databases. - -The following tests have been run in the following databases. - -== Testing environment - -Tests were run on Intel Core i7 10th Gen, 8 CPU cores, 16 Gb of RAM. + -Database running inside a docker container. + -Purge tool run from the same machine. - -== PostgreSQL 11.x - -Remove 1 160 800 archived cases (corresponds to 6 964 800 lines in the database) + -Total Time: 71m30s - -=== Execution - -[source,log] ----- -╰─$ bin/bonita-purge-tool 6547377706517145159 1537600000000 - ____ _ _ _ _ - | _ \ (_) | | | | | - | |_) | ___ _ __ _| |_ __ _ _ __ _ _ _ __ __ _ ___ | |_ ___ ___ | | - | _ < / _ \| '_ \| | __/ _` | | '_ \| | | | '__/ _` |/ _ \ | __/ _ \ / _ \| | - | |_) | (_) | | | | | || (_| | | |_) | |_| | | | (_| | __/ | || (_) | (_) | | - |____/ \___/|_| |_|_|\__\__,_| | .__/ \__,_|_| \__, |\___| \__\___/ \___/|_| - | | __/ | - |_| |___/ -2020-02-21 18:19:40.072 INFO 8946 -[main] o.bonitasoft.engine.purge.ApplicationKt : Starting ApplicationKt on manu-DellXPS with PID 8946 (/home/manu/workspace/bonita-purge-tool/build/bonita-purge-tool/lib/bonita-purge-tool.jar started by manu in /home/manu/workspace/bonita-purge-tool/build/bonita-purge-tool) -2020-02-21 18:19:40.076 INFO 8946 -[main] o.bonitasoft.engine.purge.ApplicationKt : No active profile set, falling back to default profiles: default -2020-02-21 18:19:41.551 INFO 8946 -[main] org.bonitasoft.engine.purge.Application : Using datasource with HikariDataSource (null) -2020-02-21 18:19:41.711 INFO 8946 -[main] o.s.s.c.ThreadPoolTaskScheduler : Initializing ExecutorService 'taskScheduler' -2020-02-21 18:19:41.798 INFO 8946 -[main] o.bonitasoft.engine.purge.ApplicationKt : Started ApplicationKt in 2.611 seconds (JVM running for 3.325) -2020-02-21 18:19:41.828 INFO 8946 -[main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting... -2020-02-21 18:19:42.149 INFO 8946 -[main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed. -2020-02-21 18:19:42.283 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Database URL is jdbc:postgresql://localhost:5432/bonita -2020-02-21 18:19:42.284 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Tenant id used is 1 -2020-02-21 18:19:42.284 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : All settings can be changed in application.properties file -2020-02-21 18:19:43.218 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Will purge all archived process instances and their elements for process 'RFS_n_ACT_MOCKED' in version '1.1' that are finished since at least 2018-09-22T09:06:40 -Start the purge using the above parameters? [y/N] -y -2020-02-21 18:19:52.609 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Starting archive process instance purge.... -2020-02-21 18:22:00.446 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 6964800 rows from table ARCH_PROCESS_INSTANCE... -2020-02-21 18:23:43.453 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 3299016 rows from table ARCH_CONTRACT_DATA in 103000 ms -2020-02-21 19:00:04.318 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 24189472 rows from table ARCH_DATA_INSTANCE in 2180864 ms -2020-02-21 19:00:04.336 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 0 rows from table ARCH_DOCUMENT_MAPPING in 17 ms -2020-02-21 19:09:05.510 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 23216120 rows from table ARCH_FLOWNODE_INSTANCE in 541173 ms -2020-02-21 19:09:27.984 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 2321612 rows from table ARCH_PROCESS_COMMENT in 22474 ms -2020-02-21 19:09:27.989 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 0 rows from table ARCH_REF_BIZ_DATA_INST in 5 ms -2020-02-21 19:09:30.396 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 0 rows from table ARCH_CONNECTOR_INSTANCE in 2407 ms -2020-02-21 19:09:30.404 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 0 rows from table ARCH_CONTRACT_DATA in 7 ms -2020-02-21 19:29:02.660 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 4643224 rows from table ARCH_DATA_INSTANCE in 1172256 ms -2020-02-21 19:31:17.330 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Deleted 8125642 rows from table ARCH_CONNECTOR_INSTANCE in 134669 ms -2020-02-21 19:31:17.352 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Archive process instance purge completed. -2020-02-21 19:31:17.353 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : Some of the deleted elements may still appear in Bonita applications for a short while. -2020-02-21 19:31:17.353 INFO 8946 -[main] o.b.e.purge.DeleteOldProcessInstances : If you try to access them you will get a not found error. This is the expected behaviour. -2020-02-21 19:31:17.353 INFO 8946 -[main] org.bonitasoft.engine.purge.Application : Execution completed in 4295553 ms -2020-02-21 19:31:17.357 INFO 8946 -[main] o.s.s.c.ThreadPoolTaskScheduler : Shutting down ExecutorService 'taskScheduler' -2020-02-21 19:31:17.358 INFO 8946 -[main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown initiated... -2020-02-21 19:31:17.366 INFO 8946 -[main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown completed. ----- - -== MYSQL (5.5.X) - -Remove 137 603 archived cases (corresponds to 825 618 lines in the database) + -Total Time: 25 minutes - -=== Configuration ( mysql.cnf ) - -[source,properties] ----- -max_connections = 200 -key_buffer_size = 512M -innodb_buffer_pool_size = 5500M -innodb_buffer_pool_instances=8 -125 Gb in the database ----- - -chunk size : 500 k - -=== Initial data volume - -125 Gb in the database - -[source,sql] ----- -SELECT - table_name, - table_rows, - round(((data_length + index_length) / 1024 / 1024), 2) 'size MB' -FROM - INFORMATION_SCHEMA.TABLES -WHERE - TABLE_SCHEMA = 'bonita' -order by - table_name ASC ; ----- - -[source,text] ----- -segment_name Size in MB NUM_ROWS -------------------------------------------------------- -ARCH_FLOWNODE_INSTANCE 9730254.09 9730254 -ARCH_PROCESS_INSTANCE 2938758 2938758 -ARCH_DATA_INSTANCE 28696 16179052 -QUERIABLE_LOG 3044085 3044085 -ARCH_CONTRACT_DATA 0 0 -ARCH_CONTRACT_DATA_BACKUP 6757.02 1075697 -ARCH_CONNECTOR_INSTANCE 3010846 3010846 -DOCUMENT 0 0 -ARCH_PROCESS_COMMENT 976635 976635 -ARCH_REF_BIZ_DATA_INST 0 0 -ARCH_DOCUMENT_MAPPING 0 0 -JOB_PARAM 0 0 -ARCH_MULTI_BIZ_DATA 0 0 -PROCESS_COMMENT 0 0 -MESSAGE_INSTANCE 0 0 -DOCUMENT_MAPPING 0 0 -DATA_INSTANCE 0 0 ----- - -=== Execution - -[source,log] ----- -╰─$./bin/bonita-purge-tool 7431518296865410294 1537689211864 - - ____ _ _ _ _ - | _ \ (_) | | | | | - | |_) | ___ _ __ _| |_ __ _ _ __ _ _ _ __ __ _ ___ | |_ ___ ___ | | - | _ < / _ \| '_ \| | __/ _` | | '_ \| | | | '__/ _` |/ _ \ | __/ _ \ / _ \| | - | |_) | (_) | | | | | || (_| | | |_) | |_| | | | (_| | __/ | || (_) | (_) | | - |____/ \___/|_| |_|_|\__\__,_| | .__/ \__,_|_| \__, |\___| \__\___/ \___/|_| - | | __/ | - |_| |___/ -2020-03-23 18:49:42,436 INFO Starting ApplicationKt on pascal-XPS-15-9570 with PID 31673 (/home/pascal/development/bonita-purge-tool/build/distributions/bonita-purge-tool/lib/bonita-purge-tool.jar started by pascal in /home/pascal/development/bonita-purge-tool/build/distributions/bonita-purge-tool) -2020-03-23 18:49:42,438 DEBUG Running with Spring Boot v2.1.6.RELEASE, Spring v5.1.8.RELEASE -2020-03-23 18:49:42,439 INFO No active profile set, falling back to default profiles: default -2020-03-23 18:49:43,449 INFO Using datasource with HikariDataSource -2020-03-23 18:49:43,553 INFO Started ApplicationKt in 1.344 seconds (JVM running for 1.757) -2020-03-23 18:49:43,759 INFO Database URL is jdbc:mysql://localhost:3307/bonita?allowMultiQueries=true -2020-03-23 18:49:43,759 INFO Tenant id used is 1 -2020-03-23 18:49:43,759 INFO All settings can be changed in application.properties file -2020-03-23 18:49:45,376 INFO Will purge all archived process instances and their elements for process 'RFS_n_ACT_MOCKED' in version '1.3' that are finished since at least 2018-09-23T09:53:31.864 -2020-03-23 18:49:45,376 INFO Starting archive process instance purge... -2020-03-23 18:49:45,380 DEBUG Executing SQL: CREATE INDEX idx_rootprocid_archprocinst_tmp ON arch_process_instance(rootprocessinstanceid) -2020-03-23 18:49:52,285 DEBUG SQL command executed in 6905 ms -2020-03-23 18:49:52,286 DEBUG Executing SQL: DELETE A FROM arch_process_instance A INNER JOIN arch_process_instance B ON A.rootprocessinstanceid = B.rootprocessinstanceid WHERE B.rootprocessinstanceid = B.sourceobjectid AND B.processdefinitionid = ? AND (B.stateid = 6 OR B.stateid = 3 OR B.stateid = 4) AND B.enddate <= ? AND B.tenantId = ? -2020-03-23 18:50:55,427 INFO Deleted 825618 rows from table ARCH_PROCESS_INSTANCE in 63141 ms -2020-03-23 18:50:55,427 DEBUG Executing SQL: DELETE FROM arch_contract_data WHERE kind = 'PROCESS' AND tenantId = ? AND NOT EXISTS ( SELECT ID FROM arch_process_instance b WHERE arch_contract_data.scopeid = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 18:50:55,430 INFO Deleted 0 rows from table arch_contract_data in 3 ms -2020-03-23 18:50:56,430 DEBUG Executing SQL: DELETE FROM arch_data_instance WHERE arch_data_instance.containertype = 'PROCESS_INSTANCE' AND arch_data_instance.tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_process_instance b WHERE arch_data_instance.containerid = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 18:52:02,567 INFO Deleted 500000 rows from table arch_data_instance in 66137 ms -2020-03-23 18:53:13,722 INFO Deleted 500000 rows from table arch_data_instance in 70154 ms -2020-03-23 18:54:34,795 INFO Deleted 500000 rows from table arch_data_instance in 80071 ms -2020-03-23 18:56:00,910 INFO Deleted 500000 rows from table arch_data_instance in 85115 ms -2020-03-23 18:57:25,547 INFO Deleted 500000 rows from table arch_data_instance in 83636 ms -2020-03-23 18:58:47,634 INFO Deleted 500000 rows from table arch_data_instance in 81086 ms -2020-03-23 19:00:08,156 INFO Deleted 500000 rows from table arch_data_instance in 79522 ms -2020-03-23 19:01:29,054 INFO Deleted 500000 rows from table arch_data_instance in 79897 ms -2020-03-23 19:03:22,682 INFO Deleted 265693 rows from table arch_data_instance in 112627 ms -2020-03-23 19:03:23,685 DEBUG Executing SQL: DELETE FROM arch_document_mapping WHERE tenantId = ? AND NOT EXISTS ( SELECT ID FROM arch_process_instance b WHERE arch_document_mapping.processinstanceid = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 19:03:23,691 INFO Deleted 0 rows from table arch_document_mapping in 5 ms -2020-03-23 19:03:24,691 DEBUG Executing SQL: DELETE FROM arch_flownode_instance WHERE tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_process_instance b WHERE arch_flownode_instance.rootcontainerid = b.rootprocessinstanceid AND b.tenantId = ?) LIMIT ? -2020-03-23 19:04:22,641 INFO Deleted 500000 rows from table arch_flownode_instance in 57950 ms -2020-03-23 19:05:45,568 INFO Deleted 500000 rows from table arch_flownode_instance in 81927 ms -2020-03-23 19:07:10,313 INFO Deleted 500000 rows from table arch_flownode_instance in 83745 ms -2020-03-23 19:08:35,304 INFO Deleted 500000 rows from table arch_flownode_instance in 83990 ms -2020-03-23 19:09:55,926 INFO Deleted 500000 rows from table arch_flownode_instance in 79622 ms -2020-03-23 19:11:15,852 INFO Deleted 252060 rows from table arch_flownode_instance in 78925 ms -2020-03-23 19:11:16,853 DEBUG Executing SQL: DELETE FROM arch_process_comment WHERE tenantId = ? AND NOT EXISTS ( SELECT ID FROM arch_process_instance b WHERE arch_process_comment.processinstanceid = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 19:11:37,423 INFO Deleted 275206 rows from table arch_process_comment in 20569 ms -2020-03-23 19:11:38,424 DEBUG Executing SQL: DELETE FROM arch_ref_biz_data_inst WHERE tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_process_instance b WHERE arch_ref_biz_data_inst.orig_proc_inst_id = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 19:11:38,429 INFO Deleted 0 rows from table arch_ref_biz_data_inst in 4 ms -2020-03-23 19:11:39,430 DEBUG Executing SQL: DELETE FROM arch_connector_instance WHERE containertype = 'process' AND tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_process_instance b WHERE arch_connector_instance.containerid = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 19:11:42,190 INFO Deleted 0 rows from table arch_connector_instance in 2760 ms -2020-03-23 19:11:43,191 DEBUG Executing SQL: DELETE FROM arch_contract_data WHERE KIND = 'TASK' AND tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_flownode_instance b WHERE arch_contract_data.scopeid = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 19:11:43,193 INFO Deleted 0 rows from table arch_contract_data in 2 ms -2020-03-23 19:11:44,194 DEBUG Executing SQL: DELETE FROM arch_data_instance WHERE containertype = 'ACTIVITY_INSTANCE' AND tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_flownode_instance b WHERE arch_data_instance.containerid = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 19:12:09,429 INFO Deleted 500000 rows from table arch_data_instance in 25235 ms -2020-03-23 19:13:21,716 INFO Deleted 50412 rows from table arch_data_instance in 71287 ms -2020-03-23 19:13:22,718 DEBUG Executing SQL: DELETE FROM arch_connector_instance WHERE containertype = 'flowNode' AND tenantId = ? AND NOT EXISTS ( SELECT ID FROM arch_flownode_instance b where arch_connector_instance.containerid = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 19:13:36,432 INFO Deleted 500000 rows from table arch_connector_instance in 13714 ms -2020-03-23 19:14:04,755 INFO Deleted 463221 rows from table arch_connector_instance in 27322 ms -2020-03-23 19:14:05,822 INFO Detected presence of table ARCH_CONTRACT_DATA_BACKUP. Purging it as well. -2020-03-23 19:14:05,823 DEBUG Executing SQL: DELETE FROM arch_contract_data_backup WHERE KIND = 'PROCESS' AND tenantId = ? AND NOT EXISTS ( SELECT ID FROM arch_process_instance b WHERE arch_contract_data_backup.scopeId = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 19:14:58,476 INFO Deleted 275206 rows from table arch_contract_data_backup in 52653 ms -2020-03-23 19:14:59,478 DEBUG Executing SQL: DELETE FROM arch_contract_data_backup WHERE KIND = 'TASK' AND tenantId = ? AND NOT EXISTS ( SELECT ID FROM arch_flownode_instance b WHERE arch_contract_data_backup.scopeId = b.sourceobjectid AND b.tenantId = ?) LIMIT ? -2020-03-23 19:15:09,906 INFO Deleted 0 rows from table arch_contract_data_backup in 10428 ms -2020-03-23 19:15:10,909 DEBUG Executing SQL: DROP INDEX idx_rootprocid_archprocinst_tmp ON arch_process_instance -2020-03-23 19:15:11,035 DEBUG SQL command executed in 126 ms -2020-03-23 19:15:11,036 INFO Archive process instance purge completed. -2020-03-23 19:15:11,036 INFO Some of the deleted elements may still appear in Bonita applications for a short while. -2020-03-23 19:15:11,036 INFO If you try to access them you will get a not found error. This is the expected behaviour. -2020-03-23 19:15:11,036 INFO Execution completed in 1527482 ms -2020-03-23 19:15:11,036 INFO According to the database type you use, you may need to execute certain maintenance commands -2020-03-23 19:15:11,036 INFO to reclaim space or optimize the newly purged tables. -2020-03-23 19:15:11,036 INFO Eg. VACUUM REINDEX on PostgreSQL ----- - -== ORACLE - -Remove 59 025 archived cases (corresponds to 354 150 lines in the database) + -Total Time: 66 minutes - -[source,sql] ----- -SELECT - table_name, - table_rows, - round(((data_length + index_length) / 1024 / 1024), 2) 'size MB' -FROM - INFORMATION_SCHEMA.TABLES -WHERE - TABLE_SCHEMA = 'bonita' -order by - table_name ASC ; ----- - -[source,text] ----- -segment_name Size in MB NUM_ROWS ------------------------------------------------- -ARCH_FLOWNODE_INSTANCE 408 2104224 -ARCH_PROCESS_INSTANCE 72 578238 -ARCH_DATA_INSTANCE 72 470169 -QUERIABLE_LOG 128 389695 -ARCH_CONTRACT_DATA 96 182639 -ARCH_CONNECTOR_INSTANCE 18 168844 -DOCUMENT 56 164716 -ARCH_PROCESS_COMMENT 22 163844 -ARCH_REF_BIZ_DATA_INST 15 148158 -ARCH_DOCUMENT_MAPPING 9 133459 -JOB_PARAM 13 87620 -ARCH_MULTI_BIZ_DATA 2 74079 -PROCESS_COMMENT 8 43764 -MESSAGE_INSTANCE 4 28956 -DOCUMENT_MAPPING 2 28542 -DATA_INSTANCE 5 28408 ----- - -=== Initial data volume - -[source,sql] ----- -select segment_name, bytes/1024/1024 AS MB, NUM_ROWS -from dba_segments -JOIN all_tables ON dba_segments.segment_name = all_tables.table_name -where segment_type='TABLE' and dba_segments.OWNER ='BONITA' -ORDER BY NUM_ROWS DESC; ----- - -[source,text] ----- -segment_name Size in MB NUM_ROWS ------------------------------------------------- -ARCH_FLOWNODE_INSTANCE 408 2104224 -ARCH_PROCESS_INSTANCE 72 578238 -ARCH_DATA_INSTANCE 72 470169 -QUERIABLE_LOG 128 389695 -ARCH_CONTRACT_DATA 96 182639 -ARCH_CONNECTOR_INSTANCE 18 168844 -DOCUMENT 56 164716 -ARCH_PROCESS_COMMENT 22 163844 -ARCH_REF_BIZ_DATA_INST 15 148158 -ARCH_DOCUMENT_MAPPING 9 133459 -JOB_PARAM 13 87620 -ARCH_MULTI_BIZ_DATA 2 74079 -PROCESS_COMMENT 8 43764 -MESSAGE_INSTANCE 4 28956 -DOCUMENT_MAPPING 2 28542 -DATA_INSTANCE 5 28408 ----- - -=== Execution - -[source,log] ----- -╰─$ bin/bonita-purge-tool 5488089572307653177 1584631356000 2 ↵ - ____ _ _ _ _ - | _ \ (_) | | | | | - | |_) | ___ _ __ _| |_ __ _ _ __ _ _ _ __ __ _ ___ | |_ ___ ___ | | - | _ < / _ \| '_ \| | __/ _` | | '_ \| | | | '__/ _` |/ _ \ | __/ _ \ / _ \| | - | |_) | (_) | | | | | || (_| | | |_) | |_| | | | (_| | __/ | || (_) | (_) | | - |____/ \___/|_| |_|_|\__\__,_| | .__/ \__,_|_| \__, |\___| \__\___/ \___/|_| - | | __/ | - |_| |___/ -2020-03-19 16:23:01,725 INFO Starting ApplicationKt on manu-DellXPS with PID 13151 (/home/manu/workspace/bonita-purge-tool/build/bonita-purge-tool/lib/bonita-purge-tool.jar started by manu in /home/manu/workspace/bonita-purge-tool/build/bonita-purge-tool) -2020-03-19 16:23:01,727 DEBUG Running with Spring Boot v2.1.6.RELEASE, Spring v5.1.8.RELEASE -2020-03-19 16:23:01,728 INFO No active profile set, falling back to default profiles: default -2020-03-19 16:23:02,809 INFO Using datasource with HikariDataSource -2020-03-19 16:23:02,933 INFO Started ApplicationKt in 1.46 seconds (JVM running for 1.919) -2020-03-19 16:23:03,425 INFO Database URL is jdbc:oracle:thin:@//localhost:1521/ORCLPDB1.localdomain -2020-03-19 16:23:03,425 INFO Tenant id used is 1 -2020-03-19 16:23:03,425 INFO All settings can be changed in application.properties file -2020-03-19 16:23:03,481 INFO Will purge all archived process instances and their elements for process 'All Kind Of Elements Auto' in version '1.1' that are finished since at least 2020-03-19T16:22:36 -Start the purge using the above parameters? [y/N] -y -2020-03-19 16:23:09,354 INFO Starting archive process instance purge... -2020-03-19 16:23:09,366 DEBUG Executing SQL: CREATE INDEX idx_rootprocid_archprocinst_tmp ON arch_process_instance(rootprocessinstanceid) -2020-03-19 16:23:11,223 DEBUG SQL command executed in 1856 ms -2020-03-19 16:23:11,224 DEBUG Executing SQL: DELETE FROM ARCH_PROCESS_INSTANCE A WHERE exists ( SELECT rootprocessinstanceid FROM ARCH_PROCESS_INSTANCE B WHERE B.ROOTPROCESSINSTANCEID = B.SOURCEOBJECTID AND A.ROOTPROCESSINSTANCEID = B.ROOTPROCESSINSTANCEID AND PROCESSDEFINITIONID = ? and (STATEID = 6 OR STATEID = 3 OR STATEID = 4) AND ENDDATE <= ?) AND tenantId = ? -2020-03-19 16:29:38,125 INFO Deleted 354150 rows from table ARCH_PROCESS_INSTANCE in 386901 ms -2020-03-19 16:29:38,126 DEBUG Executing SQL: DELETE FROM ARCH_CONTRACT_DATA a WHERE a.KIND = 'PROCESS' AND a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM ARCH_PROCESS_INSTANCE b WHERE a.SCOPEID = b.SOURCEOBJECTID AND b.tenantId = ?) -2020-03-19 16:31:03,463 INFO Deleted 39350 rows from table ARCH_CONTRACT_DATA in 85337 ms -2020-03-19 16:31:03,463 DEBUG Executing SQL: DELETE FROM ARCH_DATA_INSTANCE a WHERE a.CONTAINERTYPE = 'PROCESS_INSTANCE' AND a.tenantId = ? AND NOT EXISTS ( SELECT id FROM ARCH_PROCESS_INSTANCE b WHERE a.CONTAINERID = b.SOURCEOBJECTID AND b.tenantId = ?) -2020-03-19 16:35:25,170 INFO Deleted 154056 rows from table ARCH_DATA_INSTANCE in 261706 ms -2020-03-19 16:35:25,171 DEBUG Executing SQL: DELETE FROM ARCH_DOCUMENT_MAPPING a WHERE a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM ARCH_PROCESS_INSTANCE b WHERE a.PROCESSINSTANCEID = b.SOURCEOBJECTID AND b.tenantId = ?) -2020-03-19 16:36:54,060 INFO Deleted 78700 rows from table ARCH_DOCUMENT_MAPPING in 88889 ms -2020-03-19 16:36:54,060 DEBUG Executing SQL: DELETE FROM ARCH_FLOWNODE_INSTANCE a WHERE a.tenantId = ? AND NOT EXISTS ( SELECT id FROM ARCH_PROCESS_INSTANCE b WHERE a.ROOTCONTAINERID = b.ROOTPROCESSINSTANCEID AND b.tenantId = ?) -2020-03-19 17:25:07,584 INFO Deleted 1167526 rows from table ARCH_FLOWNODE_INSTANCE in 2893524 ms -2020-03-19 17:25:07,585 DEBUG Executing SQL: DELETE FROM ARCH_PROCESS_COMMENT a WHERE a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM ARCH_PROCESS_INSTANCE b WHERE a.PROCESSINSTANCEID = b.SOURCEOBJECTID AND b.tenantId = ?) -2020-03-19 17:25:07,822 INFO Deleted 0 rows from table ARCH_PROCESS_COMMENT in 237 ms -2020-03-19 17:25:07,823 DEBUG Executing SQL: DELETE FROM ARCH_REF_BIZ_DATA_INST a WHERE a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM ARCH_PROCESS_INSTANCE b WHERE a.ORIG_PROC_INST_ID = b.SOURCEOBJECTID AND b.tenantId = ?) -2020-03-19 17:27:39,818 INFO Deleted 78700 rows from table ARCH_REF_BIZ_DATA_INST in 151995 ms -2020-03-19 17:27:39,819 DEBUG Executing SQL: DELETE FROM ARCH_CONNECTOR_INSTANCE a WHERE a.CONTAINERTYPE = 'process' AND a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM ARCH_PROCESS_INSTANCE b WHERE a.CONTAINERID = b.SOURCEOBJECTID AND b.tenantId = ?) -2020-03-19 17:27:39,885 INFO Deleted 0 rows from table ARCH_CONNECTOR_INSTANCE in 66 ms -2020-03-19 17:27:39,885 DEBUG Executing SQL: DELETE FROM ARCH_CONTRACT_DATA a WHERE a.KIND = 'TASK' AND a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM ARCH_FLOWNODE_INSTANCE b WHERE a.SCOPEID = b.SOURCEOBJECTID AND b.tenantId = ?) -2020-03-19 17:27:40,388 INFO Deleted 0 rows from table ARCH_CONTRACT_DATA in 503 ms -2020-03-19 17:27:40,388 DEBUG Executing SQL: DELETE FROM ARCH_DATA_INSTANCE a WHERE a.CONTAINERTYPE = 'ACTIVITY_INSTANCE' AND a.tenantId = ? AND NOT EXISTS ( SELECT id FROM ARCH_FLOWNODE_INSTANCE b WHERE a.CONTAINERID = b.SOURCEOBJECTID AND b.tenantId = ?) -2020-03-19 17:27:41,044 INFO Deleted 0 rows from table ARCH_DATA_INSTANCE in 656 ms -2020-03-19 17:27:41,045 DEBUG Executing SQL: DELETE FROM ARCH_CONNECTOR_INSTANCE a WHERE a.CONTAINERTYPE = 'flowNode' AND a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM ARCH_FLOWNODE_INSTANCE b WHERE a.CONTAINERID = b.SOURCEOBJECTID AND b.tenantId = ?) -2020-03-19 17:29:12,637 INFO Deleted 75564 rows from table ARCH_CONNECTOR_INSTANCE in 91592 ms -2020-03-19 17:29:12,821 DEBUG Executing SQL: DROP INDEX idx_rootprocid_archprocinst_tmp -2020-03-19 17:29:13,054 DEBUG SQL command executed in 233 ms -2020-03-19 17:29:13,054 INFO Archive process instance purge completed. -2020-03-19 17:29:13,055 INFO Some of the deleted elements may still appear in Bonita applications for a short while. -2020-03-19 17:29:13,055 INFO If you try to access them you will get a not found error. This is the expected behaviour. -2020-03-19 17:29:13,055 INFO Execution completed in 3970120 ms ----- - -== MS SQL Server - -Remove 112 396 archived process instances (corresponds to 674 379 lines in the database) + -Total Time: 5m30s - -=== Initial data volume - -[source,sql] ----- -SELECT t.Name AS TableName, p.rows AS RowCounts, -CAST(ROUND((SUM(a.used_pages) / 128.00), 2) AS NUMERIC(36, 2)) AS Used_MB -FROM sys.tables t -INNER JOIN sys.indexes i ON t.OBJECT_ID = i.object_id -INNER JOIN sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id -INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id -INNER JOIN sys.schemas s ON t.schema_id = s.schema_id -Where used_pages <> 0 -GROUP BY t.Name, s.Name, p.Rows -ORDER BY Used_MB DESC ----- - -[source,text] ----- -TableName -|RowCounts|Used_MB| -------------------------|---------|-------| -arch_flownode_instance |2 948 801|2227.38| -document | 227454| 988.68| -arch_data_instance | 682875| 385.91| -arch_process_instance | 679 506| 321.69| -queriable_log | 569821| 293.39| -arch_process_comment | 449944| 188.71| -arch_contract_data | 341431| 185.84| -arch_ref_biz_data_inst | 224972| 71.98| -arch_connector_instance | 227612| 70.97| -flownode_instance | 1498| 46.52| -arch_document_mapping | 226026| 42.99| -process_comment | 5286| 13.87| -data_instance | 1334| 8.64| -process_instance | 1428| 8.45| -ref_biz_data_inst | 2666| 7.89| -page | 24| 7.88| -arch_multi_biz_data | 112486| 6.70| ----- - -=== Execution - -[source,text] ----- -╰─$ bin/bonita-purge-tool 6186406801545861394 1584703344000 130 ↵ - ____ _ _ _ _ - | _ \ (_) | | | | | - | |_) | ___ _ __ _| |_ __ _ _ __ _ _ _ __ __ _ ___ | |_ ___ ___ | | - | _ < / _ \| '_ \| | __/ _` | | '_ \| | | | '__/ _` |/ _ \ | __/ _ \ / _ \| | - | |_) | (_) | | | | | || (_| | | |_) | |_| | | | (_| | __/ | || (_) | (_) | | - |____/ \___/|_| |_|_|\__\__,_| | .__/ \__,_|_| \__, |\___| \__\___/ \___/|_| - | | __/ | - |_| |___/ -2020-03-20 12:23:40,671 INFO Starting ApplicationKt on manu-DellXPS with PID 2760 (/home/manu/workspace/bonita-purge-tool/build/bonita-purge-tool/lib/bonita-purge-tool.jar started by manu in /home/manu/workspace/bonita-purge-tool/build/bonita-purge-tool) -2020-03-20 12:23:40,673 DEBUG Running with Spring Boot v2.1.6.RELEASE, Spring v5.1.8.RELEASE -2020-03-20 12:23:40,673 INFO No active profile set, falling back to default profiles: default -2020-03-20 12:23:41,705 INFO Using datasource with HikariDataSource -2020-03-20 12:23:41,830 INFO Started ApplicationKt in 1.407 seconds (JVM running for 1.872) -2020-03-20 12:23:42,078 INFO Database URL is jdbc:sqlserver://localhost:1433;database=bonita -2020-03-20 12:23:42,078 INFO Tenant id used is 1 -2020-03-20 12:23:42,078 INFO All settings can be changed in application.properties file -2020-03-20 12:23:42,186 INFO Will purge all archived process instances and their elements for process 'All Kind Of Elements' in version '1.2' that are finished since at least 2020-03-20T12:22:24 -Start the purge using the above parameters? [y/N] -y -2020-03-20 12:24:24,225 INFO Starting archive process instance purge... -2020-03-20 12:24:24,245 DEBUG Executing SQL: CREATE INDEX idx_rootprocid_archprocinst_tmp ON arch_process_instance(rootprocessinstanceid) -2020-03-20 12:24:26,432 DEBUG SQL command executed in 2187 ms -2020-03-20 12:24:26,433 DEBUG Executing SQL: DELETE A FROM arch_process_instance A INNER JOIN arch_process_instance B ON A.rootprocessinstanceid = B.rootprocessinstanceid WHERE B.rootprocessinstanceid = B.sourceobjectid AND B.processdefinitionid = ? AND (B.stateid = 6 OR B.stateid = 3 OR B.stateid = 4) AND B.enddate <= ? AND B.tenantId = ? -2020-03-20 12:24:48,774 INFO Deleted 674379 rows from table ARCH_PROCESS_INSTANCE in 22341 ms -2020-03-20 12:24:48,775 DEBUG Executing SQL: DELETE a FROM arch_contract_data as a WHERE a.kind = 'PROCESS' AND a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM arch_process_instance b WHERE a.scopeid = b.sourceobjectid AND b.tenantId = ?) -2020-03-20 12:24:52,769 INFO Deleted 112486 rows from table arch_contract_data in 3994 ms -2020-03-20 12:24:52,769 DEBUG Executing SQL: DELETE a FROM arch_data_instance as a WHERE a.containertype = 'PROCESS_INSTANCE' AND a.tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_process_instance b WHERE a.containerid = b.sourceobjectid AND b.tenantId = ?) -2020-03-20 12:25:14,468 INFO Deleted 449942 rows from table arch_data_instance in 21699 ms -2020-03-20 12:25:14,468 DEBUG Executing SQL: DELETE a FROM arch_document_mapping as a WHERE a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM arch_process_instance b WHERE a.processinstanceid = b.sourceobjectid AND b.tenantId = ?) -2020-03-20 12:25:17,323 INFO Deleted 224793 rows from table arch_document_mapping in 2854 ms -2020-03-20 12:25:17,324 DEBUG Executing SQL: DELETE a FROM arch_flownode_instance as a WHERE a.tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_process_instance b WHERE a.rootcontainerid = b.rootprocessinstanceid AND b.tenantId = ?) -2020-03-20 12:28:14,781 INFO Deleted 2914573 rows from table arch_flownode_instance in 177457 ms -2020-03-20 12:28:14,781 DEBUG Executing SQL: DELETE a FROM arch_process_comment as a WHERE a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM arch_process_instance b WHERE a.processinstanceid = b.sourceobjectid AND b.tenantId = ?) -2020-03-20 12:28:23,173 INFO Deleted 449944 rows from table arch_process_comment in 8392 ms -2020-03-20 12:28:23,174 DEBUG Executing SQL: DELETE a FROM arch_ref_biz_data_inst as a WHERE a.tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_process_instance b WHERE a.orig_proc_inst_id = b.sourceobjectid AND b.tenantId = ?) -2020-03-20 12:28:34,153 INFO Deleted 224972 rows from table arch_ref_biz_data_inst in 10979 ms -2020-03-20 12:28:34,153 DEBUG Executing SQL: DELETE a FROM arch_connector_instance as a WHERE a.containertype = 'process' AND a.tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_process_instance b WHERE a.containerid = b.sourceobjectid AND b.tenantId = ?) -2020-03-20 12:28:34,701 INFO Deleted 0 rows from table arch_connector_instance in 548 ms -2020-03-20 12:28:34,701 DEBUG Executing SQL: DELETE a FROM arch_contract_data as a WHERE a.KIND = 'TASK' AND a.tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_flownode_instance b WHERE a.scopeid = b.sourceobjectid AND b.tenantId = ?) -2020-03-20 12:28:45,360 INFO Deleted 224972 rows from table arch_contract_data in 10659 ms -2020-03-20 12:28:45,360 DEBUG Executing SQL: DELETE a FROM arch_data_instance as a WHERE a.containertype = 'ACTIVITY_INSTANCE' AND a.tenantId = ? AND NOT EXISTS ( SELECT id FROM arch_flownode_instance b WHERE a.containerid = b.sourceobjectid AND b.tenantId = ?) -2020-03-20 12:29:06,026 INFO Deleted 224972 rows from table arch_data_instance in 20666 ms -2020-03-20 12:29:06,027 DEBUG Executing SQL: DELETE a FROM arch_connector_instance a WHERE a.containertype = 'flowNode' AND a.tenantId = ? AND NOT EXISTS ( SELECT ID FROM arch_flownode_instance b where a.containerid = b.sourceobjectid AND b.tenantId = ?) -2020-03-20 12:29:12,747 INFO Deleted 224972 rows from table arch_connector_instance in 6719 ms -2020-03-20 12:29:12,817 DEBUG Executing SQL: DROP INDEX IF EXISTS idx_rootprocid_archprocinst_tmp ON arch_process_instance -2020-03-20 12:29:12,822 DEBUG SQL command executed in 5 ms -2020-03-20 12:29:12,822 INFO Archive process instance purge completed. -2020-03-20 12:29:12,823 INFO Some of the deleted elements may still appear in Bonita applications for a short while. -2020-03-20 12:29:12,823 INFO If you try to access them you will get a not found error. This is the expected behaviour. -2020-03-20 12:29:12,823 INFO Execution completed in 330991 ms -2020-03-20 12:29:12,823 INFO According to the database type you use, you may need to execute certain maintenance commands -2020-03-20 12:29:12,823 INFO to reclaim space or optimize the newly purged tables. -2020-03-20 12:29:12,823 INFO Eg. VACUUM REINDEX on PostgreSQL ----- diff --git a/modules/api/pages/handling-documents.adoc b/modules/api/pages/handling-documents.adoc index f7035feb6e..1ba00432b4 100644 --- a/modules/api/pages/handling-documents.adoc +++ b/modules/api/pages/handling-documents.adoc @@ -37,7 +37,7 @@ def DocumentValue createNewDocument(FileInputValue fileFromContract) { // From an url def DocumentValue createNewDocument(String url) { - new DocumentValue(url); + new DocumentValue(url) } // From an existing file on the fileSystem @@ -124,7 +124,7 @@ def createCaseWithDocument(String processDefinitionName, } // ----- start process instance ----- - processAPI.startProcess(processDefinitionId, operations, listExpressionsContext); + processAPI.startProcess(processDefinitionId, operations, listExpressionsContext) } ---- @@ -162,7 +162,7 @@ The use case is to delete the documents of archived cases older than a certain d `searchArchivedDocumentsOlderThanArchivedDate` look for archived documents `deleteArchivedDocumentsOlderThan` delete the content of the document -WARNING: Although the document binary will be deleted there will still be records in the database. No methods are provided to completely get rid of the document from the database +WARNING: Although the document binary will be deleted there will still be records in the database. The xref:runtime:purge-tool.adoc#delete-orphan-document-content[Purge Tool] should be used to completely get rid of the document from the database to free disk space. [source,groovy] ---- @@ -177,12 +177,12 @@ def SearchResult searchArchivedDocumentsOlderThanArchivedDate(ProcessAPI process //Delete archived documents older than archivedDate def deleteArchivedDocumentsOlderThan(ProcessAPI processAPI, long archivedDate) { - int startIndex = 0; + int startIndex = 0 int maxResults = 100 def searchResult = searchArchivedDocumentsOlderThanArchivedDate(processAPI, archivedDate, startIndex, maxResults) while(searchResult.count > 0){ searchResult.result.each { archivedDocument -> - processAPI.deleteContentOfArchivedDocument(archivedDocument.getId()); + processAPI.deleteContentOfArchivedDocument(archivedDocument.getId()) } startIndex += maxResults searchResult = searchArchivedDocumentsOlderThanArchivedDate(processAPI, archivedDate, startIndex, maxResults) @@ -190,5 +190,5 @@ def deleteArchivedDocumentsOlderThan(ProcessAPI processAPI, long archivedDate) { } //Then just call the method with desired archivedDate -deleteArchivedDocumentsOlderThan(processAPI, archivedDate); +deleteArchivedDocumentsOlderThan(processAPI, archivedDate) ---- diff --git a/modules/data/pages/documents.adoc b/modules/data/pages/documents.adoc index 852fe7abda..f8c20602a6 100644 --- a/modules/data/pages/documents.adoc +++ b/modules/data/pages/documents.adoc @@ -152,7 +152,7 @@ In a process instance, there is no specific versioning. When a document is updat === Document archives -When a process element is archived the associated documents are also archived. It is possible to delete the archived documents using the Engine API or REST API when they are no longer needed, to save space. You can delete an archived document from a live process instance or from an archived process instance. When you delete an archived document, only the content is deleted. The metadata, such as the name, last updated date, and uploader, is kept so that it can be retrieved if needed for audit. +When a process element is archived the associated documents are also archived. It is possible to delete the archived documents using the Engine API, the REST API or the xref:runtime:purge-tool.adoc[Purge Tool] when they are no longer needed, to free disk space. You can delete an archived document from a live process instance or from an archived process instance. When you delete an archived document, only the content is deleted. The metadata, such as the name, last updated date, and uploader, is kept so that it can be retrieved if needed for audit. == Define a document in a process definition diff --git a/modules/runtime/pages/purge-tool-changelog.adoc b/modules/runtime/pages/purge-tool-changelog.adoc index a3789ea6a8..16486fda85 100644 --- a/modules/runtime/pages/purge-tool-changelog.adoc +++ b/modules/runtime/pages/purge-tool-changelog.adoc @@ -1,10 +1,39 @@ -= Purge tool change log += Purge Tool change log :page-aliases: ROOT:purge-tool-changelog.adoc -:description: This is the changelog of the purge tool. +:description: This is the changelog of the Purge Tool. -This is the changelog of the purge tool. +{description} + +The Purge Tool is used to remove data from Bonita archive tables. It is useful for big production environments. + +== 2.1.0 - December 4, 2024 + +Breaking changes: + +- Rename `-t, --timeout-interval` option to `-i, --batch-interval` without changing its behavior: + +Time interval in milliseconds to wait between each batch query execution. Default value is 0 ms. + +New features: + +- Support purge of document content. +- Add a new command `delete-orphan-document-content`. +- Add a new option `--preserve-document-content` for the `delete` command: + +Skip deletion of document content. + +When set to true, document content will be preserved in the database. Default is false. + +This is a fallback option to preserve the behaviour before 2.1.0 version. +- Add a new option `--delete-interval` for the `delete` command: + +Time interval in milliseconds to wait between each table deletion operation. + +A waiting interval can be set to avoid requests to be locked by the database engine (e.g. when computing indexes after their creation). + +Default value is 5000 ms. +- Purge the new `arch_bpm_failure` table introduced in Bonita 2025.1. +- Optimize requests for all vendors. + +Others: + +- deps: update Springboot version to `3.4.0` +- deps: update Oracle driver to `21.16.0.0` +- deps: update Exposed JDBC to `0.56.0` -The purge tool is used to remove data from bonita archive tables. It is useful for big production environments. == 2.0.0 - October 24, 2024 diff --git a/modules/runtime/pages/purge-tool.adoc b/modules/runtime/pages/purge-tool.adoc index 8f229c8d2f..bc30c1179d 100644 --- a/modules/runtime/pages/purge-tool.adoc +++ b/modules/runtime/pages/purge-tool.adoc @@ -1,6 +1,7 @@ = Purge Tool -:page-aliases: ROOT:purge-tool.adoc +:page-aliases: ROOT:purge-tool.adoc, ROOT:purge-tool-deletion-volume-testing.adoc :description: Bonita Purge Tool provides the capability to purge finished (archived) process instances from Bonita Runtime environment. +:tabs-sync-option: [NOTE] ==== @@ -16,13 +17,15 @@ By default, all archives are preserved forever in Bonita runtime, but if your fu [WARNING] ==== -The purge tool doesn't delete documents (stored in the DOCUMENT table) from the platform. It will only remove the mapping between the archived cases and the document itself. If you need to reduce the size of the Document table in the engine database, please refer to the documentation: xref:ROOT:handling-documents.adoc#delete_document_archived_case[Delete documents of archived cases based on archive date] +*Prior to version 2.1.0*, the Purge Tool did not delete documents (stored in the DOCUMENT table) from the platform. It only removed the mapping between the archived cases and the document itself. If you need to reduce the size of the Document table in the engine database, consider using command `delete-orphan-document-content` to remove all documents not linked to any process instance anymore. Moreover, by default, subsequent runs of Purge Tool 2.1+ will automatically purge documents along with archived process instances, unless option `--preserve-document-content` is explicitly set. ==== == Pre-requisites This tool requires a Java 17+ runtime environment to run. + -This tool can be run on a Bonita runtime environment in a version greater than or equal to 7.7.0. +This tool can be run on a Bonita runtime environment in a version greater than or equal to 7.7.0. + +We recommend to run the tool from a different machine than the one hosting the Bonita database. +The machine requires a network access to the database. [CAUTION] ==== @@ -81,9 +84,10 @@ Windows:: ---- ==== +[[list]] === `list` command -This command lists all existing process definitions that have root process instances archived for given date filter. +This command lists all existing process definitions that have root process instances archived for a given date filter. .List process definitions with archived root process instances older than 6 months [tabs] @@ -103,9 +107,22 @@ Windows:: ---- ==== +[[delete]] === `delete` command -This command deletes archived process instances and their related archived elements (flownodes, data, comments, etc.) for given date filter. +This command deletes archived process instances and their related archived elements (flownodes, data, comments, etc.) for a given date filter. + +You can use either the `--older-than` or `--before-date` option to define the date filter: + + * `--older-than` to delete all archived elements older than a specified period of time + * `--before-date` to delete all archived elements before a specified date + +The command accepts other optional filtering options like: + + * `--process-definition-id` to delete only the archived elements of a specific process definition + * `--tenant-id` to delete only the archived elements of a specific tenant, if your platform is a multi-tenant architecture (as a reminder, xref:version-update:mtmr-tool.adoc[this feature has been removed from Bonita 2023.1]) + +Examples: .Delete all archived process instances older than 6 months [tabs] @@ -164,29 +181,42 @@ NOTE: The `--before-date` parameter must be in https://www.epochconverter.com/[m ==== Delete modes -===== `batch-delete` +The `delete` command supports two modes: `batch-delete` and `copy-truncate`. -By default, the tool uses the `batch-delete` mode to delete rows in database tables. + -This mode will be slower than the copy-truncate mode, but it doesn't require the Runtime to be shutdown and can be stopped and resumed at any time. +===== `batch-delete` -In this mode, each deleted batch is committed in database. + -The batch size and the timeout interval between each batch can be configured using the `--batch-size` and `--timeout-interval` options on the `delete` command. +The default mode is `batch-delete`. It can be used while the Bonita runtime is still running, and can be stopped and resumed at any time. + +We recommend to use this mode to delete small to medium-sized data volumes regularly, for example, by using cron jobs during off-peak hours. -Fine tune the batch size and timeout interval depending on your database configuration and the volume of data to delete. By default, the batch size is `5000` and the timeout interval is `0` ms. +In this mode, the rows of each table to purge are deleted in batches, and each batch is committed in the database. ===== `copy-truncate` -In this mode, the tool will copy the rows to keep in a temporary table, then truncate and drop the original table and rename the temporary table and recreate all required constraints and indices. +The `copy-truncate` mode is more efficient for large data volumes, especially when the number of rows to delete is higher than the number of rows to keep in the tables, but it requires the Bonita runtime to be stopped during the operation. Also, we advise not to stop the ongoing execution. + +In this mode, the rows to keep are copied in batches to a temporary table, then the original table is truncated and dropped, and the temporary table is renamed to the original table. All required constraints and indexes are re-created. -Due to the efficiency of the TRUNCATE command, this method should be faster in most cases, especially when the number of rows to delete is higher than the number of rows to keep in the table, but it requires the Runtime to be stopped during the operation. +==== Fine-tuning the deletion -Use the `--delete-mode copy-truncate` option to use this mode with the `delete` command. +As mentioned before, both modes use batch processing to either delete or copy rows. + +The batch processing can be fine-tuned using the `--batch-size` and `--batch-interval` options: +* `--batch-size` to define the number of rows to delete or copy in each batch. +** With the `batch-delete` mode: although it depends on your database configuration and the volume of data to delete, we recommend starting with the default batch size of `5000` rows. You can increase this value based on the performance of your database. +** With the `copy-truncate` mode: since the Bonita runtime is stopped when using this mode, you can use a much larger batch size to reduce the number of batches, even until reaching a unique one, which could improve performance. As a reminder, the main criterion with this mode is the number of rows to keep in the tables. + +* `--batch-interval` to define the time in milliseconds to wait between each batch. + +We introduce this option because it may happen that batch requests slow down or get blocked due to their quick sequence of execution. Adjust this interval if you face this issue, but take into consideration that it will significantly increase the execution time if the number of batches is important. + +When using the `copy-truncate` mode, some requests may be locked by the database engine when computing indexes after their re-creation. +To prevent this, use the `--delete-interval` option to set a waiting interval between each table deletion operation. + +[[delete-file-input]] === `delete-file-input` command Delete all archived contract file input values. + In other words, delete all rows in table `arch_contract_data` corresponding to contract data of type `File` (in Studio) or `org.bonitasoft.engine.bpm.contract.FileInputValue` in Bonita Engine. + -These data are not used by Bonita and can take a large amout of space in your database, so deleting them is advised. +These data are not used by Bonita and can take a large amount of space in your database, so deleting them is advised. [NOTE] ==== @@ -214,6 +244,13 @@ Windows:: WARNING: `delete-fileinput-content` command is not supported for SQLServer database. +[[delete-orphan-document-content]] +=== `delete-orphan-document-content` command + +As mentioned above, the Purge Tool did not delete documents from the platform before the version 2.1.0. + +If you executed the Purge Tool before this version, you may have a large amount of orphan document content in your database. + +To delete those orphan documents, use the `delete-orphan-document-content` command. + == Deletion strategy You need to have in mind 2 precepts to understand how this tool works: @@ -221,28 +258,24 @@ You need to have in mind 2 precepts to understand how this tool works: 1) This tool will first delete all archived process instances (`arch_process_instance` rows) that are concerned by this purge. Then the tables containing associated elements will be scanned to remove all existing orphans. -2) All archived and running process instances (cases) will have at least one row in arch_process_instance table. +2) All archived and running process instances (cases) will have at least one row in `arch_process_instance` table. This is due to the first initializing state (stateId = 0) of the process instance that is archived as soon as it is created. -Thanks to these facts, to identify the orphans we only need to query the arch_process_instance, which is more performant than querying +Thanks to these facts, to identify the orphans we only need to query the `arch_process_instance`, which is more performant than querying both `process_instance` and `arch_process_instance` tables while we avoid removing data from running cases. For example, once all `arch_process_instance` rows matching the conditions (processDefinitionId and timestamp) have been deleted and when the tool deletes the `arch_data_instance` rows, the tool only needs to query the `arch_process_instance` table. [source,sql] ---- -DELETE FROM ARCH_DATA_INSTANCE a WHERE -a.CONTAINERTYPE = 'PROCESS_INSTANCE' -AND a.tenantId = 1 -AND NOT EXISTS ( - SELECT id FROM ARCH_PROCESS_INSTANCE b - WHERE a.CONTAINERID = b.SOURCEOBJECTID - AND b.tenantId = 1); +DELETE FROM arch_data_instance a +WHERE a.containertype = 'PROCESS_INSTANCE' + AND a.tenantid = 1 + AND NOT EXISTS ( + SELECT 1 FROM arch_process_instance b + WHERE a.containerid = b.sourceobjectid AND b.tenantid = 1 + ); ---- This strategy allows this tool to be more robust, it can be stopped at any given time, relaunching it will continue the deletion from where it stopped. However, this means that the time required to execute a purge will be the same when deleting a few elements or a lot of elements. - -== Database deletion volume testing reference - -This xref:ROOT:purge-tool-deletion-volume-testing.adoc[reference page] provides the tests run on all supported databases.