Releases: opensearch-project/data-prepper
Releases · opensearch-project/data-prepper
2.0.1
2022-10-24 Version 2.0.1
Bug Fixes
- Sending events to multiple pipelines would result in unexpected behavior because the event was mutated in all pipelines. Duplicate all events sent through the
pipeline
sink. #1886 - Support read S3 objects when keys have spaces. #1923.
- Delete
s3:TestEvent
objects in the S3 source to avoid those events getting stuck. #1924 - Correctly parse duration values of
0s
or0ms
. #1910
Security
- Updates protobuf-java-util to 3.23.7 due to CVE-2022-3171. #1945
- Updates Jackson to 2.13.4.2 due to CVE-2022-42003. #1933
Maintenance
2.0.0
2022-10-10 Version 2.0.0
Breaking Changes
- Replaced the
data-prepper-tar-install.sh
script withbin/data-prepper
- Replaced the single Jar file with a directory structure and the
bin/data-prepper
script. - Data Prepper pipelines no longer support
prepper:
. Useprocessor:
instead. - Data Prepper now requires Java 11 or higher to run.
- Removed properties from the
opensearch
sink:trace_analytics_raw
andtrace_analytics_service_map
. Use theindex_type
instead. - Renamed two
grok
metric names. RenamedgrokProcessingMatchSuccess
togrokProcessingMatch
andgrokProcessingMatchFailure
togrokProcessingMismatch
. - Removed
record_type
fromotel_trace_source
. - Removed
otel_trace_raw_prepper
- useotel_trace_raw
instead. - Removed
otel_trace_group_prepper
- useotel_trace_group
instead. - Removed
peer-forwarder
processor plugin. - Incorrect HTTP methods were removed from Data Prepper core HTTP APIs.
- All APIs have been renamed to use the
org.opensearch.dataprepper
package.
Features
- Support conditional routing in sinks (#1337)
- Support Core Peer Forwarder which replaces the peer forwarder processor plugin (#700)
- Support parsing CSV and TSV values in event fields using CSV processor (#1081)
- Support decoding CSV or TSV S3 objects using CSV codec in S3 source (#1644)
- Support processing JSON values in event fields using JSON processor (#831)
- Support multi node aggregation using Core Peer Forwarder (#978)
- Use peer forwarding in service_map_stateful processor using Core Peer Forwarder (#1765)
- Use peer forwarding in otel_trace_raw processor using Core Peer Forwarder (#1766)
Enhancements
- Updates default configurations for
bounded_buffer
-buffer_size
is 12,800,batch_size
is 200. - Support concurrency in Data Prepper expression (#1189)
- Support bulk create option in OpnSearch sink (#1561)
- Added new metric to track utilization rate of buffer (#1817)
- Support publishing metrics for Data Prepper core (#1789)
- Support configurable timeouts for processors and sinks to flush the data downstream before shutdown (#1742)
- Moved the fields from S3 JSON objects up to the root level of the object to promote consistency with other S3 codecs (#1687)
- Support Duration in Data Prepper server configuration (#1623)
- Fixed health check bug when Auth is enabled for HTTP and OTel trace source (#1600)
- Enabled HTTP health check for OTel trace source and OTel metrics source (#1546)
- Made Java 11 to be the baseline version for Data Prepper (#1422)
- Updated HTTP source request timeout to use configured timeout (#975)
- Distribute Data Prepper Docker image with JDK 17 (#694)
- Support ACM and S3 for TLS/SSL in HTTP source (#365)
- Updated HTTP methods supported on core API endpoints (#313)
Bug Fixes
- Fix a bug where file sink fails to write to output file with multiple pipeline threads (#1843)
- Fixed a bug S3 source poll delay (#1550)
- Fixed a bug where Data Prepper stops if there's an error reading S3 object (#1544)
Infrastructure
- Improved Data Prepper assemble task to create a runnable distribution (#1762)
- Updated end-to-end tests with core peer forwarder for trace tests (#1866)
- Support directory structure for Data Prepper (#305)
Maintenance
- Removed support for OTLP protocol as internal data transfer in trace pipeline (#1272)
- Update Jackson to 2.13.4 (#1871)
- Updated Armeria to 1.19.0 (#1806)
- Removed Peer Forwarder processor plugin (#1874)
- Removed deprecated type property in DataPrepperPlugin annotation (#1657)
- Removed deprecated trace_analytics_raw and trace_analytics_service_map values, index_type replaces them (#1648)
- Removed deprecated PluginFactory (#1584)
- Removed AWS SDK v1 entirely from S3 source (#1562)
- Updated Antlr to 4.10.1 (#1513)
- Updated all existing prepper plugins to use only processor. (#647)
- Removed deprecated prepper plugin type from pipeline configuration. (#619)
- Updated package naming to
org.opensearch
fromcom.amazon
(#344)
1.5.1
2022-07-08 Version 1.5.1
Bug Fixes
- Fix a bug where the S3 Source failed to load all records for concatenated gzip files. This was found in some Application Load Balancer log files. #1572
- The S3 Source continues to run even if errors occur reading from S3. Previously, the polling thread would stop. #1566
- S3 Source poll delay will no longer sleep if some messages are received from the SQS queue. #1567
1.5.0
2022-06-23 Version 1.5.0
Features
- Support S3 and SQS as a source of events. (#251)
- Data Prepper can now report its own metrics with custom tags applied. (#1415)
- Support the Embedded Metrics Format (EMF) for reporting Data Prepper's own metrics. (#1404)
Enhancements
- The OpenSearch sink now supports disabling any index management from Data Prepper via the
management-disabled
index-type
. (#1051) - Add a health check to the HTTP source. (#1466)
- Display the port number when starting the HTTP source or OTel trace source. (#1469)
- Allow for HTTP decorators in gRPC authentication plugins to give access to the HTTP request. (#1529)
Bug Fixes
- Fix a bug where a null plugin setting throws an exception when attempting to validate that setting. (#1525)
Infrastructure
- Data Prepper supports Docker labels for major version only now. This gets the latest within an entire major version series. (#1475)
Documentation
- Clarified the latest tested version of OpenDistro in the documentation. (#1494)
Maintenance
1.4.0
2022-05-17 Version 1.4.0
Features
Enhancements
- Migrate Trace Analytics plugins to Event Model (#1216, #1223, #1224, #1220, #1237, #1241, #1239)
- Support for OpenSearch 2.0 (previously supported versions of OpenSearch and OpenDistro are still supported)
Infrastructure
- Added needs-documentation label (#1373)
- Upload and publish JUnit test reports for some tests (#1336)
Documentation
Maintenance
- Update to use opensearch-java client instead of Rest High Level Client for bulk requests. (#1381)
- OpenSearch build files clean-up (#1315)
- Set 30 minute timeout to release process GitHub actions (#1392)
- Updated Gradle to 7.4.2 (#1377)
- Fix file uploads to S3 with Gradle 7 (#1383)
- Updated README links (#1376)
- Fix link to NOTICE file (#1268)
- Run OpenSearch sink integration tests against more versions of OpenDistro (#1348)
- Updated Mockito in OpenSearch plugin (#1339)
- Removed OpenSearch build-tools gradle plugin from OpenSearch plugin (#1327)
- Use complete url for processor READMEs (#1324)
Refactoring
1.3.0
2022-03-22 Version 1.3.0
Important Deprecation
- We have updated the pipeline definition to support
processor:
as replacement ofprepper:
which has been deprecated and will be fully removed in 2.0. (#655, #667)
Features
- AggregateProcessor for generic stateful aggregation (README) (#839, #850, #931, #969, #1022, #1046)
- DateProcessor to extract dates from fields in events (README) (#971, #1014)
- Processors to support mutate, alter, and delete fields from Events (README) (#1002)
- KeyValueProcessor to support parsing messages with key-value strings such as queries and properties (README) (#872)
- DropProcessor to filter out (remove/drop) entire events based off confitional expression (README) (#801, #1174)
Enhancements
- Add dependency Injection support for Data Prepper Core and Plugins. (#815, #846, #1140)
- Add Data Prepper expression evaluator (#1024, #1027, #1090, #1065, #1153, #1155, #1157, #1169, #1177, #1178)
- Support for nested syntax in LogStashConfigConverter (#1088)
- Support default values for attributes in LogstashConfigConverter mapping files (#1095)
- Support converting index with a date-time pattern in LogstashConfigConverter (#1045, #1095)
- Support creation from raw text string in JacksonEvent builder (#770, #1074)
- Support loading plugins from multiple packages (#948)
- Support of date and time patterns in opensearch sink index names (#788, #833)
- Support for passing a PipelineDescription in @DataPrepperPluginConstructor (#825)
- Validate Plugin Configurations using JSR-303 (#826)
- Support Logstash configuration conversion for OpenSearch Logstash output (#756)
- Support negation of boolean attribute values in Logstash configuration converter while mapping plugins (#756)
Bug Fixes
- Allow stdout and file sink to output generic object type (#1192)
- Fixed issue where Spring was unable to find the PrometheusMeterRegistry Bean (#1019)
- Use BlockingTaskExecutor in OtelTraceSource (#745)
Infrastructure
- Upload Maven artifacts as part of Release build (#1181)
- Updated the release build to push the Docker image and upload archives (#1151)
- Update Gradle project to produce only tar.gz archives (#1132)
- Add simple integration tests for AggregateProcessor (#1046)
- Assemble the data-prepper-core uber-jar using Zip64 (#820)
Documentation
- Example on log ingestion from Kubernetes containers (#729)
- Update copyright headers in release scripts (#933)
- Update copyright headers in data-prepper subprojects (#776, #928)
- Example on Data-Prepper ECS Firelens integration (#704)
Maintenance
- Delete outdated Kibana trace analytics example (#1135)
- Update Open Distro usages to OpenSearch in scripts (#1086)
- Upgrade docker-compose.yml files from ODFE to OpenSearch (#847)
- Update the ADOT example to OpenSearch (#703)
- Migrate demo sample application to use opensearch and opensearch dashboards instead of ODFE and kibana (#666)
- Update log ingestion example to use latest data-prepper docker image (#752)
Refactoring
1.2.1
1.2.0
2021-12-15 Version 1.2.0
Features
- Grok Prepper for processing unstructured data with grok pattern matching. (#302), (#324), (#377), (#449), (#510), (#548), (549), (#548), & (#586)
- HTTP Source plugin for receiving log data (#309), (#325), (#359), (#380), & (#415)
- Logstash config support. Users can now run Data Prepper with a logstash.conf file. (#581), (#568), (#579), (#580), (#582), (#587), (#636), (#473), (#535), (#552), (#577), (#588), (#597), (#616), (#447), (#506), (#559), (#575), (#584), (#591), & (#617)
Enhancements
- PluginSettings now supports generic List and Map data types (#302)
- A disabled SSL warning was added to HTTP and Otel Trace Source plugins and Data Prepper core APIs. A warning will appear in Data Prepper logs when SSL is disabled. (#537) & (#603)
- HTTP & Otel Trace Source support configurable basic authentication via plugins (#570), (#545)
- File Source file type is now configurable and supports parsing JSON files. The default remains plain text. (#601),
- Buffer Interface now supports batch writing via a writeAll method. BlockingBuffer now supports the writeAll method. (#320)
- Plugin framework now supports a DataPrepperPluginConstructor annotation for indicating a plugin constructor. ($481)
- Data Prepper core APIs now support basic HTTP Authentication, Docker image's core API's are now secure by default (#558), (#561)
- OpenSearch sink now supports forwarding requests through an HTTP Proxy (#479
- OpenSearch sink now supports an optional index-type parameter. (#480)) & (#433)
- OpenSearch sink now emits a new metric for bulkRequestSizeBytes. (#572)
Bug Fixes
- Fixed Github Actions for ODFE integration tests (#393)
Infrastructure
- Using Armeria client builder to help mitigate flaky end-to-end tests (#375)
- Syncing OpenSearch version to help mitigate flaky end-to-end tests (403)
- Refactoring existing end-to-end tests out of data prepper core into a new e2e-test module (#512)
- Added basic grok end-to-end tests and created necessary CI workflow (#536)
- Code coverage comment bot was added to the GitHub workflow (#549)
- Added DCO check for GitHub workflow (#360)
- Code checkstyle integration (#378)
- Improved Issue template (#397)
- Supporting Maven publication of the Data Prepper API (#596), (#634) & (#635)
- Added support for generating THIRD PARTY licenses (#621) & (#631)
Documentation
- New getting started, developer, getting started trace analytics and pipeline setup guide (#346)
- Added new log ingestion guide showcasing new HTTP and Grok Prepper Plugins (#573)
- New guide for migrating to OpenSearch Data Prepper from Open Distro Data Prepper (#470)
- Added Project Resource to documentation (#482)
- Added Coding Guidance to the Developer Guid (#560)
- Added instructions to build and run the Docker image locally (#564)
- Updated copyright headers for root project, api and core (#569)
- Update documentation to use OpenSearch Dashboards (#658)
- Improving OpenSearch sink documentation (#553), (#562), & (#563)
Maintenance
- Updated version to 1.2 (#416)
- The OpenSearch REST client and the build plugins used by the OpenSearch plugin are now at 1.1.0 (#384)
- Use OpenSearch instead of Elasticsearch in builds (#438)
- Use Netty 4.1.68 which fixes CVE-2021-37136 and CVE-2021-37137 (#661)
- Uses Log4j 2.16.0 which fixes CVE-2021-44228 and CVE-2021-45046 (#742)
Refactoring
- Created a new internal data model, Events, to capture data as it flows through the pipeline. This was introduced to eliminate the excessive de/serialization of the current implementation. Currently, integrated with only log ingestion and sample plugins (#), (#435), (#463), (#468), (#477), & (#539)
- The StdOutSink supports Objects instead of Strings as part of the migration to support the new event model (#599)
- The FileSource uses Objects to support the new event model (#)
- Small refactoring of PeerForwarder to improve readability of the code (#626)
- Plugin class redesign leveraging new plugin framework (#363), (#451), (#478)
- Consistent usage of OpenSearch, OpenSearch Dashboards and Amazon OpenSearch. (#637)
- Ref...
1.1.1
Data Prepper 1.1.1
Release Notes
1.1.0
Data Prepper v1.1.0
Release Notes