Skip to content

Commit

Permalink
Correct Parquet support for the S3 sink and a new multipart buffer ty…
Browse files Browse the repository at this point in the history
…pe (opensearch-project#3186)

Correct the ParquetOutputCodec and moved into the S3 sink project for now. It has a few corrections including support for compression and avoiding multiple S3 copies. This PR also adds a new buffer type to the S3 sink - Multipart uploads. This PR also includes a number of refactorings to the project and the integration tests.

Signed-off-by: David Venable <[email protected]>
  • Loading branch information
dlvenable authored Aug 17, 2023
1 parent b50f095 commit 1f4b48e
Show file tree
Hide file tree
Showing 60 changed files with 1,050 additions and 1,225 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,22 @@ public interface OutputCodec {
*/
String getExtension();

/**
* Returns true if this codec has an internal compression. That is, the entire
* {@link OutputStream} should not be compressed.
* <p>
* When this value is true, sinks should not attempt to encrypt the final {@link OutputStream}
* at all.
* <p>
* For example, Parquet compression happens within the file. Each column chunk
* is compressed independently.
*
* @return True if the compression is internal to the codec; false if whole-file compression is ok.
*/
default boolean isCompressionInternal() {
return false;
}

default Event addTagsToEvent(Event event, String tagsTargetKey) throws JsonProcessingException {
String eventJsonString = event.jsonBuilder().includeTags(tagsTargetKey).toJsonString();
Map<String, Object> eventData = objectMapper.readValue(eventJsonString, new TypeReference<>() {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
package org.opensearch.dataprepper.model.codec;

import com.fasterxml.jackson.core.JsonProcessingException;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.mockito.invocation.InvocationOnMock;
import org.opensearch.dataprepper.model.event.DefaultEventMetadata;
import org.opensearch.dataprepper.model.event.Event;
import org.opensearch.dataprepper.model.event.EventMetadata;
Expand All @@ -19,12 +19,17 @@
import java.util.Set;
import java.util.UUID;

import static org.hamcrest.CoreMatchers.equalTo;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.junit.Assert.assertNotEquals;
import static org.mockito.Mockito.mock;

public class OutputCodecTest {
@Test
void isCompressionInternal_returns_false() {
OutputCodec objectUnderTest = mock(OutputCodec.class, InvocationOnMock::callRealMethod);

@BeforeEach
public void setUp() {
assertThat(objectUnderTest.isCompressionInternal(), equalTo(false));
}

@Test
Expand Down

This file was deleted.

Loading

0 comments on commit 1f4b48e

Please sign in to comment.