Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unhelpful error message initializing OpenSearch Ingestion & sink #4717

Open
elliott-king opened this issue Jul 9, 2024 · 5 comments
Open
Labels
bug Something isn't working

Comments

@elliott-king
Copy link

Describe the bug
My pipeline sink/ingestion is failing to initialize, with the following cloudwatch error logs:

2024-07-09T22:19:44.802 [dynamodb-pipeline-sink-worker-2-thread-1] WARN  org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink - Failed to initialize OpenSearch sink with a retryable exception. 
java.lang.IllegalStateException: org.apache.http.nio.reactor.IOReactorException: Failure opening selector
	at org.apache.http.impl.nio.client.IOReactorUtils.create(IOReactorUtils.java:45) ~[httpasyncclient-4.1.5.jar:4.1.5]
	at org.apache.http.impl.nio.client.HttpAsyncClientBuilder.build(HttpAsyncClientBuilder.java:686) ~[httpasyncclient-4.1.5.jar:4.1.5]
	at java.base/java.security.AccessController.doPrivileged(Native Method) ~[?:?]
	at org.opensearch.client.RestClientBuilder.createHttpClient(RestClientBuilder.java:318) ~[opensearch-rest-client-2.7.0.jar:2.7.0]
	at java.base/java.security.AccessController.doPrivileged(Native Method) ~[?:?]
	at org.opensearch.client.RestClientBuilder.build(RestClientBuilder.java:261) ~[opensearch-rest-client-2.7.0.jar:2.7.0]
	at org.opensearch.client.RestHighLevelClient.<init>(RestHighLevelClient.java:284) ~[opensearch-rest-high-level-client-1.3.14.jar:2.7.0]
	at org.opensearch.client.RestHighLevelClient.<init>(RestHighLevelClient.java:276) ~[opensearch-rest-high-level-client-1.3.14.jar:2.7.0]
	at org.opensearch.dataprepper.plugins.sink.opensearch.ConnectionConfiguration.createClient(ConnectionConfiguration.java:328) ~[opensearch-2.x.138.jar:?]
	at org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink.doInitializeInternal(OpenSearchSink.java:222) ~[opensearch-2.x.138.jar:?]
	at org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink.doInitialize(OpenSearchSink.java:201) ~[opensearch-2.x.138.jar:?]
	at org.opensearch.dataprepper.model.sink.AbstractSink.initialize(AbstractSink.java:52) ~[data-prepper-api-2.x.138.jar:?]
	at org.opensearch.dataprepper.pipeline.Pipeline.isReady(Pipeline.java:200) ~[data-prepper-core-2.x.138.jar:?]
	at org.opensearch.dataprepper.pipeline.Pipeline.lambda$execute$2(Pipeline.java:252) ~[data-prepper-core-2.x.138.jar:?]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.base/java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.apache.http.nio.reactor.IOReactorException: Failure opening selector
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.<init>(AbstractMultiworkerIOReactor.java:144) ~[httpcore-nio-4.4.15.jar:4.4.15]
	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.<init>(DefaultConnectingIOReactor.java:82) ~[httpcore-nio-4.4.15.jar:4.4.15]
	at org.apache.http.impl.nio.client.IOReactorUtils.create(IOReactorUtils.java:43) ~[httpasyncclient-4.1.5.jar:4.1.5]
	... 18 more
Caused by: java.io.IOException: Too many open files
	at java.base/sun.nio.ch.IOUtil.makePipe(Native Method) ~[?:?]
	at java.base/sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:83) ~[?:?]
	at java.base/sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36) ~[?:?]
	at java.base/java.nio.channels.Selector.open(Selector.java:295) ~[?:?]
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.<init>(AbstractMultiworkerIOReactor.java:142) ~[httpcore-nio-4.4.15.jar:4.4.15]
	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.<init>(DefaultConnectingIOReactor.java:82) ~[httpcore-nio-4.4.15.jar:4.4.15]
	at org.apache.http.impl.nio.client.IOReactorUtils.create(IOReactorUtils.java:43) ~[httpasyncclient-4.1.5.jar:4.1.5]
	... 18 more

There is a similar issue (#4195), but that is an auth issue. Seems like "Failure opening selector" can be a file io issue, but idk how to dive into that.

To Reproduce
I'm basically following the AWS Amplify Opensearch guide, but have an "Events" model instead of "Todos":

// backend.ts
// Define OpenSearch index mappings
// https://docs.amplify.aws/react/build-a-backend/data/custom-business-logic/search-and-aggregate-queries/#step-3b-opensearch-service-pipeline
const indexName = "event";

const indexMapping = {
  settings: {
    number_of_shards: 1,
    number_of_replicas: 0,
  },
  mappings: {
    properties: {
      id: {
        type: "keyword",
      },
      name: {
        type: "text",
      },
      ...
      times: { type: "date", format: "time " },
    },
  },
};

// OpenSearch template definition
const openSearchTemplate = `
version: "2"
dynamodb-pipeline:
  source:
    dynamodb:
      acknowledgments: true
      tables:
        - table_arn: "${tableArn}"
          stream:
            start_position: "LATEST"
          export:
            s3_bucket: "${s3BucketName}"
            s3_region: "${region}"
            s3_prefix: "${tableName}/"
      aws:
        sts_role_arn: "${openSearchIntegrationPipelineRole.roleArn}"
        region: "${region}"
  sink:
    - opensearch:
        hosts:
          - "https://${openSearchDomain.domainEndpoint}"
        index: "${indexName}"
        index_type: "custom"
        template_content: |
          ${JSON.stringify(indexMapping)}
        document_id: '\${getMetadata("primary_key")}'
        action: '\${getMetadata("opensearch_action")}'
        document_version: '\${getMetadata("document_version")}'
        document_version_type: "external"
        bulk_size: 4
        aws:
          sts_role_arn: "${openSearchIntegrationPipelineRole.roleArn}"
          region: "${region}"
`;

// Create a CloudWatch log group
const logGroupName = "/aws/vendedlogs/OpenSearchService/pipelines/dev";
[...omitted for brevity]

// Create an OpenSearch Integration Service pipeline
const cfnPipeline = new osis.CfnPipeline(
  dataStack,
  "OpenSearchIntegrationPipeline",
  {
    maxUnits: 4,
    minUnits: 1,
    pipelineConfigurationBody: openSearchTemplate,
    pipelineName: "opensearch-ddb-integration",
    logPublishingOptions: {
      isLoggingEnabled: true,
      cloudWatchLogDestination: {
        logGroup: logGroupName,
      },
    },
  }
);

// Add OpenSearch data source
// https://docs.amplify.aws/react/build-a-backend/data/custom-business-logic/search-and-aggregate-queries/#step-4-expose-new-queries-on-opensearch
const osDataSource = backend.data.addOpenSearchDataSource(
  "osDataSource",
  openSearchDomain
);

Expected behavior
I'd expect the pipeline to work, or more involved logging.

@elliott-king elliott-king added bug Something isn't working untriaged labels Jul 9, 2024
@elliott-king
Copy link
Author

FOLLOW-UP: looks like there was something wrong with my index mapping. I abbreviated it above, but here's the original:

const indexMapping = {
  settings: {
    number_of_shards: 1,
    number_of_replicas: 0,
  },
  mappings: {
    properties: {
      id: {
        type: "keyword",
      },
      name: {
        type: "text",
      },
      host: {
        type: "text",
      },
      description: {
        type: "text",
      },
      location_description: {
        type: "text",
      },
      rsvp: {
        type: "boolean",
      },
      source: { type: "text" },
      website: { type: "text" },
      types: { type: "string" },
      location: { type: "geo_point" },
      dates: { type: "date", format: "year_month_day" },
      times: { type: "date", format: "time " },
    },
  },
};

The problem here was:

      types: { type: "string" },

But that's pretty much impossible to determine from the logs.

@dlvenable
Copy link
Member

@elliott-king , Did your pipeline work after you corrected your index mapping? Or does this still occur?

@jmanuel1
Copy link

jmanuel1 commented Jul 24, 2024

One of my coworkers encountered an issue like this last week. Our logs have a whole lot of the same "Failure opening selector" exception, but these exceptions were directly preceded by authorization exceptions that he was trying to address.

2024-07-18T19:17:47.338 [dynamodb-pipeline-sink-worker-2-thread-1] WARN  org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink - Failed to initialize OpenSearch sink, retrying: org.opensearch.client.opensearch._types.OpenSearchException: Request failed: [security_exception] OpenSearch exception [type=authorization_exception, reason=User does not have permissions for the requested resource]

Here's the IO exception for reference:

2024-07-18T19:17:48.343 [dynamodb-pipeline-sink-worker-2-thread-1] WARN  org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink - Failed to initialize OpenSearch sink with a retryable exception. 
java.lang.IllegalStateException: org.apache.http.nio.reactor.IOReactorException: Failure opening selector
	at org.apache.http.impl.nio.client.IOReactorUtils.create(IOReactorUtils.java:45) ~[httpasyncclient-4.1.5.jar:4.1.5]
	at org.apache.http.impl.nio.client.HttpAsyncClientBuilder.build(HttpAsyncClientBuilder.java:686) ~[httpasyncclient-4.1.5.jar:4.1.5]
	at java.base/java.security.AccessController.doPrivileged(Native Method) ~[?:?]
	at org.opensearch.client.RestClientBuilder.createHttpClient(RestClientBuilder.java:318) ~[opensearch-rest-client-2.7.0.jar:2.7.0]
	at java.base/java.security.AccessController.doPrivileged(Native Method) ~[?:?]
	at org.opensearch.client.RestClientBuilder.build(RestClientBuilder.java:261) ~[opensearch-rest-client-2.7.0.jar:2.7.0]
	at org.opensearch.client.RestHighLevelClient.<init>(RestHighLevelClient.java:284) ~[opensearch-rest-high-level-client-1.3.14.jar:2.7.0]
	at org.opensearch.client.RestHighLevelClient.<init>(RestHighLevelClient.java:276) ~[opensearch-rest-high-level-client-1.3.14.jar:2.7.0]
	at org.opensearch.dataprepper.plugins.sink.opensearch.ConnectionConfiguration.createClient(ConnectionConfiguration.java:328) ~[opensearch-2.x.143.jar:?]
	at org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink.lambda$doInitializeInternal$0(OpenSearchSink.java:226) ~[opensearch-2.x.143.jar:?]
	at org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchClientRefresher.<init>(OpenSearchClientRefresher.java:33) ~[opensearch-2.x.143.jar:?]
	at org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink.doInitializeInternal(OpenSearchSink.java:232) ~[opensearch-2.x.143.jar:?]
	at org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink.doInitialize(OpenSearchSink.java:201) ~[opensearch-2.x.143.jar:?]
	at org.opensearch.dataprepper.model.sink.AbstractSink.initialize(AbstractSink.java:52) ~[data-prepper-api-2.x.143.jar:?]
	at org.opensearch.dataprepper.pipeline.Pipeline.isReady(Pipeline.java:200) ~[data-prepper-core-2.x.143.jar:?]
	at org.opensearch.dataprepper.pipeline.Pipeline.lambda$execute$2(Pipeline.java:252) ~[data-prepper-core-2.x.143.jar:?]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.base/java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.apache.http.nio.reactor.IOReactorException: Failure opening selector
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.<init>(AbstractMultiworkerIOReactor.java:144) ~[httpcore-nio-4.4.15.jar:4.4.15]
	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.<init>(DefaultConnectingIOReactor.java:82) ~[httpcore-nio-4.4.15.jar:4.4.15]
	at org.apache.http.impl.nio.client.IOReactorUtils.create(IOReactorUtils.java:43) ~[httpasyncclient-4.1.5.jar:4.1.5]
	... 20 more
Caused by: java.io.IOException: Too many open files
	at java.base/sun.nio.ch.IOUtil.makePipe(Native Method) ~[?:?]
	at java.base/sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:83) ~[?:?]
	at java.base/sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36) ~[?:?]
	at java.base/java.nio.channels.Selector.open(Selector.java:295) ~[?:?]
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.<init>(AbstractMultiworkerIOReactor.java:142) ~[httpcore-nio-4.4.15.jar:4.4.15]
	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.<init>(DefaultConnectingIOReactor.java:82) ~[httpcore-nio-4.4.15.jar:4.4.15]
	at org.apache.http.impl.nio.client.IOReactorUtils.create(IOReactorUtils.java:43) ~[httpasyncclient-4.1.5.jar:4.1.5]
	... 20 more

He deployed a change to attempt to solve the authorization issue, but I don't know if it had any effect. Anyway, we're still seeing the same IOException.

BTW these two particular logs that I posted are adjacent.

@elliott-king
Copy link
Author

elliott-king commented Jul 24, 2024

@elliott-king , Did your pipeline work after you corrected your index mapping? Or does this still occur?

It did work. The source was user error (me), but the logs were still indecipherable.

Since my issue was related to the logging output, I left this open.

@oeyh
Copy link
Collaborator

oeyh commented Aug 13, 2024

This should be fixed by #4758

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants