Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while transfer from Elastic search Milvus: Unable to create a source for identifier 'Elasticsearch'. #47

Open
gifi-siby opened this issue Jan 27, 2025 · 4 comments
Assignees

Comments

@gifi-siby
Copy link

gifi-siby commented Jan 27, 2025

I am trying to migrate data from Elasticsearch to Milvus and below is my configuration:

env {
  parallelism = 1
  job.mode = "BATCH"
}
source{
       Elasticsearch {
           tls_verify_hostname = false
           tls_verify_certificate = false
           hosts = ["http://127.0.0.1:9200/"]
           api_key = "APIkey"
           index = "books"
       }
}
transform {
  FieldMapper {
    field_mapper = {
        _id = _id
        vector = vector
    }
  }
}
sink {
      Milvus {
          url="http://localhost:19530/"
          token=""
          database="default"
          batch_size=10
      }
}

When try to run ./bin/seatunnel.sh --config ./es_to_milvus.conf -m local, getting the below error:
[Factory initialize failed] - Unable to create a source for identifier 'Elasticsearch'.

@nianliuu
Copy link
Collaborator

nianliuu commented Feb 5, 2025

@gifi-siby Hi,Is there any error log which can help us investigate? Thank you!

@gifi-siby
Copy link
Author

Sure, below is the logs I got in the console:

root@41e1a3fa66b7:/opt/seatunnel# mkdir -p ./logs
./bin/seatunnel-cluster.sh -d
start master_and_worker node
root@41e1a3fa66b7:/opt/seatunnel# ./bin/seatunnel.sh --config ./es_to_milvus.conf 
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2025-02-05 14:50:55,270 INFO  [c.h.i.c.AbstractConfigLocator ] [main] - Loading configuration '/opt/seatunnel/config/seatunnel.yaml' from System property 'seatunnel.config'
2025-02-05 14:50:55,280 INFO  [c.h.i.c.AbstractConfigLocator ] [main] - Using configuration file at /opt/seatunnel/config/seatunnel.yaml
2025-02-05 14:50:55,293 INFO  [o.a.s.e.c.c.SeaTunnelConfig   ] [main] - seatunnel.home is /opt/seatunnel
2025-02-05 14:50:55,471 INFO  [c.h.i.c.AbstractConfigLocator ] [main] - Loading configuration '/opt/seatunnel/config/hazelcast.yaml' from System property 'hazelcast.config'
2025-02-05 14:50:55,471 INFO  [c.h.i.c.AbstractConfigLocator ] [main] - Using configuration file at /opt/seatunnel/config/hazelcast.yaml
2025-02-05 14:50:55,989 INFO  [c.h.i.c.AbstractConfigLocator ] [main] - Loading configuration '/opt/seatunnel/config/hazelcast-client.yaml' from System property 'hazelcast.client.config'
2025-02-05 14:50:55,990 INFO  [c.h.i.c.AbstractConfigLocator ] [main] - Using configuration file at /opt/seatunnel/config/hazelcast-client.yaml
2025-02-05 14:50:56,398 INFO  [.c.i.s.ClientInvocationService] [main] - hz.client_1 [seatunnel] [5.1] Running with 2 response threads, dynamic=true
2025-02-05 14:50:56,491 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is STARTING
2025-02-05 14:50:56,492 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is STARTED
2025-02-05 14:50:56,537 INFO  [.c.i.c.ClientConnectionManager] [main] - hz.client_1 [seatunnel] [5.1] Trying to connect to cluster: seatunnel
2025-02-05 14:50:56,542 INFO  [.c.i.c.ClientConnectionManager] [main] - hz.client_1 [seatunnel] [5.1] Trying to connect to [localhost]:5801
2025-02-05 14:50:56,612 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is CLIENT_CONNECTED
2025-02-05 14:50:56,613 INFO  [.c.i.c.ClientConnectionManager] [main] - hz.client_1 [seatunnel] [5.1] Authenticated with server [localhost]:5801:66e9f961-9a44-497d-a43b-45d330644b00, server version: 5.1, local address: /127.0.0.1:48977
2025-02-05 14:50:56,617 INFO  [c.h.i.d.Diagnostics           ] [main] - hz.client_1 [seatunnel] [5.1] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to the JVM arguments.
2025-02-05 14:50:56,645 INFO  [c.h.c.i.s.ClientClusterService] [hz.client_1.event-5] - hz.client_1 [seatunnel] [5.1] 

Members [1] {
        Member [localhost]:5801 - 66e9f961-9a44-497d-a43b-45d330644b00
}

2025-02-05 14:50:56,706 INFO  [.c.i.s.ClientStatisticsService] [main] - Client statistics is enabled with period 5 seconds.
2025-02-05 14:50:57,001 INFO  [o.a.s.c.s.u.ConfigBuilder     ] [main] - Loading config file from path: ./es_to_milvus.conf
2025-02-05 14:50:57,177 INFO  [o.a.s.c.s.u.ConfigShadeUtils  ] [main] - Load config shade spi: [base64]
2025-02-05 14:50:57,229 INFO  [o.a.s.c.s.u.ConfigBuilder     ] [main] - Parsed config file: 
{
    "transform" : [
        {
            "field_mapper" : {
                "vector" : "vector",
                "_id" : "_id"
            },
            "plugin_name" : "FieldMapper"
        }
    ],
    "sink" : [
        {
            "database" : "default",
            "batch_size" : 10,
            "plugin_name" : "Milvus",
            "url" : "http://localhost:19530/",
            "token" : ""
        }
    ],
    "source" : [
        {
            "tls_verify_hostname" : false,
            "api_key" : "api-key",
            "hosts" : [
                "http://127.0.0.1:9200/"
            ],
            "index" : "books",
            "plugin_name" : "Elasticsearch",
            "tls_verify_certificate" : false
        }
    ],
    "env" : {
        "job.mode" : "BATCH",
        "parallelism" : 1
    }
}

2025-02-05 14:50:57,241 INFO  [p.MultipleTableJobConfigParser] [main] - add common jar in plugins :[]
2025-02-05 14:50:57,270 INFO  [.s.p.d.AbstractPluginDiscovery] [main] - Load SeaTunnelSink Plugin from /opt/seatunnel/connectors
2025-02-05 14:50:57,281 INFO  [.s.p.d.AbstractPluginDiscovery] [main] - Discovery plugin jar for: PluginIdentifier{engineType='seatunnel', pluginType='source', pluginName='Elasticsearch'} at: file:/opt/seatunnel/connectors/connector-elasticsearch-2.3.8-SNAPSHOT.jar
2025-02-05 14:50:57,287 INFO  [.s.p.d.AbstractPluginDiscovery] [main] - Discovery plugin jar for: PluginIdentifier{engineType='seatunnel', pluginType='transform', pluginName='FieldMapper'} at: file:/opt/seatunnel/connectors/seatunnel-transforms-v2-2.3.8-SNAPSHOT.jar
2025-02-05 14:50:57,288 INFO  [.s.p.d.AbstractPluginDiscovery] [main] - Discovery plugin jar for: PluginIdentifier{engineType='seatunnel', pluginType='sink', pluginName='Milvus'} at: file:/opt/seatunnel/connectors/connector-milvus-2.3.8-SNAPSHOT.jar
2025-02-05 14:50:57,298 INFO  [p.MultipleTableJobConfigParser] [main] - start generating all sources.
2025-02-05 14:50:57,520 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is SHUTTING_DOWN
2025-02-05 14:50:57,525 INFO  [.c.i.c.ClientConnectionManager] [main] - hz.client_1 [seatunnel] [5.1] Removed connection to endpoint: [localhost]:5801:66e9f961-9a44-497d-a43b-45d330644b00, connection: ClientConnection{alive=false, connectionId=1, channel=NioChannel{/127.0.0.1:48977->localhost/127.0.0.1:5801}, remoteAddress=[localhost]:5801, lastReadTime=2025-02-05 14:50:56.985, lastWriteTime=2025-02-05 14:50:56.980, closedTime=2025-02-05 14:50:57.522, connected server version=5.1}
2025-02-05 14:50:57,526 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is CLIENT_DISCONNECTED
2025-02-05 14:50:57,531 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is SHUTDOWN
2025-02-05 14:50:57,532 INFO  [s.c.s.s.c.ClientExecuteCommand] [main] - Closed SeaTunnel client......
2025-02-05 14:50:57,532 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - 

===============================================================================


2025-02-05 14:50:57,532 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Fatal Error, 

2025-02-05 14:50:57,533 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Please submit bug report in https://github.com/apache/seatunnel/issues

2025-02-05 14:50:57,533 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Reason:SeaTunnel job executed failed 

2025-02-05 14:50:57,539 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:213)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.api.table.factory.FactoryException: ErrorCode:[API-06], ErrorDescription:[Factory initialize failed] - Unable to create a source for identifier 'Elasticsearch'.
        at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:101)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSource(MultipleTableJobConfigParser.java:375)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:209)
        at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.getLogicalDag(ClientJobExecutionEnvironment.java:114)
        at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.execute(ClientJobExecutionEnvironment.java:182)
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:160)
        ... 2 more
Caused by: java.lang.IllegalArgumentException: Invalid HTTP host: 127.0.0.1:9200/
        at org.apache.http.HttpHost.create(HttpHost.java:122)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.buildHttpHosts(EsRestClient.java:239)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.getRestClientBuilder(EsRestClient.java:180)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.createInstance(EsRestClient.java:143)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.createInstance(EsRestClient.java:116)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSource.getFieldTypeMapping(ElasticsearchSource.java:287)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSource.parseOneIndexQueryConfig(ElasticsearchSource.java:117)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSource.<init>(ElasticsearchSource.java:83)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSourceFactory.lambda$createSource$0(ElasticsearchSourceFactory.java:79)
        at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:113)
        at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:74)
        ... 7 more
 
2025-02-05 14:50:57,540 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - 
===============================================================================



Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:213)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.api.table.factory.FactoryException: ErrorCode:[API-06], ErrorDescription:[Factory initialize failed] - Unable to create a source for identifier 'Elasticsearch'.
        at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:101)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSource(MultipleTableJobConfigParser.java:375)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:209)
        at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.getLogicalDag(ClientJobExecutionEnvironment.java:114)
        at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.execute(ClientJobExecutionEnvironment.java:182)
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:160)
        ... 2 more
Caused by: java.lang.IllegalArgumentException: Invalid HTTP host: 127.0.0.1:9200/
        at org.apache.http.HttpHost.create(HttpHost.java:122)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.buildHttpHosts(EsRestClient.java:239)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.getRestClientBuilder(EsRestClient.java:180)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.createInstance(EsRestClient.java:143)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.createInstance(EsRestClient.java:116)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSource.getFieldTypeMapping(ElasticsearchSource.java:287)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSource.parseOneIndexQueryConfig(ElasticsearchSource.java:117)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSource.<init>(ElasticsearchSource.java:83)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSourceFactory.lambda$createSource$0(ElasticsearchSourceFactory.java:79)
        at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:113)
        at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:74)
        ... 7 more
root@41e1a3fa66b7:/opt/seatunnel# 

@gifi-siby
Copy link
Author

Additional Logs:
seatunnel-engine-server.log
seatunnel-server.out.log

@nianliuu
Copy link
Collaborator

nianliuu commented Feb 6, 2025

looks like the es endpoint has some problem. It should be http://127.0.0.1:9200, no "/" behind. Also, make sure your local es cluster can be accessed in seatunnel pod

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants