Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LDAP ingestor list index out of range error for some user attributes #8523

Closed
alplatonov opened this issue Jul 28, 2023 · 3 comments
Closed
Labels
bug Bug report stale

Comments

@alplatonov
Copy link
Contributor

alplatonov commented Jul 28, 2023

Describe the bug
LDAP ingestor provides "list index out of range" error if the field doesn't fill in LDAP catalog.

To Reproduce
Steps to reproduce the behavior:

  1. Setup ldap catalog without Person departmentNumber field
  2. Setup ingest LDAP source; it can be basic.
  3. See the error
[2023-07-28 12:59:44,122] ERROR    {datahub.entrypoints:199} - Command failed: list index out of range
Traceback (most recent call last):
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/entrypoints.py", line 186, in main
    sys.exit(datahub(standalone_mode=False, **kwargs))
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/venv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/venv/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/venv/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/venv/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/venv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/venv/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/telemetry/telemetry.py", line 448, in wrapper
    raise e
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/telemetry/telemetry.py", line 397, in wrapper
    res = func(*args, **kwargs)
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/utilities/memory_leak_detector.py", line 95, in wrapper
    return func(ctx, *args, **kwargs)
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/cli/ingest_cli.py", line 195, in run
    ret = loop.run_until_complete(run_ingestion_and_check_upgrade())
  File "/opt/homebrew/Cellar/[email protected]/3.10.12_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/cli/ingest_cli.py", line 179, in run_ingestion_and_check_upgrade
    ret = await ingestion_future
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/cli/ingest_cli.py", line 137, in run_pipeline_to_completion
    raise e
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/cli/ingest_cli.py", line 129, in run_pipeline_to_completion
    pipeline.run()
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/ingestion/run/pipeline.py", line 367, in run
    for wu in itertools.islice(
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/ingestion/api/source_helpers.py", line 119, in auto_stale_entity_removal
    for wu in stream:
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/ingestion/api/source_helpers.py", line 143, in auto_workunit_reporter
    for wu in stream:
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/ingestion/api/source_helpers.py", line 208, in auto_browse_path_v2
    for urn, batch in _batch_workunits_by_urn(stream):
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/ingestion/api/source_helpers.py", line 346, in _batch_workunits_by_urn
    for wu in stream:
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/ingestion/api/source_helpers.py", line 156, in auto_materialize_referenced_tags
    for wu in stream:
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/ingestion/api/source_helpers.py", line 70, in auto_status_aspect
    for wu in stream:
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/ingestion/source/ldap.py", line 270, in get_workunits_internal
    yield from self.handle_user(dn, attrs)
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/ingestion/source/ldap.py", line 319, in handle_user
    mce = self.build_corp_user_mce(dn, attrs, manager_ldap)
  File "/Users/alplatonov/git/datahub-idea/metadata-ingestion/src/datahub/ingestion/source/ldap.py", line 365, in build_corp_user_mce
    int(attrs[self.config.user_attrs_map["departmentId"]][0].decode())
IndexError: list index out of range

Expected behavior
Datahub ingest LDAP source without errors.

PR #8525

@alplatonov
Copy link
Contributor Author

Waiting for a release.

@github-actions
Copy link

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

@github-actions github-actions bot added the stale label Sep 11, 2023
@hsheth2
Copy link
Collaborator

hsheth2 commented Oct 4, 2023

#8525 has been merged and released, so I'm going to go ahead and close this.

@hsheth2 hsheth2 closed this as completed Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report stale
Projects
None yet
Development

No branches or pull requests

2 participants