Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consumer: KeyError: 'NextShardIterator' after shard scale-up #29

Closed
zarnovican opened this issue Apr 13, 2021 · 4 comments
Closed

consumer: KeyError: 'NextShardIterator' after shard scale-up #29

zarnovican opened this issue Apr 13, 2021 · 4 comments

Comments

@zarnovican
Copy link
Contributor

Hi,

I have scaled-up my Kinesis stream from 2 to 4 shards. Since then, I'm not able to read anything because of this error:

Apr 13 14:06:39 monitoring python[1418761]: KeyError: 'NextShardIterator'
Apr 13 14:06:42 monitoring python[1418761]: 2021-04-13 14:06:42,031 ERROR kinesis.consumer 'NextShardIterator'
Apr 13 14:06:42 monitoring python[1418761]: Traceback (most recent call last):
Apr 13 14:06:42 monitoring python[1418761]:   File "/home/ubuntu/.virtualenvs/kinesis_to_influx/lib/python3.8/site-packages/kinesis/consumer.py", line 121, in _fetch
Apr 13 14:06:42 monitoring python[1418761]:     await self.fetch()
Apr 13 14:06:42 monitoring python[1418761]:   File "/home/ubuntu/.virtualenvs/kinesis_to_influx/lib/python3.8/site-packages/kinesis/consumer.py", line 266, in fetch
Apr 13 14:06:42 monitoring python[1418761]:     if not result["NextShardIterator"]:
Apr 13 14:06:42 monitoring python[1418761]: KeyError: 'NextShardIterator'

When I added a bit of logging, the result variable contained this:

{'MillisBehindLatest': 4358000,
 'Records': [],
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '439',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Tue, 13 Apr 2021 14:09:44 GMT',
                                      'x-amz-id-2': 'nE1OTCciholISmxfJITWsyexTmFbtp5qmCWLbaie1QFWs4IvB7hjXV9o4wLeyznyInZ2yzvP54OYbOv+tCR1yLfvO89qk0F3EHqTm5OXcQ8=',
                                      'x-amzn-requestid': 'f9e5ea88-296f-9be6-a10a-42e0f39804d1'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'f9e5ea88-296f-9be6-a10a-42e0f39804d1',
                      'RetryAttempts': 0}}

Restarting the consumer does not help.

I haven't tried it yet, but I guess, destroying and recreating the stream would solve the problem.

Info:

  • async-kinesis version 1.1.2
  • single consumer
  • consumer is created as (nothing fancy):
async with kinesis.Consumer(stream_name=config.KINESIS_STREAM, iterator_type='LATEST', processor=kinesis.StringProcessor()) as kinesis_stream:
@hampsterx
Copy link
Owner

yep known issue, WIP see #26

@jmcgrath207
Copy link
Contributor

jmcgrath207 commented Apr 13, 2021

@zarnovican so this issue will correct itself in 24 hours or based on your shard retention time after the creation of the shards.

Here is why.

When you bumped your shards 2 to 4, there really are 6 shards in total now. 2( being the old parent shards) and 2 child shards that point back to the parent shards and 2 shards that are completely new with no parent. Those 2 old parent shards will stay in the available shard list until their retention period expires.

As hampsters mentioned #26 combats this problem, by making a shard lifecycle manager to help detect this issue. It's been drawn out due to lack of time on my side and issues with the kinesislite library for testing so I have to use a real stream to fix it.

Saying that, if you want to get involved in fixing it, I am more than happy to take pull requests on #26

Also worth mentioning, this only affects consumers. The producers are unaffected by this.

@zarnovican
Copy link
Contributor Author

Thank you both for timely response and John for the great explanation.

It's been drawn out due to lack of time

No pressure. I have destroyed and re-created the affected Kinesis streams. It solved the problem for me. Fortunately, I'm using Kinesis only for CloudFront logging. The gap in graphs is no problem for me. I won't be needing resharding for a while anyway. I just though to let you know about the problem. But if you are aware of it, I'm happy to close it as duplicate.

@zarnovican
Copy link
Contributor Author

duplicate of #26

@zarnovican zarnovican closed this as not planned Won't fix, can't repro, duplicate, stale Aug 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants