Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fdb_flush.lua executes so long leading to REDIS BUSY #1397

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

inspurSDN
Copy link

Why I did it
mac learning, configure TC1 to send the traffic of smac changes (number 217600). Log error: No More Resources, Orchagent hang

Nov 29 17:50:47.706137 NV2 ERR syncd#SDK: [FDB_UC.ERR] Polling enabled on error
Nov 29 17:50:47.806214 NV2 ERR syncd#SDK: [FDB_UC.ERR] Failed adding entries to RM (No More Resources)
Nov 29 17:50:47.806214 NV2 ERR syncd#SDK: [FDB_UC.ERR] Process polled data failed on SWID - 0, status - No More Resources
Nov 29 17:50:47.806214 NV2 ERR syncd#SDK: [FDB_UC.ERR] Polling enabled on error
Nov 29 17:50:47.906356 NV2 ERR syncd#SDK: [FDB_UC.ERR] Failed adding entries to RM (No More Resources)

What I did
Instead of using Lua script to flush, use loop deletion in the code.

Why I did it
mac learning, configure TC1 to send the traffic of smac changes (number 217600). Log error: No More Resources, Orchagent hang

Nov 29 17:50:47.706137 NV2 ERR syncd#SDK: [FDB_UC.ERR] Polling enabled on error
Nov 29 17:50:47.806214 NV2 ERR syncd#SDK: [FDB_UC.ERR] Failed adding entries to RM (No More Resources)
Nov 29 17:50:47.806214 NV2 ERR syncd#SDK: [FDB_UC.ERR] Process polled data failed on SWID - 0, status - No More Resources
Nov 29 17:50:47.806214 NV2 ERR syncd#SDK: [FDB_UC.ERR] Polling enabled on error
Nov 29 17:50:47.906356 NV2 ERR syncd#SDK: [FDB_UC.ERR] Failed adding entries to RM (No More Resources)

What I did
Instead of using Lua script to flush, use loop deletion in the code.
@inspurSDN
Copy link
Author

Hi @yxieca , @vaibhavhd , could you please kindly review these? #1397 #1399 #1400

@yxieca yxieca requested a review from qiluo-msft July 5, 2024 20:54
@yxieca
Copy link
Contributor

yxieca commented Jul 5, 2024

@qiluo-msft to help take an initial assessment.

@qiluo-msft qiluo-msft requested a review from kcudnik July 8, 2024 23:19
}
}
}
else if (vals.size() == 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else

What if vals.size is not 1 or 2? currently the code just skip silently.

portStr.c_str(),
std::to_string(flush_static).c_str());

swss::RedisReply r(m_dbAsic.get(), command);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to understand the root reason of "REDIS BUSY". If m_dbAsic is using redis pipeline, you could explicitly call its flush function to improve the redis responsibilities.

*bridgePortIdFromDb == portStr)))
{
m_dbAsic->del(it);
countFlushed++;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S riot is added in Lua since it's faster to flush fdb than doing it 1by1, are you sure it's flush leading to Redis busy? How many entries fdb do you have in database ?

@kcudnik
Copy link
Collaborator

kcudnik commented Jul 24, 2024

How come Lua is slower than deleting 1by one using hiredis ? Do you have performance stats for this ?

@@ -893,17 +893,46 @@ void RedisClient::processFlushEvent(
SWSS_LOG_THROW("unknown fdb flush entry type: %d", type);
}

for (int flush_static: vals)
// If has a lot of macs(example:217600) and use lua scripts, will cause REDIS BUSY.
// Change to this without atomicity operation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Atomic here would be desired sine it could lead to some wired issues while new Mac is learned during deletion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants