-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[x2cpg] Unification of Lambda Naming #3831
Conversation
* Added `io.joern.x2cpg.AstCreatorBase.nextClosureName` to generate names for closures/lambdas/anonymous functions. * Chose the naming scheme following Kotlin/Python using `<lambda>0`, `<lambda>1`, etc. due it low likelihood of collisions with real source code method naming schemes, and it doesn't include special regex characters. * Replaced naming conventions with this unified one in: - c2cpg - kotlin2cpg - javasrc2cpg - jssrc2cpg - php2cpg - pysrc2cpg The result is that all lambdas in the CPG now share the same naming scheme. Resolves #3792
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait.. I guess with this approach the names are not stable across multiple runs/files in parallel.
Did you check this? Maybe I am wrong here.
That's a good point. It looks like each AstCreator instance uses a separate keypool which should deal with cross-file consistency (this is similar to what javasrc2cpg was doing before without issues that I've noticed), but it's worth confirming that it is stable. |
Apologies for the close. I fumbled and accidentally hit "Close with comment" instead of just comment |
Yeah, it's bound to |
@maltek FYI: when this goes live (i.e., jssrc2cpg-internal is also updated) the sptests expectations for js2cpg and jssrc2cpg will differ a bit more from each other. |
Not just sptests, that's fine. (Nobody cares about the method names in those...) This will have some customer impact for us since stable method full names are a requirement for tracking findings across different scans. @bbrehm you're the fingerprint guy - will this only affect findings where the source or sink is a lambda, or also any findings where it's anywhere on the path? (I think most of those frontends aren't considered GA yet, so we might still be able to get away with such a change. But it requires a discussion.) I don't particularly like that this is per-file now everywhere. At least jssrc had a per-method counter for this, which means that a code change of a lambda only affected stability of the fullnames within that method - the fullnames in the rest of that file staying as they were before. Ideally, we'd move all frontends in that direction instead of the opposite one. |
Alternatively, we could add a modifier like Either solution works for me, the modifier direction may be less intrusive. |
This is not a pressing issue, it's more of a larger project towards cleaning up. Lambda and module-defining methods seem to follow different patterns in each frontend. For the CPG to be more of a uniform abstraction, I'd say that these structures should either share the same naming conventions or have a defining property. |
for the use-case of finding all lambdas, a modifier feels cleaner to me anyway than a regex search 👍 (Though I would bikeshed the naming a bit - if it's about unnamed functions, I would go for something like |
It still would be nice if we could unify these names, just to have things cleaner... with fullnames that's just breaking a deep assumption on the qwiet side :( |
@maltek Let me keep this PR open for a while and prioritize the modifier then. That way, we can migrate to checking the modifier to retrieve these first. |
Related PR #3842 |
I'd like to merge this at some point tomorrow to conclude the unification of lambdas in the CPG. Then this week can conclude the uniform representation of lambdas in the CPG. |
io.joern.x2cpg.AstCreatorBase.nextClosureName
to generate names for closures/lambdas/anonymous functions.<lambda>0
,<lambda>1
, etc. due it low likelihood of collisions with real source code method naming schemes, and it doesn't include special regex characters.The result is that all lambdas in the CPG now share the same naming scheme.
Resolves #3792