You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently, when diffing minimized bundled JavaScript code, there's a significant amount of 'noise' due to the bundler often changing the minified variable names between builds. This can obscure the real changes and make the diff output less useful for understanding code changes.
Describe the solution you'd like
I propose adding a feature to diffsitter that ignores changes in variable/function names within minified JavaScript code. This improvement would drastically reduce the noise in diffs of minimized source builds, allowing for a clearer focus on the actual code changes rather than the fluctuation of variable names.
Describe alternatives you've considered
As workarounds, I've experimented with various git diff modes like patience, histogram, and minimal to somewhat reduce the diff size. For instance, changing the diff algorithm can alter the number of lines in the diff output significantly:
Nonetheless, these approaches still capture variable name changes, which can introduce a substantial amount of 'noise', especially in larger files.
Other potential solutions include pre-processing the files to normalize variable/function names or post-processing the diff output to filter out sections where the only changes involve variable/function names.
Additional context
The ideal solution would provide diff output in text format, but the actual diffing would occur at the AST level, ignoring variable/function name changes.
I suspect this might be possible already (at least to some degree) with the following; though I haven't found any good examples/docs to help explain how to use it better yet:
You can filter the nodes that are considered in the diff by setting include_nodes or exclude_nodes in the config file. exclude_nodes always takes precedence over include_nodes, and the type of a node is the kind of a tree-sitter node.
This feature currently only applies to leaf nodes, but we could exclude nodes recursively if there's demand for it.
I'm going to hopefully play around with it a bit more now, but wanted to capture this while it was fresh in my mind.
So this works well with an idea I had before - allow users to supply tree-sitter queries to filter which nodes can be diffed on. That is general enough that you could filter for/against certain node types and ignore variable names, for example
Is your feature request related to a problem? Please describe.
Currently, when diffing minimized bundled JavaScript code, there's a significant amount of 'noise' due to the bundler often changing the minified variable names between builds. This can obscure the real changes and make the diff output less useful for understanding code changes.
Describe the solution you'd like
I propose adding a feature to
diffsitter
that ignores changes in variable/function names within minified JavaScript code. This improvement would drastically reduce the noise in diffs of minimized source builds, allowing for a clearer focus on the actual code changes rather than the fluctuation of variable names.Describe alternatives you've considered
As workarounds, I've experimented with various git diff modes like
patience
,histogram
, andminimal
to somewhat reduce the diff size. For instance, changing the diff algorithm can alter the number of lines in the diff output significantly:Nonetheless, these approaches still capture variable name changes, which can introduce a substantial amount of 'noise', especially in larger files.
Other potential solutions include pre-processing the files to normalize variable/function names or post-processing the diff output to filter out sections where the only changes involve variable/function names.
Additional context
The ideal solution would provide diff output in text format, but the actual diffing would occur at the AST level, ignoring variable/function name changes.
I suspect this might be possible already (at least to some degree) with the following; though I haven't found any good examples/docs to help explain how to use it better yet:
I'm going to hopefully play around with it a bit more now, but wanted to capture this while it was fresh in my mind.
See Also
The text was updated successfully, but these errors were encountered: