-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paragraph mode (aka paragrep, multiline) #171
Comments
(Slack discussion recorded in #171 and #172 is 2018-11-29 https://beyondgrep.slack.com/archives/C4J886HT2/p1543503390000600 ) |
🔮 I am curious how such an issue will evolve further. |
This would require significant internal changes to Ack's underpinnings so I'm not terribly hopeful, but at least Andy put the (feature) flag on instead of closing it as out-of-scope NLP (possibly convinced because of nice whitespace praxis wrapping code on multi lines). (There are optimization techniques to avoid doing an ( I'm wishing i'd implemented this using the never-used, removedinput-filters feature of Ack2 where it would have been more natural.) FWIW I commented on #333 that I'm now using |
Note from renewed Slack monologue. (1) if Ack had a Paragraph (pgrep) mode, it would be useful for on-label-use Code search (not just off-label-use Text repo search, as it would naturally look at a blank-line separated stanza of the code file - blank line e.g., find any (2) Note I am not suggesting encoding override of brackets by filetype as DWIM. (3) Top level Bracket matching is of course fallible with quoted brackets in strings but (And alas the obvious workaround of accepting only brackets with only optional whitespace on one side or the other would be mostly better but would miss open or close brackets with inline comment (4) In Paragraph mode, the difference between ^$ and \A\z and mutation of . under |
Today in Slack, Andy daydreamed
to which i replied (after some meandering) anything in the generalized paragrep category is probably good-enough for end-user-driven heuristic within/near -
it's up to the end-user to know their house-style of indenting and commenting. Pattern may be |
as noted in #99,
ag
SilverSurfer and Perl both support (respectively)--multiline
or-000
paragraph mode, as of course does the eponymous one-trick-ponyparagrep
.While this is most obviously useful for data and NLP use cases (officially unsupported), if coders are using the recommended vertical whitespace for paragraphing their code, -000 is directly applicable to code, and if paragraph end sequence is specifiable as an optional arg on the option, coders can specify
/^[}]/
or/^\w*$/
or whatever is end of a block or sub in their corpus of code (possibly/^\t[}]/
if they like that extra tab at end of sub bodies) , which would approximate the suggested--same-subroutine
feature.(An arg specifying end-of-para pattern would also support filtering multiline structured data (like EDI-INT tagged-data multiline hierarchical record streams/files).)
(As a possible enhancement, we might later allow specifying an end-paragraph pattern for each
--type
in.ackrc
?)Paragraph mode would interact with
--and
by providing an additional, medial mode of sameness beyond the obvious largest and smallest extents, same-line, same-file; without requiring an additional option flag, as--paragraph-mode pattern --and pattern
would naturally switch same-line to same-paragraph by switching the primary object from a line-buffer to a paragraph-buffer.Paragraph mode would require match patterns have the
(?sm)
or//sm
flags activated so that . matches internal newlines and^$
match next to internal newlines too, and\A\z
match beginning and end of whole paragraph/chunk. (Just as PBP says to always do.)(Which means
foo$.*^bar
is not meaningless in--paragraph
mode, same as it becomes meaningful in Perl withqr{}sm
and likely-000
.)The text was updated successfully, but these errors were encountered: