Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quoted heredoc delimiter gives parse error #104

Closed
Sjord opened this issue Aug 27, 2021 · 7 comments · Fixed by #121
Closed

Quoted heredoc delimiter gives parse error #104

Sjord opened this issue Aug 27, 2021 · 7 comments · Fixed by #121
Labels

Comments

@Sjord
Copy link
Contributor

Sjord commented Aug 27, 2021

<?php

$header = <<<"EOD"
this is it
EOD;

var_dump($header);

Expected:

(program [0, 0] - [7, 0]
  (php_tag [0, 0] - [0, 5])
  (expression_statement [2, 0] - [4, 4]
    (assignment_expression [2, 0] - [4, 3]
      left: (variable_name [2, 0] - [2, 7]
        (name [2, 1] - [2, 7]))
      right: (heredoc [2, 9] - [4, 3])))
  (expression_statement [6, 0] - [6, 18]
    (function_call_expression [6, 0] - [6, 17]
      function: (name [6, 0] - [6, 8])
      arguments: (arguments [6, 8] - [6, 17]
        (argument [6, 9] - [6, 16]
          (variable_name [6, 9] - [6, 16]
            (name [6, 10] - [6, 16])))))))

Actual:

(program [0, 0] - [7, 0]
  (php_tag [0, 0] - [0, 5])
  (expression_statement [2, 0] - [4, 4]
    (binary_expression [2, 0] - [4, 3]
      left: (variable_name [2, 0] - [2, 7]
        (name [2, 1] - [2, 7]))
      (ERROR [2, 8] - [2, 11])
      (ERROR [2, 13] - [3, 10]
        (encapsed_string [2, 13] - [2, 18]
          (string [2, 14] - [2, 17]))
        (name [3, 0] - [3, 4])
        (name [3, 5] - [3, 7])
        (name [3, 8] - [3, 10]))
      right: (name [4, 0] - [4, 3])))
  (expression_statement [6, 0] - [6, 18]
    (function_call_expression [6, 0] - [6, 17]
      function: (name [6, 0] - [6, 8])
      arguments: (arguments [6, 8] - [6, 17]
        (argument [6, 9] - [6, 16]
          (variable_name [6, 9] - [6, 16]
            (name [6, 10] - [6, 16])))))))
test2.php	0 ms	(ERROR [2, 8] - [2, 11])
@cfroystad cfroystad added the bug label Aug 27, 2021
@cfroystad
Copy link
Collaborator

I'm planning a full overhaul of heredoc to support string interpolation there as well.
Will add this to the list - unless someone else beats me to it 😄

@claytonrcarter
Copy link

@cfroystad when you start overhauling heredocs (and nowdocs?) is there any chance that you'll also be adding more child nodes to them? As I understand it, the (heredoc) node currently includes the full heredoc expression w/o no easy way to query or select just the start/end identifiers, or just the actual string content. Being able to do so would make highlighting and injection easier for my pet project over at atom/language-php#438.

For example, Atom currently allows the injection and highlighting of SQL/HTML/JS/JSON/XML in heredocs if the heredoc identifier matches one of those. This would be nice to have. Thank you!

@cfroystad
Copy link
Collaborator

I've yet to look into it in detail (quite busy at the moment), but my current thinking is that since heredoc is basically a slightly more advanced double-quoted string in PHP, it should have the same node structure as the encapsed_string

The current heredoc implementation should (after some fixes) be a viable nowdoc implementation (disclaimer: unverified from the top of my head)

I'm not sufficiently familiar with the tree-sitter family to fully understand the ramifications of your last paragraph. Could you point me to some relevant documentation/example to help me understand this better. That way I'll try to account for it when I get started on this task

@Sjord
Copy link
Contributor Author

Sjord commented Sep 2, 2021

We currently have this:

heredoc [2, 9] - [4, 3]

Which covers the entire thing. As I understand it, @claytonrcarter wants more like this:

(heredoc [2, 9] - [4, 3]
  start: name[2, 9] - [2,13]
  string: string[3, 0] - [3,30]
  end: name[4, 0] - [4, 3])

Then they can check the content of the start and end delimiters, and highlight in the correct language according to delimeters. E.g.

$a = <<< SQL
SELECT * FROM table
SQL; 

Here the contents of the heredoc will be highlighted as SQL, which is indicated by the delimiters. To implement this, they need an easy way to extract the delimiters from the parse tree.

@claytonrcarter
Copy link

Oh, a few hours later and that last paragraph of mine isn't clear at all, but yes @Sjord is correct on what I'm asking for. Thank you both!

@cfroystad cfroystad linked a pull request Jan 31, 2022 that will close this issue
5 tasks
@cfroystad
Copy link
Collaborator

It took a while before I found the time, but I think #121 provides what you asked for. Any additional testing and review would be great!

@claytonrcarter
Copy link

Thank you @cfroystad for this! I have not had a chance to actually play w/ it, but based on what I see in the test files, yes, this should work nicely for my needs. If I'm reading this correctly, I could look for any heredoc node, then use the value of heredoc_start to determine what, if any, language should be injected for highlighting/parsing of the heredoc_body. For example, if the author uses a heredoc identifier of HTML, I could instruct the editor to parse/highlight the contents of the heredoc as HTML.

If I get a chance to play with it before merge, I'll be sure to let you know what I find. Thanks again for working on this!

@aryx aryx closed this as completed in #121 Jun 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants