Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHPDoc types sniff #123

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

PHPDoc types sniff #123

wants to merge 8 commits into from

Conversation

james-cnz
Copy link

@james-cnz james-cnz commented Mar 17, 2024

This is a replacement for the old Moodle PHPdoc check (moodlecheck) checks relating to PHPDoc types. It's a port of moodlehq/moodle-local_moodlecheck#120 .

It's a significant improvement over the old checks (at least last time I looked at them). It supports DNF types, and checks types for validity, recognises namespaces, use aliases, and inheritence and implementation of classes and interfaces in the current file.

I've pulled across the PHPDoc type specific fixtures from the old checker, and my checker reports the contained issues, although not with the same messages. I'm not sure if there's some standard to be followed with error messages and codes?

Is someone else OK with removing the corresponding checks from the old checker, or should I do this?

Things checked

Classes

  • Templates names are present and valid.
  • Template types, if present, are well formed.

Functions

  • Named functions have a PHPDoc comment. (Perhaps this doesn't ultimately belong in the PHPDoc types sniff, but I don't think it's currently checked elsewhere.)
  • Template names are present and valid.
  • Template types, if present, are well formed.
  • Number of parameters matches.
  • Parameter types are present and well formed.
  • Parameter names are present and valid.
  • Parameter types match.
  • Parameter splat usages matches.
  • Parameter names match.
  • There aren't multiple @return tags.
  • Return type is present and well formed.
  • Return type matches.

Variables

  • Class variables and constants have a PHPDoc comment. (Perhaps this doesn't ultimately belong in the PHPDoc types sniff, but I don't think it's currently checked elsewhere.)
  • There is at least one @var tag.
  • And not more than one.
  • The type is well formed.
  • The type matches.

Misc

  • The PHPDoc types sniff successfully parsed the file. (This may be annoying if multiple sniffs report failing to parse the file, but I think it's better than silently failing, and giving the impression that the sniffs passed.)

Things not checked

  • Function parameter pass by reference '&' usage matches. (The old checker disallows this in PHPDoc comments entirely, so requiring it to match would make it impossible to satisfy both checkers.)
  • Function return tag is given other than for constructors. (The old checker doesn't check this.)
  • Tags (@template, @param, @return, @var) aren't used in inappropriate places. (The old checker doesn't check this.)
  • A @const tag isn't used. (I guess this is part of a check for invalid tags generally, and doesn't belong in the PHPDoc type sniff? But in any case, an error will be given because of a missing @var tag, so this is probably enough.)

Copy link

codecov bot commented Mar 18, 2024

Codecov Report

Attention: Patch coverage is 92.71772% with 97 lines in your changes are missing coverage. Please review.

Project coverage is 96.10%. Comparing base (02a279e) to head (43f8ce0).

Files Patch % Lines
moodle/Sniffs/Commenting/PHPDocTypesSniff.php 89.97% 73 Missing ⚠️
moodle/Util/PHPDocTypeParser.php 96.02% 24 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main     #123      +/-   ##
============================================
- Coverage     97.90%   96.10%   -1.80%     
- Complexity      850     1420     +570     
============================================
  Files            37       39       +2     
  Lines          2524     3856    +1332     
============================================
+ Hits           2471     3706    +1235     
- Misses           53      150      +97     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@andrewnicols andrewnicols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi James,

Thanks for your work on this. I recognise that this is not an insignificant amount of work and it is appreciated.

That said, in it's current form, I'm not sure that this is the right approach to take for this change.

phpcs is quite a different (and more mature) beast to moodlecheck and runs in a fairly different way. It's generally not good to store state in the sniff, and it is intended to be paired with fixing rules where possible.

We also make use of a couple of libraries to reduce the Sniff boilerplate, and I think that some of those helpers may perform some of the jobs you have here. In particular we make use of PHPCSUtils.

I'd highly recommend taking a look at:

I've added some initial comments below and I'll continue to work my way through this change in the mean time.

class PHPDocTypesSniff implements Sniff
{
/** @var ?File the current file */
protected ?File $file = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to track the file?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case there's multiple PHP blocks with HTML content between them, so the checker doesn't run multiple times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it run multiple times anyway?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PHP used to be a template language. A .php file would largely be an HTML file, but with bits of PHP embedded in it, with opening and closing tags around it. There could be lots of PHP opening and closing tags in a file. Now days, if it's used at all, it's usually the other way around, I think. The file will largely be PHP, but with PHP closing and reopening tags around blocks of HTML. Most Moodle files don't have this, but some do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It used to be a template language, 20 years ago. It's fairly rare to use it that way nowadays. You say you've only found 2 cases in Moodle (and one in a third-party lib) out of thousands of PHP files. I think it better to focus on the code we want, not the code that is in the minority.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, those were just some examples. It looks like there's 35 files (as of Moodle 4.1), with between 2 and 67 PHP open tags.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, there's an easy way to fix it. Just look back in the file to see if there's any earlier PHP open tags. Obviously a very quick check in most cases. :-) Then I won't need to compare the file pointer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently the process() function can return where it got up to in the file, then PHPCS won't call it again before that point. So I've just returned the end of the file now.


/** @var array{'code': ?array-key, 'content': string, 'scope_opener'?: int, 'scope_closer'?: int}[]
* file tokens */
protected array $tokens = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sniff should not store state at all.

Please pass vars around between member methods.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, hold off on checking more. I'll have a go at rewriting the sniff.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started rewriting to pass the scopes between methods in a meaningful way (and I think it'll be nicer this way). The rest are really global variables, though, I think. If it's a technical issue, I could wrap them in an object, and pass the whole lot everywhere. If it's a style issue, though, then this wouldn't really solve it, because it'd still be globals in spirit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rewritten to pass the scopes, and one of the comment variables. It also no longer stores state between calls to process().

public function process(File $phpcsfile, $stackptr): void {

// Check we haven't already done this file.
if ($phpcsfile == $this->file) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? It will prevent the ability to run a Fix.

phpcs will call the process method once per registered tag.

You've only registered the T_OPEN_TAG tag, and there's only one of those per file. This should not be processed a second time anyway to need this.

Note: phpcbf (code beautifier) does call it a second time. The first run is to find faults, and the second to tell it was to fix and that it should fix them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it doesn't look like Moodle (4.1 at least) has any PHP files with multiple opening tags. This is theoretically possible, though, and I would like to handle it, just in case. Is there a way to tell whether the function's being called to do the checks, or apply the fixes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, no, wait, I did the check wrong. There are some, e.g.:
/admin/mnet/access_control.php
/h5p/h5plib/v124/joubel/core/embed.php
/mnet/service/enrol/course.php

return;
}

try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the try/catch?
Why so large? Try/catch should ideally only cover the smallest item.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just give up on the sniff if there's a parse error. It's the easiest thing to do, continuing might produce some spurious error messages, and the user's probably going to want to fix the parse error before doing anything else anyway. Also, I'm worried that if a later version of PHP breaks the parser, silently failing will mean we don't know there's a problem. Do you think I should continue to try to check the file?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It continues after an error now.

* @phpstan-impure
*/
protected function processPass(): void {
$this->scopes = [(object)['type' => 'root', 'namespace' => '', 'uses' => [], 'templates' => [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We adopt PSR-12 in this repo. Multiline var declarations and standard 4-char indent, not the legacy Moodle style we've defuncted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know anything about style. The only PHP I've worked on is Moodle stuff, and all I've ever done regarding style is make the changes suggested by the checkers. Are you saying multiline arrays should have one element per line?

// Skip irrelevant tokens.
while (
!in_array(
$this->token['code'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out the tokens in \PHP_CodeSniffer\Util\Tokens and https://phpcsutils.com/phpdoc/classes/PHPCSUtils-Tokens-Collections.html to see if there is a similar list. I'm not sure what this is intended to do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're talking about the scope_opener / scope_closer stuff, it's to deal with alternative syntax, where a colon can be a scope opener. I had thought this could apply to classes and functions, but looking it up again, perhaps it's just control structures.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, perhaps it's good to look at the scope_openers/_closers anyway. It looks like phpcs doesn't mark spurious close curly brackets at the end of a file as scope_closers. If there are any other differences, then looking at the scope_openers/closers instead of token types may make the sniff better match up with the way phpcs parses a broken PHP file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and I'll look at those lists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really don't want to write a parser within phpcs, which is itself a parser. We shoudl really aim to have one toolset here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I was more referring to the list of inline T_ constants.

See the Tokens COllection and the Tokens in phpcs. These are largely already defined and maintained there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to know what's nested in what to figure out which templates apply. I guess phpcs parses the file and figures this out, but as far as I can tell, there isn't a good way to access this information as an abstract syntax tree or something.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I also need to know the nesting to figure out if a USE token is for importing other PHP files, or class trait usage, or whatever it is that functions do with it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made use of lists from Tokens.

@james-cnz
Copy link
Author

@andrewnicols

Continuing from #121

Regarding comment tags, I'm being pedantic, and checking that the tags start on a new line. Technically, I think param, return, and var tags within a line don't count, so I'm not sure the list would help me. But the issue is with multi-line types like this:

Sorry, what is this in relation to? I'm not sure of the context of your reply.

It was in relation to:

The phpcs parser already does much of this work for us and if you find the T_DOC_COMMENT_OPEN_TAG tag, you'll find that phpcs includes in its structure an array of tags named comment_tags whose value is the ptr for each of the @ tags in the docblock.

I don't think this would help me much, because I'm not sure it would tell me which tags start on a new line.

The general approach to this would be:
identify the start and end of the type
store these, along with the concatenated type
replace the first type part with the new type
replace the rest of them with an empty string

Oh, I guess I could do that. It would mean the fix would create one very long line, losing any formatting they had, and causing a complaint about the line length though. (If I understand the suggestion right.) But if that's acceptable I'll give it a try.

That said, we want to strongly discourage complex objects unless they are a class in their own right. They may be syntactically valid, but they're not a good pattern.

It was a separate class originally, but I converted it to an object because the style checker told me I should only have one class per file. That said, part of it's relevant to the type parser, and part of it isn't, so it wasn't really perfect as a class either.

In fact, the above is not recognised by phpDocumentor in any case. See the attached screenshot.

I've aimed to put in everything that's supported by both PHPStan and Psalm. If there are specific tools that are relevant to Moodle, though, I could remove anything those tools don't support.

@andrewnicols andrewnicols marked this pull request as draft March 19, 2024 03:25
@james-cnz
Copy link
Author

Hi @andrewnicols,

Thanks for your feedback. I've made several changes.

Done:

  • Refactored, so some stuff could be passed between functions, rather than stored in variables.
  • Changed the check for multiple calls on the same file, so it won't break autofixing.
  • Warn about types that don't conform to Moodle style (short type keywords, and lower case), and support autofixing.
  • Made use of token lists.
  • Check class property, property-read, and property-write tags.
  • Allow PHP attributes in between the PHPDoc comment and the thing it's annotating.
  • Check against Moodle master, and required fixes for that.
  • Removed checks to be performed in other sniffs.

Still to do:

  • Remove support for types that aren't supported by phpDocumentor? (I'll have to have a play with it, I haven't used it before.)
  • Increase test coverage.

@james-cnz
Copy link
Author

I've had a bit of a look at phpDocumentor, and it looks like it will parse almost everything accepted by the checks, although it doesn't handle everything correctly. So the checks as they are are already doing something useful--they go a long way towards ensuring that phpDocumentor will run. But some things would have to be taken out to ensure phpDocumentor could handle everything correctly.

@james-cnz
Copy link
Author

It looks like phpDocumentor has just added support for object shapes:
phpDocumentor/TypeResolver@c92bb85

@andrewnicols
Copy link
Contributor

As it happens, I've been starting to look at bringing phpdocumentor/reflector in as part of the existing type hint side of things but I have to put this on the backburner for a week.

So far the issue I've encountered is that the type is normalised during the parsing by reflector, which makes it useless for our purposes.

@james-cnz
Copy link
Author

Sorry, not sure I understand. Is this related to #126, or something different?

@andrewnicols
Copy link
Contributor

andrewnicols commented Mar 26, 2024 via email

@james-cnz
Copy link
Author

Sorry, I meant in the same vein. Is it about simplifying types? If so, is this still needed? phpDocumentor seems to be adding more complex types for its next release. Are there other tools that Moodle developers depend on that don't support the complex types?

@andrewnicols
Copy link
Contributor

andrewnicols commented Mar 26, 2024 via email

@james-cnz
Copy link
Author

Is simplifying the types still needed, now that phpDocumentor is adding support for complex types?

@andrewnicols
Copy link
Contributor

At the moment, yes. That may change in future, but right now we are focused on trying to migrate remaining functionality from moodle-local_moodlecheck here.

Once we have the sniffs migrated, we can start to add more support, and I am looking at using Reflection for that, but it's not something I have time to focus on right now.

@andrewnicols
Copy link
Contributor

See also PHPCSStandards/PHP_CodeSniffer#387

@james-cnz
Copy link
Author

Changes:

  • Added error recovery
  • Less parsing, relying more on phpcs
  • Added check for misplaced property-*, param, return, template tags (I didn't do this originally, because the old checker didn't, but I guess there's no reason to stick to what the old checker did?)
  • More test coverage
  • Checked against phpDocumentor--it seems to have high quality PHPDocs, so good for checking there aren't too many false positives, and fixed some issues found there
  • Removed the check for multiple var tags. phpDocumentor has multiple var tags for variables separated by commas, even though there's only one type, so I guess this is OK.
  • Added BSD licence, since phpcs uses this, I think it's probably a good thing to do.
  • And some tidying

I had a look back over comments, to see if I'd missed anything, and found this:

Aside from everything else... since when are those generics compliant with PHPDoc? [with a link to PHP-FIG's PHPDoc standard.]
#121 (comment)

So is compliance with PHP-FIG's standard a requirement? Is there any chance of re-evaluating this, now that phpDocumentor supports more complex types? If not, would it be OK to still check more complex types, but give a warning if they're used? Or leave the code in, but set a constant that disables it? There's quite a bit in the Sniff that's not in the PHP-FIG standard, and I think it would be good to keep it in case it's supported in the future.

@james-cnz
Copy link
Author

I'll go ahead and add a warning. That'll provide the most informative messages.

@james-cnz james-cnz marked this pull request as ready for review April 2, 2024 04:25
@james-cnz
Copy link
Author

I think I've addressed all functionality issues, and I've made an attempt at style issues.

The sniff still does some parsing, to determine how classes and functions are nested, so that relative class types (self, parent, static, $this) can be checked correctly, and so it knows which templates are in scope. It passes over the file, calling functions to deal with classes and functions as they are encountered, storing relevant data locally, and exiting, thereby disposing of that data, when it encounters the ends of the classes and functions. I think this approach is relatively simple, and sufficient for the purpose.

I've added a check that (most) relevant tags don't appear in inappropriate places. Perhaps this doesn't ultimately belong in this sniff, but it's easy to do here, since a consequence of having to know when tags are expected, is that we also know when they're not, and I'm not sure this check is currently performed elsewhere?

The var tag isn't checked to see if it appears in inappropriate places, though. Apparently, besides variables defined in the usual way, and in "define" statements, var tags can also annotate variables defined as the iterator in foreach statements, and those defined by being returned as a pass by reference parameter that wasn't defined before a function call. Checking this properly would add complexity for diminishing returns, I think, so for var tags that don't precede class variables, I've just checked the type is well formed, and left it at that.

@andrewnicols
Copy link
Contributor

Hi @james-cnz ,

I'm not going to have time to review this change in the next few weeks as we're prepping for Moodle 4.4's release.

I will again reiterate that our current focus on this project is to migrate all existing sniffs in their current form from the moodlecheck project to here, fixing any existing bugs along the way. In some cases we may add some small improvements as we do so, but such improvements are not the focus.

I would strongly suggest waiting until we have done this as some of these changes simplify the sniffs, and we are considering whether to make the PHP Documentor Reflector a dependency of this project, which would likely alleviate the need for some of your changes.

@james-cnz
Copy link
Author

Hi @andrewnicols,

OK. I guess there's nothing really Moodle specific about this sniff. I wonder if phpcs would be interested in it. If part of the issue is time constraints, would you be more likely to consider it if it was available an an off-the-shelf sniff?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants