Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EEP 70: Implement non-skipping generators (as an experimental feature) #8625

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

dszoboszlay
Copy link
Contributor

@dszoboszlay dszoboszlay commented Jun 28, 2024

First of all, this is a big PR and it's not 100% completed either. However, I'd like to get some feedback from you on how to proceed. I'll try to explain everything in this description.

What's in this PR?

  • The first two commits are minor fixes that are not strictly related to the rest of the PR. These could go on to the maint branch too, and if you're ok with them, I can create a separate PR for them. However, I also need these commits here to avoid merge conflicts with adjacent changes.
  • The bulk of the work is in the [WIP] labelled commit: I'm implementing a new language feature, non-skipping generators, for the comprehensions that corresponds to EEP 70, outline in Add proposal for non-skipping generators eep#62. I wanted to do this as an experimental feature, so it could be dropped if nobody else likes my idea (a quick informal survey across my colleagues showed that they would like to have it, but they are not a representative sample of the Erlang community), but I have questions about how experimental features are intended to work. More on that below.
  • Finally, the last commit is the result of running ./otp_build update_primary (because I changed files in stdlib and compiler too). Here my question is whether I'm supposed to check in these compiled modules myself? I thought you may rather want to run this step in a trusted environment (like in CI) instead of on my laptop, but I'm not familiar with the process.

What are non-skipping generators?

Currently existing generators are "skipping": they ignore terms in the right-hand side expression that do not match the left-hand side pattern. Non-skipping generators on the other hand fail with exception badmatch.

The motivation for non-skipping generators is that skipping generators can hide the presence of unexpected elements in the input data of a comprehension. For example consider the below snippet:

[{User, Email} || #{user := User, email := Email} <- all_users()]

This list comprehension would skip users that don't have an email address. This may be an issue if we suspect potentially incorrect input data, like in case all_users/0 would read the users from a JSON file. Therefore cautious code that would prefer crashing instead of silently skipping incorrect input would have to use a more verbose map function:

lists:map(fun(#{user := User, email := Email}) -> {User, Email} end,
          all_users())

Unlike the generator, the anonymous function would crash on a user without an email address. Non-skipping generators would allow similar semantics in comprehensions too:

[{User, Email} || #{user := User, email := Email} <-:- all_users()]

This generator would crash (with a badmatch error) if the pattern wouldn't match an element of the list.

Syntactically non-skipping generators use <-:- (for lists and maps) and <=:= (for binaries) instead of <- and <=. This syntax was chosen because <-:- and <=:= resemble the =:= operator that tests whether two terms match. Since <-:- and <=:= were invalid syntax in previous versions of Erlang, they avoid backward compatibility issues.

Nevertheless, I aim to add non-skipping generators as an experimental feature, that has to be explicitly enabled:

 -feature(non_skipping_generators,enable).

What is left to do?

While I've already declared the non_skipping_generators feature, in practice it doesn't work (the feature is never disabled), because I'm not sure how to best implement the feature check?

The only feature Erlang had so far introduced a new keyword, maybe, and made it possible to enable/disable the feature based on this keyword. Namely epp:parse_file/2 gained a new reserved_word_fun option that allows customizing what unquoted atoms should be treated as keywords. erl_features also assumes features may introduce new keywords and has API-s for dealing with them.

However, the non-skipping generator features doesn't introduce any new features, but new operators: <-:- and <=:=. Neither erl_features nor epp:parse_file/2 are prepared to deal with new operators unfortunately.

I see a number of possible ways forward, but I'd need some guidance on which one would you prefer (or if you have a better idea):

  • Make new operators a first-class citizen in erl_features. On one hand this would make sense, since I believe features introducing new operators are not less likely to come than features introducing new keywords. However, erl_scan currently implements a hand-rolled scanner (I guess for performance reasons) that would be quite hard to change to accept arbitrary new operators returned by the erl_features module.
  • An other option would be to pass down to erl_scan the list of currently enabled features too. Then the logic about what operators to support per feature could be baked in to the hand-rolled scanner code.
  • A third option would be to allow the scanner to always accept optional operators, and raise an error in a later stage of the compilation. But I don't know which phase would be suitable?

The issue of features and operators aside I'd need some feedback on whether I found all the places of the code base to update? I based my work on #5411 and the various PR-s related to the introduction of map comprehensions, but this is my first PR changing the language itself, so I may very well have missed something. And there are places, like the erlang.el Emacs code where I'm basically just shooting in the dark (I have no experience with writing Emacs Lisp at all).

Finally, I verified that the tests I changed do pass, but haven't run the entire OTP test suite myself. I'll keep an eye on CI results of course!

Copy link
Contributor

github-actions bot commented Jun 28, 2024

CT Test Results

     8 files     575 suites   1h 29m 40s ⏱️
 4 399 tests  4 278 ✅ 120 💤 1 ❌
10 361 runs  10 214 ✅ 146 💤 1 ❌

For more details on these failures, see this check.

Results for commit b18813d.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

@dszoboszlay
Copy link
Contributor Author

I see only one test failure here:

/buildroot/otp/Erlang ∅⊤℞/lib/stdlib-6.0/src/peer.erl:51:2: can't find moduledoc file "../doc/src/peer.md"
%   51| -moduledoc({file, "../doc/src/peer.md"}).
%     |  ^

Doesn't seem to be related to my changes?

@jhogberg jhogberg added the team:LG Assigned to OTP language group label Jul 1, 2024
@dszoboszlay dszoboszlay changed the title Implement non-skipping generators (as an experimental feature) EEP 70: Implement non-skipping generators (as an experimental feature) Jul 2, 2024
@dszoboszlay
Copy link
Contributor Author

Created an EEP about the proposed feature: erlang/eep#62

@jhogberg jhogberg added the stalled waiting for input by the Erlang/OTP team label Jul 15, 2024
Currently existing generators are "skipping": they ignore terms in
the right-hand side expression that do not match the left-hand side
pattern. Non-skipping generators on the other hand fail with
exception badmatch.

The motivation for non-skipping generators is that skipping generators
can hide the presence of unexpected elements in the input data of a
comprehension. For example consider the below snippet:

[{User, Email} || #{user := User, email := Email} <- all_users()]

This list comprehension would skip users that don't have an email
address. This may be an issue if we suspect potentionally incorrect
input data, like in case all_users/0 would read the users from a
JSON file. Therefore caucious code that would prefer crashing
instead of silently skipping incorrect input would have to use
a more verbose map function:

lists:map(fun(#{user := User, email := Email}) -> {User, Email} end,
          all_users())

Unlike the generator, the anonymous function would crash on a user
without an email address. Non-skipping generators would allow
similar semantics in comprehensions too:

[{User, Email} || #{user := User, email := Email} <:- all_users()]

This generator would crash (with a badmatch error) if the pattern
wouldn't match an element of the list.

Syntactically non-skipping generators use <:- (for lists and maps)
and <:= (for binaries) instead of <- and <=. This syntax was chosen
because `<:-` and `<:=` somewhat resemble the `=:=` operator that
tests whether two terms match, and at the same time keep the operators
short and easy to type. Having the two types of operators differ by
a single character, `:`, also makes the operators easy to remember as
"`:` means non-skipping."

Nevertheless, non-skipping generators are added as an experimental
feature, that has to be explicitly enabled:

-feature(non_skipping_generators,enable).

change operators
@dszoboszlay
Copy link
Contributor Author

Updated with syntax proposed by @kikofernandez : <:- and <:=

@kikofernandez
Copy link
Contributor

I will look at this once the EEP is approved and we have some feedback from the community
I will wait a week or so to see if we can gather feedback from ErlangForums and other places.

Thanks for your contribution

@kikofernandez kikofernandez self-assigned this Sep 16, 2024
@kikofernandez
Copy link
Contributor

Sorry for the delay.
We will schedule an internal discussion in two weeks time.
Expect news after two weeks.

@bjorng bjorng self-assigned this Oct 3, 2024
@bjorng
Copy link
Contributor

bjorng commented Oct 3, 2024

OTB has approved this PR (see this Erlang Forum post).

Please remove the code defining this as a feature. Also remove the update of the primary bootstrap; we will take care of that.

@dszoboszlay
Copy link
Contributor Author

Thank you for the approval! I'll probably have time next week to make the requested changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stalled waiting for input by the Erlang/OTP team team:LG Assigned to OTP language group
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants