Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0181] List index syntax #181

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

rhendric
Copy link
Member


It is a syntax error if either new production is used in the left-hand side of a binding.

An implementation of this design is available as patches for Nix at <https://gitlab.com/rhendric/nix-list-index-syntax/>; see instructions there for use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect much regression, but have you tested current nixpkgs trunk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested one build on the nixos-unstable channel as of yesterday, but not literally trunk.

rfcs/0181-list-index-syntax.md Outdated Show resolved Hide resolved
@sg-qwt
Copy link

sg-qwt commented Jul 15, 2024

When writing Nix code, it is relatively uncommon to want to index into a list, and builtins.elemAt suffices.

My 2cents is that I'd still prefer the simplicity of builtins.elemAt than introducing new syntax. The usability of nix repl is poor compared to other lisp languages. However, maybe that's something can be worked on the tooling level so that writing and interacting Nix code in repl can be the same experience as writing Nix code other places.

Thus, if exploring data in a REPL is the only motivation for introducing this new syntax, I feel like it doesn't hold a very strong argument this case.

```

# Motivation
[motivation]: #motivation
Copy link
Member

@infinisil infinisil Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sg-qwt Let's try to keep the discussions in threads, so they can be easily marked as resolved later

When writing Nix code, it is relatively uncommon to want to index into a list, and builtins.elemAt suffices.

My 2cents is that I'd still prefer the simplicity of builtins.elemAt than introducing new syntax. The usability of nix repl is poor compared to other lisp languages. However, maybe that's something can be worked on the tooling level so that writing and interacting Nix code in repl can be the same experience as writing Nix code other places.

Thus, if exploring data in a REPL is the only motivation for introducing this new syntax, I feel like it doesn't hold a very strong argument this case.

A primary other use case is builtins.{match,split}, which returns matched groups in indexed lists. A GitHub search finds many examples of this. Though for this use case it might be better to introduce matching group labels, so that (builtins.match "(?<foo>.*)" "bar").foo == "bar" would work.

rhendric and others added 2 commits July 15, 2024 11:29
Co-authored-by: Yingchi Long <[email protected]>
I had previously thought there'd be a conflict between `expr!4` and the prefix `!` operator, but it turns out to be quite manageable without having to deprecate anything.

Finally, there is an opportunity cost to claiming new syntax.
One could imagine speculative features that might want to use this syntax, such as a list or string slicing syntax, or a ‘list swizzle’ operator that desugars `expr.[ 2 0 1 ]` to `[ (elemAt expr 2) (elemAt expr 0) (elemAt expr 1) ]`.
It is, in my opinion, unlikely that list and string manipulation (assuming that any feature in competition for this syntax would involve lists or strings somehow) would be so common in Nix to make this a compelling objection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, this is a major blocker for the proposed syntax. While I agree that list slices are fairly uninteresting, this syntax could instead be used for set slicing, as drafted out here: To me, this is a major blocker for the proposed syntax. While I agree that list slices are fairly uninteresting, this syntax could be used with tremendous benefits for set slicing, as drafted out here: https://md.darmstadt.ccc.de/nix2?view=#Set-slicing-confidence-mid (ignore the comma separated lists, which are a separate language improvement proposal)

Therefore, I propose going with the list.NUMERAL approach, at the cost of having to remove the unnecessary .3 floating literals first.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in agreement. Unifying all the composite type elements under one operator is probably going to be nicer long-term.

The .${numeral} syntax reminds me of JS a bit and is pretty intuitive. If Nix is just high level JSON and we already have .foo, we could have .42 too. (In other words, I get why [numeral] was chosen because JS works that way for both objects and arrays, but Nix doesn't need to have two syntaxes for composite type indexing, I don't think.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, [ "hello" "world" ].1 => "world" is the way forward IMO.

builtins.head would be equivalent to list: list.0.

builtins.elemAt would be equivalent to list: idx: list.${idx}.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

builtins.elemAt would be equivalent to list: idx: list.${idx}.

I'm not so sure about that part. Basically list.${idx} is syntactically identical to the existing set.${attr}, which means that an implementation would need to make a case distinction at runtime on the type of the selector.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it only follows naturally. In { }.foo (and { }."bar") vs [ ].42 we're already differentiating based on the type of the literal, why not also differentiate based on the type of the evaluated ${123} expression?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of them can be done at parse time, the other one has to be delayed until the expression is forced. Btw I don't know the actual performance impact this might have, but I'd be very wary of it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I describe in the Alternatives section, the main concern with that is less about the evaluator's performance and more about external static analysis tools.

One could imagine speculative features that might want to use this syntax, such as a list or string slicing syntax, or a ‘list swizzle’ operator that desugars `expr.[ 2 0 1 ]` to `[ (elemAt expr 2) (elemAt expr 0) (elemAt expr 1) ]`.
It is, in my opinion, unlikely that list and string manipulation (assuming that any feature in competition for this syntax would involve lists or strings somehow) would be so common in Nix to make this a compelling objection.

# Alternatives
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another alternative which might be worth mentioning would be foo |> getElem 2 using the pipe operator (RFC #148). I think this may work sufficiently okay in a REPL, however in Nix code this may interfere with existing |> pipelines and thus require parentheses

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even in a REPL, the motivating case of (foo.bar.qux |> getElem 0).moreStuff requires parentheses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another alternative which might be worth mentioning would be foo |> getElem 2 using the pipe operator

I agree that "pipe operator" is good but not a real alternative of this case.

@@ -0,0 +1,290 @@
---
feature: list-index-syntax
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are here, what about extending this to sets?

{ x=1; y=2; }.["x"]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? You can already write { x = 1; y = 2; }."x".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good argument why the .0 syntax is more consistent and should be preferred.

Copy link
Member

@AndersonTorres AndersonTorres Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? You can already write { x = 1; y = 2; }."x".

I do not have a fancy technical term for this.

While lists are indexed by positive or null integers, sets are indexed by keys.

If we are using a syntax construction for one thing and another for the other one, it adds an otherwise anti-natural distinction.

The idea from piegamesde looks better since it squeezes two bytes.


On the other hand, in guise of speculation, with the bracketed syntax we can have nice things like

.[ "x" "y" ]

plus

let
keysNeeded = [ "x" "y" ];
set = { x=1; y=2; z=4; };
in
set.keysNeeded # [ 1 2 ]

Copy link

@MattSturgeon MattSturgeon Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, in guise of speculation, with the bracketed syntax we can have nice things like

.[ "x" "y" ]

I think this should be discussed in its own RFC.

IMO .int (e.g. .2) is consistent with the attr indexing syntax ."name".

Since unquoted attr names can't start with a number, this fits in nicely with the current language.

.${expression} indexing will cover more complex scenarios too.

This would leave .[/*other syntax*/] available for any future use, to be tackled by another RFC.

@infinisil infinisil added the status: open for nominations Open for shepherding team nominations label Aug 5, 2024
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-08-05/50170/1

However, it requires an additional character to type and its technical qualities are identical to those of the proposed syntax without the `$` character.
There is at least some prior art for `.[]` in OCaml and F#; there is none that I know of for `.$[]`.

#### `expr.3`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everyone arguing for this alternative: you don't have to convince me of its technical superiority. I am already convinced, as I describe below. This is the syntax I'm using in my personal Nix build.

Rather, you have to convince me that the Nix maintainers can be persuaded to make a breaking change to float literal syntax on a timeline that is less than, say, five years, when they have expressed reluctance to make ‘breaking changes’ to the language even with respect to obvious buggy behavior.

Gating this proposal behind something that will never happen is, in practice, equivalent to rejecting the proposal. I would rather see some syntax implemented in a couple of years than wait indefinitely for the ideal syntax.

And if sane language evolution manages to actually become policy at some point, in that case we'll have the tools we need to deprecate the less-appealing syntax and migrate to the better one, should we choose to do so.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With expr.3 how would be non-literal index look like?

Obviously, expr.n is already mean expr.${"n"}.

If you introduce expr.(n) or expr.[n] for non-literal index, should they work with literals, should expr.(3) or expr.[3] be valid too?

If yes, why duplicating and confusing expr.3 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This proposal is only for when the desired index is known; replacing elemAt for variable indices is left as future work.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that adopting expr.3 without addressing the implications for non-literal indices could lead to significant challenges down the road. If we postpone even the discussion on non-literal indices, we risk creating a legacy issue that will be difficult and cumbersome to address in the future.

While we might agree on expr.3 today, this decision could result in a syntax that feels outdated and problematic later on, making it harder to evolve the language cleanly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree here. If we agree that elemAt is the future then that is fine. But I think if we agree that we want to improve the dynamic syntax in the future it is best to have that discussion now to ensure that we don't shoot ourselves in the foot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't really open that gate; 3a and 1e+3 are already ill-advised things to write in Nix.

Copy link

@kvtb kvtb Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Make .${n} work with both numbers on lists as well as strings on sets

this is already criticized above for runtime overhead and issues with static analyzers.

  • Introduce .$[n] as a parallel construction to .${str}
  • If this proposal is adopted as-is, simply extend .[n] to also allow expressions

These are not truly parallel, [] is a bit different in Nix than {} and (): while ${a b} and (a b) are function applications, [a b] is not.

How to read .$[a b] or .[a b] then?
Function application? Slice? Index in a multi-dimensional array?

Copy link

@kvtb kvtb Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3a and 1e+3 are already ill-advised things to write in Nix.

I mean, currently, such code likely hit "attempt to call something which is not a function but an integer" error
but in Nix-with-expr.3-allowed one could mistakenly write expr.0x29A and she may get no error even at runtime, because expr.0 is not a number, it may be a callable.

She just get wrong result at the end.

The code looks pretty valid doing expr.[666] but it does something else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather, you have to convince me that the Nix maintainers can be persuaded to make a breaking change to float literal syntax on a timeline that is less than, say, five years

At least for Lix, I am currently setting up infrastructure to remove URL literals, and the .3 floating literals are guaranteed to be next.

However, I can't guarantee that adding the new syntax will be allowed without proper language versioning tools (which are on the roadmap but it will take years still).

elemAt is good enough

I'll join that camp. Especially since the main motivation for this RFC is the REPL, where the user knows which index they want to use.

Copy link
Member Author

@rhendric rhendric Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(What other projects do is their business; this is a Nix RFC.)

For everyone using original Nix, a reminder that you can have either the proposed syntax or the expr.3 alternative in your REPL right now, on (almost) any stable or unstable version of Nix, without waiting for anyone else to make any decisions about what is allowed.

@kvtb
Copy link

kvtb commented Aug 12, 2024

I suggest adding the expr.(3) or expr.(n) variant to the proposal as it's the least confusing option:

  1. No issues with floats.
  2. No issues with symbols or letters attached to the right.
  3. No issues with sets; .(a) and .("a") are already prohibited, ensuring no conflict with existing Nix code.
  4. No issues with spacing; expr.(a) has the same meaning regardless of spacing: expr.(a) is expr .(a) is expr. (a) is expr . ( a )
  5. Unlike [], priorities inside () are straightforward: expr.(a b) behaves like expr.${a b}.
  6. Simplifying expr.(3) to expr.3 could be allowed in unambiguous cases, making it as intuitive as any other ()-removal, unlike [] adding/removals, which feel counterintuitive.

i.e. moving towards non-literal indexes will look like:

😺 expr.3 <-> expr.(3) -> expr.(a) -> expr.(a+1) -> expr.(f a)
vs
😿 expr.3 <-> expr.[3] -> expr.[a] -> expr.[(a+1)] -> expr.[(f a)]
or
😭 expr.3 <-> expr.$[3] -> expr.$[a] -> expr.$[(a+1)] -> expr.$[(f a)]

@piegamesde
Copy link
Member

I personally find it fairly confusing to use parentheses in syntax that is not a grouping operator. We already have a precedent for it in the language, inherit (from) and I really dislike it as well.

@AndersonTorres
Copy link
Member

We already have a precedent for it in the language, inherit (from) and I really dislike it as well.

Off-topic: I think we lost a great chance for suggesting from lib inherit cmakeBool mesonBool;...

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-09-02/51514/1

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-09-16/52224/1

@AndersonTorres
Copy link
Member

I am inclined to shepherd this for the sake of advancing the RFC, however I am not so sure if I can deal with three RFCs at the same time.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-09-30/53690/1

@kevincox
Copy link
Contributor

RFCSC: @AndersonTorres if you are able to spare the time it would be much appreciated as that would be enough to complete the shepherd team and move the RFC forward.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-10-28/55095/1

@AndersonTorres
Copy link
Member

OK then, I recruit myself as shepherd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: open for nominations Open for shepherding team nominations
Projects
None yet
Development

Successfully merging this pull request may close these issues.