Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fusion/pointers: add pointer arithmetic operators #21

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

timotheecour
Copy link
Member

@timotheecour timotheecour commented Oct 9, 2020

  • safer than using cast and inlining the code contained in the operators from this PR
  • future compiler work can enable the following:
import fusion/pointers
proc fn =
  var a = @[10, 11, 12]
  let pa = a[0].addr # or from another other source, eg FFI returning directly a ptr
  doAssert pa[2] == 12 # generate a warning: must use {.cast(safe).}
  {.cast(safe).}:
    doAssert pa[2] == 12 # ok: warning is not triggered

nothing will break, the only thing is that warnings will eventually get triggered once {.cast(safe).}: is implemented, and {.cast(safe).}: can then be used to avoid triggering that warning.

this avoids using non-standard operators like +! as was suggested in nim-lang/Nim#15490 (comment)
and it gives the maximum flexibility: allows user to add --warningAsError:UnsafeBlock, or --warning:UnsafeBlock:off or localize those in code, or use {.cast(safe).}: blocks

as mentioned in nim-lang/Nim#15490 (comment)

+ has one obvious meaning for ptr T just like it does in C, C++, D, swift etc (all support pointer arithmetic as in this PR); unlike for eg for concatenation where nim rightfully departs from python and uses & to avoid ambiguities. The leibniz argument applies.

often requested, eg: Added ptrops to stdlib by awr1 · Pull Request #12101 · nim-lang/Nim

type T = typeof(p[])
cast[ptr T](cast[ByteAddress](p) -% off * sizeof(T))

template `[]`*[T](p: ptr T, off: int): T =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single pointer is not an array.

Copy link
Member Author

@timotheecour timotheecour Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about this; slightly more verbose than [], []= but still short enough:

# `[]` =>
template `at`*[T](p: ptr T, off: int): T = ...
# `[]=` =>
template `at=`*[T](p: ptr T, off: int, val: T) = ...

it would still allow this:

echo p.at(1)
p.at(2) = 2
p.at(2) -= 3

(at naming precedent: http://www.cplusplus.com/reference/vector/vector/at/ or https://riptutorial.com/opencv/example/6394/access-individual-pixel-values-with-cv--mat--at-t---)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have the toUncheckedArray proc to offer array indexing.

Copy link
Member Author

@timotheecour timotheecour Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toUncheckedArray is useful but not as a replacement for that; this module is about convenience when dealing with low-level code so you use cast less often and make the intent clearer

p[1]
vs
p.toUncheckedArray[1]

That's why it's been implemented in so many places (with more or less good implementations eg many implementations have issues eg are not safe wrt multiple template argument evaluation bugs) and requested also in many places as shown here #21 (comment)

Among other things, it makes it in particular easy to adapt C/C++ code to nim with the simplest possible syntax.

pa[1] = 2
doAssert a[1] == 2
type T = typeof(p[]) # pending https://github.com/nim-lang/Nim/issues/13527
cast[ptr T](cast[ByteAddress](p) +% off * sizeof(T))
Copy link
Member

@Araq Araq Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my own low level code I never needed the * sizeof(T) part. It's not intuitive and probably error-prone. Why try to outsmart the programmer who chose to operate on a very low level for a reason?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## Unsafe.
(p + off)[]

template `[]=`*[T](p: ptr T, off: int, val: T) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single pointer is not an array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timotheecour
Copy link
Member Author

timotheecour commented Oct 13, 2020

In my own low level code I never needed the * sizeof(T) part. It's not intuitive and probably error-prone. Why try to outsmart the programmer who chose to operate on a very low level for a reason?

I have to disagree here. Here are just a few examples:

# Nim/lib/nimhcr.nim:393:14:
      curr = cast[ptr cstring](cast[int64](curr) + sizeof(ptr cstring))
      =>
      curr += 1

# Nim/lib/pure/coro.nim
coro.stack.top = cast[pointer](cast[ByteAddress](coro) + sizeof(Coroutine))
=>
coro.stack.top = cast[pointer](coro + 1)

# Nim/lib/system/excpt.nim:109:
zeroMem(cast[pointer](cast[int](s)+%sizeof(GcFrameHeader)), s.len*sizeof(pointer))
=>
zeroMem(cast[pointer](s + 1), s.len*sizeof(pointer))

# Nim/lib/system/gc_regions.nim (with `type Chunk = ptr BaseChunk`)
r.bump = fresh +! sizeof(BaseChunk)
=>
r.bump = fresh + 1
# many other instances; system/gc* code could likely be made more maintainable with
# less `pointer` and more `ptr[T]` + standard ptr[T] arithmetics, but that's out of scope for this discussion

probably error-prone

using sizeof when computing offsets from a ptr[T] is arguably less error prone than relying on programmer to cast[ByteOffset](x) first then compute the offset themselves from a sizeof(T), especially when you have aliases and non obvious types eg type Chunk = ptr BaseChunk

There's plenty of evidence for this:

  • even C has pointer arithmetics for T* p; p+=3;, interpreted as p = (T*)( (ptrdiff_t)(p) + 3*sizeof(T) ), and for good reasons;
  • ditto with C++, D, swift and many languages that support pointers

Why try to outsmart the programmer who chose to operate on a very low level for a reason?

There's no need to obfuscate code just because it involves pointers/ptr[T]. Because of alignment reasons, the majority of cases you want standard C pointer arithmetic semantics on ptr[T]; in more complex cases (eg when you're combining several operations at once, eg field offset + pointer arithmetics etc), you can always use cast[ByteOffset] with arbitrary manipulation, but that's arguably more error prone.

finally, this is oft-requested and re-implemented, often poorly

plenty of forum posts asking about how to do it too:

I was wondering if Nim has pointer arithmetic as in C?

@Araq
Copy link
Member

Araq commented Oct 14, 2020

I don't see the benefit in

coro.stack.top = cast[pointer](coro + 1)

r.bump = fresh + 1

Frankly, the 1 here is pure obfuscation for me.

In fact, when we analyse the situation further:

r.bump = fresh +! sizeof(BaseChunk) # ok, we skip the header. There is 1 header, not 2 and not 3

vs


r.bump = fresh + 1 # only 1 here is valid, not 2 and not 3 because we don't have an array of headers...

Often clarity in programming is achieved by ignoring pretty much everything of what was done in C and Unix.

@timotheecour
Copy link
Member Author

I don't see the benefit in

There is plenty of evidence pointing to the opposite, see the re-implementations and form posts I've linked. It is also my experience that > 90% of the time when dealing with ptr[T] variables, what I need are procs in this PR, it leads to safer code than using cast directly. cast is then only used when necessary, not when doing regular sizeof(T)-aligned arithmetics, leading to more maintainable code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants