Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hl] use hl.NativeArray for Vector #11568

Merged
merged 6 commits into from
Aug 29, 2024
Merged

Conversation

Simn
Copy link
Member

@Simn Simn commented Feb 9, 2024

I really think that Vector should be implemented on top of fixed array types on all targets where this is possible. I'm attempting this here for HL, which seems to work well enough on HL/JIT, but fails on hlc:

Command: haxe [compile-hlc.hxml,-D,Windows]
Command exited with 1 in 6s: haxe [compile-hlc.hxml,-D,Windows]
test hl failed
�[30;41m ERROR �[0m (unknown position)

   | Error: Don't know how to compare array and array (hlc)

As @yuxiaomao points out, this has been reported before in #11468.

@ncannasse Any advice?

@Simn
Copy link
Member Author

Simn commented Feb 9, 2024

Another thing to address is that we're losing the sort function at the moment because the previous implementation just forwarded this to Array.sort. This means that we should finally look into #3388 first.

@skial skial mentioned this pull request Feb 9, 2024
1 task
@Simn
Copy link
Member Author

Simn commented Feb 10, 2024

Thanks to @Apprentice-Alchemist for telling me how to fix the array compare problem! HL is now green, and I'd like to merge this soon so that we can use Vector in the public std API across all targets.

I don't know what to do with the sort problem though. It seems like we would need implementations of https://github.com/HaxeFoundation/haxe/blob/development/std/haxe/ds/ArraySort.hx for all Vector "variants" on HL, but even then it's not obvious how to define the actual sort function itself.

@RblSb
Copy link
Member

RblSb commented Feb 10, 2024

I don’t know what exactly you’re talking about, but maybe it’s worth making the function inlineable so that inline vector.sort(...) can inline the callback (not sure how faster it can be on different targets)

@Simn
Copy link
Member Author

Simn commented Feb 10, 2024

Huh, I thought this didn't work on HL, but it seems like I'm wrong about that:

import haxe.ds.Vector;

function f<T>(x:Vector<T>) {
	trace(x);
}

function main() {
	var v = new Vector<Int>(1);
	f(v);
}

This makes me reluctant to merge this because apparently I don't understand how hl.NativeArray works.

@Apprentice-Alchemist
Copy link
Contributor

It works, as long as you don't try to access the contents of the native array.
The type being read is determined at the site of array access, so doing x[0] inside f would try to read a Dynamic instead of an Int, which can cause segfaults (https://try.haxe.org/#BE053756).

@Simn
Copy link
Member Author

Simn commented Feb 10, 2024

Right, so the situation is actually even worse than I thought. Surely this should be caught earlier...

Though to be fair, I also don't catch this on the JVM target and instead let it fail in the verifier with Exception in thread "main" java.lang.ClassCastException: class [I cannot be cast to class [Ljava.lang.Object;.

@Simn
Copy link
Member Author

Simn commented Feb 10, 2024

Your example gives me a proper error locally though:

Uncaught exception: Access violation
Called from _Main.$Main_Fields_.main(Main.hx:4)
Called from .init(?:1)

Do you also get that segfault locally with recent HL?

@Apprentice-Alchemist
Copy link
Contributor

Access violation is the Windows equivalent of a segfault, looks like HL turns those into real exceptions somehow.

@Apprentice-Alchemist
Copy link
Contributor

Testing locally on my Linux machine gives

SIGNAL 11
_Test2.$Test2_Fields_.main(Test2.hx:4)
.init(?:1)
fish: Job 1, 'hl out.hl' terminated by signal SIGSEGV (Address boundary error)

HL catches the segfault too but doesn't turn it into a real exception and just prints a stack trace before letting the default signal handler run.

@Simn
Copy link
Member Author

Simn commented Feb 10, 2024

Yeah okay, I can't merge this before this is improved somehow because we would be going from working code to a segfault, which isn't the best upgrading experience...

@yuxiaomao
Copy link
Contributor

I don't think re-implement Vector directly with hl.NativeArray is a good idea:

  • hl.NativeArray element size depends on type, and there is no typecheck: try reading a Dynamic value from not Dynamic compatible T 's Array (Int, Float, Bool etc) will cause access violation (try read value as pointer).
  • If we force hl.NativeArray<Dynamic>, there will be no access violation, but will add many unnecessary allocations for base type Array.
  • The current Array implementation already handle this properly: it use hl.Bytes for I32/UI16/F32/F64 and hl.NativeArray for all other types.
  • The only additional cost is one resize during new.

https://github.com/HaxeFoundation/hashlink/blob/f463d71217c08f91c8b2d90f75c4d02afc9ff4f5/src/std/types.c#L43-L67

@Simn
Copy link
Member Author

Simn commented Apr 10, 2024

I'll tell you how I got here:

  • There's an awful HL-specific hack in haxe.ds.EnumValueMap that was introduced with this commit.
  • The reason that was done is because Type.enumParameters is defined to return an Array: static function enumParameters(e:EnumValue):Array<Dynamic>. This causes allocation on HL (and other targets) which makes EnumValueMap slow.
  • I then wanted to look into adding a version of enumParameters that returns Vector instead. This would be quite convenient not only for HL, but also for the JVM target which has the exact same problem.
  • However, if I do this with the current HL implementation of Vector, we'll have the original problem again because it's based on Array.
  • And that's why I'd like to change it to work on top of NativeArray instead.

I'm aware of the problems you point out, but note that Vector is supposed to be completely transparent, and its type parameters are supposed to be invariant. The problem is that this cannot be expressed in our type system at the moment, so the only way to catch this is at the generator-level, where Vector already has special treatment on various targets anyway.

These variance violations have been failing natively since the old Flash times, and I'm fine with that, but it shouldn't fail with an incomprehensible segfault.

@Apprentice-Alchemist
Copy link
Contributor

I think catching this in the generator would require inserting extra type checking during the typed expr -> bytecode phase in every place where there is a potential type "conversion".

This can't be done in some kind of bytecode verification phase either because the native array type does not carry type information at compile time.

@Simn
Copy link
Member Author

Simn commented Apr 10, 2024

This can't be done in some kind of bytecode verification phase either because the native array type does not carry type information at compile time.

Meh, I expected arrays to know their element type. In that case I don't know how to properly manage this either.

@Apprentice-Alchemist
Copy link
Contributor

What about checking on array access/allocation instead, at that point the generator should be able to determine whether the element type is a type parameter.

haxe/src/generators/genhl.ml

Lines 2001 to 2031 in 7217530

| "$aalloc", [esize] ->
let et = (match follow e.etype with TAbstract ({ a_path = ["hl"],"NativeArray" },[t]) -> to_type ctx t | _ -> invalid()) in
let size = eval_to ctx esize HI32 in
let a = alloc_tmp ctx HArray in
let rt = alloc_tmp ctx HType in
op ctx (OType (rt,et));
op ctx (OCall2 (a,alloc_std ctx "alloc_array" [HType;HI32] HArray,rt,size));
a
| "$aget", [a; pos] ->
(*
read/write on arrays are unsafe : the type of NativeArray needs to be correcly set.
*)
let at = (match follow a.etype with TAbstract ({ a_path = ["hl"],"NativeArray" },[t]) -> to_type ctx t | _ -> invalid()) in
let arr = eval_to ctx a HArray in
hold ctx arr;
let pos = eval_to ctx pos HI32 in
free ctx arr;
let r = alloc_tmp ctx at in
op ctx (OGetArray (r, arr, pos));
cast_to ctx r (to_type ctx e.etype) e.epos
| "$aset", [a; pos; value] ->
let et = (match follow a.etype with TAbstract ({ a_path = ["hl"],"NativeArray" },[t]) -> to_type ctx t | _ -> invalid()) in
let arr = eval_to ctx a HArray in
hold ctx arr;
let pos = eval_to ctx pos HI32 in
hold ctx pos;
let r = eval_to ctx value et in
free ctx pos;
free ctx arr;
op ctx (OSetArray (arr, pos, r));
r

The example with trace(x) would still work, but actually accessing the array would give a compile time error instead of segfaulting.

Hmm. That wouldn't fix the var a:hl.NativeArray<Dynamic> = new hl.NativeArray<Int>(1); case though.

@ncannasse
Copy link
Member

The HL code to manage different types of Arrays is quite tricky.

  • There's "objects" Array (hl.types.ArrayObj) for all pointer types (including Null<Int>)
  • There's "bytes" Array (hl.types.ArrayBytes) for basic types Int/Float/Single/UI16
  • All of these inherit hl.types.ArrayAccess which also provides a dynamic API access
  • Then there's hl.types.ArrayDyn which allows to wrap an underlying Array AND eventually recast it to the good "actual" type but only once (for instance when you access a JSON parsed Array as Array<Int>)
  • And finally all these (including ArrayDyn) also implements an ArrayAccess API for extra features

So we shouldn't duplicate all of these for Vector, but it being an abstract with inline functions should work well ?

@Simn
Copy link
Member Author

Simn commented Apr 10, 2024

I agree it should work for normal use-cases. The problem is that currently assigning Vector<Int> to Vector<Dynamic> is just fine because HL's Vector is based on Array, and our type system doesn't care about this case. After this change it would instead segfault unceremoniously, which would be a terrible upgrading experience. That's why I'm looking into ways to make this fail nicer.

@Simn
Copy link
Member Author

Simn commented Aug 1, 2024

I'm quite willing to merge this now unless somebody objects. This is going to cause some friction but to be fair Vector should never have been based on Array in the first place.

@yuxiaomao
Copy link
Contributor

yuxiaomao commented Aug 1, 2024

I'm having 2 different errors in heaps when compiling shiro's project, could you please check before merge?

 ERROR  /heaps/h3d/impl/GlDriver.hx:599: characters 59-79

 599 |     gl.uniform4fv(s.globals, streamData(hl.Bytes.getArray(buf.globals.toData()), 0, s.shader.globalsSize * 16), 0, s.shader.globalsSize * 4);
     |                                                           ^^^^^^^^^^^^^^^^^^^^
     | haxe.ds._Vector.VectorData<hxd.impl.Float32> should be Array<Unknown<0>>
     | For function argument 'a'

 ERROR  /heaps/h3d/impl/GlDriver.hx:608: characters 58-77

 608 |     gl.uniform4fv(s.params, streamData(hl.Bytes.getArray(buf.params.toData()), 0, s.shader.paramsSize * 16), 0, s.shader.paramsSize * 4);
     |                                                          ^^^^^^^^^^^^^^^^^^^
     | haxe.ds._Vector.VectorData<hxd.impl.Float32> should be Array<Unknown<0>>
     | For function argument 'a'

 ERROR  /heaps/hxd/fmt/hmd/Library.hx:263: characters 43-63       

 263 |    buf.vertexes = haxe.ds.Vector.fromData(vertexes.getNative());
     |                                           ^^^^^^^^^^^^^^^^^^^^
     | hxd._FloatBuffer.InnerData should be haxe.ds._Vector.VectorData<Unknown<0>>
     | For function argument 'data'

Edit: maybe it's a heaps problem?

@Simn
Copy link
Member Author

Simn commented Aug 1, 2024

Vector.toData returns the underlying type of Vector, which used to be Array but is now NativeArray. This has to be fixed in heaps, and also means that Float32Array is currently Array-based, which surely isn't ideal.

@yuxiaomao
Copy link
Contributor

I don't really know how to deal with Vector.toData to hl.Bytes in heaps. Maybe we can add hl.Bytes.fromVectorData/fromNativeArray (same inner implementation as fromArray and $abytes)? but if we change api it will break heaps for haxe not in 5.0 nightly :/. If we use toArray, then it will recreate an Array from NativeArray et each uploadBuffer which does not seems to be a good idea either.

I also suggest put hl in the prealloc if of Vector.toArray

		#if (neko || hl)
		// prealloc good size
		if (len > 0)
			a[len - 1] = get(0);

@Simn
Copy link
Member Author

Simn commented Aug 2, 2024

Dealing with heaps is usually bad for my blood pressure but I can take a look if I have to...

And yes, good idea on the pre-alloc I suppose.

@Simn
Copy link
Member Author

Simn commented Aug 2, 2024

Well, I made some wrong assumptions here. I thought that Array would be based on NativeArray and NativeArray would be based on Bytes, but it's actually the case that Array is based on Bytes directly (for basic types). This makes NativeArray the odd one out because we can't easily go from it to Bytes, which is what's necessary for heaps.

This suggests to me that heaps actually wants to use Array instead of Vector, but I imagine that this is not an easy change to make. I really resent the fact that we have one target where Vector is just Array, so I'd still like to move forward with this PR, but I don't know how to make progress as long as heaps' Vector usage is in the way.

@yuxiaomao
Copy link
Contributor

I don't really like the idea of sending back array_bytes to construct hl.Bytes from NativeArray (instead of hl.Bytes.getArray for basic type Array), but they are both just some memory area allocated with hl_gc_alloc_noptr(size) and behave probably the same.

For heaps I made a PR and tested on Shiro's project, seems ok for now.

@Simn
Copy link
Member Author

Simn commented Aug 5, 2024

Thank you! For me this is good now. I'll let you handle the merging so that you can synchronize with the heaps change.

@yuxiaomao yuxiaomao merged commit 3b050f0 into development Aug 29, 2024
99 checks passed
@ncannasse
Copy link
Member

Looking again at it and discussing with @yuxiaomao , I don't think that this worth it.

First I don't really understand what's the problem of Vector implementation being an Array, as the API does not allow it to be resized we don't really care if it could be resized or not.

Second, using NativeArray for Vector implementation will open a lot of potential issues when doing Dynamic or scripting or casting, as NativeArray are not generics in the VM, allowing to cast directly between container types, with dire consequences such as hard crashes (access violation) or crashes when reading out of the allowed bounds which should be avoided entirely when doing pure user code in Haxe/HL.

Please revert :)

@Simn
Copy link
Member Author

Simn commented Aug 29, 2024

First I don't really understand what's the problem of Vector implementation being an Array, as the API does not allow it to be resized we don't really care if it could be resized or not.

All this originally comes from your commit here: 7dc9272

This introduces a hack in order to avoid Array allocation in an API function, replacing it with hl.NativeArray. I want to standardize that because I have the same issue on the JVM target, but if I introduce a new enumParameters function that returns Vector we end up having the same problem again. As it stands, such functions have the potential to be a performance trap (see #10699), so there's an incentive here to have a cross-target Vector type that works on HL too.

@ncannasse
Copy link
Member

I understand that Type.getEnumParameters might want some platform specific optimizations. But I think in that case that the correct answer would be for it to return an abstract with array access, so the actual underlying type can be platform specific based on how the platform represent enums. We could actually get entirely rid of any allocation on some platforms where the underlying type would be the enum itself.

We could keep either to toArray() or even a to Array for backward compatibility.

@ncannasse
Copy link
Member

@Simn up :)

@Simn
Copy link
Member Author

Simn commented Sep 11, 2024

I figured Yuxiao may want to handle this because it might involve heaps changes too.

It still annoys me that we have this container type that is fast on all targets except HL, but at this point in my Haxe life I don't see myself arguing this point too much, so feel free to revert all this.

@yuxiaomao
Copy link
Contributor

Ok I'll handle it

yuxiaomao added a commit that referenced this pull request Sep 11, 2024
Simn pushed a commit that referenced this pull request Sep 11, 2024
* Revert "[hl] use hl.NativeArray for Vector (#11568)"

This reverts commit 3b050f0.

* [tests] test change can remains
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants