Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test: simplify gc memmove #109014

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft

Test: simplify gc memmove #109014

wants to merge 8 commits into from

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Oct 18, 2024

No description provided.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Oct 18, 2024
@EgorBo
Copy link
Member Author

EgorBo commented Oct 18, 2024

@EgorBot -intel -arm64 -profiler -nonativepgo

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args);

[GcServer(true)]
public class Bench
{
    static object[] Src0 = Enumerable.Range(0, 2).Select(i => (object)i).ToArray();
    static object[] Src1 = Enumerable.Range(0, 10).Select(i => (object)i).ToArray();
    static object[] Src2 = Enumerable.Range(0, 128).Select(i => (object)i).ToArray();
    static object[] Dst = new object[128];

    [Benchmark]public void CopyObjects2() => Src0.AsSpan().CopyTo(Dst);
    [Benchmark]public void CopyObjects10() => Src1.AsSpan().CopyTo(Dst);
    [Benchmark]public void CopyObjects128() => Src2.AsSpan().CopyTo(Dst);
}

@huoyaoyuan
Copy link
Member

Is auto-vectorization by C++ compiler expected to kick in?

@EgorBo
Copy link
Member Author

EgorBo commented Oct 18, 2024

Is auto-vectorization by C++ compiler expected to kick in?

I am just curious about the impact of the logic we have here.

  1. It seems that the current logic it's not 100% safe as it uses SIMD on x64. SIMD on x64 demands 16 byte alignment for both src and dest to be atomic, but we only align dest.
  2. I've observed a lot of "Load with an additional latency due to misalignment" PMU counters pointing to this function on arm64 with perf

UPD: Results suggest the plain loop is faster? EgorBot/runtime-utils#123 (not sure about windows/msvc)

@huoyaoyuan
Copy link
Member

How about the numbers at
#101699 (comment) ? I was basically doing the same thing for NoSSE.

@EgorBo
Copy link
Member Author

EgorBo commented Oct 18, 2024

@EgorBot -arm64 -profiler -nonativepgo -perf_events ld_align_lat

using BenchmarkDotNet.Attributes;

public class Bench
{
    static object[] Src = Enumerable.Range(0, 10).Select(i => (object)i).ToArray();
    static object[] Dst = new object[128];

    [Benchmark]public void CopyObjects10() => Src.AsSpan().CopyTo(Dst);
}

@EgorBo
Copy link
Member Author

EgorBo commented Oct 18, 2024

How about the numbers at #101699 (comment) ? I was basically doing the same thing for NoSSE.

do you mean AVX numbers? I guess the atomicity concern still stands

@EgorBo
Copy link
Member Author

EgorBo commented Oct 18, 2024

@EgorBot -intel -arm64 -profiler -nonativepgo

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args);

[GcServer(true)]
public class Bench
{
    static object[] Src0 = Enumerable.Range(0, 2).Select(i => (object)i).ToArray();
    static object[] Src1 = Enumerable.Range(0, 10).Select(i => (object)i).ToArray();
    static object[] Src2 = Enumerable.Range(0, 128).Select(i => (object)i).ToArray();
    static object[] Dst = new object[128];

    [Benchmark]public void CopyObjects2() => Src0.AsSpan().CopyTo(Dst);
    [Benchmark]public void CopyObjects10() => Src1.AsSpan().CopyTo(Dst);
    [Benchmark]public void CopyObjects128() => Src2.AsSpan().CopyTo(Dst);
}

@EgorBo
Copy link
Member Author

EgorBo commented Oct 18, 2024

With volatile it's obviosly a regression on large inputs, but the PMU counters for unaligned loads look happy:

Main:

 563591328      ld_align_lat
         0      st_align_lat
2801225037      unaligned_ld_spec
2780268499      unaligned_st_spec

PR:

     19048      ld_align_lat
         9      st_align_lat
         0      unaligned_ld_spec
         0      unaligned_st_spec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants