Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix indexing for arrays with non-zero base #1824

Merged
merged 3 commits into from
Dec 4, 2024

Conversation

BCSharp
Copy link
Member

@BCSharp BCSharp commented Dec 2, 2024

Currently IronPython ignores the lowerbound of a CLI array and always treats the array as 0-based. Assume arr is a 1-based 2x2 matrix. Using SetValue or GetValue requires using 1-based indices, but when using subscripts all indices are must be shifted down by 1. This is not .NET-compliant.

from System import *

# 1-based 2x2 matrix
arr = Array.CreateInstance(str, (2,2), (1,1))

# SetValue uses correct indices
arr.SetValue("a_1,1", Array[Int32]((1,1)))
arr.SetValue("a_1,2", Array[Int32]((1,2)))
arr.SetValue("a_2,1", Array[Int32]((2,1)))
arr.SetValue("a_2,2", Array[Int32]((2,2)))

# BUT subscripts are off by 1
arr[0, 0] = "a_1,1"
arr[0, 1] = "a_1,2"
arr[1, 0] = "a_2,1"
arr[1, 1] = "a_2,2"

After this PR:

# As above:
arr.SetValue("a_1,1", Array[Int32]((1,1)))
arr.SetValue("a_1,2", Array[Int32]((1,2)))
arr.SetValue("a_2,1", Array[Int32]((2,1)))
arr.SetValue("a_2,2", Array[Int32]((2,2)))

# AND:
arr[1, 1] = "a_1,1"
arr[1, 2] = "a_1,2"
arr[2, 1] = "a_2,1"
arr[2, 2] = "a_2,2"

This puts IronPython on par with C#:

// Creates and initializes a 2-dimensional, 1-based Array of type string (a 2x2 matrix).
Array myArray = Array.CreateInstance(typeof(string), new[] { 2, 2 }, new[] { 1, 1 } );
for (int i = myArray.GetLowerBound(0); i <= myArray.GetUpperBound(0); i++)
    for (int j = myArray.GetLowerBound(1); j <= myArray.GetUpperBound(1); j++)
        myArray.SetValue($"a_{i},{j}", new int[2] { i, j } );

// Equivalently:
var arr = (string[,])myArray;
arr[1, 1] = "a_1,1";
arr[1, 2] = "a_1,2";
arr[2, 1] = "a_2,1";
arr[2, 2] = "a_2,2";

Notes on indexing in C#

The only way to create non-zero-based arrays in C# is by using Array.CreateInstance, which returns an object of abstract type Array. This type does not implement the index operator; to use it, the array has to be downcast to the concrete (native) type. Hence the cast to (string[,]) above is necessary.

Actually, the concrete type of an non-zero-based array is not string[,] but string[*,*]. But a cast (string[*,*]) in C# is syntactically invalid. The compiler accepts cast (string[,]) but interprets it as (string[*,*]). That is, var arr is string[*,*] arr as it were.

Surprisingly, this is only supported for higher dimensional arrays; for 1-dimensional arrays casting from 1-based string array to string[] results in a compile error of invalid cast from string[*] to string[]. In practice this means that 1-based 1-dimentional arrays cannot be indexed in C# using the index operator. I can only speculate why it is so.

IronPython, being a dynamic language, does not require casting and an array of any type can be indexed using the index operator.

Summary of indexing of CLI arrays in IronPython:

Base Index >= 0 Index < 0
> 0 absolue relative from end
0 absolute == relative from beginning relative from end
< 0 absolute absolute

Comparison to indexing in C# and CPython:

  • Index >= 0, any base is C# compliant.
  • Base 0, any index is CPython compliant.
  • Base 0, index < 0 is not supported by C# but can be achieved by System.Index with 1-dim arrays only; then IronPython indexing is C# compliant (as well as CPython compliant) in principle (support for System.Index is not implemented).
  • Base > 0, index < 0 is not supported by C#; IronPython follows the CPython convention as more practical.
  • Base < 0, index < 0 is C# compliant.
  • Base != 0 is not supported by CPython for any builtin structures.

For more detailed explanation of indexing see #1828 (comment)

@BCSharp BCSharp changed the title Fix indixing for arrays with non-zero base Fix indexing for arrays with non-zero base Dec 2, 2024
Copy link
Contributor

@slozier slozier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't spot any obvious issues. Do you know of any Python packages that would have types that have a similar non-zero base? Just wondering if there's anything to compare to. Still not sure why anyone would want to use these. 😄

@BCSharp
Copy link
Member Author

BCSharp commented Dec 4, 2024

No, I don't know of any Python packages using non-zero based indexing. This is more a .NET thing. Even then, I don't know what it was intended for. For a while, my guess was Visual Basic, because you could write Option Base 1 and get all arrays 1-based. Or Dim arr(1 To 5) As Integer to get a specific variable 1-based, but afaik this was in VB/VBA before VB for .NET. In VB for .NET the lower bound is always 0. Maybe it was meant for other, future languages. Like IronPython... 😉

There is a reason why this code was sitting in my stash for 6 years. 😄

@BCSharp BCSharp merged commit 476ebae into IronLanguages:main Dec 4, 2024
8 checks passed
@BCSharp BCSharp deleted the array_baseN branch December 4, 2024 04:15
@BCSharp
Copy link
Member Author

BCSharp commented Dec 6, 2024

From the non-Python world: MATLAB and GNU Octave use 1-based arrays/matrices. All indices are treated as absolute, i.e. no negative indices allowed. Slices can have negative steps though. Just like in Numpy/Sympy, multidimensional arrays can be sliced freely (something that IronPython does not support (yet)).

Also, Wolfram Mathematica uses vectors, matrices, tensors etc. which are all 1-based. It seems to be a convention used by mathematicians. It also supports negative indices, and they behave like in IronPython for 1-based arrays: i.e. while index 1 addresses the first element, index -1 is for the last element. Mathematica also has slices, which are inclusive (unlike in Python). Basically, everything in Mathematica is 1-based by default with the end being inclusive, unless overridden. E.g. even Range[5] is 1, 2, 3, 4, 5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants