Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GroupBy variable converter in Jupyter #663

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

koperagen
Copy link
Collaborator

@koperagen koperagen commented Apr 17, 2024

This was a long standing minor inconvenience. Having this variable in a cell1 you couldn't access k in keys in the cell2
cell1:

val groupBy = dataFrameOf("a")("1", "11", "2", "22")
    .groupBy { expr { "a"<String>().length } named "k" }

cell2:

groupBy.keys.k

Other use case is adding a new column to GroupBy

val groupBy = dataFrameOf("a")("1", "11", "2", "22")
    .groupBy { expr { "a"<String>().length } named "k" }
    .add("newCol") { 42 }
groupBy.aggregate { newCol into "newCol" }

@koperagen koperagen added the enhancement New feature or request label Apr 17, 2024
@koperagen koperagen added this to the 0.14.0 milestone Apr 17, 2024
@koperagen koperagen self-assigned this Apr 17, 2024
Copy link
Collaborator

@Jolanrensen Jolanrensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good that it solves the issue :), but I now have more questions about how it actually works than before

public data class CodeWithConverter(val declarations: Code, val converter: (VariableName) -> Code = EmptyConverter) {
public data class CodeWithConverter<T : CodeConverter>(
val declarations: Code,
val converter: T
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's an idea to rename this variable or rework it a bit, because looking at this with new eyes, even with the documentation, I have no idea what's going on

Copy link
Collaborator Author

@koperagen koperagen Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk, hard to tell more than expressed in the code itself. Basically CodeWithConverter is all the information that dataframe integration needs to generate except name of the variable being updated. Variable name is provided by the caller due to separation of concerns. So there's hardly any special "domain" meaning behind it

@@ -7,14 +7,19 @@ import org.jetbrains.kotlinx.jupyter.api.VariableName
* Class representing generated code declarations for a [Marker].
*
* @param declarations The generated code.
* @param converter Needs to provide additional info (name) from org.jetbrains.dataframe.impl.codeGen.CodeGenerator to its callers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two @param converters now

@@ -27,3 +32,19 @@ public data class CodeWithConverter(val declarations: Code, val converter: (Vari
else -> declarations + "\n" + converter(name)
}
}

public sealed interface CodeConverter : (VariableName) -> Code
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs docs to explain what it does/should do. As it's a foreign concept and essentially just a String->String, which doesn't tell much. Maybe it could steal some docs from CodeWithConverter

fun `groupBy`() {
"""
val groupBy = dataFrameOf("a")("1", "11", "2", "22").groupBy { expr { "a"<String>().length } named "k" }
groupBy.keys.k
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, this works in the same cell?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No :) The test executes each line individually. It's like 2 separate cells here

* @param converter Optional converter for the [Marker], such as a [org.jetbrains.kotlinx.dataframe.api.cast], often used for Jupyter.
*/
public data class CodeWithConverter(val declarations: Code, val converter: (VariableName) -> Code = EmptyConverter) {
public data class CodeWithConverter<T : CodeConverter>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see most functions return a CodeWithWithConverter<*> and since both CodeConverterImpl and ProvidedCodeConverter are essentially just a (VariableName) -> Code, do we need the generic type T?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GroupBy i need to combine names of two generated interfaces together for cast, and so CodeWithConverter provides this additional info. This parameter doesn't make sense for other usages, so there we use more generic CodeWithWithConverter<*>

@Jolanrensen
Copy link
Collaborator

I'll also check whether it influences #662 or not

@koperagen koperagen modified the milestones: 0.14.0, 0.15.0 Sep 17, 2024
@Jolanrensen
Copy link
Collaborator

what needs to be done about this pr?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants