-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why spark generate Java code and not scala code? #18
Comments
Thank you @igreenfield for such an amazing question! I was looking for the reasons in the documentation and old PR but found any information about that. I've just posted a question on Spark users group. You can follow the conversation on https://mail-archives.apache.org/mod_mbox/spark-user/201911.mbox/browser or if not, I'll keep you up to date on this Issue. Cheers, |
@bartosz25 I was looking into the code generation phase and I think that if the code was scala it was easier to reduce the number of code line so many cases of compilation failed due to method grows more then 64KB will disappear. |
Hi @igreenfield , I've some answers from the mailing list:
Long story short, it's all about the compilation performance :) Regarding your point about 64KB limitation, AFAIK, Spark has a protection against too long methods. First, it's able to split too long function into multiple methods ( Did you already have some issues about "too long" generated method which made your pipeline fail? I've never experienced that so I'm really curious to learn new things and maybe help you to overcome the issue by reworking the code? |
Hi @bartosz25
we can schedule a call and I can explain in more details. another thing, one of the answers:
I think the ability to return more than one object from a function can do the different in splitting the huge methods into smaller ones. |
Re @igreenfield At that moment I don't have much time so I won't be able to help you. Sorry for that, late January it should be better. Meantime, maybe you can take a look at my series about Apache Spark customization. I cover them how to alter logical and physical plans, how to add a new parser and so forth. Maybe with that you can write your own code generation which will be much shorter than the code you've just shown me. The articles were published here: https://www.waitingforcode.com/tags/spark-sql-customization Anyway, I doubt that Spark community agrees on switching code generation to Scala because of a single demand. But you can always take a try and ask directly on the mailing list https://spark.apache.org/community.html Cheers, |
Hi, @bartosz25 Thanks! I will be in touch with you in late January. |
No description provided.
The text was updated successfully, but these errors were encountered: