Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Block Operators more neatly with the DAG #214

Open
tomerk opened this issue Jan 26, 2016 · 2 comments
Open

Integrate Block Operators more neatly with the DAG #214

tomerk opened this issue Jan 26, 2016 · 2 comments

Comments

@tomerk
Copy link
Contributor

tomerk commented Jan 26, 2016

Currently so as to be easily chainable with the rest of the code, block operators (such as block solves and block transformers) take a single complete RDD and manually split it into multiple blocks in a way that is hidden from the DAG.

If we add some DAG rewriting rules to detect this and integrate block operators better with the DAG, we should be able to take advantage of optimizations like auto-caching more effectively, and we can allow the block operators to operate on blocks lazily.

@etrain
Copy link
Contributor

etrain commented Jan 26, 2016

One thing that makes the block solves tricky is that the blocks are not independent. That is - we pass a Seq[RDD[T]] because the solution to the second block depends on the solution to the first block. It is not clear to me how to capture this in the DAG.

@tomerk
Copy link
Contributor Author

tomerk commented Jan 26, 2016

I think it should be able to work the same way the GatherTransformer works: a TransformerNode that takes multiple RDDs together as input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants