-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can you give better examples of automagical spout/bolt wiring? #7
Comments
Breeze does the wiring for you. Just configure the requirements per step and it should just work.
With the following configuration both bolts will read the input from the spout. <spout ... outputFields="a"/>
<bolt ... signature="f(a)"/>
<bolt ... signature="g(a)"/>
With the following configurtion b3 reads the output from b1 and b2. <bolt id="b1" ... outputFields="x"/>
<bolt id="b2" ... outputFields="y"/>
<bolt id="b3" ... signature="f(x, y)"/> Behind the scenes, currently the processing steps are sequential since we focus mostly on throughput, not latency. In case of ambiguity the compiler keeps the order as listed in the topology XML. [starter-demo] INFO eu.icolumbo.breeze.build.TopologyCompilation - Compiled as: {[spout 'feed']=[[bolt 'greet'], [bolt 'mark'], [bolt 'register']]} For parallel execution one option would be to split the stream automatically where possible. With something like "depends-on" the user may enforce a processing order when needed. The programmatic configuration example confused people. A kickstarter should demonstrate the ease of use. However, we could demonstrate more complicated flows indeed. |
The kickstarter example doesn't match your example. Here is your example: <spout ... outputFields="a"/>
<bolt ... signature="f(a)"/>
<bolt ... signature="g(a)"/> In this case both the bolts would have the spout as input. Here is the spring config in kickstarter: <breeze:spout id="feed" beanType="java.util.Random" signature="nextLong()" outputFields="number">
<breeze:transaction ack="setSeed(number)"/>
</breeze:spout>
<breeze:bolt id="greet" beanType="com.example.Greeter" signature="greet(number)" outputFields="heading"/>
<breeze:bolt id="mark" beanType="com.example.Marker" signature="mark(number)" outputFields="judge isOdd"> Based on your above example both the 'greet' and the 'mark' bolts would have the 'feed' spout as the only input. The kickstarter programmatic config: SpringBolt greet = new SpringBolt(Greeter.class, "greet(number)", "heading");
greet.setPassThroughFields("number");
builder.setBolt("greet", greet).noneGrouping("feed");
SpringBolt mark = new SpringBolt(Marker.class, "mark(number)", "source", "isEven");
mark.setPassThroughFields("heading");
builder.setBolt("mark", mark).noneGrouping("greet"); This shows the 'mark' bolt with only the 'greet' bolt as input. That's why I say it's ambiguous and confusing. I think the spring config should have an option to do explicit grouping. For example have a sub field for bolts: <breeze:bolt id="greet" beanType="com.example.Greeter" signature="greet(number)" outputFields="heading">
<breeze:grouping source="feed" type="shuffle"/>
<breeze:grouping source="someOtherInput" type="none"/>
</breeze:bolt> |
Effectively bolt mark and bold greet use spout feed as their input. However with true parallel execution (channel split) the latency could be improved and the traffic between steps (tuple fields) is reduced. The split functionality should be easy to implement. Aggregation is a bit more tricky with high volumes. I want to prevent the reference bloat from Storm. For example Breeze can detect that mark and greet may run in parallel. It is also clear that the results need to be aggregated for register. On the other hand not everybody wants the aggregation overhead either. How about the following? <spout ... outputFields="x">
<split>
<pipe>
<bolt ... signature="f(x)" outputFields="a"/>
<bolt ... signature="f(a)" outputFields="z"/>
</pipe>
<pipe>
<bolt ... signature="f(x)" outputFields="z"/>
</pipe>
</split>
<bolt ... signature="f(z)"/> |
So the programmatic example in kickstarter didn't match the intended functionality of the spring config version from kickstarter? |
The programmatic example did match the intended functionality of the spring config. Three notes:
|
How do you wire multiple bolts to a single spout? Or wire a bolt to the output of multiple bolts? It's pretty easy programmatically but not sure how do it with spring context file. The example in kickstarter project is too simple and it's ambiguous.
The README says: "The SpringSpout and SpringBolt classes are configured with a Spring bean and a method signature. The compiler automagically orders the processing steps based on the field names."
But in the kickstarter example the Greeter and Marker bolts both take 'number' as the only argument. So its not clear if they are both going to be wired to the spout for input or whether Marker will be wired to the output of Greeter. It was clear in the programmatic example.
Also, why was the programmatic config removed from the kickstarter example?
The text was updated successfully, but these errors were encountered: