-
Notifications
You must be signed in to change notification settings - Fork 290
Pipe And PipeArgs
Chris Lu edited this page Oct 10, 2016
·
2 revisions
Pipe works exactly like Unix Pipes.
This is the typical Unix Pipe usage. The textual lines are output from one dataset as input to the next dataset.
Pipe usually works on text line by line, which means the input and output usually is a string.
When a tuple is passed in to Pipe(), the tuple is converted to tab-separated fields.
When Pipe() need to output a tuple, the tuple should be converted to tab-separated fields also.
A typical use case is that we need to process a list of file names. For example, we need to use the files content as input to the next dataset.
fileNames := []string{"1.txt", "2.txt", "3.txt"}
flow.New().Lines(fileNames).PipeAsArgs("cat $1").Map(...).Reduce(...)...
gzippedFileNames = []string{"1.txt.gz", "2.txt.gz", "3.txt.gz"}
flow.New().Lines(gzippedFileNames).PipeAsArgs("zcat $1").Map(...).Reduce(...)...