Interface Function
-
- All Superinterfaces:
EachOperation
,Operation
,Serializable
- All Known Implementing Classes:
BaseFunction
,CombinerAggregatorInitImpl
,ConsumerExecutor
,FilterExecutor
,FlatMapFunctionExecutor
,MapFunctionExecutor
,PrintFunction
,Split
,StringLength
,TridentReach.ExpandList
,TridentWordCount.Split
,TuplifyArgs
public interface Function extends EachOperation
A function takes in a set of input fields and emits zero or more tuples as output. The fields of the output tuple are appended to the original input tuple in the stream. If a function emits no tuples, the original input tuple is filtered out. Otherwise, the input tuple is duplicated for each output tuple.For example, if you have the following function:
```java public class MyFunction extends BaseFunction { public void execute(TridentTuple tuple, TridentCollector collector) { for(int i=0; i < tuple.getInteger(0); i++) { collector.emit(new Values(i)); } } } ```
Now suppose you have a stream in the variable `mystream` with the fields `["a", "b", "c"]` with the following tuples:
``` [1, 2, 3] [4, 1, 6] [3, 0, 8] ``` If you had the following code in your topology definition:
```java mystream.each(new Fields("b"), new MyFunction(), new Fields("d"))) ```
The resulting tuples would have the fields `["a", "b", "c", "d"]` and look like this:
``` [1, 2, 3, 0] [1, 2, 3, 1] [4, 1, 6, 0] ```
In this case, the parameter `new Fields("b")` tells Trident that you would like to select the field "b" as input to the function, and that will be the only field in the Tuple passed to the `execute()` method. The value of "b" in the first tuple (2) causes the for loop to execute twice, so 2 tuples are emitted. similarly the second tuple causes one tuple to be emitted. For the third tuple, the value of 0 causes the `for` loop to be skipped, so nothing is emitted and the incoming tuple is filtered out of the stream.
### Configuration If your `Function` implementation has configuration requirements, you will typically want to extend
BaseFunction
and override theOperation.prepare(Map, TridentOperationContext)
method to perform your custom initialization.### Performance Considerations Because Trident Functions perform logic on individual tuples -- as opposed to batches -- it is advisable to avoid expensive operations such as database operations in a Function, if possible. For data store interactions it is better to use a
State
orQueryFunction
implementation since Trident states operate on batch partitions and can perform bulk updates to a database.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description void
execute(TridentTuple tuple, TridentCollector collector)
Performs the function logic on an individual tuple and emits 0 or more tuples.
-
-
-
Method Detail
-
execute
void execute(TridentTuple tuple, TridentCollector collector)
Performs the function logic on an individual tuple and emits 0 or more tuples.- Parameters:
tuple
- The incoming tuplecollector
- A collector instance that can be used to emit tuples
-
-