Class PairStream<K,​V>

  • Type Parameters:
    K - the key type
    V - the value type

    @Unstable
    public class PairStream<K,​V>
    extends Stream<Pair<K,​V>>
    Represents a stream of key-value pairs.
    • Method Detail

      • mapValues

        public <R> PairStream<K,​R> mapValues​(Function<? super V,​? extends R> function)
        Returns a new stream by applying a Function to the value of each key-value pairs in this stream.
        Parameters:
        function - the mapping function
        Returns:
        the new stream
      • flatMapValues

        public <R> PairStream<K,​R> flatMapValues​(FlatMapFunction<? super V,​? extends R> function)
        Return a new stream by applying a FlatMapFunction function to the value of each key-value pairs in this stream.
        Parameters:
        function - the flatmap function
        Returns:
        the new stream
      • aggregateByKey

        public <R> PairStream<K,​R> aggregateByKey​(R initialValue,
                                                        BiFunction<? super R,​? super V,​? extends R> accumulator,
                                                        BiFunction<? super R,​? super R,​? extends R> combiner)
        Aggregates the values for each key of this stream using the given initial value, accumulator and combiner.
        Parameters:
        initialValue - the initial value of the result
        accumulator - the accumulator
        combiner - the combiner
        Returns:
        the new stream
      • aggregateByKey

        public <A,​R> PairStream<K,​R> aggregateByKey​(CombinerAggregator<? super V,​A,​? extends R> aggregator)
        Aggregates the values for each key of this stream using the given CombinerAggregator.
        Parameters:
        aggregator - the combiner aggregator
        Returns:
        the new stream
      • countByKey

        public PairStream<K,​Long> countByKey()
        Counts the values for each key of this stream.
        Returns:
        the new stream
      • reduceByKey

        public PairStream<K,​V> reduceByKey​(Reducer<V> reducer)
        Performs a reduction on the values for each key of this stream by repeatedly applying the reducer.
        Parameters:
        reducer - the reducer
        Returns:
        the new stream
      • groupByKey

        public PairStream<K,​Iterable<V>> groupByKey()
        Returns a new stream where the values are grouped by the keys.
        Returns:
        the new stream
      • groupByKeyAndWindow

        public PairStream<K,​Iterable<V>> groupByKeyAndWindow​(Window<?,​?> window)
        Returns a new stream where the values are grouped by keys and the given window. The values that arrive within a window having the same key will be merged together and returned as an Iterable of values mapped to the key.
        Parameters:
        window - the window configuration
        Returns:
        the new stream
      • reduceByKeyAndWindow

        public PairStream<K,​V> reduceByKeyAndWindow​(Reducer<V> reducer,
                                                          Window<?,​?> window)
        Returns a new stream where the values that arrive within a window having the same key will be reduced by repeatedly applying the reducer.
        Parameters:
        reducer - the reducer
        window - the window configuration
        Returns:
        the new stream
      • peek

        public PairStream<K,​V> peek​(Consumer<? super Pair<K,​V>> action)
        Returns a stream consisting of the elements of this stream, additionally performing the provided action on each element as they are consumed from the resulting stream.
        Overrides:
        peek in class Stream<Pair<K,​V>>
        Parameters:
        action - the action to perform on the element as they are consumed from the stream
        Returns:
        the new stream
      • filter

        public PairStream<K,​V> filter​(Predicate<? super Pair<K,​V>> predicate)
        Returns a stream consisting of the elements of this stream that matches the given filter.
        Overrides:
        filter in class Stream<Pair<K,​V>>
        Parameters:
        predicate - the predicate to apply to each element to determine if it should be included
        Returns:
        the new stream
      • join

        public <V1> PairStream<K,​Pair<V,​V1>> join​(PairStream<K,​V1> otherStream)
        Join the values of this stream with the values having the same key from the other stream.

        Note: The parallelism of this stream is carried forward to the joined stream.

        Parameters:
        otherStream - the other stream
        Returns:
        the new stream
      • join

        public <R,​V1> PairStream<K,​R> join​(PairStream<K,​V1> otherStream,
                                                       ValueJoiner<? super V,​? super V1,​? extends R> valueJoiner)
        Join the values of this stream with the values having the same key from the other stream.

        Note: The parallelism of this stream is carried forward to the joined stream.

        Parameters:
        otherStream - the other stream
        valueJoiner - the ValueJoiner
        Returns:
        the new stream
      • leftOuterJoin

        public <V1> PairStream<K,​Pair<V,​V1>> leftOuterJoin​(PairStream<K,​V1> otherStream)
        Does a left outer join of the values of this stream with the values having the same key from the other stream.

        Note: The parallelism of this stream is carried forward to the joined stream.

        Parameters:
        otherStream - the other stream
        Returns:
        the new stream
      • leftOuterJoin

        public <R,​V1> PairStream<K,​R> leftOuterJoin​(PairStream<K,​V1> otherStream,
                                                                ValueJoiner<? super V,​? super V1,​? extends R> valueJoiner)
        Does a left outer join of the values of this stream with the values having the same key from the other stream.

        Note: The parallelism of this stream is carried forward to the joined stream.

        Parameters:
        otherStream - the other stream
        valueJoiner - the ValueJoiner
        Returns:
        the new stream
      • rightOuterJoin

        public <V1> PairStream<K,​Pair<V,​V1>> rightOuterJoin​(PairStream<K,​V1> otherStream)
        Does a right outer join of the values of this stream with the values having the same key from the other stream.

        Note: The parallelism of this stream is carried forward to the joined stream.

        Parameters:
        otherStream - the other stream
        Returns:
        the new stream
      • rightOuterJoin

        public <R,​V1> PairStream<K,​R> rightOuterJoin​(PairStream<K,​V1> otherStream,
                                                                 ValueJoiner<? super V,​? super V1,​? extends R> valueJoiner)
        Does a right outer join of the values of this stream with the values having the same key from the other stream.

        Note: The parallelism of this stream is carried forward to the joined stream.

        Parameters:
        otherStream - the other stream
        valueJoiner - the ValueJoiner
        Returns:
        the new stream
      • fullOuterJoin

        public <V1> PairStream<K,​Pair<V,​V1>> fullOuterJoin​(PairStream<K,​V1> otherStream)
        Does a full outer join of the values of this stream with the values having the same key from the other stream.

        Note: The parallelism of this stream is carried forward to the joined stream.

        Parameters:
        otherStream - the other stream
        Returns:
        the new stream
      • fullOuterJoin

        public <R,​V1> PairStream<K,​R> fullOuterJoin​(PairStream<K,​V1> otherStream,
                                                                ValueJoiner<? super V,​? super V1,​? extends R> valueJoiner)
        Does a full outer join of the values of this stream with the values having the same key from the other stream.

        Note: The parallelism of this stream is carried forward to the joined stream.

        Parameters:
        otherStream - the other stream
        valueJoiner - the ValueJoiner
        Returns:
        the new stream
      • window

        public PairStream<K,​V> window​(Window<?,​?> window)
        Returns a new stream consisting of the elements that fall within the window as specified by the window parameter. The Window specification could be used to specify sliding or tumbling windows based on time duration or event count. For example,
         // time duration based sliding window
         stream.window(SlidingWindows.of(Duration.minutes(10), Duration.minutes(1));
        
         // count based sliding window
         stream.window(SlidingWindows.of(Count.(10), Count.of(2)));
        
         // time duration based tumbling window
         stream.window(TumblingWindows.of(Duration.seconds(10));
         
        Overrides:
        window in class Stream<Pair<K,​V>>
        Parameters:
        window - the window configuration
        Returns:
        the new stream
        See Also:
        SlidingWindows, TumblingWindows
      • repartition

        public PairStream<K,​V> repartition​(int parallelism)
        Returns a new stream with the given value of parallelism. Further operations on this stream would execute at this level of parallelism.
        Overrides:
        repartition in class Stream<Pair<K,​V>>
        Parameters:
        parallelism - the parallelism value
        Returns:
        the new stream
      • branch

        public PairStream<K,​V>[] branch​(Predicate<? super Pair<K,​V>>... predicates)
        Returns an array of streams by splitting the given stream into multiple branches based on the given predicates. The predicates are applied in the given order to the values of this stream and the result is forwarded to the corresponding (index based) result stream based on the (index of) predicate that matches.

        Note: If none of the predicates match a value, that value is dropped.

        Overrides:
        branch in class Stream<Pair<K,​V>>
        Parameters:
        predicates - the predicates
        Returns:
        an array of result streams (branches) corresponding to the given predicates
      • updateStateByKey

        public <R> StreamState<K,​R> updateStateByKey​(R initialValue,
                                                           BiFunction<? super R,​? super V,​? extends R> stateUpdateFn)
        Update the state by applying the given state update function to the previous state of the key and the new value for the key. This internally uses IStatefulBolt to save the state. Use Config.TOPOLOGY_STATE_PROVIDER to choose the state implementation.
        Parameters:
        stateUpdateFn - the state update function
        Returns:
        the StreamState which can be used to query the state
      • updateStateByKey

        public <R> StreamState<K,​R> updateStateByKey​(StateUpdater<? super V,​? extends R> stateUpdater)
        Update the state by applying the given state update function to the previous state of the key and the new value for the key. This internally uses IStatefulBolt to save the state. Use Config.TOPOLOGY_STATE_PROVIDER to choose the state implementation.
        Parameters:
        stateUpdater - the state updater
        Returns:
        the StreamState which can be used to query the state
      • coGroupByKey

        public <V1> PairStream<K,​Pair<Iterable<V>,​Iterable<V1>>> coGroupByKey​(PairStream<K,​V1> otherStream)
        Groups the values of this stream with the values having the same key from the other stream.

        If stream1 has values - (k1, v1), (k2, v2), (k2, v3)
        and stream2 has values - (k1, x1), (k1, x2), (k3, x3)
        The the co-grouped stream would contain - (k1, ([v1], [x1, x2]), (k2, ([v2, v3], [])), (k3, ([], [x3]))

        Note: The parallelism of this stream is carried forward to the co-grouped stream.

        Parameters:
        otherStream - the other stream
        Returns:
        the new stream