Package org.apache.storm.starter
Class ReachTopology
- java.lang.Object
-
- org.apache.storm.starter.ReachTopology
-
public class ReachTopology extends Object
This is a good example of doing complex Distributed RPC on top of Storm. This program creates a topology that can compute the reach for any URL on Twitter in realtime by parallelizing the whole computation.Reach is the number of unique people exposed to a URL on Twitter. To compute reach, you have to get all the people who tweeted the URL, get all the followers of all those people, unique that set of followers, and then count the unique set. It's an intense computation that can involve thousands of database calls and tens of millions of follower records.
This Storm topology does every piece of that computation in parallel, turning what would be a computation that takes minutes on a single machine into one that takes just a couple seconds.
For the purposes of demonstration, this topology replaces the use of actual DBs with in-memory hashmaps.
- See Also:
- Distributed RPC
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ReachTopology.CountAggregator
static class
ReachTopology.GetFollowers
static class
ReachTopology.GetTweeters
static class
ReachTopology.PartialUniquer
-
Field Summary
Fields Modifier and Type Field Description static Map<String,List<String>>
FOLLOWERS_DB
static Map<String,List<String>>
TWEETERS_DB
-
Constructor Summary
Constructors Constructor Description ReachTopology()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static LinearDRPCTopologyBuilder
construct()
static void
main(String[] args)
-
-
-
Method Detail
-
construct
public static LinearDRPCTopologyBuilder construct()
-
-