Storm 1.2.0 Released

Posted on Feb 15, 2018 by P. Taylor Goetz

The Apache Storm community is pleased to announce that version 1.2.0 has been released and is available from the downloads page.

Apache Kafka Integration Improvements

This release includes many improvements to Storm's Kafka integration that improve stability, ease configuration, and expose new features. More details can be found in the Kafka client documentation

New Metrics Reporting API

This release introduces a new metrics system for reporting internal statistics (e.g. acked, failed, emitted, transferred, disruptor queue metrics, etc.) as well as a new API for user defined metrics based on the popular Dropwizard Metrics library. Storm includes reporters for gathering metrics data with Ganglia, Graphite, JMX, CSV and the console. Additional metrics systems can be integrated by extending the org.apache.storm.metrics2.reporters.ScheduledStormReporter class. Additional details can be found in the metrics documentation.

Thanks

Special thanks are due to all those who have contributed to Apache Storm -- whether through direct code contributions, documentation, bug reports, or helping other users on the mailing lists. Your efforts are much appreciated.

Changes in this Release

New Feature

  • [STORM-2383] - [storm-hbase] Support HBase as state backend
  • [STORM-2484] - Flux: support bolt+spout memory configuration
  • [STORM-2648] - Kafka spout can't show acks/fails and complete latency when auto commit is enabled
  • [STORM-2694] - Create a listener to handle tuple state changes of the KafkaSpout

Improvement

  • [STORM-2153] - New Metrics Reporting API
  • [STORM-2160] - Expose interface to MetricRegistry via TopologyContext
  • [STORM-2164] - Create simple generic plugin system to register codahale reporters
  • [STORM-2369] - [storm-redis] Use binary type for State management
  • [STORM-2379] - [storm-elasticsearch] switch ES client to Java REST API
  • [STORM-2421] - Support lists of childopts beyond just worker
  • [STORM-2448] - Support running workers using older JVMs/storm versions
  • [STORM-2481] - Upgrade Aether version to resolve Aether bug BUG-451566
  • [STORM-2482] - Refactor the Storm auto credential plugins to be more usable
  • [STORM-2491] - Missing various configuration parameters to configure the Cassandra client used by the Cassandra Bolts
  • [STORM-2501] - Implement auto credential plugin for Hive
  • [STORM-2512] - Change KafkaSpoutConfig in storm-kafka-client to make it work with flux
  • [STORM-2519] - AbstractAutoCreds should look for configKeys in both nimbus and topology configs
  • [STORM-2524] - Set Kafka client.id with storm-kafka
  • [STORM-2527] - Initialize java.sql.DriverManager earlier to avoid deadlock between DriverManager static initializer and driver static initializer
  • [STORM-2528] - Bump log4j version to 2.8.2
  • [STORM-2548] - Simplify KafkaSpoutConfig
  • [STORM-2551] - Thrift client socket timeout
  • [STORM-2553] - JedisCluster does not support password
  • [STORM-2598] - Add proxy server option for dependency resolver
  • [STORM-2601] - the method of getting the nimbus cilent doenot accept timeout parameter
  • [STORM-2616] - Document the built in metrics (just in time to replace them???)
  • [STORM-2618] - Add TridentKafkaStateUpdater for storm-kafka-client
  • [STORM-2650] - Add test for non-string property substitution in Flux tests
  • [STORM-2657] - Update SECURITY.MD
  • [STORM-2663] - Backport STORM-2558 and deprecate storm.cmd on 1.x-branch
  • [STORM-2712] - accept arbitrary number of rows per tuple in storm-cassandra
  • [STORM-2775] - Improve KafkaPartition Metric Names
  • [STORM-2781] - Refactor storm-kafka-client KafkaSpout Processing Guarantees
  • [STORM-2791] - Add support for multiple output fields to FixedTupleSpout
  • [STORM-2796] - Flux: Provide means for invoking static factory methods and improve non-primitive number handling
  • [STORM-2807] - Integration test should shut down topologies immediately after the test
  • [STORM-2854] - Expose IEventLogger to make event logging pluggable
  • [STORM-2860] - Add Kerberos support to Solr bolt
  • [STORM-2862] - More flexible logging in multilang (Python, Ruby, JS)
  • [STORM-2864] - Minor optimisation about trident kafka state
  • [STORM-2867] - Add Consumer lag metrics to Kafka Spout
  • [STORM-2877] - Introduce an option to configure pagination in Storm UI
  • [STORM-2901] - Reuse ZK connection for getKeySequenceNumber
  • [STORM-2914] - Remove enable.auto.commit support from storm-kafka-client
  • [STORM-2917] - Check the config(nimbus.host) before using it to connect

Bug

  • [STORM-1114] - Racing condition in trident zookeeper zk-node create/delete
  • [STORM-2194] - ReportErrorAndDie doesn't always die
  • [STORM-2231] - NULL in DisruptorQueue while multi-threaded ack
  • [STORM-2315] - New kafka spout can't commit offset when ack is disabled.
  • [STORM-2343] - New Kafka spout can stop emitting tuples if more than maxUncommittedOffsets tuples fail at once
  • [STORM-2357] - add At-Most-Once guarantee in KafkaSpout
  • [STORM-2413] - New Kafka spout is ignoring retry limit
  • [STORM-2426] - First tuples fail after worker is respawn
  • [STORM-2429] - non-string values in supervisor.scheduler.meta cause crash
  • [STORM-2431] - the default blobstore.dir is storm.local.dir/blobs which is different from distcache-blobstore.md
  • [STORM-2432] - Storm-Kafka-Client Trident Spout Seeks Incorrect Offset With UNCOMMITTED_LATEST Strategy
  • [STORM-2435] - Logging in storm.js inconsistent to console.log and does not support log levels
  • [STORM-2440] - Kafka outage can lead to lockup of topology
  • [STORM-2449] - Iterator of Redis State may return same key multiple time, with different values
  • [STORM-2450] - supervisor v2 broke ShellBolt/Spout in local mode from storm jar
  • [STORM-2467] - Encoding issues in Kafka consumer
  • [STORM-2478] - BlobStoreTest.testDeleteAfterFailedCreate fails on Windows
  • [STORM-2486] - bin/storm launcher script can be broken if CDPATH is exported from environment
  • [STORM-2488] - The UI user Must be HTTP
  • [STORM-2489] - Overlap and data loss on WindowedBolt based on Duration
  • [STORM-2494] - KafkaSpout does not handle CommitFailedException
  • [STORM-2496] - Dependency artifacts should be uploaded to blobstore with READ permission for all
  • [STORM-2498] - Download Full File link broken in 1.x branch
  • [STORM-2500] - waitUntilReady in PacemakerClient cannot be invoked
  • [STORM-2503] - lgtm.com alerts: bugs in equality and comparison operations
  • [STORM-2505] - Kafka Spout doesn't support voids in the topic (topic compaction not supported)
  • [STORM-2511] - Submitting a topology with name containing unicode getting failed.
  • [STORM-2516] - WindowedBoltExecutorTest.testExecuteWithLateTupleStream is flaky
  • [STORM-2517] - storm-hdfs writers can't be subclassed
  • [STORM-2518] - NPE during uploading dependency artifacts with secured cluster
  • [STORM-2520] - AutoHDFS should prefer cluster-wise hdfs kerberos principal to global hdfs kerberos principal
  • [STORM-2521] - "storm sql" fails since '--jars' can't handle wildcard
  • [STORM-2525] - Fix flaky integration tests
  • [STORM-2535] - test-reset-timeout is flaky. Replace with a more reliable test.
  • [STORM-2536] - storm-autocreds adds jersey 1.x to worker classpath
  • [STORM-2541] - Manual partition assignment doesn't work
  • [STORM-2544] - Bugs in the Kafka Spout retry logic when using manual commit
  • [STORM-2546] - Kafka spout can stall / get stuck due to edge case with failing tuples
  • [STORM-2549] - The fix for STORM-2343 is incomplete, and the spout can still get stuck on failed tuples
  • [STORM-2552] - KafkaSpoutMessageId should be serializable
  • [STORM-2555] - storm-autocreds for HBase doesn't handle impersonation
  • [STORM-2557] - A bug in DisruptorQueue causing severe underestimation of queue arrival rates
  • [STORM-2562] - Use stronger key size for blow fish key generator and get rid of stack trace
  • [STORM-2563] - Remove the workaround to handle missing UGI.loginUserFromSubject
  • [STORM-2564] - We should provide a template for storm-cluster-auth.yaml
  • [STORM-2568] - 'api/vi/topology/:id/lag' returns empty json {}
  • [STORM-2597] - EXT_CLASSPATH strips out directories
  • [STORM-2599] - BasicContainer.getWildcardDir tries to resolve the wildcard character with Paths.get, which prevents workers from booting on Windows
  • [STORM-2602] - "storm.zookeeper.topology.auth.payload" doesn't work even you set it
  • [STORM-2607] - [kafka-client] Consumer group every time with lag 1
  • [STORM-2608] - Out Of Range Offsets Should Be Removed From Pending Queue
  • [STORM-2621] - STORM-2557 broke sojourn time estimation
  • [STORM-2627] - The annotation of "storm.zookeeper.topology.auth.scheme" in Config.java is wrong
  • [STORM-2639] - Kafka Spout incorrectly computes numCommittedOffsets due to voids in the topic (topic compaction)
  • [STORM-2642] - Storm-kafka-client spout cannot be serialized when using manual partition assignment
  • [STORM-2645] - update storm.py to be python3 compatible
  • [STORM-2652] - Exception thrown in JmsSpout open method
  • [STORM-2660] - The Nimbus storm-local directory is relative to the working directory of the shell executing "storm nimbus"
  • [STORM-2666] - Storm-kafka-client spout can sometimes emit messages that were already committed.
  • [STORM-2674] - NoNodeException when ZooKeeper tries to delete nodes
  • [STORM-2675] - KafkaTridentSpoutOpaque not committing offsets to Kafka
  • [STORM-2677] - consider all sampled tuples which took greater than 0 ms processing time
  • [STORM-2682] - Supervisor crashes with NullPointerException
  • [STORM-2690] - resurrect invocation of ISupervisor.assigned() & make Supervisor.launchDaemon() accessible
  • [STORM-2692] - Load only configs specific to the topology in populateCredentials
  • [STORM-2695] - BlobStore uncompress argument should be Boolean
  • [STORM-2705] - DRPCSpout sleeps twice when idle
  • [STORM-2706] - Nimbus stuck in exception and does not fail fast
  • [STORM-2722] - JMSSpout test fails way too often
  • [STORM-2724] - ExecutorService in WaterMarkEventGenerator never shutdown
  • [STORM-2736] - o.a.s.b.BlobStoreUtils [ERROR] Could not update the blob with key
  • [STORM-2751] - Remove AsyncLoggingContext from Supervisor
  • [STORM-2756] - STORM-2548 on 1.x-branch broke setting key/value deserializers with the now deprecated setKey/setValue methods
  • [STORM-2764] - HDFSBlobStore leaks file system objects
  • [STORM-2769] - Fast-fail if output stream Id is null
  • [STORM-2771] - Some tests are being run twice
  • [STORM-2779] - NPE on shutting down WindowedBoltExecutor
  • [STORM-2784] - storm-kafka-client KafkaTupleListener method onPartitionsReassigned() should be called after initialization is complete
  • [STORM-2786] - Ackers leak tracking info on failure and lots of other cases.
  • [STORM-2787] - storm-kafka-client KafkaSpout should set 'initialized' flag independently of processing guarantees
  • [STORM-2810] - Storm-hdfs tests are leaking resources
  • [STORM-2811] - Nimbus may throw NPE if the same topology is killed multiple times, and the integration test kills the same topology multiple times
  • [STORM-2814] - Logviewer HTTP server should return 403 instead of 200 if the user is unauthorized
  • [STORM-2815] - UI HTTP server should return 403 if the user is unauthorized
  • [STORM-2825] - storm-kafka-client configuration fails with a ClassCastException if "enable.auto.commit" is present in the consumer config map, and the value is a string
  • [STORM-2826] - KafkaSpoutConfig.builder doesn't set key/value deserializer properties in storm-kafka-client
  • [STORM-2833] - Cached Netty Connections can have different keys for the same thing.
  • [STORM-2835] - storm-kafka-client KafkaSpout can fail to remove all tuples from waitingToEmit
  • [STORM-2843] - Flux: properties file not found when loading resources from classpath
  • [STORM-2844] - KafkaSpout Throws IllegalStateException After Committing to Kafka When First Poll Strategy Set to EARLIEST
  • [STORM-2847] - Exception thrown after rebalance IllegalArgumentException
  • [STORM-2850] - ManualPartitionSubscription assigns new partitions before calling onPartitionsRevoked
  • [STORM-2851] - org.apache.storm.kafka.spout.KafkaSpout.doSeekRetriableTopicPartitions sometimes throws ConcurrentModificationException
  • [STORM-2853] - Deactivated topologies cause high cpu utilization
  • [STORM-2855] - Travis build doesn't work after update of Ubuntu image
  • [STORM-2856] - Make Storm build work on post 2017Q4 Travis Trusty image
  • [STORM-2868] - Address handling activate/deactivate in multilang module files
  • [STORM-2869] - KafkaSpout discards all pending records when adjusting the consumer position after a commit
  • [STORM-2870] - FileBasedEventLogger leaks non-daemon ExecutorService which prevents process to be finished
  • [STORM-2873] - Backpressure implentation delete ephemeral too frequently
  • [STORM-2876] - Some storm-hdfs tests fail with out of memory periodically
  • [STORM-2879] - Supervisor collapse continuously when there is a expired assignment for overdue storm
  • [STORM-2881] - Storm-druid topologies fail with NoSuchMethodError
  • [STORM-2892] - Flux test fails to parse valid PATH environment variable
  • [STORM-2894] - fix some random typos in tests
  • [STORM-2900] - Subject is not populated and NPE is thrown while populating credentials in nimbus.
  • [STORM-2903] - Fix possible NullPointerException in AbstractAutoCreds
  • [STORM-2906] - HDFS and HBase bolt on the same worker fails with GSS no valid credentials exception
  • [STORM-2907] - In a secure cluster with storm-autocreds enabled storm-druid can fail with NoSuchMethodError
  • [STORM-2912] - Tick tuple is being shared without resetting start time and incur side-effect to break metrics
  • [STORM-2913] - STORM-2844 made autocommit and at-most-once storm-kafka-client spouts log warnings on every emit, because those modes don't commit the right metadata to Kafka
  • [STORM-2918] - Upgrade Netty version
  • [STORM-2942] - Remove javadoc and source jars from toollib directory in binary distribution

Documentation

  • [STORM-2620] - Update the docs to better indicate the versions of java tested

Task

  • [STORM-2191] - shorten classpaths in worker and LogWriter commands
  • [STORM-2506] - Make Kafka Spout log its assigned partition
  • [STORM-2874] - Minor style improvements to backpressure code
  • [STORM-2904] - Document Metrics V2

Sub-task

  • [STORM-2161] - Stop shading the codahale metrics library so that it is available to users
  • [STORM-2640] - Deprecate KafkaConsumer.subscribe APIs on 1.x, and make KafkaConsumer.assign the default
  • [STORM-2858] - Fix worker-launcher build