Apache Storm 2.2.0 Released

Posted on Jun 30, 2020 by Govind Menon

The Apache Storm community is pleased to announce that version 2.2.0 has been released and is available from the downloads page.

This release includes a number of code improvements and important bug fixes that improve Apache Storm's performance, stability and fault tolerance. We encourage users of previous versions to upgrade to this latest release.

Thanks

Special thanks are due to all those who have contributed to Apache Storm -- whether through direct code contributions, documentation, bug reports, or helping other users on the mailing lists. Your efforts are much appreciated.

Changes in this Release

New Feature

  • [STORM-1293] - port backtype.storm.messaging.netty-integration-test to java
  • [STORM-1304] - port backtype.storm.submitter-test to java
  • [STORM-3259] - NUMA support for Storm
  • [STORM-3479] - HB timeout configurable on a topology level
  • [STORM-3480] - Implement One Executor Per Worker RAS Option
  • [STORM-3482] - Implement One Worker Per Component Option
  • [STORM-3492] - Adding configuration for blacklisting scheduler behavior
  • [STORM-3585] - Change ConstraintSolverStrategy to allow max co-Location Count for spreading components
  • [STORM-3627] - Allow use of shortNames for Metrics for worker in Metrics-V2
  • [STORM-3636] - Enable SSL credentials auto reload for storm UI, LogViewer and DRPC server

Improvement

  • [STORM-2749] - Remove state spout since it's never supported by storm
  • [STORM-3066] - Storm Flux variable substitution
  • [STORM-3071] - change checkstyle plugin setting logViolationsToConsole to true
  • [STORM-3257] - 'storm kill' command line should be able to continue on error
  • [STORM-3434] - server: fix all checkstyle warnings
  • [STORM-3484] - Add Blacklisted Supervisors Info To UI
  • [STORM-3490] - Add checkstyle rule RedundantModifier
  • [STORM-3493] - Allow overriding python interpreter by environment variable
  • [STORM-3494] - Use UserGroupInformation to login to HDFS only once per process
  • [STORM-3507] - Need feedback from blacklisting to scheduling
  • [STORM-3509] - Improved RAS scheduling
  • [STORM-3529] - Catch and log RetriableException in KafkaOffsetMetric
  • [STORM-3530] - Improve Scheduling Failure Message
  • [STORM-3534] - Add generic resources to UI
  • [STORM-3536] - Add Generic-resources.md
  • [STORM-3538] - Add Meter for sendSupervisorAssignments exception
  • [STORM-3539] - Add metric for worker start time out
  • [STORM-3541] - allow reporting of v2 metrics api using metrics tick
  • [STORM-3543] - Avoid iterators for task hook info objects
  • [STORM-3545] - blob update spews errors until cleanup occurs after topology killed
  • [STORM-3548] - Remove iterator from Task.sendUnanchored
  • [STORM-3555] - Add meter for tracking errors killing workers
  • [STORM-3557] - allow health checks to pass on timeout
  • [STORM-3570] - add config name when validation fails with ClassNotFoundException
  • [STORM-3571] - Add topology info to slot warning messages
  • [STORM-3575] - Fix Scheduler Status on failure after multiple attempts
  • [STORM-3581] - Change log level to info to show the config classes being used for validation
  • [STORM-3584] - Support getting version info from a wildcard classpath entry
  • [STORM-3587] - Allow Scheduler futureTask to gracefully exit and register message on timeout
  • [STORM-3588] - RAS scheduler should not pre-empt and evict topologies due to generic resource
  • [STORM-3589] - Iterator in BaseResourceStrategy is potentially buggy
  • [STORM-3591] - Improve GRAS Strategy Log
  • [STORM-3594] - Add checkstyle rule WhitespaceAfter
  • [STORM-3596] - Feed send assignment status into blacklist scheduler
  • [STORM-3600] - ResourceAwareScheduler taking too long to schedule
  • [STORM-3604] - HealthChecker should print out error message when it fails
  • [STORM-3605] - add meter to track scheduling timeouts
  • [STORM-3614] - update SystemBolt metrics to use v2 API
  • [STORM-3616] - If running upload credentials and no autocreds are found, we should have an option to fail
  • [STORM-3618] - add meter for tracking internal scheduling errors
  • [STORM-3619] - Add null check for the topology name
  • [STORM-3625] - Storm CLI should validate topology name on client side
  • [STORM-3632] - Reduce SimpleSaslServerCallbackHandler supervisor logging
  • [STORM-3633] - Add message that supervisor is killing detached workers
  • [STORM-3634] - validate numa ports are contained in supervisor.slots.ports
  • [STORM-3640] - timed out health check processes should be killed

Bug

  • [STORM-2483] - wrong parameters order
  • [STORM-3498] - Fix missing cases of invoking bash directly without /bin/env
  • [STORM-3504] - AsyncLocalizerTest is stubbing file system operations
  • [STORM-3510] - WorkerState.transferLocalBatch backpressure resend logic fix
  • [STORM-3511] - Nimbus logs got flood with TTransportException Error messages (because of thrift 0.12.0)
  • [STORM-3512] - Nimbus failing on startup with `GLIBC_2.12' not found
  • [STORM-3519] - Change ConstraintSolverStrategy::backtrackSearch to avoid StackOverflowException
  • [STORM-3523] - supervisor restarts when releasing slot with missing file
  • [STORM-3527] - Container.getWorkerUser() should check if the user name is empty
  • [STORM-3540] - Pacemaker race condition can cause continual reconnection
  • [STORM-3549] - use of topology specific jaas conf doesn't work with kafka
  • [STORM-3551] - Fix LocalAssignment Equivalency in Slot for Generice Resource Aware Scheduler
  • [STORM-3552] - Storm CLI set_log_level no longer updates the log level
  • [STORM-3567] - Topology UI page is showing total resources for each component if not scheduled
  • [STORM-3568] - Topology UI page "Change Log Level" should not allow empty logger name
  • [STORM-3572] - Topology visualization can fail if executor is not up
  • [STORM-3577] - upload-credentials Breaks Topology in secure cluster
  • [STORM-3580] - Config overrides supplied using -c in storm.py not passed to all commands
  • [STORM-3583] - Handle exceptions when AsyncLocalizer tries to get local resources
  • [STORM-3598] - Storm UI visualization throws NullPointerException
  • [STORM-3602] - loadaware shuffle can overload local worker
  • [STORM-3606] - AutoTGT shouldn't invoke TGT renewal thread (from UserGroupInformation.loginUserFromSubject)
  • [STORM-3609] - ClassCastException when credentials are updated for ICredentialsListener spout/bolt instances
  • [STORM-3613] - storm.py should include lib-worker instead of lib directory in the classpath while submitting a topology
  • [STORM-3620] - Data corruption can happen when components are multi-threaded because of non thread-safe serializer
  • [STORM-3622] - Race Condition in CachedThreadStatesGaugeSet registered at SystemBolt
  • [STORM-3623] - v2 metrics tick reports all worker metrics within each executor
  • [STORM-3626] - storm-kafka-migration should pull in storm-client as "provided" dependency
  • [STORM-3629] - Logviewer should always allow admins to access logs
  • [STORM-3631] - Wrong format of logs.users/groups in topology conf can cause supervisor/logviewer to terminate

Comment

  • [STORM-3231] - TopologyBySubmissionTimeComparator does not consider priority

Dependency upgrade

  • [STORM-3608] - Upgrade snakeyaml from 1.11 to 1.26 (latest)

Documentation

  • [STORM-3508] - The links to download in setting up environmtn page are broken
  • [STORM-3615] - Add documentation for Storm NUMA support

Task

  • [STORM-3211] - WindowedBoltExecutor NPE if wrapped bolt returns null from getComponentConfiguration
  • [STORM-3306] - Some tests in storm-core/test/jvm/org/apache/storm/integration/TopologyIntegrationTest.java are using Thrift to build topologies. They should use TopologyBuilder instead.

Test

  • [STORM-3475] - Add ConstraintSolverStrategy Unit Test
  • [STORM-3495] - TestConstraintSolverStrategy is not stable on travis
  • [STORM-3503] - Create unit tests for blacklistOnBadSlot option
  • [STORM-3525] - Large Contraint Solver test fails on some VM
  • [STORM-3651] - Give producerTasks in ExecutorTransferMultiThreadingTest.testExecutorTransfer more time to finish

Sub-task

  • [STORM-2687] - Group Topology executors by network proximity needs and schedule them on "network wise" close slots
  • [STORM-3486] - Upgrade to Jersey 2.29
  • [STORM-3578] - ClientAuthUtils.insertWorkerTokens removes exiting and new WorkerToken altogether if they are equal
  • [STORM-3579] - Fix Kerberos connection from Worker to Nimbus/Supervisor
  • [STORM-3599] - Bump the rocksdbjni to 5.18.4
  • [STORM-3607] - Document the exceptions topologies will see from TGT renewal thread