DeathStar
Easy, Dynamic, Multi-tenant HBase via
YARN
Ishan Chhabra, Nitin Aggarwal
Rocketfuel Inc.
In a not so distant
past…
1000 node cluster
Rogue Applications
Cannot customize per
application
Hard to capacity plan
or support new
applications
The Common Solution:
Separate Clusters
Non uniform network usage
Different DFSs’, leading to lot of copying of data
Low cluster utilization
High lead time for new applications
Run HBase on YARN
Built on top of Slider
Solution:
DeathStar
Hangar
App Cluster 3
Provisioning
Model
App
Cluster 2
App
Cluster 1
(grid/deathstar): $ git commit
Capacity planning and configuration discussion
Create simple JSON config
As applications mature from
hangar to their cluster
Dynamic Cluster:
Make API call to start, stop and scale cluster
Static Cluster:
Good to go
Image
Strict Isolation
Common HDFS
Layer
Bulkload
MapReduce over snapshots
Dynamic
config and
cluster size
changes
Fits into
organization’s
capacity planning
model
Clusters out of thin air
Hot swap a new cluster (human error / corruption)
Easier HBase version upgrades and testing
Temporary scale up for backfill
“Dynamic” enables interesting
use cases
Key Challenges and Solutions
Another failure mode
Taken care of by
auto restarts
RM HA in the works
Early Days:
Bugs
Slider did not
acknowledge container
allocations correctly
Fixed scheduled for 0.8:
SLIDER-828
Not easily reproducible,
still debugging
Zombie Regionservers
Early Days:
Bugs
Long running apps a
secondary use case
Logging, an unsolved problem
Store logs on local disks, considering
ELK
YARN/Slider lack certain
scheduling constraints
At most x instances per node for
spread and availability
Custom patch in-house
Rolling restarts for
config changes
Unsolved. SLIDER-226
Data Locality
Locality
Anti-Locality
Metrics
Reporting
Custom hadoop metrics
OpenTSDB reporter
App name passed via
config
Conclusion:
Is it for me?
Conclusion:
Is it worth it?
Thank you!
Questions?
Reach us at:
ishan@rocketfuel.com
naggarwal@rocketfuel.com
Are you sure you want
to scroll down?
Really?
Dungeons and Dragons await…
Key Insight:
HBase Multi-Tenancy
and Access Patterns
HBase
Service
Online Operational Store
HBase
Data Pipeline 1
Mutable Materialized View
Stream 1
Data Pipeline 2 Data Pipeline 3
Stream 2
HBase
Prep
Stage
Transient Cache
Stage 1 Stage 2 Stage 3

HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN