1 hbasecon.com
HBASEATBLOOMBERG//
THE EVOLUTION
OF BLOOMBERG
DATA SYSTEMSMEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY
MAY // 07 // 2015
HBASEATBLOOMBERG//
BLOOMBERG
3
Leading Data and Analytics provider to the financial industry
HBASEATBLOOMBERG//
DATA IS OUR BUSINESS
4
HBASEATBLOOMBERG//
September 28: Full Workshop at Bloomberg
September 30: Showcase at Strata Hadoop
Call for papers at:
bloomberglabs.com/data-science
DATA SCIENCE
FOR SOCIAL GOOD:
GOVERNMENT INNOVATION,
PUBLIC HEALTH, ENVIRONMENT,
EDUCATION
HBASEATBLOOMBERG//
6
• We have a “medium data” problem…
• Speed and availability are paramount
• Hundreds of thousands of users with
expensive requests
We’ve built many systems
to address
DATA MANAGEMENT TODAY
HBASEATBLOOMBERG//
DATA MANAGEMENT CHALLENGES
7
• Single security
analytics on Big Iron
• Replication of
Systems and Data
• Complexity kills
Top 500 Supercomputer list, 2013
>96% Linux. 100% of top 40.
HBASEATBLOOMBERG//
DATA MANAGEMENT TOMORROW
8
• Simplicity and
performance
• Benefit from external
developments
• Retain our
independence
• Details matter
HBASEATBLOOMBERG//
THE PREMISE
9
• Can apply big data techniques to our medium
data problem, by addressing gaps in existing
open systems
• HBase is a good bet
• Part of a broader whole
• The Biggest community wins
HBASEATBLOOMBERG//
CHALLENGES
Our requirements from HBase are:
• Read performance – fast with low variability
• High availability
• Operational simplicity
• Efficient use of good hardware
• Expressive power
Bloomberg has been investing in all these
aspects of HBase
HBASEATBLOOMBERG//
WE’VE MADE THAT BET
11
HBASEATBLOOMBERG//
WE’RE NOT THE ONLY ONES
12
Google Cloud Bigtable
HBASEATBLOOMBERG//
AIMING HIGHER
We can make things better
by working together
Let’s be the gold standard
HBASEATBLOOMBERG//
14
>>>>>>>>>>>>>>
CALL TO ACTION
HBASEATBLOOMBERG//
FURTHER BOLSTER RELIABILITY
16
Great strides such as HBASE-10070 but more to do
• Improved reconciliation of
state between Master,
META and ZK
• More determinism in
Admin/Master operations
HBASEATBLOOMBERG//
BENEFIT FROM MODERN HARDWARE
17
• 32 cores - 256GB RAM – SSD - untapped potential
• CPU load max 20% , inadequate throughput
• Multi-RS administratively painful
• Much better story with memory
HBASEATBLOOMBERG//
IMPROVE MULTI-TENANCY
18
• Mixed workloads challenging
• interactive vs batch
• read vs write
• different read access
patterns
• Many solutions in progress
• Administrative simplicity is key
HBASEATBLOOMBERG//
SPARK INTEGRATION
19
• Analytical frameworks need a distributed database
• Columnar file format != column database
• Integrate with HBase to move towards the
universal database
HBASEATBLOOMBERG//
ANALYTICS: EFFICIENCY
20
• Choice of row and columnar storage engines
• Expose primitives for efficiency:
• Column pruning
• Predicate pushdowns
• Data locality
HBASEATBLOOMBERG//
THE FUTURE IS BRIGHT
21
• The state of the “Hadoop Database” union is strong
– Increasing adoption
– Strong foundation
– Great community
• Prominent role in the data & analytics platform of
the future
• Let’s go create the future
>>>>>>>>>>>>>>
THANK YOU
23 hbasecon.com

HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg

Editor's Notes

  • #2 Welcome everyone! Today is going to be fantastic and we have quite the agenda for you. In the interest of time, lets dive right in.
  • #24 Welcome everyone! Today is going to be fantastic and we have quite the agenda for you. In the interest of time, lets dive right in.