Cloudera’s cover photo
Cloudera

Cloudera

Software Development

Santa Clara, California 294,182 followers

About us

Cloudera is the only data and AI platform company that brings AI to data anywhere: in clouds, data centers, and at the edge. Cloudera delivers 100% of data in all forms–whether it is in Cloudera or anywhere in the entire data estate. The world’s largest organizations rely on Cloudera to fuel insights that boost bottom lines, safeguard against threats, and save lives. Learn more at Cloudera.com. --------------------------------------------------------------------------------- Recruitment Fraud Alert It has come to our attention that job seekers have been contacted about fake job opportunities with Cloudera from individuals fraudulently posing as Cloudera employees. These recruiting fraud schemes often include requests for personal information and payments. Be aware that Cloudera will never request a payment as part of its recruitment process. Additionally, Cloudera will never make a job offer without conducting an interview process. Any information submitted to Cloudera in relation to a job application should only be through our official career portal (https://www.cloudera.com/careers.html). Email communications from Cloudera will come from an email address ending in @cloudera.com. If you are the target of a recruiting scam, consider filing a report with law enforcement authorities. Cloudera is not responsible for fraudulent job offers and/or any claims, damages, expenses, or other inconvenience connected to recruiting scams.

Website
https://www.cloudera.com
Industry
Software Development
Company size
1,001-5,000 employees
Headquarters
Santa Clara, California
Type
Privately Held
Specialties
Big Data, Cloud Computing, machine learning, cloud, Analytics, Artificial Intelligence, Databases, Open Source, Data Science, Open Source, data warehouse, Data Engineering, IoT, Data, Operational Database, Streaming, Edge to AI, AI, ML, Enterprise Data Cloud, Apache, cdp, hybrid cloud, generative ai, and kubernetes

Locations

  • Primary

    5470 Great America Pkwy

    Santa Clara, California 95054, US

    Get directions
  • 1180 Avenue of the Americas 8th floor

    New York, New York 10018, US

    Get directions
  • 220 Horizon Drive, Suite 117

    Raleigh, North Carolina 27615, US

    Get directions
  • 1420 Spring Hill Road, Suite 600

    McLean, Virginia 22102, US

    Get directions
  • 53 State St

    Boston, Massachusetts 02109, US

    Get directions
  • 314-316 Dean St

    Ground Floor, Front and Rear

    Brooklyn, NY 11217, US

    Get directions
  • 300 S Wacker Dr

    Chicago, Illinois, US

    Get directions
  • Av Santa Fe 94

    Torre A Piso 8, Suite 839

    Mexico City, Distrito Federal 01210, MX

    Get directions
  • 151 W 26th St

    New York, 10001, US

    Get directions
  • 434 Fayetteville St

    Raleigh, North Carolina 27601, US

    Get directions
  • 8281 Greensboro Dr

    Suite 450

    Tysons, VA 22102, US

    Get directions
  • Av Dr Chucri Zaidan 920

    Conjunto 91-Bairro Vila Gertrudes

    Sao Paulo, SP 04794-000, BR

    Get directions
  • Széchenyi István tér 7-8

    level 7

    Budapest, BP 1051, HU

    Get directions
  • Business Central Towers

    Office 807A, Floor 08

    Dubai, Dubai Internet City N/A, AE

    Get directions
  • Arnulfstraße 122

    Munich, Bavaria, DE

    Get directions
  • 88 avenue Charles de Gaulle

    CS 20081

    Neuilly-sur-Seine, Ile De France 92200 , FR

    Get directions
  • Tamarai Tech Park, S.P. Plot, Thiru Vi Ka Industrial Estate, Inner Ring Road

    Level 5, North Block, No: 16-19 & 20A

    Chennai, Chennai 600032, IN

    Get directions
  • No.19 Dongfang dong Road

    Rm 1732, Tower D1, DRC Diplomatic Office Building

    Beijing, Chaoyang District 100016, CN

    Get directions
  • 1 Queens Road

    Suite 1443

    Melbourne, VIC 3004, AU

    Get directions
  • 152 Teheran-ro

    41/F Gangnam Finance Center

    Seoul, Gangnam-gu 06236, KR

    Get directions
  • No. 2299 West Yan'an Road,

    Shanghai Mart Office Tower Floor 26, Suite 2612 No. 2299

    Shanghai, Shanghai 200336, CN

    Get directions
  • One Raffles Place, Tower Two

    Singapore Pte, Ltd Unit No. #28-61

    Singapore, Singapore 048616, SG

    Get directions
  • 1 Pacific Hwy

    Sydney, New South Wales 2060, AU

    Get directions
  • Burex Kyobashi Chuo-ku

    Kyobashi 2-7-14

    Tokyo, Tokyo 104-0031, JP

    Get directions
  • Penrose Two, Penrose Dock Alfred Street, Victorian Quarter

    Cork, IE

    Get directions
  • VAISHNAVI SUMMIT No: 6/B, 7th Main, 80 Feet Road, 3rd Block, Koramangala Industrial Layout, Corporation Ward No. 68

    Bangalore, Koramangala, IN

    Get directions
  • 515 Congress Ave

    1300

    Austin, Texas 78701, US

    Get directions

Employees at Cloudera

Updates

  • Cloudera reposted this

    View profile for Abhas Ricky

    Cloudera16K followers

    Many customers have told me that the model is no longer the bottleneck for enterprise AI inference; lot of those practitioners however say "The document understanding layer is!" - Context: Every inference call starts with context. In enterprise, that context is overwhelmingly unstructured — PDFs, filings, claims schedules, contracts. If the parse is wrong, the embedding is wrong, the retrieval is wrong, and the model generates a confidently wrong answer from a confidently wrong input. "Cloudera Enterprise AI Ecosystem" partner Pulse open-sourced PulseBench-Tab — a frontier benchmark that grades whether a model actually understood a table's structure (rowspans, colspans, nested merges), not just whether the extracted text reads okay • 1,820 human-annotated tables across 9 languages, 4 scripts • Real 10-Ks, government reports, corporate disclosures • Structures up to 1,000+ cells with deep merged-cell nesting • T-LAG: unified scoring for structural accuracy + OCR quality • 9 providers evaluated independently, in the open - Why care if you are deploying on private environments? The rest of the enterprise inference stack runs in tenant — Document parsing has been the forced exception: SaaS-gated, making regulated customers choose between accuracy and sovereignty. Open-source, VPC-deployable parsing at frontier quality closes that gap. Parse → embed → retrieve → generate now lives entirely inside the customer's perimeter. No egress of regulated data. - So what for some of the customers and their use cases? 1. #FinancialServices — 10-Ks, fund admin, bordereaux, claims schedules, actuarial reports. A misread merged cell quietly becomes a reserving error the underwriting agent then reasons from 2. #Healthcare — clinical trial tables, lab panels, remittance advice, EOBs. Manual abstraction is one of the largest line items in health data ops; VPC-deployable parsing means PHI never leaves the tenant 3. #Telecom — vendor interconnect billing, SLA tables, contract exhibits. Industry revenue leakage runs 1–3% of revenue, much of it buried in rate-table complexity. As #inference commoditizes, durable moats move one layer down — into the data-to-inference pipeline. End-to-end inference, from raw document to generated answer, can now run entirely inside the customer's VPC at frontier quality. No egress. No SaaS dependency. No structural errors silently corrupting the context window. That's the sovereign AI stack regulated enterprises have been asking for. Strong work by the Pulse team Sid Ritvik! — rigorous, open, independently evaluated. Credit to Dushyanth/ Moody at S&P Global for the methodology contributions! #EnterpriseAI #Cloudera #RAG #AgenticAI #SovereignAI #DocumentIntelligence

    View profile for Sid Manchkanti

    Founder, CEO @ Pulse | Transform Documents into Trusted Data

    Introducing PulseBench-Tab: an open-source, frontier benchmark for table parsing. Table extraction benchmarks today mostly evaluate cell-level text matching or sequence alignment, which means they miss the structural relationships (rowspan, colspan, adjacency) that determine whether a table was actually understood. We built PulseBench-Tab to close that gap. The dataset contains 1,820 human-annotated tables across 9 languages and 4 scripts, drawn from real-world financial filings, government reports, and corporate disclosures. Tables range from simple grids to complex structures with over 1,000 cells and deep merged-cell nesting. Alongside the dataset, the Pulse research team developed T-LAG, a new scoring metric that evaluates structural accuracy and OCR quality in a single unified score. Incredibly proud of our research and engineering team for building the industry's most accurate document extraction model, and for doing the work to prove it rigorously in the open. Pulse runs in some of the most demanding environments in the world, including Fortune 50 technology companies, top-ten global private equity firms, the largest global insurance firms, and many of the fastest-growing AI startups. We've built Pulse for the highest quality document understanding, not traditional OCR. A lot of the novel research work has been around incorporating vision-language models for production document parsing, and the platform has evolved into a horizontal document intelligence layer used across finance, insurance, healthcare, legal, energy, and technology. Thank you to Dushyanth Sekhar and Moody Hadi of S&P Global's Enterprise Data Organization for their academic contributions to the benchmark methodology. We evaluated 9 providers on the full dataset, independently. Dataset, evaluation code, research paper, and granular per-language and per-complexity results are all in the comments below.

    • No alternative text description for this image

Affiliated pages

Similar pages

Browse jobs

Funding

Cloudera 13 total rounds

Last Round

Post IPO secondary
See more info on crunchbase