OpenStreetMap US

Thematic extracts of OpenStreetMap data in cloud-native file formats

OpenStreetMap’s native file format is OSM PBF, but this 80GB ‘planet file’ is unwieldy and not supported by all GIS software. Layercake is OSM data extracted into thematic layers (buildings, transportation, etc) and converted to cloud-native file formats that are easy to use with software from DuckDB to QGIS.

Layercake data is available from data.openstreetmap.us. Generally, you’ll put the URL for the layer you’d like to use into DuckDB or other software that supports GeoParquet files.

Schema

All Layercake layers are available as GeoParquet files. Every layer has the following columns:

  • type (string): the OSM element type (node, way, or relation)
  • id (int64): the OSM element ID
  • bbox (struct): the xmin, ymin, xmax, and ymax of the element’s geometry
  • geometry (binary): a WKB-encoded Geometry

Each layer also has additional columns corresponding to OSM tags. These vary by layer and are documented below. Most are strings, but some columns have been parsed into richer types (lists, maps, or integers).

The following layers are currently available:

buildings

URL: https://data.openstreetmap.us/layercake/buildings.parquet
Columns: building, building:levels, building:flats, building:material, building:colour, building:part, building:use, name, addr:housenumber, addr:street, addr:city, addr:postcode, website, wikipedia, wikidata, height, roof:shape, roof:levels, roof:colour, roof:material, roof:orientation, roof:height, start_date, access, wheelchair

highways

URL: https://data.openstreetmap.us/layercake/highways.parquet
Columns: highway, service, crossing, cycleway, cycleway:left, cycleway:right, footway, construction, name, ref, bridge, covered, lanes, layer, lit, sidewalk, smoothness, surface, tracktype, tunnel, wheelchair, width, access, bicycle, bus, foot, hgv, maxspeed, motor_vehicle, motorcycle, oneway, toll

boundaries

URL: https://data.openstreetmap.us/layercake/boundaries.parquet
Columns: boundary, admin_level, name (list), names (map), official_name (list), official_names (map), int_name (list), alt_name (list), alt_names (map), place, border_type, ISO3166-2, ISO3166-1:alpha2, ISO3166-1:alpha3, wikidata, wikipedia, disputed_by (list), claimed_by (list), controlled_by (list), recognized_by (list)

settlements

URL: https://data.openstreetmap.us/layercake/settlements.parquet
Columns: place, name, names (map), alt_name, alt_names (map), official_name, official_names (map), wikidata, wikipedia, population (int64)

parks

URL: https://data.openstreetmap.us/layercake/parks.parquet
Columns: boundary, protected_area, leisure, name (list), names (map), short_name (list), short_names (map), official_name (list), official_names (map), protect_class, protection_title, protected, iucn_level, access, operator, operator:type, owner, ownership, start_date, related_law, website, wikidata, wikipedia

Examples

One use for Layercake is to download a subset of data that is of interest for your use case. For example, you could download buildings in Colorado that are taller than 5 floors, and write the results to a GeoJSON file for further processing.

D copy (
    from 'https://data.openstreetmap.us/layercake/buildings.parquet'
    select type as osm_type, id as osm_id,
           building, "building:levels", name, height, geometry
    where try_cast("building:levels" as int) > 5
      and bbox.xmin > -109.05
      and bbox.ymin > 36.99
      and bbox.xmax < -102.04
      and bbox.ymax < 41.00
  ) to 'colorado_tall_buildings.geojson' with (format GDAL, driver 'GeoJSON');

You can also join two Layercake layers together. The query below uses a spatial join to find the largest cities in California by joining settlements against the boundaries layer.

D select settlements.type, settlements.id, settlements.name, settlements.population
    from 'https://data.openstreetmap.us/layercake/settlements.parquet'
    join 'https://data.openstreetmap.us/layercake/boundaries.parquet'
    on st_within(settlements.geometry, boundaries.geometry)
    where boundaries.name[1] = 'California'
    order by settlements.population desc
    limit 10;
┌─────────┬────────────┬───────────────┬────────────┐
│  type   │     id     │     name      │ population │
│ varchar │   int64    │    varchar    │   uint64   │
├─────────┼────────────┼───────────────┼────────────┤
│ node    │ 1738808199 │ Los Angeles   │    3898747 │
│ node    │ 1824135555 │ San Diego     │    1386932 │
│ node    │ 1690212988 │ San Jose      │    1013240 │
│ node    │   26819236 │ San Francisco │     873965 │
│ node    │ 1956099531 │ Fresno        │     520052 │
│ node    │  150959789 │ Sacramento    │     490712 │
│ node    │ 6474240715 │ Long Beach    │     469450 │
│ node    │  150980683 │ Oakland       │     433031 │
│ node    │ 1979182884 │ Bakersfield   │     373640 │
│ node    │ 1837296118 │ Anaheim       │     350742 │
└─────────┴────────────┴───────────────┴────────────┘

Analytics queries work too. The example below finds all of the values of surface used on highways in OSM, sorted by how common they are.

$ duckdb

D from 'https://data.openstreetmap.us/layercake/highways.parquet'
  select surface, count(*) as count
  where type = 'way'
  group by surface
  order by count desc;
┌──────────────────────┬───────────┐
│       surface        │   count   │
│       varchar        │   int64   │
├──────────────────────┼───────────┤
│ NULL                 │ 179401445 │
│ asphalt              │  29916184 │
│ unpaved              │  12156065 │
│ paved                │   4095349 │
│ concrete             │   3923954 │
│ paving_stones        │   3771049 │
│ ground               │   3387599 │
│ gravel               │   2139921 │
│ dirt                 │   1688384 │
│ compacted            │   1192208 │
│ grass                │    851898 │
│ sett                 │    485161 │
│ fine_gravel          │    444866 │
│ sand                 │    295180 │
│ wood                 │    217383 │
│ concrete:plates      │    193862 │
│ earth                │    146823 │
│ cobblestone          │    139089 │
│ pebblestone          │    130414 │
│ metal                │     45100 │
│  ·                   │         · │
│  ·                   │         · │
│  ·                   │         · │
│ metl                 │         1 │
│ curved               │         1 │
│ Via de Joaquim Gomis │         1 │
│ 0                    │         1 │
│ earth_grass          │         1 │
│ unkno                │         1 │
│ driving_plates       │         1 │
│ 砕石舗装w            │         1 │
│ trawaw               │         1 │
│ azaq                 │         1 │
│ surface=asphalt      │         1 │
│ dirt/sand;paved      │         1 │
│ murrum               │         1 │
│ rubber car tires     │         1 │
│ آهنگ_۳               │         1 │
│ ground,_gravel,_sand │         1 │
│ ail                  │         1 │
│ bewachsener_boden    │         1 │
│ pu                   │         1 │
│ dirt4                │         1 │
├──────────────────────┴───────────┤
│ 5410 rows (40 shown)   2 columns │
└──────────────────────────────────┘

News and talks

The OpenStreetMap US 2025 Yearbook

Dec 19, 2025 · OpenStreetMap US Staff

The OpenStreetMap US team got up to a lot this year! The following superlatives celebrate some notable moments of 2025… Most Dependable: Layercake Layercake is one of the most recent tools to...
Mapping Features with Layercake

Apr 22, 2025 · Quincy Morgan

OpenStreetMap US Tech Lead Quincy Morgan introduces Layercake, a new cloud-native way to work with OSM data, at FedGeoDay 2025 in Washington, D.C.