Stories by Vincent Sarago on Medium

COG Talk 4ter — Distributed processes

Vincent Sarago — Mon, 17 Feb 2020 21:48:49 GMT

COG Talk 4ter — Distributed processes

Last week in COG Talk 4 we discussed large scale processing using Cloud Optimized GeoTIFF and MosaicJSON. Here is another example on how we can use mosaicJSON and dynamic tiler to create high resolution large scale mosaic.

Divide, Select and Conquer

With COG and dynamic tiling you create tiles at the time of request from the raw data. Usually this lets you apply rescaling or color correction to enable a better look for web map display. With mosaicJSON and rio-tiler-mosaic, introduced in COG Talk 2, we extended the idea of dynamic tiling by adding the pixel selection operation. When we have multiple overlapping datasets, you can tell rio-tiler-mosaic which pixel you want to keep or what operation you want to perform on the stack of pixel.

Pixel selection methods applied on Landsat-8 NDVI values for all 2018 observations over Montreal area.

In this post we want to show how we can use the combination of COGs, dynamic tiler and mosaicJSON to create large scale, good looking and high resolution mosaic of Landsat 8 data.

1. Create mosaicJSON

For this demo, we are using our awspds-mosaic stack. The stack has a mosaic/create endpoint which accepts sat-api queries to create mosaicJSON of Landsat 8 data hosted on AWS PDS.

https://medium.com/media/cf94177aedff2f17567398a16a6ccfbe/href

In 👆 the above we:

define an area of interest (AOI), date range and cloud filter for the SAT API search endoint (Note: it should work the same with any other STAC api)
POST the query to the mosaic endpoint with optional parameters (season, tile format…)

The result of the requests is a tileJSON like object.

2. Create list of mercator tiles

We are going to distribute our processes using web mercator tiles at zoom 11 (512x512 px tiles have the same resolution as zoom12 256x256 tiles). For our area of interest, this represents 783 tiles.

https://medium.com/media/ea18a8e9c05551cea5613ef9d448abb8/href

3. Create tile URL

In this step we define the creation option for the tiles. The query_params will be added to the tile url obtained in 1.

https://medium.com/media/808bc7adb9803041d334426d501aa943/href

bands="4,3,2": We pass the RGB band combination as a coma separated list. Here 4,3,2 correspond to Landsat 8 band combination for True Color.
color_ops="gamma RGB 3.5, saturation 1.7, sigmoidal RGB 15 0.35": The Landsat data is stored as Uint16 data type, in order to obtain a good looking result we apply a rio-color formula.
pixel_selection="median": This is were the magic happens. The median pixel selection option means that for each tile, the dynamic tiler will return the median value for the whole stack of data.

Note: In the _worker function you could likely add an inference step to apply ML on the output tile.

4. Distribute and collect

We now can call the tiler and get the resulting tiles (⚠ it can take up to 1 min). We also use some Rasterio code to merge the tiles into one raster file which we then translate to Cloud Optimized GeoTIFF.

https://medium.com/media/a230b3ccd2f6f5dd72090e2791cdba91/href

5. The result

High Resolution mosaic over Britany using median pixel selection for all 2019’s spring and summer Landsat 8 scenes with less than 5% of cloud.

Doing this over Brittany is great, but can we scale it to a country size area ? 👇

High Resolution mosaic over France using median pixel selection for all summer Landsat 8 scenes (2013 → 2019) with less than 10% of cloud.

Checkout the high resolution image here.

Doing this over an entire country takes a bit more time, however using AWS Lambda and our cogeo/mosaic-* tools it takes less than 10 minutes 😱. Here are some numbers to put this in context:

~7 min for the tiles creation step (it takes almost longer to merge the tiles into one COG)
7802 number of Zoom 11 tiles to fetch (== AWS Lambda calls)
522 different Landsat 8 scenes (x3 bands)
>76 Gb of data fetched (522 x 3 x ~50Mb per file)
48128 x 42496 output raster size

The whole notebook can be found here: https://github.com/developmentseed/awspds-mosaic/blob/master/notebooks/LargeScaleMosaic.ipynb

Got Data ?

We’re always looking for interesting problems to tackle using COGs, if you have a raster dataset and want to learn how COG, STAC or mosaicJSON could help, please feel free to ping me on Twitter or LinkedIn! And if you are interested in joining Development Seed to help us build technology that helps solve global challenges take a look at our open positions!

COG Talk 4ter — Distributed processes was originally published in Development Seed on Medium, where people are continuing the conversation by highlighting and responding to this story.

Facebook’s population dataset

Vincent Sarago — Fri, 14 Feb 2020 16:34:28 GMT

Earlier this week in COG Talk 4 we talked about doing large scale processing using Cloud Optimized GeoTIFF and mosaicJSON. Here is another example of how to create simple visualization tools when you store the data as Cloud Optimized GeoTIFF.

Data For Good: High Resolution Population Density Maps

Ref: https://registry.opendata.aws/dataforgood-fb-hrsl/ Format: .TIFF (Cloud Optimized GeoTIFF)

Dataset's coverage.

The dataset is formed of 294 Cloud Optimized GeoTIFF, representing six different variables stored in separate file.

$ aws s3 ls dataforgood-fb-data/tif/month=2019-06/country=ZWE/ --recursive | grep ".tif$"

type=children_under_five/ZWE_children_under_five.tif type=elderly_60_plus/ZWE_elderly_60_plus.tif
type=men/ZWE_men.tif type=women/ZWE_women.tif type=women_of_reproductive_age_15_49/ZWE_women_of_reproductive_age_15_49.tif 
type=youth_15_24/ZWE_youth_15_24.tif

See it live: https://cogeo.xyz/projects/Facebook/index.html

We can visualize the mosaic-json using cogeo-mosaic-tiler stack.

COG → MosaicJSON

1. Download list of files

$ aws s3 cp s3://dataforgood-fb-data/index.txt .

2. Split list in each variables

# type=elderly_60_plus

$ cat index.txt| grep ".tif$" | grep "type=elderly_60_plus/" | awk '{print "s3://dataforgood-fb-data/"$NF}'  > list_elderly_60_plus.txt

# type=men

$ cat index.txt| grep ".tif$" | grep "type=men/" | awk '{print "s3://dataforgood-fb-data/"$NF}'  > list_men.txt

# type=women_of_reproductive_age_15_49

$ cat index.txt| grep ".tif$" | grep "type=women_of_reproductive_age_15_49/" | awk '{print "s3://dataforgood-fb-data/"$NF}'  > list_women_of_reproductive_age_15_49.txt

# type=women

$ cat index.txt| grep ".tif$" | grep "type=women/" | awk '{print "s3://dataforgood-fb-data/"$NF}'  > list_women.txt

# type=youth_15_24

$ cat index.txt| grep ".tif$" | grep "type=youth_15_24/" | awk '{print "s3://dataforgood-fb-data/"$NF}'  > list_youth_15_24.txt

# type=children_under_five

$ cat index.txt| grep ".tif$" | grep "type=children_under_five/" | awk '{print "s3://dataforgood-fb-data/"$NF}'  > list_children_under_five.txt

3. Create MosaicJSON (using cogeo-mosaic).

$ cat list_elderly_60_plus.txt | cogeo-mosaic create - -o elderly_60_plus.json

$ cat list_men.txt | cogeo-mosaic create - -o men.json

$ cat list_women_of_reproductive_age_15_49.txt | cogeo-mosaic create - -o women_of_reproductive_age_15_49.json

$ cat list_women.txt | cogeo-mosaic create - -o women.json

$ cat list_youth_15_24.txt | cogeo-mosaic create - -o youth_15_24.json

$ cat list_children_under_five.txt | cogeo-mosaic create - -o children_under_five.json

4. Upload to cogeo.xyz

$ curl -X POST -d @elderly_60_plus.json https://mosaic.cogeo.xyz/add

$ curl -X POST -d @men.json https://mosaic.cogeo.xyz/add

$ curl -X POST -d @women_of_reproductive_age_15_49.json https://mosaic.cogeo.xyz/add

$ curl -X POST -d @women.json https://mosaic.cogeo.xyz/add

$ curl -X POST -d @youth_15_24.json https://mosaic.cogeo.xyz/add

$ curl -X POST -d @children_under_five.json https://mosaic.cogeo.xyz/add

Got Data ?

COG Talk 4bis — Montreal LIDAR dataset

Vincent Sarago — Wed, 12 Feb 2020 16:58:41 GMT

COG Talk 4bis — Montreal LIDAR dataset

Another example of large scale mosaicJSON

Earlier this week in COG Talk 4 we shared how to do large scale processing using Cloud Optimized GeoTIFF and mosaicJSON. Here is another example of how to create simple visualization tools when you store the data as Cloud Optimized GeoTIFF.

Montreal LIDAR dataset

Ref: http://donnees.ville.montreal.qc.ca/dataset/lidar-aerien-2015

Format: .LAZ (Point Cloud)

Coverage of the Montreal opendata LIDAR dataset.

The dataset is formed of 684 different COG created from .LAZ file using a modified version of our cogeo-watchbot-light stack. Each COG has a 25cm pixel resolution and two bands (Min and Max, see PDAL docs).

See it live: https://cogeo.xyz/projects/MTLidar/index.html

We can visualize the mosaicJSON using cogeo-mosaic-tiler stack.

Using rio-tiler-mvt, introduced in COG Talk part 3, we can also create Vector Tiles from the COGs directly.

.LAZ → COG → MosaicJSON

Here are the steps we took to translate the .LAZ to COG and create the mosaicJSON:

1. Create list of files to translate

$ curl http://donnees.ville.montreal.qc.ca/dataset/9ae61fa2-c852-464b-af7f-82b169b970d7/resource/ec35760c-5cbe-44a0-8ad1-30c037174b0a/download/indexlidar2015.csv | tail -n +2 | cut -d"," -f3 > list_of_files.txt

2. Deploy pointcloud-to-cog (a serverless stack based on AWS Lambda to run rio-cogeo at scale).

$ git clone https://github.com/developmentseed/pointcloud-to-cog
$ cd pointcloud-to-cog
$ make build && sls deploy --stage production --bucket my-bucket --region us-east-1

3. Send jobs to queue.

$ pip install rio-cogeo rio-tiler cogeo-mosaic
$ cd scripts/
$ cat ~/list_of_files.txt | python -m create_jobs - \
    -p webp \
    --co blockxsize=256 \
    --co blockysize=256 \
    --op overview_level=6 \
    --op dtype=float32 \
    --op web_optimized=True \
    --prefix cogs/MTLLidar \
    --topic arn:aws:sns:us-east-1:{AWS_ACCOUNT_ID}:pdal-watchbot-production-WatchbotTopic

4. Create a MosaicJSON (using cogeo-mosaic).

$ aws s3 ls s3://my-bucket/cogs/MTLLidar/ | awk '{print "s3://my-bucket/cogs/MTLLidar/"$NF}' | cogeo-mosaic create - -o mosaic.json

5. Use cogeo.xyz to visualize the mosaic

$ curl -X POST -d @mosaic.json https://mosaic.cogeo.xyz/add | jq -r ".id"

> d4c05a130c8a336c..........2cbc5c34aed85feffdaafd01ef

open https://cogeo.xyz/mosaic.html?mosaicid=d4c05a130c8a336c..........2cbc5c34aed85feffdaafd01ef

Got Data ?

COG Talk 4bis — Montreal LIDAR dataset was originally published in Development Seed on Medium, where people are continuing the conversation by highlighting and responding to this story.

COG Talk — Part 4: Enabling Spatio-temporal data processing at scale

Vincent Sarago — Mon, 10 Feb 2020 16:55:06 GMT

COG Talk — Part 4: Enabling Spatio-temporal data processing at scale

This blog is the fourth in a series called COG Talk, which looks at ways to use Cloud Optimized GeoTIFFs to efficiently render and analyze planetary data at massive scale.

After a refresh of what COGs are in Part 1, the introduction of mosaics in Part 2, and a fun experiment in Part 3, today we are going to see how COGs can be useful for large scale spatio-temporal dataset.

Introduction slide from a talk given at GéoMTL conference (slides).

Cloud Optimized GeoTIFFs (COGs)

First, the basics. As of today, the Cloud Optimized GeoTIFF specification can be summarized as a tiny list of requirements:

the data has to be tiled (internally split into chunks of regular size)
the file has a header with the location of each tile
the file can have internal overview

Basically, you take a well known open format (created in the 80's), enforce good usage and internal architecture, and then have a binary file optimized for remote access. Because the header has a map to the internal tiles and the geographic information, libraries like GDAL can easily understand which tiles to fetch (using GET Range-Requests) for a given area of interest (AOI), minimizing data transfer and HTTP requests.

Like this cute raccoon, GDAL is able to take just what it needs from a COG and runs really fast.

Web map tile is a common format for distributed processing (see chip-n-scale) or for simple raster dataset visualization. Because we can read partial parts of the COG in an optimized way, and, if present, obtain a preview of the high resolution data from internal overviews, we can dynamically generate the tiles from COGs at request time.

Web maps are often based on static raster tiles stored as jpeg or png. A full set of tiles is created for each zoom level and stored in a tree-based file structure that allows users to zoom and pan a map. This approach requires you to pre-generate a tile tree consisting of millions of files. With a COG you can use internal overviews to create multiple zooms and internal tiles to stand in for map tiles, and thus, only have one file to manage for a large area. We call this process "dynamic tiling" because we access the raw data (e.g. surface reflectance or elevation) and then apply algorithms on it before creating the tile to display in the browser.

How to create valid COGs

Common Geographic Information System (GIS) software like QGIS supports exporting raster data to COG natively but if you want to do it programmatically you can use GDAL commands

# First add internal overviews 
$ gdaladdo my-file.tif

# Then translate the geotiff to a COG (`TILED=YES`) and keep overviews 
$ gdal_translate -of GTiff -co TILED=YES -co COPY_SRC_OVERVIEWS=YES -co COMPRESS=DEFLATE my-file.tif my-cog.tif

$ rio cogeo my-file.tif my-cog.tif --cog-profile deflate

or use rio-cogeo (see COG Talk — Part 1)

$ rio cogeo my-file.tif my-cog.tif --cog-profile deflate

COGs everywhere

More organizations are storing their data as a COG ( Landsat level 2, USGS DEM, MODIS on AWS), and while it’s not the most storage-efficient format (in comparison to JPEG2000), a COG is a more user-friendly format that enables fast and cheap access to the data (see this blog for a comparison).

Having access to more data creates another kind of problem (a good one): due to the increase in data availability, we need to implement easier ways to access/process/share them at scale. Development Seed regularly advocates for open datasets, but what we love even more is to enable people to access and use the data. Take for instance, the example of opening up Landsat data. It was a big win for the open data community, but it posed several challenges using it with Earth Explorer. Libra was born as a result of needing a better tool and process for using this data, and still to this day has 3000 visitors per month 😱 !

One scene or a million!

The combination of distributed cloud services and an increase in datasets being stored as COGs, means users are now pivoting from single scene workflows to large scale processing (e.g. state, country wide). To support this, we created the mosaic-json specification, an open standard for representing metadata about sets of Cloud-Optimized GeoTIFF (see COG Talk - Part-2). With simplicity and performance in mind, the specification uses a simple quadkey based spatial index and allows overlapping scenes to create spatio-temporal mosaics.

In a dynamic tiling workflow, the mosaicJSON is a simple JSON file that acts as a proxy between Web Map tile requests (using Z-X-Y Slippy map tilenames) and the list of files intersecting with this tile.

Spatio-temporal mosaic-json.

Real World example

ABoVE: Landsat-derived Annual Dominant Land Cover Across ABoVE Core Domain, 1984–2014

Ref: https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1691
Format: GeoTIFF + internal overview

ABoVE dataset footprint.

The ABoVE dataset is comprised of 175 different GeoTIFFs at 30m resolution derived from Landsat (5 and 7) surface reflectance values. While it covers a pretty large area, the other interesting part of this dataset is the temporal aspect, because each file has 31 different bands, one for each year between 1984 to 2014.

The dataset is distributed as a GeoTIFF with internal overview. Sadly, they are not aligned with the Cloud Optimized GeoTIFF specification because they are not internally tiled and the overviews are located at the end of the files.

https://medium.com/media/d87777b7b6fd5c3c1380b5f9a7d57404/href

GeoTIFF → COG → mosaicJSON

To create user interfaces (UI) and tools that are as responsive as possible, we need the files stored as proper Cloud Optimized GeoTIFFs.

Here are the steps we took to convert the files:

1. Download the whole dataset and upload the GeoTIFFs to S3.

# head over https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1691 and login
$ wget {url of the zip}
$ unzip Annual_Landcover_ABoVE_1691.zip
$ aws s3 sync . s3://my-bucket/raw/ABoVe/


2. Deploy cogeo-watchbot-light (a serverless stack based on AWS Lambda to run rio-cogeo at scale).

$ git clone https://github.com/developmentseed/cogeo-watchbot-light
$ cd cogeo-watchbot-light
$ sls deploy --stage production --bucket my-bucket --region us-east-1

3. Create COGs.

# create list of files to translate
$ aws s3 ls s3://my-bucket/raw/ABoVE/ | grep "Simplified" | awk '{print "s3://my-bucket/cogs/ABoVE/"$NF}' > list_raw_files.txt

# Send jobs to the stack
$ pip install rio-cogeo rio-tiler
$ cd scripts/
$ cat list_raw_files.txt | python -m create_jobs - \
   -p webp \
   --co blockxsize=256 \
   --co blockysize=256 \
   --op overview_level=6 \
   --op overview_resampling=bilinear \
   --prefix cogs/ABoVE \
   --topic arn:aws:sns:us-east-1:{AWS_ACCOUNT_ID}:cogeo-watchbot-light-production-WatchbotTopic

4. Create a MosaicJSON (using cogeo-mosaic).

$ aws s3 ls s3://my-bucket/cogs/ABoVE/ | awk '{print "s3://my-bucket/cogs/ABoVE/"$NF}' | cogeo-mosaic create - -o mosaic.json

Explore

When the COGs and the mosaicJSON are formatted correctly we can use the cogeo-mosaic-tiler stack to create web map tiles dynamically and visualize the data in a web map.

See it live: https://cogeo.xyz/projects/ABoVE/index.html

The temporal side

With the introduction of the mosaicJSON specification (see COG Talk — Part 2), we released an open source python module rio-tiler-mosaic to handle the creation of tiles using multiple files. One important feature of this plugin is its ability to do pixel selection dynamically, meaning the user can choose how each pixel value is created (e.g. take the pixel value from the first image or from the last one). It also enables custom pixel selection methods:

"""Custom stddev pixel selection method."""

import numpy
from rio_tiler_mosaic.methods.base import MosaicMethodBase

class bidx_stddev(MosaicMethodBase):
    """Return bands stddev."""

def __init__(self):
        """Overwrite base and init bands stddev method."""
        super(bidx_stddev, self).__init__()
        self.exit_when_filled = True

def feed(self, tile):
        """Add data to tile."""
        tile = numpy.ma.std(tile, axis=0, keepdims=True)
        if self.tile is None:
            self.tile = tile

        pidex = self.tile.mask & ~tile.mask
        mask = numpy.where(pidex, tile.mask, self.tile.mask)
        self.tile = numpy.ma.where(pidex, tile, self.tile)
        self.tile.mask = mask

The pixel selection method above was specifically design for the ABoVE dataset. On each tile request the dynamic tiler using this method will return the standard deviation value for the stack of bands, enabling us to find where the land cover classification values have changed over the 31 year span.

See it live: https://cogeo.xyz/projects/ABoVE/stddev.html

Standard deviation values for the full stack of 31 values (years).

❤️ STAC + COG

DevelopmentSeed is advancing the adoption of the new STAC metadata specification (checkout our latest sat-api-pg project). Standardized and well-formatted metadata is an important step toward the democratization of remote sensed data and it also help us to create nice visualization tools.

Introducing landsatlive.live

By combining both STAC and mosaicJSONs we built a small demo called landsatlive.live (shoutout to the great landsat.live by our friends at Mapbox). This demo lets you visualize mosaics created dynamically using sat-api (our STAC search api) for the Landsat 8 dataset hosted on AWS.

A mosaic of Landsat 8 scenes. Web map tiles created dynamically using sat-api + mosaic-json + rio-tiler-mosaic.

COG for the best

Cloud Optimized GeoTIFF is one of the go-to data formats for when organizations are looking to store their data in the cloud. Utilizing this approach breaks down the barrier of accessing data. We’ve put together an open and user-friendly suite of tools that allows users to process data by combining STAC, mosaicJSON, and the cogeo-* tools, allowing anyone to access and analyze planetary data at scale.

For more information on how COGs can solve many of your data-related problems feel free to ping me on Twitter or LinkedIn! If you are interested in joining Development Seed to help us build further and faster take a look at our open positions!

COG Talk — Part 4: Enabling Spatio-temporal data processing at scale was originally published in Development Seed on Medium, where people are continuing the conversation by highlighting and responding to this story.

COG Talk — Part 3: Translate COG to Mapbox Vector Tiles

Vincent Sarago — Mon, 24 Jun 2019 15:35:05 GMT

COG Talk — Part 3: Translate COG to Mapbox Vector Tiles

Today we’re releasing rio-tiler-mvt, a rio-tiler plugin to create Mapbox Vector Tiles from Cloud-Optimized GeoTIFFs (COGs). It enables better dynamic web map visualizations especially for sparse datasets stored. This is the result of recent work where we had the need to visualize LiDAR data in-browser. We experimented with generating our visualizations on-the-fly by generating vector data directly from the source Cloud-Optimized GeoTIFFs (COGs). While the initial approach felt clumsy at the time, we’ve since polished it up into a proper plugin with impressive performance.

This is the third post of our COG Talk series (check out the introduction in Part 1 and use of COG mosaics in Part 2).

Lidar dataset displayed as vector tiles (top) or raster (bottom). Data from Montreal Open Data.

Cloud Optimized GeoTIFF is an excellent format for storing remote sensing data because the file structure provides a convenient method for data access and visualization. When we want to access a smaller raster — either as an array for analysis or a PNG/JPEG for visualization — we can easily read just that portion of the data. Most tools stop at this point and return a raster value, which is exactly what we want in most cases. But with sparse datasets, this isn’t always the best approach.

The population dataset mentioned in our second post is a great example for showing how COG mosaics work on a technical level because we need to combine multiple large COGs into one seamless product. But when visualizing this as a raster, you’ll notice that the output isn’t ideal because the sparse data makes it difficult to see individual pixels.

High-resolution population density data from Facebook AI (link) displayed as raster tiles.

To solve this problem, we want more control over the rendering of the data. We would ideally have a simple way to query data from COGs but return data in another non-raster format. For point visualization, Mapbox Vector Tile (MVT) format seems to be a better fit and it also enables client-side rendering and user interaction (e.g. changing colors dynamically and clicking on the data).

🎉 rio-tiler-mvt 🎉

Today we’re releasing rio-tiler-mvt, a rio-tiler plugin to encode tile arrays as Mapbox Vector Tiles. We started this as an experiment to see how far we could push Vector Tiles encoding from COG tiles. With some help from Yohan Boniface, we updated the python-vtzero library which enables fast MVT encoding in python (wrapping Mapbox’s vtzero C++ library) and then created this small rio-tiler plugin to convert raster tile values to vector features on the fly.

https://medium.com/media/457e552d8abe0c3e291e970efde08a03/href

The resulting vector tiles make visualizing sparse data much easier but is surprisingly fast for dense data sets as well.

Lidar dataset stored as Cloud Optimized GeoTIFF and served as raster or vector tiles. Data from Montreal Open Data.

And it also works well with COG mosaics, like the Facebook population data from above (link).

COG mosaic from high-resolution population density data from Facebook AI (link) displayed as vector tiles + extrusion (3d rendering).

⚠️ Important notes:

This is an experiment and there is still some work to be done on python-vtzero (issues) to enable better data encoding.
A COG is still a COG. When creating tiles at lower zoom level than the raster’s native resolution, rio-tiler is fetching overviews (a downsampled version of the raw data) so the displayed vector value is not always equal to the raw value.
Does it work with LiDAR data? Yes and no. LiDAR datasets are usually very dense, meaning each tile (256x256px) will create 65,536 points values and might be too much to handle for the web client.
Looking forward to VT3. The next iteration of the Mapbox Vector Tiles specification (3) will add better 3D (X,Y, Z) data support (link).

Pushing to the limits

If you are not scared about mapbox-gl-js burning your laptop, you can try this demo in which we use Mapbox's awesome satellite base map and terrain data hosted on AWS PDS to create RGB + Elevation vector tiles and display it as extruded colored polygons (with x2.5 vertical exaggeration).

Mount Etna volcano, Italy.

Uluru Inselberg (also known as Ayers Rock), Australia.

Glen Canyon National Recreation Area, United States.

San Francisco area, United States.

Mount Taranaki, New Zealand.

Please feel free to ping me @_VincentS if you have questions or want to hear more about the work we are doing to make open data more accessible and easier to use.

COG Talk — Part 3: Translate COG to Mapbox Vector Tiles was originally published in Development Seed on Medium, where people are continuing the conversation by highlighting and responding to this story.

COG Talk — Part 2: Mosaics

Vincent Sarago — Thu, 23 May 2019 19:30:31 GMT

COG Talk — Part 2: Mosaics

This blog is the second in a series called COG Talk, which looks at ways to use Cloud Optimized GeoTIFF, and why we use them.

The first post is a refresh on the COG format and announces the release of version 1.0.0 of rio-tiler and rio-cogeo. Here, we’ll see how we can use them to build mosaics for web maps.

Multiple high resolution Cloud Optimized GeoTIFF hosted on OpenAerialMap (link)

COG vs Map Tiles

Cloud Optimized GeoTIFF files, as the name implies, are specifically designed for easily accessing remote raster data. Because of the internal tiling and internal overviews, people often ask: can COGs replace map tiles? The usual response is: yes, but…

Cloud Optimized GeoTIFF can replace .mbtiles or statically generated map tiles by using a proxy to render tiles dynamically (e.g lambda tiler). But when it comes to large datasets, the GeoTIFF files become too big and lose the advantage of fast remote reads.

There is no size limit for GeoTIFF and when working with a country or worldwide dataset, COGs can get quite large. Even using compression to create a reasonably sized file, it’s likely that the resulting GeoTIFF header — used to look up internal data tile locations — will be very large and will slow down the dynamic tiling processes.

If we instead decide to read a large collection of files and create a mosaic, we have a new set of issues: how can we decide which pixel to display given overlapping tiles? How can we update this pixel choice decision if we have daily updated data?

To help solve those problems, we are releasing rio-tiler-mosaic, a rio-tiler plugin allowing Mercator tile creation from multiple observations, and the associated mosaicJSON specification.

rio-tiler-mosaic

Creating a mosaic, in its simplest form, involves choosing pixels from multiple images to create a single image. To provide a dynamic map tile endpoint, we’ll need to repeat this process for each individual Web-Mercator tile. rio-tiler-mosaic provides two important methods to make this simple: pixel selection for merging the tile arrays together and smart multi-threading for quickly managing a large number of images.

Pixel selection

Creating map tiles using COGs means we are dynamically generating the image array in response to a tile request. When working with mosaics, it also means we need to choose which pixel we display when we have an overlapping dataset. For a given tile, we iterate over all intersecting input images to decide which pixel to choose from. By default, rio-tiler-mosaic provides four different pixel selections rules:

First: take the pixel in the first matching image and return when the tile is full
Brightest: loop through all images and return the highest pixel value
Darkest: loop through all images and return the lowest pixel value
Last: take the pixel in the last matching image and return when the tile is full

Pixel selection methods applied on Landsat-8 NDVI values for all 2018 observations over Montreal area.

Smart multi-threading

For each pixel selection method, we need to either process the whole stack of images or just the first few until the array is full. We are using multi-threading to fetch and read data in parallel to speed up the process. Because the First and Last pixel-selection methods should return as soon as the tile is totally filled, we implemented a partial multi-threading approach by processing chunks of assets in parallel instead of the full list (see code). This is particularly handy when the list of assets is long.

Usage

The mosaic tile handler is designed to roughly match rio-tiler’s tile handlers and returns tile and mask arrays. Here is an example of generating a mosaic tile.

https://medium.com/media/c70056e15c5718d4606b3721368d7164/href

We have published a first beta version of rio-tiler-mosaic on Pypi and the source code is available on Github.

mosaicJSON

In addition to rio-tiler-mosaic, today we are releasing a specification for representing Web Mercator mosaics constructed from multiple observations. The mosaicJSON specification can be seen as the GDAL VirtualRaster ( VRT), but for indexing files by Web Mercator quadkeys. Quadkeys (a contraction of quadtree and key) are "one-dimensional strings" representing unique Z-X-Y (level-row-column) Web Mercator map tiles.

COG footprint and Quadkey index

The goal of the metadata specification is to provide a simple spatial index linking the COGs to the XYZ Web Mercator map tile to render. Note that in the rio-tiler-mosaic examples above, we assume that the list of input image assets for a given tile is already "known": the mosaicJSON file can provide that input.

The most important requirement is that each quadkey has a zoom level equal to the mosaicJSON minzoom. We use this to calculate the parent tile for a given input tile between the mosaic’s minzoom and maxzoom.

While this is still a Work in Progress the main features are:

quadkey based file index
simple JSON format (enabling high ratio compression)

Specification

https://medium.com/media/6ea6f190acd4494393a867debbdb9a38/href

A complete example of a mosaic definition based on mosaicJSON specification can be found here.

Example of implementation

Here is a simple implementation of mosaic tiling using the specification. On each tiler's call, our handler ( my_mosaic_handler) function fetches the mosaic file and looks for the assets indexed by the Web Mercator parent tile (at minzoom level) for the input XYZ tile. We then use the rio-tiler-mosaic to combine the different image assets into one tile.

https://medium.com/media/44d3be61c41cb35d40cf7bc7b32e8cfe/href

Creating a mosaicJSON definition

Now that we know how to use it, let’s create a new mosaicJSON definition. Let’s say we have 28 files (e.g. Facebook population density) covering almost the whole continent of Africa.

High-resolution population density maps COGs from Facebook AI (link)

Quadkey base zoom (or mosaic min-zoom):

To construct the spatial index we need to define a minimal set of quadkeys. By definition, all COGs intersect with the tile 0-0-0 (zoom 0) but we can do a little math to minimize the set further. Inspecting the input TIFs, we see that the native resolution is around 0.00027 degrees with 8 levels of overviews. The native resolution corresponds to Web Mercator zoom level 12 so we can guess that we should start tiling at minzoom = 4 (12 - 8 = 4). We'll use this as the base zoom for quadkey keys.

# Create Footprint

$ parallel -j4 rio bounds ::: $(aws s3 ls opendata.remotepixel.ca/facebook/ --recursive | grep ".tif" | awk '{print "s3://opendata.remotepixel.ca/"$NF}') | jq -c '.features[0]' | fio collect > facebook.geojson

# Find ourZoom 4 quadkeys(there are 13)

$ cat facebook.geojson | supermercado burn 4 | mercantile quadkey | paste -s -d"," -

0330,0331,1220,1221,0333,1222,1223,3000,3001,3010,3002,3003,3012

Zoom 4 Mercator tiles intersecting with the facebook population dataset.

The mosaic definition is then built by finding the COGs intersecting with each of the 13 zoom 4 tiles:

https://medium.com/media/2d2d68d35182d3a12f81fa186727f5c8/href

cogeo-mosaic: a CLI and a Serverless stack to create and use mosaicJSON

https://github.com/developmentseed/cogeo-mosaic

Wrapping up this new specification and the new rio-tiler-mosaic plugin we are also releasing cogeo-mosaic, a Serverless stack (based on AWS Lambda) to create and use the mosaicJSON specification. You can also install cogeo-mosaic locally and use the built-in CLI to create mosaicJSON from a set of files.

$ pip install https://github.com/developmentseed/cogeo-mosaic

$ cogeo-mosaic create my_list_of_files.txt -o mosaic.json

Is mosaicJSON appropriate for all mosaic map tile problems?

Obviously not, but we have been testing this solution on different projects and we find it simple and fast in most cases. That said, here are the pro/cons:

Pro

Less filesto manage: In the case of the Facebook dataset, we ended up having 29 files stored in the cloud instead of ~600 000 if we created all the Mercator tiles image files.

# Get number of tiles from zoom 4 to zoom 12 using https://github.com/mapbox/supermercado/pull/26

$ cat facebook.geojson | supermercado burn 4..12 | sort | uniq | wc -l

  613 456

Flexibility: When we create the tile, we have the ability to decide whichpixel should have priority and be rendered on top.
mosaicJSON files can be relatively small (especiallyif using compression) and can be cached to increase performance.

Cons

Because we need a dynamic tiler, creating a tile on the fly will always be slightly slower than having the tile already ready to be served to the client.
Tile creation can take several seconds when using darkest/brightest methods (becauseit has to read all the COGs).
Only supports theWeb Mercator projection (as for rio-tiler).
Generating mosaicJSON files can be slowfor large areas with many images.

Community

Both the mosaicJSON specification and rio-tiler-mosaic are published on Github and we welcome any feedback and/or contributors. Please feel free to ping me on Twitter @_VincentS_ if you have questions or want to hear more about the work we are doing to make open data more open and easier to use.

Demo

We built a simple demo page where you can explore some mosaicJSON examples and even share your own: https://bl.ocks.org/vincentsarago/raw/815884188c243b636ab8d927d8942a4d/

Hundreds of mercator tiles created dynamically based on mosaicJSON definition.

COG Talk — Part 2: Mosaics was originally published in Development Seed on Medium, where people are continuing the conversation by highlighting and responding to this story.

COG Talk — Part 1: What’s new?

Vincent Sarago — Fri, 03 May 2019 18:28:56 GMT

COG Talk — Part 1: What’s new?

This blog is the first in a series called COG Talk, which looks at ways to use Cloud Optimized GeoTIFF, and why we use them.

remotepixel-tiler uses rio-tiler to dynamically create Web Map tiles from Landsat-8 data hosted on AWS.

For more than a year, we’ve been working on building out a suite of tools to make Cloud Optimized GeoTIFFs (COGs) easy to work with. Today we are excited to announce we are releasing version 1 of rio-tiler and rio-cogeo 🎂!

Both modules are:

well tested
actively maintained
support python 2 and python 3
easy to install (thanks to rasterio wheels)

COGs — The Basics

Let’s start with a quick refresher on the COG specification:

COGs are powerful because of how the data is structured internally. If done properly, the data can be accessed via HTTP range requests, meaning you can read only a small portion of a file instead of downloading the whole thing. This matters because the size of an individual block of data within the image can be small and easy to download with a simple GET request. To enforce this, COGs bigger than 1024 pixels by 1024 pixels have to be internally tiled.

The metadata header has a specific structure (by construction) and holds the Image File Directory (IFD) of each data block (internal tile). The IFD is critical to a COG, because it holds information (TileOffsets and TileByteCounts) about each internal tile. This means that by fetching only the first few bytes of the data we can then construct an internal map of the data.

The other (optional) feature is the overview. By adding internal overviews (reduced resolution versions of the raw data), we can now preview the data using fewer range requests.

Refs: https://github.com/cogeotiff/cog-spec/blob/master/spec.md

Rio-cogeo

$ pip install rio-cogeo~=1.0

While Cloud Optimized GeoTIFFs are beginning to see wider use, the creation of such files can still be a tricky process and when we started working on rio-cogeo there wasn’t an easy standalone solution. The goal was to build a simple yet powerful CLI to create and validate COGs.

COG creation

BEFORE

# Add overviews 
$ gdaladdo in.tif

# Enforce internal tiling, add compression and re-organize internal structures 
$ gdal_translate in.tif cog.tif -co TILED=YES -co COPY_SRC_OVERVIEWS=YES -co COMPRESS=DEFLATE

NOW - with rio-cogeo

$ rio cogeo create int.tif cog.tif

rio-cogeo does the exact same thing as the GDAL commands (creating overviews, tiling and compressing) but it also provides seven different profiles to help the user choose the best configuration for their needs. Each profile can be extended using the --co options.

Web Optimized COG (WOG?!)

One important feature we found valuable to add was the --web-optimized options, which enables the creation of a web-tiling friendly COG. This aligns the internal tiles with the web mercator grid and overview levels match the standard slippy map zoom levels. This is similar in concept to mbtiles with the advantage of allowing fast remote access of partial data reads.

Interpretation of the specification

While rio-cogeo respects the COG specifications, by default this plugin enforces features like:

Internal overviews (User can remove overviews with option --overview-level 0)
512x512 px internal tiles (can be overwritten with --co options)

Example

rio-cogeo has a nice CLI but it can also be used directly inside your own scripts. Checkout sentinel-2-cog to see how you could convert the whole Sentinel-2 Catalog for $90K ( link)

COG validation

The other feature we wanted to add was a validation option. Until now, people have had to rely on downloading the standalone script validate_cloud_optimized_geotiff.py or using Radiant Earth’s hosted version. Now you can easily validate with a single command:

$ rio cogeo validate cog.tif

Rio-tiler

$ pip install rio-tiler~=1.2

Before creating rio-cogeo we started working on rio-tiler, a library to improve the ability to visualize COGs available on AWS Public Datasets. rio-tiler is generally used as part of a web map server to dynamically generate a map tile from an underlying COG source file (rather than generating them beforehand). Initially the library was built for specific satellites ( Landsat-8, then Sentinel-2 and CBERS-4), but it can now be used with any COGs.

Get Mercator tile from a cloud hosted file

https://medium.com/media/b90d29e4b8ca16e812ed45a5a8894e30/href

Features in rio-tiler~=1.0

Rasterio 1.0
Support for Landsat-8, CBERS-4 and Sentinel-2* AWS Public dataset
Better image encoding using GDAL (previously done using Pillow)
Colormap for output tile image (rio-tiler can apply pre-defined or custom colormap on the output tile image)
Expression support for band ratios (e.g request expr=((b1-b2)/(b1+b2)))
Statistical functions (get min/max/histogram)

(see Changelog)

*sentinel-2 data on AWS is stored as JPEG2000 and in a requester-pays bucket. User will need to assume cost for each tile request (link).

rio-tiler plays a major role in most of our dynamic tiler related projects. With the release of 1.0, we have a solid baseline to move forward with other features, so stay tuned and subscribe to the rio-tiler repo to follow our progress.

Community

Thanks to our friends at Mapbox, we agreed to move rio-tiler and rio-cogeo to a new organization: cogeotiff. We’re partnering with Chris Holmes to build out an ecosystem of open source tools around COGs and we welcome anyone who’s interested in contributing to reach out on Twitter or comment on our repos.

Do you really want people using your data ?

Vincent Sarago — Mon, 19 Nov 2018 20:29:02 GMT

Do you really want people using your data ?

Note: This post was originally called: `The Ultimate data format`

In this post we will focus on Cloud Optimized GeoTIFF and other formats used by public dataset (AWS pds, Digitalglobe Opendata, …). This post is mostly a brain dump of some though and knowledge I needed to share since the remotepixel's huge AWS bill happened last august. I hope this will give some clue or at least some idea to people who want to open/share raster dataset.

First, can you guess the difference between both images 👇

Both are the same file, on the left is the raw data from Digitalglobe Open Data Program and on the right side is the same file transformed to COG using rio-cogeo.

PS: I ❤️ DigitalGlobe and the goal of this whole introduction is not to blame them for the format, we can’t blame them to give us free data 😃 especially for disaster responses (Digitalglobe Open Data Program).

Well, files are almost the same, except COGs have internal overviews and internal tilling. The biggest difference is the storage size: 1.5 Gb vs 69 Mb 😱

So how to produce a file which is 22x lighter ? Well the answer is compression! I won't go too deep into compression itself but you should check this awesome article by Koko Alberti: https://kokoalberti.com/articles/geotiff-compression-optimization-guide/.

For the file above we used WEBP compression, which has just been added to GDAL libtiff by Norman Barker and Even Roault in #704. "WebP is a modern image format that provides superior lossless and lossy compression for images on the web" (source), develloped by Google. This compression schema claims to be better then JPEG (lossy) and PNG (lossless):

WebP lossless images are 26% smaller in size compared to PNGs. WebP lossy images are 25–34% smaller than comparable JPEG images at equivalent SSIM quality index.

The WEBP format is supported by most browsers (except Safari) and image software… and now inside GeoTIFF 🎉 (supported in QGIS if build against GDAL 2.4.0 or HEAD).

Can you spot the difference 👇 ?

WebP vs Raw (it's a GIF)

👆Look closely, this is a GIF that shows the difference between Raw and WEBP. It's really hard to spot but WEBP compression introduces artifacts (when using default parameters) which should be acceptable at least for visualisation.

Alright enought with WebP. (Note: JPEG compression would have saved a lot of space too).

AWS Public Dataset: PDS

Let's see what are the formats used by three major AWS Public Dataset: CBERS-4, Landsat-8, Sentinel-2

Note: Most of the following numbers comes from https://github.com/vincentsarago/awspds-benchmark

CBERS-4, Landsat-8 and Sentinel-2 data formats.

Those three dataset have their own 👍/👎(e.g Landsat and CBERS are both GeoTIFF but Landsat uses external overviews). The biggest difference is for Sentinel-2 which use JPEG2000 compression.

Format matters

For years I've heard multiple times that users shouldn't need to download the data but should use/create services to access them via the cloud. While this is a good idea (who wants to store Gb of data on their own laptop), the data format has a huge impact on processing/access cost which can result in thousand $ bill.

RemotePixel use case

If you see this post it might be because you also know my side project RemotePixel.ca and maybe you remember my last post:

Remote Pixel on Twitter

Dear Friends, https://t.co/aqc4TNnJoF is finally back online. Here is what happened https://t.co/r4vzl5P3Gy

Let's see how my August AWS bill is related to Sentinel-2 data format. The bill was mostly due to GET/LIST requests which are billed to the AWS users since the sentinel-2-l1c bucket is in `requester-pays`mode.

Remotepixel's AWS cost in August 2018.

The LIST requests (2 600$) were due remotepixel simple search api which now seems not simple but totally dumb.

The other part of the bill was due to GET requests (nearly 1 Billion Get requests 😱).

I believe most of the sentinel-2 data requests came from Remotepixel viewer which was at the time a really simple AWS PDS viewer (now only Landsat and CBERS-4 data are available on the viewer). So basically, users were able to visualized Sentinel-2 data using a tile server based on AWS Lambda. The idea behind the tile server is 1 tile = 1 Lambda call, but when checking the number of AWS Lambda calls there was something odd. There was only 1 Million calls … responsible for 1 Billion GET calls 🤔. How this is possible ? lets check how many GET requests GDAL does when reading a file over the internet 👇

AWS PDS benchmark: https://github.com/vincentsarago/awspds-benchmark

We have our anwser 🎉 😱 😢, getting a mercator tile for Sentinel-2 data needs > 100 http call (GET) per band… (😱 again) while for Landsat its around 5 calls.

A better data format ?

Well again, there is no ultimate data format, but let's see how thoses three PDS would behave if translating them to proper COG (512x512 internal tilling, internal overview, high level Deflate compression) using rio-cogeo

Less HTTP calls and less data transfer 🎉 (Landsat and CBERS dataset are also lighter).

Size / computing time / access cost

At the end of the day, people mention size being the key point to choose the data format. This is (I think) why we have a Sentinel-2 archive in JPEG2000 format and when I see my august AWS bill, this make me sad. JPEG2000 is not a cloud friendly format, even with the most advanced driver (KDU) you need to transfer twice more data (800kb vs 1.3Mb) and do almost 25 times more GET requests (3 vs 74) to do partial reading over the internet. But yes JPEG2000 weights only 95Mb while the proper COG version is around 180Mb.

What about processing time ?

COGs are made to be accessed partially over the internet, so you don't need to download the whole data (just get what you need). Basically you download less data so your process is faster.

On the other end, JPEG2000 are lighter, so you can download the whole data and process the whole file… hopefully we now have OpenJPEG (a free and open source driver to read JPEG2000 shipped in GDAL by default) which is performant enough to extract the data locally, so the processing time should be acceptable but again you'll need to download the whole file.

If you chose to read the JPEG2000 over the internet (as we saw earlier) this will result in a lot of GET calls and a lot of useless data transfer.

$ facts

AWS S3 pricing: https://aws.amazon.com/s3/pricing/

Based on ☝️let's write a scenario of a web viewer using AWS Lambda.

JPEG2000

Size: 25 Tb
Storage: 25 Tb * 1000 * 0.023 = 575 $ / month
1M tile requests / month
Data access: (1M * 110 (GET requests) / 1000) * 0.004 = 440 $ *
Processing time (1536 Mb AWS Lambda): (3 second * 1M * 1536 / 1000) * 0.00001667 $ = 76,81 $ **

*Using Kakadu driver you might reduce this by half (~60 GET requests) but you have to pay couple thousand $ to get the license)

**AWS lambda cost 0.00001667 per GigaSecond | considering 3 sec per tile is quite optimistic

COST: 575 + 440 + 76.81 = 1091.81 $ (440 + 76.81 = 516.81 $ for processing)

COG (deflate)

Size: 50 Tb
Storage: 50 Tb * 1000 * 0.023 = 1150 $ / month
Data access: (1M * 5 (GET requests) / 1000) * 0.004 = 20 $
Processing time (1536 Mb AWS Lambda): (1 second * 1M * 1536 / 1000) * 0.00001667 $ = 25.60 $ *

*Reading a tile from a COG is at least 3 times faster than for JPEG2000

COST: 1150 + 20 + 25.60 = 1195.60 $ (20 + 25.60 = 45.60 $ for processing)

Those number are made from hypothesis but I believe they are close to what's going in in real world between JPEG2000 and COG. Basically if you just care about storage cost JPEG2000 is your best option, but at the end someone will have to pay $$$ to access/process the data. I believe if you store the data and provide services around, COG should be a better long term solution.

The Ultimate data format ?

As we saw in the intro, image formats (compression) can have a huge impact on data accessibility and thus usage (easier to download a 70Mb file than a 1.5Gb one).

Short answer to the question: there is not such thing as an Ultimate data format, in the real world there are plenty of good data formats. At the end of the day it rely and what you want the user to do.

Here are the question you should answer before choosing a format.

Do you want users to visualise the data online ?
Do you want users to download the data to run processes ?:
Do you want users to create services on the cloud ?
Do you care about compression artefacts ?
What is your data type (Byte|Float|Int) ?
Do you provide processing services ?

Unsolicited 2cents advise:

Use WEBP compression for RGB or RGBA dataset (there is a lossless option). This is the best option if you are looking for space saving, but sadly is only compatible with GDAL 2.4.0 . JPEG compression might be a safer choice.
use Deflate compression with PREDICTOR=2 and ZLEVEL=9 options for non-Byte or non RGB datasets.
Use internal overviews any time.
Use 256 or 512 internal block size (256 for deflate and 512 for WEBP/JPEG compressed datasets ?)
Prioritize internal bitmask instead of nodata value. And maybe give $ to someone to fix the small `bug` in GDAL which puts bitmask at the end of COGs.

More reads:

2018 SatSummit workshop about COGs (link)
What’s wrong with open infrastructure for Remote Sensing geodata? by @vrielink (link)
http://www.cogeo.org
COG specification
GDAL GeoTIFF format options (link)
https://blog.hexagongeospatial.com/jpeg2000-quirks/

Stories by Vincent Sarago on Medium

COG Talk 4ter — Distributed processes

COG Talk 4ter — Distributed processes

Last week in COG Talk 4 we discussed large scale processing using Cloud Optimized GeoTIFF and MosaicJSON. Here is another example on how we can use mosaicJSON and dynamic tiler to create high resolution large scale mosaic.

Divide, Select and Conquer

1. Create mosaicJSON

2. Create list of mercator tiles

3. Create tile URL

4. Distribute and collect

5. The result

Got Data ?

Facebook’s population dataset

Earlier this week in COG Talk 4 we talked about doing large scale processing using Cloud Optimized GeoTIFF and mosaicJSON. Here is another example of how to create simple visualization tools when you store the data as Cloud Optimized GeoTIFF.

Data For Good: High Resolution Population Density Maps

COG → MosaicJSON

Got Data ?

COG Talk 4bis — Montreal LIDAR dataset

COG Talk 4bis — Montreal LIDAR dataset

Another example of large scale mosaicJSON

Montreal LIDAR dataset

.LAZ → COG → MosaicJSON

Got Data ?

COG Talk — Part 4: Enabling Spatio-temporal data processing at scale

COG Talk — Part 4: Enabling Spatio-temporal data processing at scale

This blog is the fourth in a series called COG Talk, which looks at ways to use Cloud Optimized GeoTIFFs to efficiently render and analyze planetary data at massive scale.

Cloud Optimized GeoTIFFs (COGs)

How to create valid COGs

COGs everywhere

One scene or a million!

Real World example

GeoTIFF → COG → mosaicJSON

Explore

The temporal side

❤️ STAC + COG

Introducing landsatlive.live

COG for the best

COG Talk — Part 3: Translate COG to Mapbox Vector Tiles

COG Talk — Part 3: Translate COG to Mapbox Vector Tiles

⚠️ Important notes:

Pushing to the limits

COG Talk — Part 2: Mosaics

COG Talk — Part 2: Mosaics

This blog is the second in a series called COG Talk, which looks at ways to use Cloud Optimized GeoTIFF, and why we use them.

COG vs Map Tiles

rio-tiler-mosaic

Pixel selection

Smart multi-threading

Usage

mosaicJSON

Specification

Example of implementation

Creating a mosaicJSON definition

Quadkey base zoom (or mosaic min-zoom):

cogeo-mosaic: a CLI and a Serverless stack to create and use mosaicJSON

Is mosaicJSON appropriate for all mosaic map tile problems?

Community

Demo

COG Talk — Part 1: What’s new?

COG Talk — Part 1: What’s new?

This blog is the first in a series called COG Talk, which looks at ways to use Cloud Optimized GeoTIFF, and why we use them.

COGs — The Basics

Rio-cogeo

COG creation

Web Optimized COG (WOG?!)

Interpretation of the specification

Example

COG validation

Rio-tiler

Get Mercator tile from a cloud hosted file

Features in rio-tiler~=1.0

Community

Further reading

Do you really want people using your data ?

Do you really want people using your data ?

AWS Public Dataset: PDS

Format matters

Remote Pixel on Twitter

Size / computing time / access cost

JPEG2000

COG (deflate)