library(rstac)
This tutorial will use the open-source package rstac
to
search data in Planetary Computer’s SpatioTemporal Asset Catalog (STAC)
service. STAC services can be accessed through STAC API endpoints, which
allow users to search datasets using various parameters such as space
and time. In addition to demonstrating the use of rstac
,
the tutorial will explain the Common Query Language (CQL2) filter
extension to narrow the search results and find datasets that meet
specific criteria in the STAC API.
This tutorial is based on reading STAC API data in Python.
To access Planetary Computer STAC API, we’ll create a
rstac
query.
planetary_computer <- stac("https://planetarycomputer.microsoft.com/api/stac/v1")
planetary_computer
#> ###RSTACQuery
#> - url: https://planetarycomputer.microsoft.com/api/stac/v1
#> - params:
#> - field(s): version, base_url, endpoint, params, verb, encode
CQL2 expressions can be constructed using properties that refer to
attributes of items. A list of all properties supported by a collection
can be obtained by accessing the
/collections/<collection_id>/queryables
endpoint.
Filter expressions can use properties listed in this endpoint.
In this example, we will search for Landsat
Collection 2 Level-2 imagery of the Microsoft main campus from
December 2020. The name of this collection in STAC service is
landsat-c2-l2
. Here we’ll prepare a query to retrieve its
queriables and make a GET
request to the service.
planetary_computer |>
collections("landsat-c2-l2") |>
queryables() |>
get_request()
#> ###Queryables
#> - properties
#> - id
#> - gsd
#> - created
#> - sci:doi
#> - datetime
#> - geometry
#> - platform
#> - proj:epsg
#> - instrument
#> - proj:shape
#> - field(s): $id, type, title, $schema, properties
Now we can use rstac
to make a search query with CQL2
filter extension to obtain the items.
time_range <- cql2_interval("2020-12-01", "2020-12-31")
bbox <- c(-122.2751, 47.5469, -121.9613, 47.7458)
area_of_interest = cql2_bbox_as_geojson(bbox)
stac_items <- planetary_computer |>
ext_filter(
collection == "landsat-c2-l2" &&
t_intersects(datetime, {{time_range}}) &&
s_intersects(geometry, {{area_of_interest}})
) |>
post_request()
In that example, our filter expression used a temporal
(t_intersects
) and a spatial (s_intersects
)
operators. t_intersects()
only accepts interval as it
second argument, which we created using function
cql2_interval()
. s_intersects()
spatial
operator only accepts GeoJSON objects as its arguments. This is why we
had to convert the bounding box vector (bbox
) into a
structure representing a GeoJSON object using the function
cql2_bbox_as_geojson()
. We embrace the arguments using
{{
to evaluate them before make the request.
items
is a STACItemCollection
object
containing 8 items that matched our search criteria.
stac_items
#> ###STACItemCollection
#> - features (8 item(s)):
#> - LC08_L2SP_046027_20201229_02_T2
#> - LE07_L2SP_047027_20201228_02_T1
#> - LE07_L2SP_046027_20201221_02_T2
#> - LC08_L2SP_047027_20201220_02_T2
#> - LC08_L2SP_046027_20201213_02_T2
#> - LE07_L2SP_047027_20201212_02_T1
#> - LE07_L2SP_046027_20201205_02_T1
#> - LC08_L2SP_047027_20201204_02_T1
#> - assets:
#> ang, atmos_opacity, atran, blue, cdist, cloud_qa, coastal, drad, emis, emsd, green, lwir, lwir11, mtl.json, mtl.txt, mtl.xml, nir08, qa, qa_aerosol, qa_pixel, qa_radsat, red, rendered_preview, swir16, swir22, tilejson, trad, urad
#> - item's fields:
#> assets, bbox, collection, geometry, id, links, properties, stac_extensions, stac_version, type
A STACItemCollection
is a regular GeoJSON object. It is
a collection of STACItem
entries that stores metadata on
assets. Users can convert a STACItemCollection
to a
sf
object containing the properties field as columns. Here
we depict the items footprint.
sf <- items_as_sf(stac_items)
# create a function to plot a map
plot_map <- function(x) {
library(tmap)
library(leaflet)
current.mode <- tmap_mode("view")
tm_basemap(providers[["Stamen.Watercolor"]]) +
tm_shape(x) +
tm_borders()
}
plot_map(sf)
#> tmap mode set to interactive viewing
Some collections use the eo
extension, which allows us
to sort items by attributes like cloud coverage. The next example
selects the item with lowest cloud_cover attribute:
cloud_cover <- stac_items |>
items_reap(field = c("properties", "eo:cloud_cover"))
selected_item <- stac_items$features[[which.min(cloud_cover)]]
We use function items_reap()
to extract cloud cover
values from all features.
Each STAC item have an assets
field which describes
files and provides link to access them.
items_assets(selected_item)
#> [1] "ang" "atran" "blue" "cdist"
#> [5] "coastal" "drad" "emis" "emsd"
#> [9] "green" "lwir11" "mtl.json" "mtl.txt"
#> [13] "mtl.xml" "nir08" "qa" "qa_aerosol"
#> [17] "qa_pixel" "qa_radsat" "red" "rendered_preview"
#> [21] "swir16" "swir22" "tilejson" "trad"
#> [25] "urad"
purrr::map_dfr(items_assets(selected_item), function(key) {
tibble::tibble(asset = key, description = selected_item$assets[[key]]$title)
})
#> # A tibble: 25 × 2
#> asset description
#> <chr> <chr>
#> 1 ang Angle Coefficients File
#> 2 atran Atmospheric Transmittance Band
#> 3 blue Blue Band
#> 4 cdist Cloud Distance Band
#> 5 coastal Coastal/Aerosol Band
#> 6 drad Downwelled Radiance Band
#> 7 emis Emissivity Band
#> 8 emsd Emissivity Standard Deviation Band
#> 9 green Green Band
#> 10 lwir11 Surface Temperature Band
#> # ℹ 15 more rows
Here, we’ll inspect the rendered_preview
asset. To plot
this asset, we can use the helper function preview_plot()
and provide a URL to be plotted. We use the function
assets_url()
to get the URL. This function extracts all
available URLs in items.
selected_item$assets[["rendered_preview"]]$href
#> [1] "https://planetarycomputer.microsoft.com/api/data/v1/item/preview.png?collection=landsat-c2-l2&item=LC08_L2SP_047027_20201204_02_T1&assets=red&assets=green&assets=blue&color_formula=gamma+RGB+2.7%2C+saturation+1.5%2C+sigmoidal+RGB+15+0.55&format=png"
selected_item |>
assets_url(asset_names = "rendered_preview") |>
preview_plot()