Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download LiDAR tiles from LidarBC #99

Open
ateucher opened this issue Nov 16, 2021 · 36 comments
Open

Download LiDAR tiles from LidarBC #99

ateucher opened this issue Nov 16, 2021 · 36 comments

Comments

@ateucher
Copy link
Collaborator

https://www2.gov.bc.ca/gov/content/data/geographic-data-services/lidarbc

We can probably parse this file: https://nrs.objectstore.gov.bc.ca/gdwuts/

@ateucher ateucher changed the title Download LiDAR tiles from LiDarBC Download LiDAR tiles from LidarBC Nov 16, 2021
@ateucher
Copy link
Collaborator Author

There are three types of products:

  1. Raw Point Cloud: .laz files
  2. Digital Elevation Model (DEM): 1m and 10m as .tif files
  3. Digital Surface Model (DSM): .laz files

@bevingtona
Copy link
Collaborator

bevingtona commented Nov 17, 2021

A first step to parse the files and plot things up

Outstanding issues:

  • The geotifs have a CRS but the LAZ files do not, LAZ are in UTM, so need to know the tile to be able to know the correct CRS (EDIT: UTM Zone is in the LAZ filename...)
  • Need to make some search functions using the tiles
  • Would be nice to be able to query the available tiles (like a tile index sf object)
library(tidyverse)
library(RCurl)
#> 
#> Attaching package: 'RCurl'
#> The following object is masked from 'package:tidyr':
#> 
#>     complete
library(httr)
library(XML)
library(lidR)
#> Loading required package: raster
#> Loading required package: sp
#> 
#> Attaching package: 'raster'
#> The following object is masked from 'package:dplyr':
#> 
#>     select
#> The following object is masked from 'package:tidyr':
#> 
#>     extract
library(stars)
#> Loading required package: abind
#> Loading required package: sf
#> Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1

# XML Location 

url <- "https://nrs.objectstore.gov.bc.ca/gdwuts/"

# Parse XML 

my_xml <- XML::xmlToList(XML::xmlParse(RCurl::getURL(url)))

# Format XML 

my_xml_list <- my_xml %>% 
  unlist(recursive = FALSE) %>% 
  enframe() %>% 
  filter(name == "Contents.Key") %>% 
  mutate(value = unlist(value)) 

# File list 
tif_list <- my_xml_list[str_detect(my_xml_list$value, ".tif"),]
laz_list <- my_xml_list[str_detect(my_xml_list$value, ".laz"),]

#### TIF ####

file <- tif_list[200,2]$value
fileurl <- paste0(url, file)
local_file <- paste0(gsub("/","_",file))
download.file(url = fileurl, destfile = local_file, mode="wb")
tif <- stars::read_stars(local_file) 
tif %>% plot()
#> downsample set to c(2,2)

#### LAZ ####

file <- laz_list[200,2]$value
fileurl <- paste0(url, file)
local_file <- paste0(gsub("/","_",file))
download.file(url = fileurl, destfile = local_file, mode="wb")

las = lidR::readLAS(local_file)
projection(las) <- sf::st_crs(tif)
dtm <- grid_terrain(las, res = 1, knnidw(k=10,p=2))
#> Warning: There were 13 degenerated ground points. Some X Y Z coordinates were
#> repeated. They were removed.
#> Warning: There were 5 degenerated ground points. Some X Y coordinates were
#> repeated but with different Z coordinates. min Z were retained.
plot(dtm)

@ateucher
Copy link
Collaborator Author

Amazing @bevingtona!

@bevingtona
Copy link
Collaborator

Hmm XML is only reading the first 1000 lines... somthing wrong in:

my_xml <- XML::xmlToList(XML::xmlParse(RCurl::getURL(url)))

image

@ateucher
Copy link
Collaborator Author

ateucher commented Nov 18, 2021

These are stored in an S3 bucket, so can use the {aws.s3} package:

library(aws.s3)
library(stars)
#> Loading required package: abind
#> Loading required package: sf
#> Linking to GEOS 3.8.1, GDAL 3.2.1, PROJ 7.2.1
#> Registered S3 methods overwritten by 'stars':
#>   method             from
#>   st_bbox.SpatRaster sf  
#>   st_crs.SpatRaster  sf

bucket_092bdem <- get_bucket("gdwuts", 
                   prefix = "092/092b/2019/dem", 
                   base_url = "nrs.objectstore.gov.bc.ca", 
                   region = "")

bucket_df <- as.data.frame(bucket_092bdem)

file <- save_object(grep("092b053", bucket_df$Key, value = TRUE), 
                   bucket = "gdwuts", 
                   base_url = "nrs.objectstore.gov.bc.ca", 
                   region = "")

plot(read_stars(file))
#> downsample set to c(27)

Created on 2021-11-18 by the reprex package (v2.0.1)

@bevingtona
Copy link
Collaborator

Very nice @ateucher !
Do you know of a way to query the available files via aws.s3?

@ateucher
Copy link
Collaborator Author

As far as I can see there is get_bucket where you can list objects starting with a prefix... The S3 documentation is here: https://docs.aws.amazon.com/s3/index.html, and it looks like {aws.s3} wraps most of that functionality.

It looks like there are many tens of thousands of files in that bucket, so listing them all isn't really feasible - I think searching/prefixing is the right way to go

@boshek
Copy link
Collaborator

boshek commented Nov 19, 2021

Feels like we can approach this the same way as the CDED data (with the exception of local storage) by just giving a function some spatial data and then returning (using prefixes and maptiles) to return the lidar data. @bevingtona would you use a vrt to similarly stitch together a few lidar files?

@bevingtona
Copy link
Collaborator

@boshek
For the existing .tif files available, yes the .vrt approach would work well

But the thing about lidar is that there are a great many data products possible from the .las file

So I suppose we need to decide what bcmaps wants to "offer"

  1. Download and mosaic geotiffs for an AOI
  2. Download las/laz files
  3. Process las/laz files into DTM, DSM, CHM, hillshade, etc.?

@bevingtona
Copy link
Collaborator

Is this just leading to a new package? bclidar? where one can plot the index files and download the tif and or the las and then even have some vignettes for processing using the lidR and rlas packages?

@boshek
Copy link
Collaborator

boshek commented Nov 19, 2021

+1

@ateucher
Copy link
Collaborator Author

I think so too

@bevingtona
Copy link
Collaborator

@ateucher @boshek let me know when it's done ;)

@bevingtona
Copy link
Collaborator

jk, should we close the issue and start fresh in a new repo? I think cded can stay in bcmaps.. it plays well as a "wall to wall" dataset for BC..

@ateucher
Copy link
Collaborator Author

You can transfer an issue from one repo to another as well, that might be helpful

@stephhazlitt
Copy link
Member

+1 for {bclidar}

@bevingtona
Copy link
Collaborator

@boshek
Copy link
Collaborator

boshek commented Nov 19, 2021

Totally! This package can definitely has bcdata as a dependency making retrieval pretty robust to backend changes (🤞 )

@bevingtona
Copy link
Collaborator

Some more experimentation

library(esri2sf)
library(mapview)
library(bcmaps)
library(dplyr)
library(sf)

# AOI
aoi <- bcmaps::bc_cities() %>% filter(NAME == "Vancouver") %>% 
  st_buffer(5000) %>% 
  st_transform(4326) # NEEDS TO BE EPSG:4326 

mapview(aoi)

# LIST OF DATASETS
lidar_extent     <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/0"
lidar_dsm_02500k <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/1"
lidar_dsm_20000k <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/2"
lidar_pointcloud <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/3"
lidar_dem_02500k <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/4"
lidar_dem_20000k <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/5"

# QUERY DATASET
tiles <- esri2sf(lidar_dem_20000k, bbox = aoi %>% st_bbox())
#> Layer Type: Feature Layer
#> Geometry Type: esriGeometryPolygon
#> Service Coordinate Reference System: PROJCS["PCS_Albers",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Albers"],PARAMETER["False_Easting",1000000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-126.0],PARAMETER["Standard_Parallel_1",50.0],PARAMETER["Standard_Parallel_2",58.5],PARAMETER["Latitude_Of_Origin",45.0],UNIT["Meter",1.0]]
#> Output Coordinate Reference System: 4326

mapview(aoi) + mapview(tiles)

# DOWNLOAD
fol <- tempdir()

for(i in 1:nrow(tiles)){
  print(i)
  tile <- tiles[i,]
  file <- tile$filename
  url <- tile$s3Url
  httr::GET(url, 
            httr::write_disk(paste(fol,file,sep="/"), 
                             overwrite=TRUE))}
#> [1] 1
#> [1] 2

l <- list.files(fol, pattern = "bc_", full.names = T)

sf::gdal_utils(util = "buildvrt", 
               source = l, 
               destination = "my.vrt")

stars::read_stars("my.vrt", proxy = T) %>% 
  mapview(maxpixels = 10000) + mapview(tiles)
#> Number of pixels is above 10000.Only about 10000 pixels will be shown.
#> You can increase the value of `maxpixels` to 323986185 to avoid this.

Created on 2021-11-26 by the reprex package (v2.0.0)

@smnorris
Copy link

smnorris commented Dec 2, 2021

Searching for a list of available lidar tiles brought me to this thread before I could find the index.

+1 for {bclidar} as well. Would love to see a command line interface to it as well if possible.

@mcoghill
Copy link

mcoghill commented Nov 1, 2022

Just wondering what the status of this functionality is - is this going to be written into the bcmaps package, or is bclidar being actively developed? I tried searching for that package name on Google and GitHub but being outside of government I wasn't able to find anything. Thanks all!

@bevingtona
Copy link
Collaborator

No progress from me! Just the prototype above.. anyone else made any strides? @ateucher @stephhazlitt @boshek @HunterGleason @SashaNasonova

@ateucher
Copy link
Collaborator Author

ateucher commented Nov 1, 2022

I'm afraid not over here either...

@smnorris
Copy link

smnorris commented Nov 1, 2022

It is an open issue here too. smnorris/bcdata#107 - has not been a priority so far.

@bevingtona
Copy link
Collaborator

Not sure a 'bulk download' function is the best approach. Still ends up being quite a bit of processing locally. Wouldn't the better approach be to host the data as Cloud Optimized GeoTIFF and as Cloud Optimized Point Clouds?

@ateucher
Copy link
Collaborator Author

ateucher commented Nov 1, 2022

I'm sure you're right @bevingtona, unfortunately the hosting question is out of scope for bcmaps!

@smnorris
Copy link

smnorris commented Nov 1, 2022

I think the DEM TIFFs should be valid COGs already? I wasn't involved with the final publication/conversion but evaluated possible formats / compression and we pushed for COG, and I think with DEFLATE.

@smnorris
Copy link

smnorris commented Nov 1, 2022

$ rio cogeo validate bc_092b091_xl1m_utm10_2019.tif
The following warnings were found:
- The file is greater than 512xH or 512xW, it is recommended to include internal overviews

/Users/snorris/Downloads/bc_092b091_xl1m_utm10_2019.tif is a valid cloud optimized GeoTIFF

@bevingtona
Copy link
Collaborator

@ateucher i guess this might be out of scope for bcmaps, but I think if the storage is solved, then bcmaps or bcdata would have associated functions to access the data. So in my mind, not sure where else to bring this conversation.

@smnorris good point! So either a SpatioTemporal Asset Catalog (STAC) of the COG tiles is needed, or stitch the lidar tiles into mosaics per project and serve them as COGs, then the user can query the bounding box etc in the HTTP request... or simply add the COG path into QGIS for example.

Not sure the best path forward.. but cool that these are COGs! I didn't realize!

image

@mcoghill
Copy link

mcoghill commented Nov 9, 2022

Thanks for the discussion! I know this isn't being prioritized, but I'll throw it out there and say that there is interest in a simple function that can accomplish downloading available LiDAR of a defined area in a similar manner to the cded functions when downloading TRIM elevation. Even if it starts out with @bevingtona's prototype of bulk downloading tiles, as a user I would be okay with that for now, and then perhaps later on changing the function to accomplish what you hope for using STAC's (that is beyond my understanding but hopefully it's not too difficult for you to figure out!).

Thanks all for the wonderful work you all do :)

@bevingtona
Copy link
Collaborator

Here is an example of what this could look like in bcmaps.
I am not the maintainer of the package, and not sure if this should be it's own bclidar package.
And I don't have a great grasp of package architecture... so that is why I'm not making a pull request...

Here is what the final product could look like (functions are below this example):

aoi <- bcmaps::bc_cities() %>% 
  filter(NAME == "Vancouver") %>% 
  st_buffer(5000)

# Show how many tiles of each layer are in our AOI
bclidar_search_aoi(aoi = aoi)
#>       layer format n_tiles
#> 1    extent    shp       1
#> 2  dsm_2500    laz      58
#> 3 dsm_10000    laz       0
#> 4  laz_2500    laz      58
#> 5  dem_2500    tif       0
#> 6 dem_20000    tif       2

# Get an `sf` object of the tiles in your AOI for a specific layer
my_tiles <- bclidar_tif_get_tiles(aoi = aoi, layer = "dem_20000")
#> [1] "FOUND 2 TILES"

# Download and plot! 
r <- bclidar_tif_get_data(tiles = my_tiles, 
                     out_dir = tempdir(), 
                     overwrite = F, 
                     read_r = T, 
                     read_r_format = "stars")
#> [1] "Downloading 1 of 2"
#> [1] "Downloading 2 of 2"
#> [1] "Downloads complete"
#> [1] "Process complete"

plot(r)
#> downsample set to 30

Created on 2022-11-09 by the reprex package (v2.0.0)

Here are the functions that are used above:

library(esri2sf)
library(sf)
library(mapview)
library(dplyr)
library(httr)
library(stars)
library(terra)
bclidar_get_layers <- function(){
  data.frame(
    name = c("extent",
             "dsm_2500",
             "dsm_10000",
             "laz_2500",
             "dem_2500",
             "dem_20000"),
    format = c("shp",
               "laz",
               "laz",
               "laz",
               "tif",
               "tif"),
    url = c("https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/0",
            "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/1",
            "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/2",
            "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/3",
            "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/4",
            "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/5"))}
bclidar_search_aoi <- function(aoi){
  
  bclidar_layers <- bclidar_get_layers()
  
  do.call(bind_rows, lapply(1:nrow(bclidar_layers), function(i){
    name <- bclidar_layers[i,"name"]
    format <- bclidar_layers[i,"format"]
    url <- bclidar_layers[i,"url"]
    tiles <- esri2sf(url, bbox = st_bbox(aoi), )
    tilesize = nrow(tiles)
    data.frame(layer=name, 
               format=format, 
               n_tiles=tilesize)
    }))}
bclidar_tif_get_tiles <- function(aoi = aoi, 
                                layer = "dem_20000"){

  bclidar_layers <- bclidar_get_layers()
  
  url <- bclidar_layers[bclidar_layers$name == layer,"url"]
  
  tiles <- esri2sf(url, bbox = st_bbox(aoi))
    
  if(nrow(tiles)==0){
    print("NO TILES")}else{
    print(paste("FOUND",nrow(tiles),"TILES"))
      tiles}
}
bclidar_tif_get_data <- function(tiles = my_tiles, 
                                 out_dir = tempdir(),
                                 overwrite = F,
                                 read_r = T,
                                 read_r_format = "terra"){
  
  if(is.null(tiles[1,]$filename)){
    stop("No filename in layer")}
  
  if(length(grep("tif",tiles[1,]$filename))==0){
    stop("This tile does not contain a tif.")}
  
  lapply(1:nrow(tiles), function(i){
    
    print(paste("Downloading", i, "of", nrow(tiles)))
    tile <- tiles[i,]
    file <- tile$filename
    url <- tile$s3Url
    
    if(length(list.files(out_dir, pattern = file))==1&overwrite==F){
      print("File exists, overwrite set to FALSE, skipping to next")}else{
      httr::GET(url, 
                httr::write_disk(paste(out_dir,file,sep="/"), 
                                 overwrite=overwrite))
      if(i == nrow(tiles)){print("Downloads complete")}
        }
    
    })
  
  
  if(read_r == T){  
  vrt_name=paste0(out_dir,"/bclidar_download_",gsub(" ","_",gsub("-|:","",Sys.time())),".vrt")
    sf::gdal_utils(util = "buildvrt", 
                   source = paste0(out_dir,"/",tiles$file), 
                   destination = vrt_name)
    
    if(read_r_format == "stars"){
      out_img <- stars::read_stars(vrt_name, proxy = T)}
    
    if(read_r_format == "terra"){
      out_img <- terra::rast(vrt_name)}
    
   print("Process complete") 
   out_img
  }
  }
  

@stephhazlitt
Copy link
Member

@bevingtona That looks amazing. I would be happy to push the skeleton of an R package to a new bcgov repo (bclidar) to support you getting going with the above code, assuming this would be another bcgov R package. I think you or @ateucher would need to open the bare repo and assign me as a collaborator though. You could be the creator and maintainer and decide at a later date how the package matures (e.g. destined for CRAN etc).

@stephhazlitt
Copy link
Member

@bevingtona any plan re: advancing this in a stand-alone {bclidar} package? I would be happy to contribute to get things set up, and you can plunk your script in for a great start to a package.

@bevingtona
Copy link
Collaborator

@stephhazlitt Thanks for the reminder! I haven't made any progress on this.. The functions above work for batch downloading, but don't fully exploit the "cloud optimized geotiff" functionality.. We can certainly make the above functions into a package and see where it goes.

@stephhazlitt
Copy link
Member

stephhazlitt commented Apr 18, 2023

I think the fact that there is a Lidar Open Data portal (distinct from the BC Data Catalogue) is perhaps another rationale for a standalone R package: https://governmentofbc.maps.arcgis.com/apps/MapSeries/index.html?appid=d06b37979b0c4709b7fcf2a1ed458e03.

And see this news re: bcgov "investing more than $38 million in a new program over the next six years to collect light distance and ranging (LiDAR) elevation data" https://news.gov.bc.ca/releases/2023WLRS0010-000512

@smnorris
Copy link

That program is great news.

I still haven't worked with a STAC - but do wonder if existing libraries would be good enough if the data were to be indexed as a STAC. Poking GeoBC (presuming they continue to do the publishing) about that might be worth it before building something from scratch.

https://rdrr.io/cran/rstac/man/rstac.html
https://r-spatial.org/r/2021/04/23/cloud-based-cubes.html
https://pystac-client.readthedocs.io/en/stable/quickstart.html#cli

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants