前言

前面介紹過了如何在 ETL 的時候更新 Layer，使得能夠在大數據量的時候完成 ETL 操作，同時前兩篇文章也介紹了 COG 以及如何在 Geotrellis 中實現 COG 的讀取。本文介紹如何在進行 COG 方式 ETL 的時候實現 Layer 的更新。

一、實現

1.1 原理分析

其實實現 COG 方式的 Layer 更新就是把上述兩種方式結合起來，唯一的區別在於普通的 ETL 操作更新的時候需要合併的是同一個 Layer 下編號相同的瓦片，而 COG 方式的 ETL 更新的時候需要合併的是同一個 Layer 下編號相同的 GeoTiff 文件，明白了這一點實現起來就很容易了。

1.2 實現方案

上一篇文章中講了如何實現 COG 的數據寫入，執行寫入操作的是最後一行代碼：

writer.writeCOGLayer(layerName, cogLayer, keyIndexes)

其中 writer 是 FileCOGLayerWriter 實例或者其他 COGLayerWriter 實例，layerName 表示寫入的層，cogLayer 爲需要寫入的數據。

所以理論上實現方式爲首先判斷此 Layer 是否存在，如果存在則更新之，否則執行上述 writeCOGLayer 方法。

其實 writeCOGLayer 方法已經幫我們實現了這一步，只需要傳入一個 GeoTiff 的 merge 方法即可。merge 的類型爲 (GeoTiff[V], GeoTiff[V]) => GeoTiff[V]，V 爲 Tile 或者 MultibandTile 類型，其實就是如何將兩個 GeoTiff 合併成一個 GeoTiff。

這就很簡單了，只需要寫一個此方法即可，如下：

def merge(v1: GeoTiff[V], v2: GeoTiff[V]) = {
    val tile: V = v2.tile merge v1.tile
    val extent = v2.extent combine v1.extent
    val crs = v2.crs
    GeoTiffBuilder[V].makeGeoTiff(
      tile, extent, crs, Tags(Map(), Nil), GeoTiffOptions.DEFAULT
    )
}

只需要將此方法傳入即可，在 writeCOGLayer 方法中的下述方法會自動完成 update 操作：

case Some(merge) if uriExists(path) =>
    val old = GeoTiffReader[V].read(path, decompress = false, streaming = true)
    val merged = merge(cog, old)
    merged.write(path, true)
    // collect VRT metadata
    (0 until merged.bandCount)
      .map { b =>
        val idx = Index.encode(keyIndex.toIndex(key), maxWidth)
        (idx.toLong, vrt.simpleSource(s"$idx.$Extension", b + 1, merged.cols, merged.rows, merged.extent))
      }
      .foreach(samplesAccumulator.add)

這也正與我們的分析一樣，此方法將兩個 tiff 合併成一個寫入。

1.3 效果

編譯執行兩次數據 COG 方式導入，可以看到兩個數據完美的拼接在一起，繼續放大，然而居然出問題了，中間有些 zoom 下的結合處瓦片不翼而飛了，這是什麼原因？爲什麼僅僅是有些 zoom 下的丟失了？

其實靜下心來分析就不難知道，存在的問題一定在我們自己寫的 merge 方法中，並且是合併後的 Tiff 文件未實現 COG 造成的，因爲沒有實現 COG 導致有些 zoom 下無法讀取，所以取不到數據。

1.4 優化

明白了這一點優化起來就很容易了，只需要看一下 Geotrellis 是如何生成 COG 方式的 Tiff 的，我們也按照此方式生成合並後的 Tiff 即可。

private def generateGeoTiffRDD[
 K: SpatialComponent: Ordering: JsonFormat: ClassTag,
 V <: CellGrid: ClassTag: ? => TileMergeMethods[V]: ? => TilePrototypeMethods[V]: ? => TileCropMethods[V]: GeoTiffBuilder
](
 rdd: RDD[(K, V)],
 zoomRange: ZoomRange ,
 layoutScheme: ZoomedLayoutScheme,
 cellType: CellType,
 compression: Compression
): RDD[(K, GeoTiff[V])] = {
 val kwFomat = KryoWrapper(implicitly[JsonFormat[K]])
 val crs = layoutScheme.crs

 val minZoomLayout = layoutScheme.levelForZoom(zoomRange.minZoom).layout
 val maxZoomLayout = layoutScheme.levelForZoom(zoomRange.maxZoom).layout

 val options: GeoTiffOptions =
  GeoTiffOptions(
    storageMethod = Tiled(maxZoomLayout.tileCols, maxZoomLayout.tileRows),
    compression = compression
  )

 rdd.
  mapPartitions { partition =>
    partition.map { case (key, tile) =>
      val extent: Extent = key.getComponent[SpatialKey].extent(maxZoomLayout)
      val minZoomSpatialKey = minZoomLayout.mapTransform(extent.center)

      (key.setComponent(minZoomSpatialKey), (key, tile))
    }
  }.
  groupByKey(new HashPartitioner(rdd.partitions.length)).
  mapPartitions { partition =>
    val keyFormat = kwFomat.value
    partition.map { case (key, tiles) =>
      val cogExtent = key.getComponent[SpatialKey].extent(minZoomLayout)
      val centerToCenter: Extent = {
        val h = maxZoomLayout.cellheight / 2
        val w = maxZoomLayout.cellwidth / 2
        Extent(
          xmin = cogExtent.xmin + w,
          ymin = cogExtent.ymin + h,
          xmax = cogExtent.xmax - w,
          ymax = cogExtent.ymax - h)
      }
      val cogTileBounds: GridBounds = maxZoomLayout.mapTransform.extentToBounds(centerToCenter)
      val cogLayout: TileLayout = maxZoomLayout.layoutForBounds(cogTileBounds).tileLayout

      val segments = tiles.map { case (key, value) =>
        val SpatialKey(col, row) = key.getComponent[SpatialKey]
        (SpatialKey(col - cogTileBounds.colMin, row - cogTileBounds.rowMin), value)
      }

      val cogTile = GeoTiffBuilder[V].makeTile(
        segments.iterator,
        cogLayout,
        cellType,
        Tiled(cogLayout.tileCols, cogLayout.tileRows),
        compression)

      val cogTiff = GeoTiffBuilder[V].makeGeoTiff(
        cogTile, cogExtent, crs,
        Tags(Map("GT_KEY" -> keyFormat.write(key).prettyPrint), Nil),
        options
      ).withOverviews(NearestNeighbor)

      (key, cogTiff)
    }
  }
}

這是 Geotrellis 中的 COG Tiff 生成代碼，重點在於最下面的 GeoTiffBuilder[V].makeGeoTiff 方法，可以看到與我們上面的方式稍微有些不同，只需要按照其修改即可。如下：

def merge(v1: GeoTiff[V], v2: GeoTiff[V]) = {
    val tile: V = v2.tile merge v1.tile
    val extent = v2.extent combine v1.extent
    val crs = v2.crs
    GeoTiffBuilder[V].makeGeoTiff(
      tile, extent, crs, v2.tags, v2.options), GeoTiffOptions.DEFAULT
    ).withOverviews(NearestNeighbor)
}

主要變化在於 Tiff 的 tag 使用已有 Tiff 的 tag，這樣會添加 GT_KEY 標籤，添加了已有 Tiff 的options，並添加了 withOverviews 方法，這樣就能滿足 COG 的要求，生成符合 COG 格式的 Geotrellis 下的 Tiff 文件。

三、總結

本文介紹瞭如何實現 COG 模式下 ETL 的 Layer 更新操作，只要想明白原理，其實代碼本就不復雜，這也是我對待碼農工作的個人感悟：重要的在於編程思維、解決問題能力的培養，而不是具體的代碼。

Geotrellis系列文章鏈接地址http://www.cnblogs.com/shoufengwei/p/5619419.html

我的博客即將搬運同步至騰訊雲+社區，邀請大家一同入駐：https://cloud.tencent.com/developer/support-plan?invite_code=3tczwqg3smw44

geotrellis使用（三十九）COG 寫入更新

前言

一、實現

1.1 原理分析

1.2 實現方案

1.3 效果

1.4 優化

三、總結

免費使用ChatGPT寫代碼寫論文

pandas dataframe 時間字段 diff 函數

2018，奔波與意義

geopandas overlay 函數報錯問題解決方案

使用Python實現子區域數據分類統計

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結