2

I'm using geotrellis to load geotiff rasters from Landsat 8 that are sitting on S3. However, they are stored on a per-band basis. I can use S3GeoTiff class to load individual bands, e.g.:

val options = S3GeoTiffRDD.Options(getS3Client = () => S3Client.ANONYMOUS)

val rddBlue = S3GeoTiffRDD.spatial("landsat-pds", "L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B2.TIF", options)
val rddGreen = S3GeoTiffRDD.spatial("landsat-pds", "L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B3.TIF", options)
val rddRed = S3GeoTiffRDD.spatial("landsat-pds", "L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B4.TIF", options)

But how do I go about combining them to generate an RGB raster, e.g.

val rddRGB = ??? // something like combineRDDs(rddRed, rddGreen, rddBlue)
www
  • 38,575
  • 12
  • 48
  • 84
dff
  • 23
  • 2

1 Answers1

2

From the RDDs of single files with no maxTileSize option set, you'll end up with RDDs of full images (one element). I'd recommend setting the maxTileSize option.

There's an overload that allows you to place extra information into the key. This is how I'd approach this problem generically.

Here is some code that does what you are looking for, which utilizes those options:

import geotrellis.raster._
import geotrellis.spark._
import geotrellis.spark.io.s3._
import geotrellis.vector._

import org.apache.spark.SparkContext
import org.apache.spark.rdd._

import java.net.URI
import scala.math.BigDecimal.RoundingMode

implicit val sc: SparkContext =
  geotrellis.spark.util.SparkUtils.createLocalSparkContext("local[*]", "landsat-example")

try {

  val options =
    S3GeoTiffRDD.Options(
      getS3Client = () => S3Client.ANONYMOUS,
      maxTileSize = Some(512),
      numPartitions = Some(100)
    )

  type LandsatKey = (ProjectedExtent, URI, Int)

  // For each RDD, we're going to include more information in the key, including:
  // - the ProjectedExtent
  // - the URI
  // - the future band value
  def uriToKey(bandIndex: Int): (URI, ProjectedExtent) => LandsatKey =
    { (uri, pe) =>
      (pe, uri, bandIndex)
    }

  // Read an RDD of source tiles for each of the bands.

  val redSourceTiles =
    S3GeoTiffRDD[ProjectedExtent, LandsatKey, Tile](
      "landsat-pds",
      "L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B2.TIF",
      uriToKey(0),
      options
    )

  val greenSourceTiles =
    S3GeoTiffRDD[ProjectedExtent, LandsatKey, Tile](
      "landsat-pds",
      "L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B3.TIF",
      uriToKey(1),
      options
    )

  val blueSourceTiles =
    S3GeoTiffRDD[ProjectedExtent, LandsatKey, Tile](
      "landsat-pds",
      "L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B4.TIF",
      uriToKey(2),
      options
    )

  // Union these together, rearrange the elements so that we'll be able to group by key,
  // group them by key, and the rearrange again to produce multiband tiles.
  val sourceTiles: RDD[(ProjectedExtent, MultibandTile)] = {
    sc.union(redSourceTiles, greenSourceTiles, blueSourceTiles)
      .map { case ((pe, uri, bandIndex), tile) =>
        // Get the center of the tile, which we will join on
        val (x, y) = (pe.extent.center.x, pe.extent.center.y)

        // Round the center coordinates in case there's any floating point errors
        val center =
          (
            BigDecimal(x).setScale(5, RoundingMode.HALF_UP).doubleValue(),
            BigDecimal(y).setScale(5, RoundingMode.HALF_UP).doubleValue()
          )

        // Get the scene ID from the path
        val sceneId = uri.getPath.split('/').reverse.drop(1).head

        val newKey = (sceneId, center)
        val newValue = (pe, bandIndex, tile)
        (newKey, newValue)
      }
      .groupByKey()
      .map { case (oldKey, groupedValues) =>
        val projectedExtent = groupedValues.head._1
        val bands = Array.ofDim[Tile](groupedValues.size)
        for((_, bandIndex, tile) <- groupedValues) {
          bands(bandIndex) = tile
        }

        (projectedExtent, MultibandTile(bands))
      }
  }

  // From here, you could ingest the multiband layer.
  // But for a simple test, we will rescale the bands and write them out to a single GeoTiff
  import geotrellis.spark.tiling.FloatingLayoutScheme
  import geotrellis.raster.io.geotiff.GeoTiff

  val (_, metadata) = sourceTiles.collectMetadata[SpatialKey](FloatingLayoutScheme(512))
  val tiles = sourceTiles.tileToLayout[SpatialKey](metadata)
  val raster =
    ContextRDD(tiles, metadata)
      .withContext { rdd =>
        rdd.mapValues { tile =>
          // Magic numbers! These were created by fiddling around with
          // numbers until some example landsat images looked good enough
          // to put on a map for some other project.
          val (min, max) = (4000, 15176)
          def clamp(z: Int) = {
            if(isData(z)) { if(z > max) { max } else if(z < min) { min } else { z } }
            else { z }
          }
          val red = tile.band(0).map(clamp _).delayedConversion(ByteCellType).normalize(min, max, 0, 255)
          val green = tile.band(1).map(clamp _).delayedConversion(ByteCellType).normalize(min, max, 0, 255)
          val blue = tile.band(2).map(clamp _).delayedConversion(ByteCellType).normalize(min, max, 0, 255)

          MultibandTile(red, green, blue)
        }
      }
      .stitch

  GeoTiff(raster, metadata.crs).write("/tmp/landsat-test.tif")
} finally {
  sc.stop()
}
Rob
  • 855
  • 7
  • 8
  • Thanks! I had to change `val newKey = (uri, center)` to `val newKey = center` (otherwise it wouldn't group tiles since uri is unique for each band), but after that, it worked. But do you mind elaborating a bit more on the magic numbers you used? Where do they come from? Thanks! – dff Jun 08 '17 at 05:55
  • @dff The magic numbers come from another project, which you can find here: (https://github.com/geotrellis/geotrellis-landsat-emr-demo/blob/master/server/src/main/scala/demo/Render.scala#L19). Also I made a few fixes and set this example into repo as seen in this PR: https://github.com/locationtech/geotrellis/pull/2233 Thank you for the question! – Rob Jun 09 '17 at 17:01