问题：

如何找到像素级标准偏差？

司马狐若

2023-03-14

我有20个具有相同分辨率和范围的光栅。这是一个时间序列，每个光栅代表一年。

我想计算所有光栅的像素均方差。到目前为止，我使用的是光栅包。

qq2<-list(maxras1,maxras2,maxras3,maxras4,maxras5,maxras6,maxras7,maxras8,maxras9,maxras10)
qq2stack<-stack(qq2)
qq2mean<-mean(qq2stack)
qq2sd<-sd(qq2stack)

中庸之道奏效了。但标准差给了我这个错误：

Error in as.double(x) : 
  cannot coerce type 'S4' to vector of type 'double'

共有2个答案

吕志诚

2023-03-14

为了解决计算大型光栅的标准偏差时光栅：：calc的性能较慢的问题，我写了一个要点（这里），它使用GDAL的GDAL\u calc.py来实现。

要计算具有400万个单元格（2000 x 2000）的5个光栅的sd，gdal\U calc大约需要2秒，而光栅：：calc大约需要60秒（请参见下面的示例）。

您需要gdal_calc.py系统路径，以便它可以被Sys.which找到，或者修改gdal_calc

请注意，从内存来看，该函数最多支持26个输入栅格，因为gdal_calc用字母表中的一个字母来引用每个输入栅格。也许可以绕过这个限制——我对gdalnumic语法不够熟悉，无法解决这个问题。

例如：

首先，让我们创建一个由五个随机的2000 x 2000光栅组成的堆栈。我们将在此堆栈上使用光栅：：calc，但我们也将它们写入临时文件，因为gdal\U calc需要文件路径：

s <- stack(replicate(5, raster(matrix(runif(4000000), 2000))))  
ff <- replicate(5, tempfile(fileext='.tif')) # a vector of temp file paths
writeRaster(s, ff, bylayer=TRUE)

这是光栅::calc版本：

system.time(sd_calc <- calc(s, sd))

##  user  system elapsed 
## 79.83    0.08   80.00

以及gdal\U sd（即使用gdal\U calc.py）版本：

devtools::source_gist('61c8062938e05a4c6b92') # source the gdal_sd gist
system.time(gdal_sd(infile=ff, outfile=f <- tempfile(fileext='.tif')))

## Calculating standard deviation and writing to file17b0146640ac.tif
##    user  system elapsed 
##    0.00    0.03    2.05

以及他们价值观的比较：

range(sd_calc[] - raster(f)[])
## [1] -1.110223e-16  1.110223e-16

请注意，光栅::calc对于小光栅来说可能更快，但显然gdal_sd对于大光栅来说是一个巨大的改进。

此处给出了gdal\U calc使用的标准偏差函数的文档。我已经指定了ddof=1（即delta自由度），以便结果与R的sd返回的结果一致。

对于后代来说，这里是gdal_sd的来源，就像发布时一样：

gdal_sd <- function(infile, outfile, quiet=TRUE) {
  require(rgdal)
  # infile:   The multiband raster file (or a vector of paths to multiple raster
  #           files) for which to calculate cell standard deviations.
  # outfile:  Path to raster output file.
  # quiet:    Logical. Should gdal_calc.py output be silenced?
  gdal_calc <- Sys.which('gdal_calc.py')
  if(gdal_calc=='') stop('gdal_calc.py not found on system.')
  if(file.exists(outfile)) stop('outfile already exists.')
  nbands <- sapply(infile, function(x) nrow(attr(GDALinfo(x), 'df')))
  if(length(infile) > 26 || nbands > 26) stop('Maximum number of inputs is 26.')
  if(length(nbands) > 1 & any(nbands > 1)) 
    warning('One or more rasters have multiple bands. First band used.')

  if(length(infile)==1) {
    inputs <- paste0('-', LETTERS[seq_len(nbands)], ' ', infile, ' --', 
           LETTERS[seq_len(nbands)], '_band ', seq_len(nbands), collapse=' ')
    n <- nbands
  } else {
    inputs <- paste0('-', LETTERS[seq_along(nbands)], ' ', infile, ' --', 
           LETTERS[seq_along(nbands)], '_band 1', collapse=' ')
    n <- length(infile)
  }

  message('Calculating standard deviation and writing to ', basename(outfile))
  cmd <- 'python %s %s --outfile=%s --calc="std([%s], 0, ddof=1)"'
  out <- system(
    sprintf(cmd, gdal_calc, inputs, outfile, 
            paste0(LETTERS[seq_len(n)], collapse=',')),
    show.output.on.console=!quiet, intern=TRUE
  )
  if(any(grepl('Error', out))) stop(out, call.=FALSE) else NULL
}

狄玉书

2023-03-14

不幸的是，正如您在我的上述评论后所注意到的，每像素分析可能会很慢。我认为下一步你要尝试的是并行化这个过程。假设您有一个多核处理器，您可以利用其内置的多进程优化：

cores <- 4
beginCluster(cores, type='SOCK')
calc(qq2stack, fun=sd)
endCluster()

如果您的操作/硬件环境支持，这将大大加快速度。显然，您也可以根据您的体系结构增加进程的数量。

如何找到像素级标准偏差？

共有2个答案

相关问答

相关文章

相关阅读

相关工具

相关文档