%% $Id: csdacm.Rnw,v 1.44 2008-04-06 20:47:04 roger Exp $ % -*- mode: noweb; noweb-default-code-mode: R-mode; -*- % merge all depth coefficients!! % Customising spatial data analysis classes and methods %\SweaveOpts{echo=TRUE} %*Intro and plan* %motivate spatial data handling \documentclass{article} % \VignetteIndexEntry{Customising spatial data classes and methods} \usepackage{graphicx} \usepackage[colorlinks=true,urlcolor=blue]{hyperref} \usepackage{color} \usepackage{Sweave} \SweaveOpts{keep.source=FALSE} \usepackage{natbib} \newcommand{\strong}[1]{{\normalfont\fontseries{b}\selectfont #1}} \let\pkg=\strong \newcommand\code{\bgroup\@codex} \def\@codex#1{\small {\normalfont\ttfamily\hyphenchar\font=45 #1}\egroup} \usepackage{xspace} \def\RR{\textsf{R}\xspace} \def\SP{\texttt{S-PLUS}\texttrademark\xspace} \def\SS{\texttt{S}\xspace} \def\SIII{\texttt{S3}\xspace} \def\SIV{\texttt{S4}\xspace} \title{{\bf Customising spatial data classes and methods}\footnote{This vignette formed pp. 127--148 of the first edition of Bivand, R. S., Pebesma, E. and G\'{o}mez-Rubio V. (2008) Applied Spatial Data Analysis with R, Springer-Verlag, New York. It was retired from the second edition (2013) to accommodate material on other topics, and is made available in this form with the understanding of the publishers. It has been updated to the 2013 state of the software, e.g. using \code{over}.} } \author{Edzer Pebesma\footnote{Institute for Geoinformatics, University of Muenster, Weseler Strasse 253, 48151 M\"{u}nster, Germany. {\tt edzer.pebesma@uni-muenster.de}}} \date{Feb 2008} \begin{document} \maketitle \tableofcontents <>= owidth <- getOption("width") options("width"=90) .PngNo <- 0 @ <>= .iwidth <- 5 .iheight <- 6 .ipointsize <- 12 @ <>= <> @ <>= .PngNo <- .PngNo + 1; file <- paste("Fig-bitmap-", .PngNo, ".pdf", sep="") pdf(file=file, width = .iwidth, height = .iheight, pointsize = .ipointsize) opar <- par(mar=c(3,3,1,1)+0.1) @ <>= dev.null <- dev.off() cat("\\includegraphics[width=0.95\\textwidth]{", file, "}\n\n", sep="") @ Although the classes defined in the \pkg{sp} package cover many needs, they do not go far beyond the most typical GIS data models. In applied research, it often happens that customised classes would suit the actual data coming from the instruments better. Since \SIV classes have mechanisms for inheritance, it may be attractive to build on the \pkg{sp} classes, so as to utilise their methods where appropriate. Here, we will demonstrate a range of different settings in which \pkg{sp} classes can be extended. Naturally, this is only useful for researchers with specific and clear needs, so our goal is to show how (relatively) easy it may be to prototype classes extending \pkg{sp} classes for specific purposes. \section{Programming with classes and methods} \label{sect:cls} This section will explain the elementary basics of programming with classes and methods in \RR. The \SS language (implemented in \RR and \SP) contains two mechanisms for creating classes and methods: the traditional \SIII system and the more recent \SIV system (see Section 2.2 in \cite{bivand}, in which classes were described for the use{\RR} --- here they are described for the develope{\RR}). This chapter is not a full introduction to \RR programming \citep[see][for more details]{braun+murdoch:07}, but it will try to give some feel of how the \code{Spatial} classes in package \pkg{sp} can be extended to be used for wider classes of problems. For full details, the interested reader is referred to e.g.\ \cite{R:Venables+Ripley:2000} and \cite{R:Chambers:1998}, the latter being a reference for new-style \SIV classes and methods. Example code is for example found in the source code for package \pkg{sp}, available from CRAN. Suppose we define myfun as \begin{footnotesize} << >>= myfun <- function(x) { x + 2 } @ \end{footnotesize} then, calling it with the numbers 1, 2 and 3 results in \begin{footnotesize} << >>= myfun(1:3) @ \end{footnotesize} or alternatively using a named argument: \begin{footnotesize} << >>= myfun(x=1:3) @ \end{footnotesize} The return value of the function is the last expression evaluated. Often, we want to wrap existing functions, such as a plot function: \begin{footnotesize} << >>= plotXplus2Yminus3 <- function(x, y, ...) { plot(x = x + 2, y = y - 3, ...) } @ \end{footnotesize} In this case, the \code{...} is used to pass information to the \code{plot} function without explicitly anticipating what it will be: named arguments \code{x} and \code{y}, or the first two arguments if they are unnamed are processed, remaining arguments are passed on. The plot function is a generic method, with an instance that depends on the class of its first (\SIII) or first $n$ arguments (\SIV). The available instances of \code{plot} are shown for \SIII-type methods by \begin{footnotesize} << >>= methods("plot") @ \end{footnotesize} and for \SIV-type methods by \begin{footnotesize} << >>= library(sp) showMethods("plot") @ \end{footnotesize} where we first loaded \pkg{sp} to make sure there are some \SIV plot methods to show. \subsection{\SIII-style classes and methods} Building \SIII-style classes is simple. Suppose we want to build an object of class \code{foo}: \begin{footnotesize} << >>= x <- rnorm(10) class(x) <- "foo" x @ \end{footnotesize} If we plot this object, e.g., by \code{plot(x)} we get the same plot as when we would not have set the class to \code{foo}. If we know, however, that objects of class \code{foo} need to be plotted without symbols but with connected lines, we can write a plot method for this class: \begin{footnotesize} << >>= plot.foo <- function(x, y, ...) { plot.default(x, type = 'l', ...) } @ \end{footnotesize} after which \code{plot(x)} will call this particular method, rather than a default plot method. Class inheritance is obtained in \SIII when an object is given multiple classes, as in \begin{footnotesize} << >>= class(x) <- c("foo", "bar") @ <>= plot(x) @ \end{footnotesize} For this plot, first function \code{plot.foo} will be looked for, and if not found the second option \code{plot.bar} will be looked for. If none of them is found, the default \code{plot.default} will be used. The \SIII class mechanism is simple and powerful. Much of \RR works with it, including key functions such as \code{lm}. \begin{footnotesize} << >>= data(meuse) class(meuse) class(lm(log(zinc)~sqrt(dist), meuse)) @ \end{footnotesize} There is, however, no checking that a class with a particular name does indeed contain the elements that a certain method for it expects. It also has design flaws, as method specification by dot separation is ambiguous in case of names such as \code{as.data.frame}, where one cannot tell whether it means that the method \code{as.data} acts on objects of class \code{frame}, or the method \code{as} acts on objects of class \code{data.frame}, or none of them (the answer is: none). For such reasons, \SIV-style classes and methods were designed. \subsection{\SIV-style classes and methods} \SIV-style classes are formally defined, using \code{setClass}. As an example, somewhat simplified versions of classes \code{CRS} and \code{Spatial} in \pkg{sp} are \begin{footnotesize} <>= options("width"=60) @ <>= setClass("CRS", representation(projargs = "character")) setClass("Spatial", representation(bbox = "matrix", proj4string = "CRS"), # NOT TOO WIDE validity <- function(object) { bb <- bbox(object) if (!is.matrix(bb)) return("bbox should be a matrix") n <- dimensions(object) if (n < 2) return("spatial.dimension should be 2 or more") if (any(is.na(bb))) return("bbox should never contain NA values") if (any(!is.finite(bb))) return("bbox should never contain infinite values") if (any(bb[,"max"] < bb[,"min"])) return("invalid bbox: max < min") TRUE } ) @ <>= options("width"=70) @ \end{footnotesize} The command \code{setClass} defines a class name as a formal class, gives the names of the class elements (called slots), and their type---type checking will happen upon construction of an instance of the class. Further checking, e.g., on valid dimensions and data ranges can be done in the \code{validity} function. Here, the validity function retrieves the bounding box using the generic \code{bbox} method. Generics, if not defined in the base R system, e.g., \begin{footnotesize} << >>= isGeneric("show") @ \end{footnotesize} can be defined with \code{setGeneric}. Defining a specific instance of a generic is done by \code{setMethod}: \begin{footnotesize} <>= setGeneric("bbox", function(obj) standardGeneric("bbox")) setMethod("bbox", signature = "Spatial", function(obj) obj@bbox) @ \end{footnotesize} where the signature tells the class of the first (or first $n$) arguments. Here, the \code{@} operator is used to access the \code{bbox} slot in an \SIV object, not to be confused with the \code{\$} operator to access list elements. We will now illustrate this mechanism by providing a few examples of classes, building on those available in package \pkg{sp}. \section{Animal track data in package \pkg{trip}} CRAN Package \pkg{trip}, written by Michael Sumner \citep{kirkwood06,page06}, provides a class for animal tracking data. Animal tracking data consist of sets of ($x,y,t$) stamps, grouped by an identifier pointing to an individual animal, sensor or perhaps isolated period of monitoring. A strategy for this (slightly simplified from that of \pkg{trip}) is to extend the {\tt SpatialPointsDataFrame} class by a length 2 character vector carrying the names of the time column and the trip identifier column in the {\tt SpatialPointsDataFrame} attribute table. Package \pkg{trip} does a lot of work to read and analyse tracking data from data formats typical for tracking data (Argos DAT), removing duplicate observations and validating the objects, e.g., checking that time stamps increase and movement speeds are realistic. We ignore this and stick to the bare bones. We now define a class called \code{trip} that extends \code{SpatialPointsDataFrame}: \begin{footnotesize} << >>= library(sp) setClass("trip", representation("SpatialPointsDataFrame", TOR.columns = "character"), validity <- function(object) { if (length(object@TOR.columns) != 2) stop("Time/id column names must be of length 2") if (!all(object@TOR.columns %in% names(object@data))) stop("Time/id columns must be present in attribute table") TRUE } ) showClass("trip") @ \end{footnotesize} that checks, upon creation of objects, that indeed two variable names are passed and that these names refer to variables present in the attribute table. \subsection{Generic and constructor functions} It would be nice to have a constructor function, just like \code{data.frame} or \code{SpatialPoints}, so we now create it and set it as the generic function to be called in case the first argument is of class \code{SpatialPointsDataFrame}. \begin{footnotesize} << >>= trip.default <- function(obj, TORnames) { if (!is(obj, "SpatialPointsDataFrame")) stop("trip only supports SpatialPointsDataFrame") if (is.numeric(TORnames)) TORnames <- names(obj)[TORnames] new("trip", obj, TOR.columns = TORnames) } if (!isGeneric("trip")) setGeneric("trip", function(obj, TORnames) standardGeneric("trip")) setMethod("trip", signature(obj = "SpatialPointsDataFrame", TORnames = "ANY"), trip.default) @ \end{footnotesize} We can now try it out, with turtle data: \begin{footnotesize} << >>= turtle <- read.csv(system.file("external/seamap105_mod.csv", package="sp")) @ << >>= timestamp <- as.POSIXlt(strptime(as.character(turtle$obs_date), "%m/%d/%Y %H:%M:%S"), "GMT") turtle <- data.frame(turtle, timestamp = timestamp) turtle$lon <- ifelse(turtle$lon < 0, turtle$lon+360, turtle$lon) turtle <- turtle[order(turtle$timestamp),] coordinates(turtle) <- c("lon", "lat") proj4string(turtle) <- CRS("+proj=longlat +ellps=WGS84") turtle$id <- c(rep(1, 200), rep(2, nrow(coordinates(turtle)) - 200)) turtle_trip <- trip(turtle, c("timestamp", "id")) summary(turtle_trip) @ \end{footnotesize} \subsection{Methods for trip objects} The summary method here is not defined for \code{trip}, but is the default summary inherited from class \code{Spatial}. As can be seen, nothing special about the trip features is mentioned, such as what the time points are and what the identifiers. We could alter this by writing a class-specific summary method \begin{footnotesize} << >>= summary.trip <- function(object, ...) { cat("Object of class \"trip\"\nTime column: ") print(object@TOR.columns[1]) cat("Identifier column: ") print(object@TOR.columns[2]) print(summary(as(object, "Spatial"))) print(summary(object@data)) } setMethod("summary", "trip", summary.trip) summary(turtle_trip) @ \end{footnotesize} As \code{trip} extends \code{SpatialPointsDataFrame}, subsetting using {\small \verb+"["+} and column selection or replacement using {\small \verb+"[["+} or {\small \verb+"$"+} all work, as these are inherited. Creating invalid trip objects can be prohibited by adding checks to the validity function in the class definition, e.g., %<>= %x <- turtle_trip[1] %@ will not work because the time and/or id column are not present any more. A custom plot method for trip could be written, for example using colour to denote a change in identifier: \begin{footnotesize} << >>= setGeneric("lines", function(x, ...) standardGeneric("lines")) setMethod("lines", signature(x = "trip"), function(x, ..., col = NULL) { # NOT TOO WIDE tor <- x@TOR.columns if (is.null(col)) { l <- length(unique(x[[tor[2]]])) col <- hsv(seq(0, 0.5, length = l)) } coords <- coordinates(x) lx <- split(1:nrow(coords), x[[tor[2]]]) for (i in 1:length(lx)) lines(coords[lx[[i]], ], col = col[i], ...) } ) @ \end{footnotesize} Here, the \code{col} argument is added to the function header so that a reasonable default can be overridden, e.g., for black/white plotting. \section{Multi-point data: \texttt{SpatialMultiPoints}} One of the feature types of the OpenGeospatial Consortium (OGC) simple feature specification that has not been implemented in \pkg{sp} is the \code{MultiPoint} object. In a \code{MultiPoint} object, each feature refers to a {\em set of} points. The \pkg{sp} classes \code{SpatialPointsDataFrame} only provide reference to a single point. Instead of building a new class up from scratch, we'll try to re-use code and build a class \code{SpatialMultiPoint} from the {\tt SpatialLines} class. After all, lines are just sets of ordered points. In fact, the \code{SpatialLines} class implements the \code{MultiLineString} simple feature, where each feature can refer to multiple lines. A special case is formed if each feature only has a single line: \begin{footnotesize} <>= options("width"=50) @ << >>= setClass("SpatialMultiPoints", representation("SpatialLines"), validity <- function(object) { if (any(unlist(lapply(object@lines, function(x) length(x@Lines))) != 1)) # NOT TOO WIDE stop("Only Lines objects with one Line element") TRUE } ) SpatialMultiPoints <- function(object) new("SpatialMultiPoints", object) @ <>= options("width"=70) @ \end{footnotesize} As an example, we can create an instance of this class for two MultiPoint features each having three locations: \begin{footnotesize} << >>= n <- 5 set.seed(1) x1 <- cbind(rnorm(n),rnorm(n, 0, 0.25)) x2 <- cbind(rnorm(n),rnorm(n, 0, 0.25)) x3 <- cbind(rnorm(n),rnorm(n, 0, 0.25)) L1 <- Lines(list(Line(x1)), ID="mp1") L2 <- Lines(list(Line(x2)), ID="mp2") L3 <- Lines(list(Line(x3)), ID="mp3") s <- SpatialLines(list(L1,L2,L3)) smp <- SpatialMultiPoints(s) @ \end{footnotesize} If we now plot object \code{smp}, we get the same plot as when we plot \code{s}, showing the two lines. The \code{plot} method for a \code{SpatialLines} object is not suitable, so we write a new one: \begin{footnotesize} << >>= plot.SpatialMultiPoints <- function(x, ..., pch = 1:length(x@lines), col = 1, cex = 1) { n <- length(x@lines) if (length(pch) < n) pch <- rep(pch, length.out = n) if (length(col) < n) col <- rep(col, length.out = n) if (length(cex) < n) cex <- rep(cex, length.out = n) plot(as(x, "Spatial"), ...) for (i in 1:n) points(x@lines[[i]]@Lines[[1]]@coords, pch = pch[i], col = col[i], cex = cex[i]) } setMethod("plot", signature(x = "SpatialMultiPoints", y = "missing"), function(x, y, ...) plot.SpatialMultiPoints(x, ...)) @ \end{footnotesize} Here we chose to pass any named \code{...} arguments to the plot method for a \code{Spatial} object. This function sets up the axes and controls the margins, aspect ratio, etc. All arguments that need to be passed to \code{points} (\code{pch} for symbol type, \code{cex} for symbol size and \code{col} for symbol colour) need explicit naming and sensible defaults, as they are passed explicitly to the consecutive calls to \code{points}. According to the documentation of \code{points}, in addition to \code{pch}, \code{cex} and \code{col}, the arguments \code{bg} and \code{lwd} (symbol fill colour and symbol line width) would need a similar treatment to make this plot method completely transparent with the base \code{plot} method---something an end user would hope for. Having \code{pch}, \code{cex} and \code{col} arrays the length of the number of \code{MultiPoints} {\em sets} rather than the number of points to be plotted is useful for two reasons. First, the whole point of {\tt MultiPoints} object is to distinguish {\em sets} of points. Second, when we extend this class to \code{SpatialMultiPointsDataFrame}, e.g., by \begin{footnotesize} << >>= cName <- "SpatialMultiPointsDataFrame" setClass(cName, representation("SpatialLinesDataFrame"), validity <- function(object) { lst <- lapply(object@lines, function(x) length(x@Lines)) if (any(unlist(lst) != 1)) stop("Only Lines objects with single Line") TRUE } ) SpatialMultiPointsDataFrame <- function(object) { new("SpatialMultiPointsDataFrame", object) } @ \end{footnotesize} then we can pass symbol characteristics by (functions of) columns in the attribute table: \begin{footnotesize} <>= df <- data.frame(x1 = 1:3, x2 = c(1,4,2), row.names = c("mp1", "mp2", "mp3")) smp_df <- SpatialMultiPointsDataFrame(SpatialLinesDataFrame(smp, df)) setMethod("plot", signature(x = "SpatialMultiPointsDataFrame", y = "missing"), function(x, y, ...) plot.SpatialMultiPoints(x, ...)) grys <- c("grey10", "grey40", "grey80") @ <>= plot(smp_df, col = grys[smp_df[["x1"]]], pch = smp_df[["x2"]], cex = 2, axes = TRUE) @ \end{footnotesize} for which the plot is shown in Figure \ref{fig:smpdf}. \begin{figure} %<>= <>= .iwidth <- 6 .iheight <- 2.5 .ipointsize <- 10 <> plot(smp_df, col = grys[smp_df[["x1"]]], pch = smp_df[["x2"]], cex = 2, axes = TRUE) <> <> @ \caption{Plot of the \code{SpatialMultiPointsDataFrame} object.} \label{fig:smpdf} \end{figure} Hexagonal grids are like square grids, where grid points are centres of matching hexagons, rather than squares. Package \pkg{sp} has no classes for hexagonal grids, but does have some useful functions for generating and plotting them. This could be used to build a class. Much of this code in \pkg{sp} is based on postings to the R-sig-geo mailing list by Tim Keitt, used with permission. The spatial sampling method \code{spsample} has a method for sampling points on a hexagonal grid: \begin{footnotesize} <<>>= data(meuse.grid) gridded(meuse.grid)=~x+y xx <- spsample(meuse.grid, type="hexagonal", cellsize=200) class(xx) @ \end{footnotesize} gives the points shown in the left side of Figure \ref{fig:hex}. Note that an alternative hexagonal representation is obtained by rotating this grid 90 degrees; we will not further consider that here. \begin{footnotesize} <>= HexPts <- spsample(meuse.grid, type="hexagonal", cellsize=200) @ <>= spplot(meuse.grid["dist"], sp.layout = list("sp.points", HexPts, col = 1)) @ <>= HexPols <- HexPoints2SpatialPolygons(HexPts) df <- over(HexPols, meuse.grid) HexPolsDf <- SpatialPolygonsDataFrame(HexPols, df, match.ID = FALSE) @ <>= spplot(HexPolsDf["dist"]) @ \end{footnotesize} for which the plots are shown in Figure \ref{fig:hex}. \begin{figure} <>= .iwidth <- 6 .iheight <- 4 <> library(lattice) # RSB quietening greys grys <- grey.colors(11, 0.95, 0.55, 2.2) print(spplot(meuse.grid["dist"], cuts=10, col.regions=grys, sp.layout = list("sp.points", HexPts, col = 1)), split = c(1, 1, 2, 1), more = TRUE) print(spplot(HexPolsDf["dist"], cuts=10, col.regions=grys), split = c(2, 1, 2, 1), more = FALSE) <> <> @ \caption{Hexagonal points (left) and polygons (right).} \label{fig:hex} \end{figure} We can now generate and plot hexagonal grids, but need to deal with two representations: as points and as polygons, and both representations do not tell by themselves that they represent a hexagonal grid. For designing a hexagonal grid class we will extend \code{SpatialPoints}, assuming that computation of the polygons can be done when needed without a prohibitive overhead. \begin{footnotesize} << >>= setClass("SpatialHexGrid", representation("SpatialPoints", dx = "numeric"), validity <- function(object) { if (object@dx <= 0) stop("dx should be positive") TRUE } ) @ <>= options("width"=40) @ << >>= setClass("SpatialHexGridDataFrame", representation("SpatialPointsDataFrame", dx = "numeric"), # NOT TOO WIDE validity <- function(object) { if (object@dx <= 0) stop("dx should be positive") TRUE } ) @ <>= options("width"=70) @ \end{footnotesize} Note that these class definitions do not check that instances actually do form valid hexagonal grids; a more robust implementation could provide a test that distances between points with equal $y$ coordinate are separated by a multiple of \code{dx}, that the $y$-separations are correct and so on. It might make sense to adapt the generic \code{spsample} method in package \pkg{sp} to return \code{SpatialHexGrid} objects; we can also add \code{plot} and \code{spsample} methods for them. Method \code{over} should work with a \code{SpatialHexGrid} as its first argument, by inheriting from \code{SpatialPoints}. Let us first see how to create the new classes. Without a constructor function we can use \begin{footnotesize} << >>= HexPts <- spsample(meuse.grid, type="hexagonal", cellsize=200) Hex <- new("SpatialHexGrid", HexPts, dx = 200) df <- over(Hex, meuse.grid) spdf <- SpatialPointsDataFrame(HexPts, df) HexDf <- new("SpatialHexGridDataFrame", spdf, dx = 200) @ \end{footnotesize} Because of the route taken to define both HexGrid classes, it is not obvious that the second extends the first. We can tell the \SIV system this by \code{setIs}: \begin{footnotesize} << >>= is(HexDf, "SpatialHexGrid") setIs("SpatialHexGridDataFrame", "SpatialHexGrid") is(HexDf, "SpatialHexGrid") @ \end{footnotesize} to make sure that methods for \code{SpatialHexGrid} objects work as well for objects of class \code{SpatialHexGridDataFrame}. When adding methods, several of them will need conversion to the polygon representation, so it makes sense to add the conversion function such that e.g. \code{as(x, "SpatialPolygons")} will work: \begin{footnotesize} <>= options("width"=50) @ << >>= # NOT TOO WIDE setAs("SpatialHexGrid", "SpatialPolygons", function(from) HexPoints2SpatialPolygons(from, from@dx) ) setAs("SpatialHexGridDataFrame", "SpatialPolygonsDataFrame", function(from) SpatialPolygonsDataFrame(as(obj, "SpatialPolygons"), obj@data, match.ID = FALSE) ) @ <>= options("width"=70) @ \end{footnotesize} We can now add \code{plot}, \code{spplot}, \code{spsample} and \code{over} methods for these classes: \begin{footnotesize} << >>= setMethod("plot", signature(x = "SpatialHexGrid", y = "missing"), function(x, y, ...) plot(as(x, "SpatialPolygons"), ...) ) setMethod("spplot", signature(obj = "SpatialHexGridDataFrame"), function(obj, ...) spplot(SpatialPolygonsDataFrame( as(obj, "SpatialPolygons"), obj@data, match.ID = FALSE), ...) ) setMethod("spsample", "SpatialHexGrid", function(x, n, type, ...) spsample(as(x, "SpatialPolygons"), n = n, type = type, ...) ) setMethod("over", c("SpatialHexGrid", "SpatialPoints"), function(x, y, ...) over(as(x, "SpatialPolygons"), y) ) @ \end{footnotesize} After this, the following will work: \begin{footnotesize} <>= spplot(meuse.grid["dist"], sp.layout = list("sp.points", Hex, col = 1)) spplot(HexDf["dist"]) @ \end{footnotesize} Coercion to a data frame is done by \begin{footnotesize} <>= as(HexDf, "data.frame") @ \end{footnotesize} Another detail not mentioned is that the bounding box of the hexgrid objects only match the grid centre points, not the hexgrid cells: \begin{footnotesize} << >>= bbox(Hex) bbox(as(Hex, "SpatialPolygons")) @ \end{footnotesize} One solution for this is to correct for this in a constructor function, and check for it in the validity test. Explicit coercion functions to the points representation would have to set the bounding box back to the points ranges. Another solution is to write a bbox method for the hexgrid classes, taking the risk that someone still looks at the incorrect bbox slot. \section{Spatio-temporal grids} Spatio-temporal data can be represented in different ways. One simple option is when observations (or model-results, or predictions) are given on a regular space-time grid. Objects of class or extending \code{SpatialPoints}, \code{SpatialPixels} and \code{SpatialGrid} do not have the constraint that they represent a two-dimensional space; they may have arbitrary dimension; an example for a three-dimensional grid is \begin{footnotesize} << >>= n <- 10 x <- data.frame(expand.grid(x1 = 1:n, x2 = 1:n, x3 = 1:n), z = rnorm(n^3)) coordinates(x) <- ~x1+x2+x3 gridded(x) <- TRUE fullgrid(x) <- TRUE summary(x) @ \end{footnotesize} We might assume here that the third dimension, \code{x3}, represents time. If we are happy with time somehow represented by a real number (in double precision), then we are done. A simple representation is that of decimal year, with e.g. 1980.5 meaning the 183rd day of 1980, or e.g. relative time in seconds after the start of some event. When we want to use the \code{POSIXct} or \code{POSIXlt} representations, we need to do some more work to see the readable version. We will now devise a simple three-dimensional space-time grid with the \code{POSIXct} representation. \begin{footnotesize} <>= options("width"=50) @ << >>= # NOT TOO WIDE setClass("SpatialTimeGrid", "SpatialGrid", validity <- function(object) { stopifnot(dimensions(object) == 3) TRUE } ) @ <>= options("width"=70) @ \end{footnotesize} Along the same line, we can extend the \code{SpatialGridDataFrame} for space-time: \begin{footnotesize} << >>= setClass("SpatialTimeGridDataFrame", "SpatialGridDataFrame", validity <- function(object) { stopifnot(dimensions(object) == 3) TRUE } ) setIs("SpatialTimeGridDataFrame", "SpatialTimeGrid") x <- new("SpatialTimeGridDataFrame", x) @ \end{footnotesize} A crude summary for this class could be written along these lines: \begin{footnotesize} << >>= summary.SpatialTimeGridDataFrame <- function(object, ...) { cat("Object of class SpatialTimeGridDataFrame\n") x <- gridparameters(object) t0 <- ISOdate(1970,1,1,0,0,0) t1 <- t0 + x[3,1] cat(paste("first time step:", t1, "\n")) t2 <- t0 + x[3,1] + (x[3,3] - 1) * x[3,2] cat(paste("last time step: ", t2, "\n")) cat(paste("time step: ", x[3,2], "\n")) summary(as(object, "SpatialGridDataFrame")) } @ <>= options("width"=50) @ << >>= # NOT TOO WIDE setMethod("summary", "SpatialTimeGridDataFrame", summary.SpatialTimeGridDataFrame) summary(x) @ <>= options("width"=70) @ \end{footnotesize} Next, suppose we need a subsetting method that selects on the time. When the first subset argument is allowed to be a time range, this is done by \begin{footnotesize} << >>= subs.SpatialTimeGridDataFrame <- function(x, i, j, ..., drop=FALSE) { t <- coordinates(x)[,3] + ISOdate(1970,1,1,0,0,0) if (missing(j)) j <- TRUE sel <- t %in% i if (! any(sel)) stop("selection results in empty set") fullgrid(x) <- FALSE if (length(i) > 1) { x <- x[i = sel, j = j,...] fullgrid(x) <- TRUE as(x, "SpatialTimeGridDataFrame") } else { gridded(x) <- FALSE x <- x[i = sel, j = j,...] cc <- coordinates(x)[,1:2] p4s <- CRS(proj4string(x)) # NOT TOO WIDE SpatialPixelsDataFrame(cc, x@data, proj4string = p4s) } } setMethod("[", c("SpatialTimeGridDataFrame", "POSIXct", "ANY"), subs.SpatialTimeGridDataFrame) t1 <- as.POSIXct("1970-01-01 0:00:03", tz = "GMT") t2 <- as.POSIXct("1970-01-01 0:00:05", tz = "GMT") summary(x[c(t1,t2)]) summary(x[t1]) @ \end{footnotesize} The reason to only convert back to \code{SpatialTimeGridDataFrame} when multiple time steps are present is that the time step (``cell size'' in time direction) cannot be found when there is only a single step. In that case, the current selection method returns an object of class \code{SpatialPixelsDataFrame} for that time slice. Plotting a set of slices could be done using levelplot, or writing another \code{spplot} method: \begin{footnotesize} << >>= spplot.stgdf <- function(obj, zcol = 1, ..., format = NULL) { # NOT TOO WIDE if (length(zcol) != 1) stop("can only plot a single attribute") if (is.null(format)) format <- "%Y-%m-%d %H:%M:%S" cc <- coordinates(obj) df <- unstack(data.frame(obj[[zcol]], cc[,3])) ns <- format(coordinatevalues(getGridTopology(obj))[[3]] + ISOdate(1970,1,1,0,0,0), format = format) cc2d <- cc[cc[,3] == min(cc[,3]), 1:2] obj <- SpatialPixelsDataFrame(cc2d, df) spplot(obj, names.attr = ns,...) } setMethod("spplot", "SpatialTimeGridDataFrame", spplot.stgdf) @ \end{footnotesize} \begin{figure} \begin{center} <>= .iwidth <- 6 .iheight <- 4 <> #print(spplot(x, format = "%H:%M:%S", as.table=TRUE)) print(spplot(x, as.table=TRUE)) <> <> @ \end{center} \caption{ \code{spplot} for an object of class \code{SpatialTimeGridDataFrame}, filled with random numbers.} \label{fig:stgdf} \end{figure} Now, the result of \begin{footnotesize} <>= library(lattice) trellis.par.set(canonical.theme(color = FALSE)) spplot(x, format = "%H:%M:%S", as.table=TRUE, cuts=6, col.regions=grey.colors(7, 0.55, 0.95, 2.2)) # RSB quietening greys @ \end{footnotesize} is shown in Figure \ref{fig:stgdf}. The format argument passed controls the way time is printed; one can refer to the help of \begin{footnotesize} <>= ?as.character.POSIXt @ \end{footnotesize} for more details about the \code{format} argument. \section{Analysing spatial Monte Carlo simulations} \label{sec:simquant} Quite often, spatial statistical analysis results in a large number of spatial realisations or a random field, using some Monte Carlo simulation approach. Regardless whether individual values refer to points, lines, polygons or grid cells, we would like to write some methods or functions that aggregate over these simulations, to get summary statistics such as the mean value, quantiles, or cumulative distributions values. Such aggregation can take place in two ways. Either we aggregate over the probability space, and compute summary statistics for each geographical feature over the set of realisations (i.e., the rows of the attribute table), or for each realisation we aggregate over the complete geographical layer or a subset of it (i.e., aggregate over the columns of the attribute table). Let us first generate, as an example, a set of 100 conditional Gaussian simulations for the zinc variable in the meuse data set: \begin{footnotesize} <>= data(meuse) coordinates(meuse) <- ~x+y if (require(gstat, quietly = TRUE)) { v <- vgm(.5, "Sph", 800, .05) sim <- krige(log(zinc)~1, meuse, meuse.grid, v, nsim=100, nmax=30) sim@data <- exp(sim@data) } @ \end{footnotesize} where the last statement back-transforms the simulations from the log scale to the observation scale. A quantile method for Spatial object attributes can be written as \begin{footnotesize} << >>= quantile.Spatial <- function(x, ..., byLayer = FALSE) { stopifnot("data" %in% slotNames(x)) apply(x@data, ifelse(byLayer, 2, 1), quantile, ...) } @ \end{footnotesize} after which we can find the sample lower and upper 95\% confidence limits by \begin{footnotesize} << >>= if (require(gstat, quietly = TRUE)) { sim$lower <- quantile.Spatial(sim[1:100], probs = 0.025) sim$upper <- quantile.Spatial(sim[1:100], probs = 0.975) } @ \end{footnotesize} To get the sample distribution of the areal median, we can aggregate over layers: \begin{footnotesize} << >>= if (require(gstat, quietly = TRUE)) { medians <- quantile.Spatial(sim[1:100], probs = 0.5, byLayer = TRUE) } @ <>= hist(medians) @ \end{footnotesize} It should be noted that in these particular cases, the quantities computed by simulations could have been obtained faster and exact by working analytically with ordinary (block) kriging and the normal distribution (Section 8.7.2 in \cite{bivand}). A statistic that cannot be obtained analytically is the sample distribution of the area fraction that exceeds a threshold. Suppose that 500 is a crucial threshold, and we want to summarise the sampling distribution of the area fraction where 500 is exceeded: \begin{footnotesize} <>= options("width"=50) @ << >>= fractionBelow <- function(x, q, byLayer = FALSE) { stopifnot(is(x, "Spatial") || !("data" %in% slotNames(x))) apply(x@data < q, ifelse(byLayer, 2, 1), function(r) sum(r)/length(r)) # NOT TOO WIDE } @ <>= options("width"=70) @ << >>= if (require(gstat, quietly = TRUE)) { over500 <- 1 - fractionBelow(sim[1:100], 200, byLayer = TRUE) summary(over500) quantile(over500, c(0.025, 0.975)) } @ \end{footnotesize} For space-time data, we could write methods that aggregate over space, over time or over space and time. This chapter has sketched developments beyond the base \pkg{sp} classes and methods used otherwise in this book. Although we think that the base classes cater for many standard kinds of spatial data analysis, it is clear that specific research problems will call for specific solutions, and that the \RR environment provides the high-level abstractions needed to help busy researchers get their work done. <>= options("width"=owidth) @ \begin{thebibliography}{} \bibitem[Bivand et al., 2008]{bivand} Roger S. Bivand, Edzer J. Pebesma and Virgilio Gomez-Rubio (2008). \newblock {\em Applied spatial data analysis with {\RR}} \newblock Springer, NY \bibitem[Braun and Murdoch, 2007]{braun+murdoch:07} Braun, W.~J. and Murdoch, D.~J. (2007). \newblock {\em A first course in statistical programming with {\RR}}. \newblock Cambridge University Press, Cambridge. \bibitem[Chambers, 1998]{R:Chambers:1998} Chambers, J.M. (1998). \newblock {\em Programming with Data}. \newblock Springer, New York. \bibitem[Kirkwood et al., 2006]{kirkwood06} Kirkwood, R., Lynch, M., Gales, N., Dann, P., and Sumner, M. (2006). \newblock At-sea movements and habitat use of adult male {A}ustralian fur seals ({A}rctocephalus pusillus doriferus). \newblock {\em Canadian Journal of Zoology}, 84:1781--1788. \bibitem[Page et al., 2006]{page06} Page, B., McKenzie, J., Sumner, M., Coyne, M., and Goldsworthy, S. (2006). \newblock Spatial separation of foraging habitats among {N}ew {Z}ealand fur seals. \newblock {\em Marine Ecology Progress Series}, 323:263--279. \bibitem[Venables and Ripley, 2000]{R:Venables+Ripley:2000} Venables, W. N. and Ripley, B. D. (2000). \newblock {\em {\SS} Programming}. \newblock Springer, New York. \end{thebibliography} \end{document}