1 Introduction

1.1 Spatial data and geospatial analyses in R

CRAN Spatial taskview: https://cran.r-project.org/web/views/Spatial.html

The bulk of the geospatial/GISci analysis tools are contained in the following packages:

  • maptools reading and writing spatial data, particularly shapefiles
  • sp manipulating data in one of the Spatial Data classes
  • rgdal R “bindings” to GDAL (Geospatial Data Abstraction Layer)
  • rgeos R interface to the GEOS “geometry engine” (overlays, etc.)

The book (ASDAR2) R.S. Bivand, E. Pebesma, V. Gómez-Rubio (2013) Applied Spatial Data Analysis with R, 2nd Ed., Springer.

Here’s a .pdf: http://link.springer.com/content/pdf/10.1007%2F978-1-4614-7618-4.pdf (must be on UO Network)

There is also an R package maps that includes a world database, and methods for plotting with the R core graphics.

1.2 Spatial classes

There are several related “classes” of spatial data in R, each consisting of the specific spatial coordinate or geometry data, or the coordinate or geometry data and an associate data frame:

  • SpatialPoints and SpatialPointsDataFrame
  • SpatialLines and SpatialLinesDataFrame
  • SpatialPolygons and SpatialPolygonDataFrame
  • SpatialPixels and SpatialPixelDataFrame
  • SpatialGrid and SpatialGridDataFrame
  • SpatialMultiPoints and SpatialMultiPointsDataFrame

The names of the classes pretty much describe the kind of information they contain. One way to look at the landscape of geospatial data analysis in R is that maptools and rgdal cover reading and writing the spatial data classes, sp handles plotting, conversions and manipulations (including projections with SpTransform()) and rgeos handles geospatial analysis tasks.

Many of the functions in these packages will be exercised below, but these examples are by now means exhaustive.

2 Basic mapping

Here is a very simple example of reading and plotting a couple of shape files, that illustrates use use of some of the maptools and sp functions:

2.1 Load libraries and read the shape files

Load the sp and rgdal package:

When maptools or sp is loaded, they also check for the presence of the rgeos package.

Read two shape files, one of Oreogon county outlines, the other of climate-station locations:

## OGR data source with driver: ESRI Shapefile 
## Source: "/Users/bartlein/Projects/ESSD/data/shp_files/OR/orotl.shp", layer: "orotl"
## with 36 features
## It has 1 fields

Check what kind of spatial data class this shapefile is

## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
## Object of class SpatialPolygonsDataFrame
## Coordinates:
##          min        max
## x -124.55840 -116.46944
## y   41.98779   46.23626
## Is projected: NA 
## proj4string : [NA]
## Data attributes:
##         NAME   
##  Baker    : 1  
##  Benton   : 1  
##  Clackamas: 1  
##  Clatsop  : 1  
##  Columbia : 1  
##  Coos     : 1  
##  (Other)  :30

Note that the shapefile is a SpatialLinesDataFrame that in addition to the geometry of the counties (the outlines) also contains data (the names). Plot the county outline shapefile:

Now get the climate-station data

## OGR data source with driver: ESRI Shapefile 
## Source: "/Users/bartlein/Projects/ESSD/data/shp_files/OR/orstations.shp", layer: "orstations"
## with 92 features
## It has 12 fields
## Integer64 fields read as strings:  station elevation pjan pjul pann
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
## Object of class SpatialPointsDataFrame
## Coordinates:
##                min      max
## coords.x1 -124.567 -116.967
## coords.x2   42.050   46.150
## Is projected: FALSE 
## proj4string :
## [+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0]
## Number of points: 92
## Data attributes:
##     station      latitude       longitude        elevation       tjan             tjul      
##  350197 : 1   Min.   :42.05   Min.   :-124.6   142    : 2   Min.   :-6.000   Min.   :12.20  
##  350265 : 1   1st Qu.:43.60   1st Qu.:-123.2   18     : 2   1st Qu.:-1.225   1st Qu.:17.30  
##  350304 : 1   Median :44.46   Median :-121.7   2      : 2   Median : 0.750   Median :19.20  
##  350328 : 1   Mean   :44.32   Mean   :-121.3   24     : 2   Mean   : 1.417   Mean   :19.01  
##  350412 : 1   3rd Qu.:45.28   3rd Qu.:-119.6   3      : 2   3rd Qu.: 4.150   3rd Qu.:20.20  
##  350417 : 1   Max.   :46.15   Max.   :-117.0   6      : 2   Max.   : 8.400   Max.   :26.40  
##  (Other):86                                    (Other):80                                   
##       tann             pjan         pjul         pann         Sta    
##  Min.   : 3.200   36     : 4   9      :12   289    : 3   MLB    : 2  
##  1st Qu.: 8.700   41     : 4   8      :11   256    : 2   ANT    : 1  
##  Median :10.400   178    : 3   6      :10   267    : 2   ARL    : 1  
##  Mean   : 9.885   40     : 3   12     : 8   458    : 2   ASH    : 1  
##  3rd Qu.:11.200   58     : 3   5      : 8   1025   : 1   AST    : 1  
##  Max.   :12.600   151    : 2   7      : 8   1031   : 1   BAN    : 1  
##                   (Other):73   (Other):35   (Other):81   (Other):85  
##                               Name   
##  ANTELOPE 1 N USA-OR            : 1  
##  ARLINGTON USA-OR               : 1  
##  ASHLAND 1 N USA-OR             : 1  
##  ASTORIA WSO           //R USA-O: 1  
##  BAKER FAA AIRPORT USA-OR       : 1  
##  BAKER KBKR USA-OR              : 1  
##  (Other)                        :86

Note that this shape file also contains some climate data for each station, as well as the station names and locations.

2.2 Plot both shapefiles

Plot the station data:

More examples of simple maps can be found in the materials for Lecture 5 in GEOG 4/595: http://geog.uoregon.edu/bartlein/courses/geog495/lectures/lec05.html

[Back to top]

3 Extract data from a raster

This example demonstrates how to extract data from a raster for a set of target points contained in a .csv file. In this case, the raster is the Ramankutty and Foley potential natural vegetation data set (https://www.nelson.wisc.edu/sage/data-and-models/global-land-use/index.php), and the target points are the Global Charcoal Database (GCDv3) site locations (http://www.gpwg.paleofire.org)

## Loading required package: lattice
## Loading required package: latticeExtra
## Loading required package: RColorBrewer

3.1 Read the data sets – source and target

User the raster package to read the vegetation data

## class       : RasterLayer 
## dimensions  : 360, 720, 259200  (nrow, ncol, ncell)
## resolution  : 0.5, 0.5  (x, y)
## extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
## data source : /Users/bartlein/Projects/ESSD/data/nc_files/sage_veg30.nc 
## names       : vegetation.type 
## zvar        : vegtype

Plot the vegetation data using a rasterVis levelplot:

Read the charcoal data locations:

## 'data.frame':    703 obs. of  6 variables:
##  $ Site_ID     : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Lat         : num  44.7 44.9 45.7 45.9 46.3 ...
##  $ Lon         : num  -111 -110 -115 -114 -115 ...
##  $ Elev        : num  2530 1884 2250 2300 1770 ...
##  $ depo_context: Factor w/ 12 levels "BOSE","BUOR",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ Site_Name   : Factor w/ 703 levels "17940 core","7-M",..: 130 605 75 33 219 530 179 62 55 97 ...

In order to use the extract() function from raster, the target points must be turned into a SpatialPoints data set.

## [1] "SpatialPoints"
## attr(,"package")
## [1] "sp"

Plot the target points on top of ths source map:

3.2 Extract data at target points

Now extract the data for the target points:

## [1] "numeric"
## [1] 6 6 6 6 6 6

Make a dataframe of the extracted data that could be saved as a .csv file, and plot it:

##          Lon     Lat vegtype
## 1  -110.6161 44.6628       6
## 2  -110.3467 44.9183       6
## 3  -114.9867 45.7044       6
## 4  -114.2619 45.8919       6
## 5  -114.6503 46.3206       6
## 6  -113.4403 45.8406       6
## 7  -114.3592 48.1658       6
## 8  -123.4600 42.0200       4
## 9  -122.5575 41.3467       6
## 10 -122.4954 41.2075       4

Plot the extracted data at the target points on top of the source points. If the extraction is successful, the target-point colors should dissappear against the background.

[Back to top]

3.3 A second example – explicit cell selection

Here’s a second example of extracting values from an array, by referencing the specific cell or array element that a target point falls in. In this example, a netCDF file of bioclimatic variables is read using ncdf4 and the values of mtco (mean temperature of the coldest month) are extracted.

## File /Users/bartlein/Projects/ESSD/data/nc_files/cru10min30_bio.nc (NC_FORMAT_CLASSIC):
## 
##      19 variables (excluding dimension variables):
##         float climatology_bounds[nv,time]   
##         float gdd0[lon,lat]   
##             name: gdd0
##             long_name: Growing-degree days, 0C base
##             units: degdays
##             _FillValue: -99
##         float gdd5[lon,lat]   
##             name: gdd5
##             long_name: Growing-degree days, 5C base
##             units: degdays
##             _FillValue: -99
##         float chill[lon,lat]   
##             name: chill
##             long_name: Number of days below 5C
##             units: days
##             _FillValue: -99
##         float mtco[lon,lat]   
##             name: mtco
##             long_name: Mean temperature coldest month
##             units: C
##             _FillValue: -99
##         float mtwa[lon,lat]   
##             name: mtwa
##             long_name: Mean temperature warmest month
##             units: C
##             _FillValue: -99
##         float mipt[lon,lat]   
##             name: mipt
##             long_name: Priestley-Taylor (alpha) parameter (AE/PE)
##             units: ratio
##             _FillValue: -99
##         float aaetpt[lon,lat]   
##             name: aaetpt
##             long_name: Actual evapotranspiration (AE)
##             units: mm
##             _FillValue: -99
##         float apetpt[lon,lat]   
##             name: apetpt
##             long_name: Potential evapotranspiration (PE)
##             units: mm
##             _FillValue: -99
##         float miptever[lon,lat]   
##             name: miptever
##             long_name: Alpha -- evergreen assimilation period
##             units: ratio
##             _FillValue: -99
##         float aaetever[lon,lat]   
##             name: aaetever
##             long_name: AE -- evergreen assimilation period
##             units: mm
##             _FillValue: -99
##         float apetever[lon,lat]   
##             name: apetever
##             long_name: PE -- evergreen assimilation period
##             units: mm
##             _FillValue: -99
##         float miptdecd[lon,lat]   
##             name: miptdecd
##             long_name: Alpha -- deciduous ssimilation period
##             units: ratio
##             _FillValue: -99
##         float aaetdecd[lon,lat]   
##             name: aaetdecd
##             long_name: AE -- deciduous assimilation period
##             units: mm
##             _FillValue: -99
##         float apetdecd[lon,lat]   
##             name: apetdecd
##             long_name: PE -- deciduous assimilation period
##             units: mm
##             _FillValue: -99
##         float totsnpt[lon,lat]   
##             name: totsnpt
##             long_name: Total snow (from surface water balance)
##             units: mm
##             _FillValue: -99
##         float apr[lon,lat]   
##             name: apr
##             long_name: Annual precipitation (mm)
##             units: mm
##             _FillValue: -99
##         float pjanpann[lon,lat]   
##             name: pjanpann
##             long_name: January/Annual precipitation ratio
##             units: ratio
##             _FillValue: -99
##         float pjulpann[lon,lat]   
##             name: pjulpann
##             long_name: July/Annual precipitation ratio
##             units: ratio
##             _FillValue: -99
## 
##      4 dimensions:
##         lon  Size:720
##             standard_name: longitude
##             long_name: longitude
##             units: degrees_east
##             axis: X
##         lat  Size:360
##             standard_name: latitude
##             long_name: latitude
##             units: degrees_north
##             axis: Y
##         time  Size:12
##             standard_name: time
##             long_name: time
##             units: days since 1900-01-01 00:00:00.0 -0:00
##             axis: T
##             calendar: standard
##             climatology: climatology_bounds
##         nv  Size:2
## 
##     3 global attributes:
##         source: calculated using Sarah Shafer's version of bioclim.f
##         data: e:\Projects\cru\work\cl_2.0_regrid\regions\cru10min30\cru10min30_bio.dat
##         history: P.J. Bartlein, 6 Dec 2007 from 2 Dec 2006 data
## [1] -179.75 -179.25 -178.75 -178.25 -177.75 -177.25
## [1] -89.75 -89.25 -88.75 -88.25 -87.75 -87.25
## [1] 720 360
## [1] 720 360

Close the netCDF file using the nc_close() function.

Plot the control data.

Now extract the data for the target points:

Get the indices (j’s and k’s) of the grid cell that each target point lies in. For each target point, figure out which column (j) and row (k) a target point falls in. This code is basically the same as that used in reshaping a “short” data frame into an array. The function that is defined and executed within the sapply() function figures out which column (j) and row (‘k’) in the control-data array a target point falls in. The j’s and k’s together describe the control-data grid cells the individual target points fall in.

##        j   k
## [1,] 139 270
## [2,] 140 270
## [3,] 131 272
## [4,] 132 272
## [5,] 131 273
## [6,] 134 272
##          j   k
## [698,] 219 218
## [699,] 219 218
## [700,] 115 271
## [701,] 114 269
## [702,] 115 269
## [703,] 123 277

Get the data for each j, k combination. The way to do this is the convert the mtco array to a vector, and then calculate an index jk for each target value:

##        j   k     jk gcdv3_mtco              
## [1,] 139 270 193819      -11.4 -110.75 44.75
## [2,] 140 270 193820      -10.7 -110.25 44.75
## [3,] 131 272 195251       -6.3 -114.75 45.75
## [4,] 132 272 195252       -6.4 -114.25 45.75
## [5,] 131 273 195971       -5.8 -114.75 46.25
## [6,] 134 272 195254       -6.6 -113.25 45.75
##          Lon      Lat  mtco
## 1  -110.6161 44.66280 -11.4
## 2  -110.3467 44.91830 -10.7
## 3  -114.9867 45.70440  -6.3
## 4  -114.2619 45.89190  -6.4
## 5  -114.6503 46.32060  -5.8
## 6  -113.4403 45.84060  -6.6
## 7  -114.3592 48.16580  -6.3
## 8  -123.4600 42.02000   3.3
## 9  -122.5575 41.34670  -0.6
## 10 -122.4954 41.20750   0.5
## 11 -122.5778 41.38360  -0.6
## 12 -122.5092 41.19130  -0.6
## 13 -110.1700 44.28000 -11.2
## 14 -123.5792 45.82420   4.4
## 15 -123.9067 46.10060   4.9
## 16 -119.9767 33.95639 -99.0
## 17 -127.0500 65.21667 -27.9
## 18 -108.1500 37.41667  -4.6
## 19 -105.5000 37.55000  -6.9
## 20 -112.2167 36.71667  -2.7

Plot the extracted values of mtco. To do this, the colors for plotting the different levels of mtco are generated from and RColorBrewer palette, and augmented by gray ("#AAAAAA") to handle missing values (i.e. from marine charcoal records, or those from sites “off” the cru10min30 grid).

[Back to top]

4 Clipping/Trimming/Point-in-polygon analyses

A common problem arises in dealing with spatial data is the “clipping” or “trimming” problem, in which one wants to know whether one or more points lie within the outlines of a particular polygon, or in the case of multiple polygons, which polygon an individual point lies in. More formally, this is known as the “point in polygon” problem. Here the process of clipping the na_10km_v2 climate grid to a tree-species shape file, in particular, that for Picea mariana (black spruce) is demonstrated. What we want is a list of climate data set points that lie withing the range of P. mariana.

4.1 Read the polygon and target-point files

## rgeos version: 0.4-3, (SVN revision 595)
##  GEOS runtime version: 3.6.1-CAPI-1.10.1 
##  Linking to sp version: 1.3-1 
##  Polygon checking: TRUE

Read the shapefile for Picea mariana

## OGR data source with driver: ESRI Shapefile 
## Source: "/Users/bartlein/Projects/ESSD/data/shp_files/USGS_pp1650/picemari.shp", layer: "picemari"
## with 1585 features
## It has 5 fields
## Integer64 fields read as strings:  PICEMARI_ PICEMARI_I

Add a coordinate refernce system to the shapefile, and plot it:

## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"

Read the na10km_v2 points as a .csv file (for illustration, in practice it would be more efficient to read it as a netCDF file).

## 'data.frame':    277910 obs. of  7 variables:
##  $ x     : num  2690000 2700000 2710000 2720000 2730000 2740000 2750000 2760000 2770000 2780000 ...
##  $ y     : num  -4510000 -4510000 -4510000 -4510000 -4510000 -4510000 -4510000 -4510000 -4510000 -4510000 ...
##  $ lon   : num  -77.3 -77.2 -77.1 -77 -77 ...
##  $ lat   : num  5.12 5.1 5.08 5.05 5.03 ...
##  $ mask  : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ etopo1: int  67 61 67 26 43 56 30 57 134 652 ...
##  $ srtm30: int  78 39 52 22 79 49 32 50 110 318 ...

For convenience in display (because the na10km_v2 grid includes Europe and wraps around the dateline), reduce it to only the points west of 45W:

Make a SpatialPointsDataFrame of the na10km_v2 data-point locations, and plot it:

## [1] "SpatialPoints"
## attr(,"package")
## [1] "sp"

4.2 Overlay the points onto the polygons

Now overlay the points onto the polygons. Note that the input shape file was a SpatialPolygonsDataFrame object, so convert it to a simpler “SpatialPolygons” file first. The over() function from rgeos takes a little while to run.

## [1] "SpatialPolygons"
## attr(,"package")
## [1] "sp"
## [1] "integer"

Make a dataframe from the “trimmed” points and plot it on top of the na10km_v2 points:

[Back to top]

5 Gridding or rasterizing point data

Another common task is to grid or rasterize a set of point data, creating a gridded data set of counts, averages, minima, maxima, etc. This can be illustrated using the FPA-FOD daily fire-fire start data set: Spatial wildfire occurrence data for the United States, 1992-2013/Fire Program Analysis Fire-Occurrence Database [FPA_FOD_20150323] (3nd Edition) (Short, K.C., 2014, Earth Syst. Sci. Data, 6, 1-27) – http://www.fs.usda.gov/rds/archive/Product/RDS-2013-0009.3/. The idea here is to calculate the number and average area of the fires that occured in 0.5-degree grid cells that cover the coterminous U.S.

5.1 Read the data sets, and create empty rasters

Read the fire-start data:

## 'data.frame':    1727476 obs. of  14 variables:
##  $ datasource    : Factor w/ 1 level "fpafod": 1 1 1 1 1 1 1 1 1 1 ...
##  $ sourceid      : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ latitude      : num  40 38.9 39 38.6 38.6 ...
##  $ longitude     : num  -121 -120 -121 -120 -120 ...
##  $ year          : int  2005 2004 2004 2004 2004 2004 2004 2005 2005 2004 ...
##  $ mon           : int  2 5 5 6 6 6 7 3 3 7 ...
##  $ day           : int  2 12 31 28 28 30 1 8 15 1 ...
##  $ daynum        : int  33 133 152 180 180 182 183 67 74 183 ...
##  $ area_ha       : num  0.0405 0.1012 0.0405 0.0405 0.0405 ...
##  $ cause_original: int  9 1 5 1 1 1 1 5 5 1 ...
##  $ cause1        : int  2 1 2 1 1 1 1 2 2 1 ...
##  $ cause2        : int  8 1 5 1 1 1 1 5 5 1 ...
##  $ stateprov     : Factor w/ 52 levels "AK","AL","AR",..: 5 5 5 5 5 5 5 5 5 5 ...
##  $ agency        : Factor w/ 11 levels "BIA","BLM","BOR",..: 6 6 6 6 6 6 6 6 6 6 ...

Convert the fire-start data to a SpatialPointsDataFrame

Create a couple of empty rasters to hold the rasterized data:

## class       : RasterLayer 
## dimensions  : 50, 126, 6300  (nrow, ncol, ncell)
## resolution  : 0.5, 0.5  (x, y)
## extent      : -128, -65, 25.5, 50.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
## class       : RasterLayer 
## dimensions  : 50, 126, 6300  (nrow, ncol, ncell)
## resolution  : 0.5, 0.5  (x, y)
## extent      : -128, -65, 25.5, 50.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0

5.2 Rasterize the data

Now rasterize the data, first as counts (number of fires in each grid cell), and then as the average size of the fires in each grid cell. Also plot the data (both on a log10 scale:

## class       : RasterLayer 
## dimensions  : 50, 126, 6300  (nrow, ncol, ncell)
## resolution  : 0.5, 0.5  (x, y)
## extent      : -128, -65, 25.5, 50.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
## data source : in memory
## names       : layer 
## values      : 1, 8755  (min, max)

## class       : RasterBrick 
## dimensions  : 50, 126, 6300, 2  (nrow, ncol, ncell, nlayers)
## resolution  : 0.5, 0.5  (x, y)
## extent      : -128, -65, 25.5, 50.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
## data source : in memory
## names       :          ID,     area_ha 
## min values  : 1.29922e+05, 4.04686e-03 
## max values  :  1677519.00,    18210.87

5.3 Write the rasterized data out as a netCDF file

Write the two rasterized data sets out as variables in a netCDF data set. Create some variables, and replace the R NAs with netCDF fillvalues:

## [1] 126  50
## [1] 126  50
## [1] 126  50

Write out a netCDF file:

## File /Users/bartlein/Projects/ESSD/data/nc_files/us_fires.nc (NC_FORMAT_NETCDF4):
## 
##      2 variables (excluding dimension variables):
##         float fpafod_counts[lon,lat]   (Contiguous storage)  
##             units: 1
##             _FillValue: 1.00000003318135e+32
##             long_name: Number of fires, 1992-2013
##         float fpafod_mean_area[lon,lat]   (Contiguous storage)  
##             units: ha
##             _FillValue: 1.00000003318135e+32
##             long_name: Average Fire Size, 1992-2013
## 
##      2 dimensions:
##         lon  Size:126
##             units: degrees_east
##             long_name: lon
##             axis: X
##         lat  Size:50
##             units: degrees_north
##             long_name: lat
##             axis: Y
## 
##     6 global attributes:
##         title: FPA-FOD Fires
##         institution: USFS
##         source: http://www.fs.usda.gov/rds/archive/Product/RDS-2013-0009.3/
##         references: Short, K.C., 2014, Earth Syst. Sci. Data, 6, 1-27
##         history: P.J. Bartlein, Mon May 20 22:46:40 2019
##         Conventions: CF-1.6

[Back to top]

6 Interpolating/regridding

Another common task involves tranferring values from one gridded data set to another. When the grids are identical, this is trivial, but when the “target” grid is different from the source or “control” grid, this involves interpolation (as does also the case when the target points are irregularly distributed). The most widely used method for interpolation within a control grid is bilinear interpolation, which involves finding the control grid points that surround a particular target point, and then simultaneously (linearlly) interplating in the x- and y-directions. The method is implemented in the raster package, and relatively fast version is implemented in the fields package.

The example here uses a lower-resolution verions of the ETOPO1 global DEM as the source file, and the locations of the (land-only) points in the na10km_v2 grid as the targets. These are read in here from a .csv file, but they also could have come from a netCDF file.

6.1 Open the “control” netCDF and target .csv files

Load the appropriate packages.

Read the etopo1 netCDF file. This particular file is one in which the original 30-sec data have been aggregated (by averaging) to six minutes or one-tenth of a degree (to speed up the execution of the examples). In practice, one would work with the original higher resolution data. Do some setup an open the file, and list its contents.

## File /Users/bartlein/Projects/ESSD/data/nc_files/etopo1_ig_06min.nc (NC_FORMAT_CLASSIC):
## 
##      1 variables (excluding dimension variables):
##         short elev[lon,lat]   
##             long_name: elevation
##             units: m
##             valid_min: -10360
##             valid_max: 6365
##             scale_factor: 1
##             add_offset: 0
##             _FillValue: -32768
##             source: z in etopo1_ig.nc -- averaged
## 
##      2 dimensions:
##         lon  Size:3600
##             standard_name: longitude
##             long_name: longitude
##             units: degrees_east
##             axis: X
##         lat  Size:1800
##             standard_name: latitude
##             long_name: latitude
##             units: degrees_north
##             axis: Y
## 
##     7 global attributes:
##         title: 1-Minute Gridded Global Relief Data (ETOPO1)
##         data: grid-registered data -- ETOPO1_Ice_g.grd
##         institution: http://www.ngdc.noaa.gov/ngdc.html
##         source: http://www.ngdc.noaa.gov/mgg/global/global.html
##         history: This file created by P.J. Bartlein, 28 Nov 2008
##         comment: average over 7x7 1-min cells
##         Conventions: CF-1.0

Get the “control” longitudes and latitudes.

## [1] -179.95 -179.85 -179.75 -179.65 -179.55 -179.45
## [1] -89.95 -89.85 -89.75 -89.65 -89.55 -89.45
## [1] 3600 1800

Now read the elevations, and various attributes:

## [1] 3600 1800

Close the netCDF file using the nc_close() function.

Produce a quick map to check that the data make sense. (Always do this!)

Open and read the .csv file containing the “target”" points.

## 'data.frame':    277910 obs. of  5 variables:
##  $ x   : num  2690000 2700000 2710000 2720000 2730000 2740000 2750000 2760000 2770000 2780000 ...
##  $ y   : num  -4510000 -4510000 -4510000 -4510000 -4510000 -4510000 -4510000 -4510000 -4510000 -4510000 ...
##  $ lon : num  -77.3 -77.2 -77.1 -77 -77 ...
##  $ lat : num  5.12 5.1 5.08 5.05 5.03 ...
##  $ mask: int  1 1 1 1 1 1 1 1 1 1 ...

Get the number of target points:

## [1] 277910

6.2 Interpolation

Set up to do the interpolation. Generate x’s and y’s for the target grid. In practice, these might be read in from a netCDF file. Here we need them to define an array that will hold the interpolated values.

Get the array subscripts for each point. This is similar to the work done in reshaping a “short” data frame to an array, as in Week 4

##        j k                   
## [1,] 847 1 -77.29202 5.124395
## [2,] 848 1 -77.20858 5.099891
## [3,] 849 1 -77.12515 5.075297
## [4,] 850 1 -77.04173 5.050615
## [5,] 851 1 -76.95832 5.025843
## [6,] 852 1 -76.87492 5.000983
##              j   k                  
## [277905,] 1073 900 4.212445 47.09974
## [277906,] 1074 900 4.238897 47.00791
## [277907,] 1075 900 4.265386 46.91605
## [277908,] 1076 900 4.291913 46.82415
## [277909,] 1077 900 4.318477 46.73222
## [277910,] 1078 900 4.345078 46.64025

Now do bilinear interpolation using the fields package:

Get a quick map:

At this point, interp_mat could be written out as a variable in a netCDF file (along with dimension and attribute data). Also make a data frame of the interpolated data, which could be written out as a .csv file:

##         x        y       lon      lat     elev
## 1 2690000 -4510000 -77.29202 5.124395 64.91837
## 2 2700000 -4510000 -77.20858 5.099891 87.74857
## 3 2710000 -4510000 -77.12515 5.075297 69.43258
## 4 2720000 -4510000 -77.04173 5.050615 52.84844
## 5 2730000 -4510000 -76.95832 5.025843 54.49635
## 6 2740000 -4510000 -76.87492 5.000983 54.86496
##              x       y      lon      lat     elev
## 277905 4950000 4480000 4.212445 47.09974 471.2087
## 277906 4960000 4480000 4.238897 47.00791 383.3458
## 277907 4970000 4480000 4.265386 46.91605 381.3808
## 277908 4980000 4480000 4.291913 46.82415 404.7478
## 277909 4990000 4480000 4.318477 46.73222 345.8679
## 277910 5000000 4480000 4.345078 46.64025 313.1289

[Back to top]