-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathRmd_script_4_big_files.Rmd
106 lines (80 loc) · 5.27 KB
/
Rmd_script_4_big_files.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
title: "R Notebook"
---
## Bonus 2!
Ok so at the end of the last R notebook I said it was the end of the series. And I've been talking about 3 R notebooks throughout the text in the previous three notebooks.
But then Sarah asked me to make sure all of here data worked in R as well.
Luckily Sarah's data does open and can be processed using all the same techniques as you have already learnt in the previous R notebooks. There are just a couple of extra steps for a couple of the files. Let's take a look!
First, let's load one of the 2D NetCDF files. Luckily these can be be loaded in exactly the same way as the NetCDF files in the other R notebooks. The only extra thing to notice is that the variable holding the missing values is not "_FillValue" as it was in the NetCDF files used in R Notebook 2 and 3. The missing values variable in the NetCDF file here is called "missing_values". I found this out by reading the output of print(ncfile) in the usual way as taught in the previous R notebooks.
```{r}
# Load the ncdf4 library into R. (note, we installed this with install.packages() in Notebook 2)
library("ncdf4")
# Open a file connection to the NetCDF file
ncfile <- nc_open("E:\\Murray\\R_netcdf-master\\R_netcdf-master\\140625.p0055c.x1p5CO2outgas\\140625.p0055c.x1p5CO2outgas\\biogem\\fields_biogem_2d.nc")
# Print the NetCDFs header
print(ncfile)
# Extract variables from the NetCDF file
lat = ncvar_get(ncfile, 'lat')
lon = ncvar_get(ncfile,'lon')
sed_det = ncvar_get(ncfile, 'sed_det')
sed_CaCO3 = ncvar_get(ncfile, 'sed_CaCO3')
carb_conc_CO3 = ncvar_get(ncfile, 'carb_conc_CO3')
grid_topo = ncvar_get(ncfile, 'grid_topo')
# Close the NetCDF file connection
nc_close(ncfile)
```
Let's plot a quick map of the values of the 4 variables we have extracted. Remember our variables are currently a cube of data with dimensions latitude x longitude x time. We can get a single slice of them in the usual way, using the [latitude,longitude,time] square bracket notation. remember leaving one of the spaces empty selects all the data, so [,,1] selects all of the latitude, all of the longitude and just the first time value, to give us a time slice.
```{r}
image.plot(lon, lat, grid_topo, main=grid_topo_name$value)
image.plot(lon, lat, sed_det[,,1], main=sed_det_name$value)
image.plot(lon, lat, sed_CaCO3[,,1], main=sed_CaCO3_name$value)
image.plot(lon, lat, carb_conc_CO3[,,1], main=carb_conc_CO3_name$value)
```
Second of Sarah's files are a bunch of ASCII files. These files are delimited by a space, so can be read with the read.table function as in R Notebook 1. However the header in the file (telling us what each column corresponds to) has multiple words. This trips R over and we get an error message. This is because R looks at the first line to decide how many columns there are, and as R thinks every column has a space between it, then thinks there are too few columns. We can get around this using the "skip=1" argument, which tells R to skip the first row when reading in the data. We then use the "names()" function to manually change the names of the columns.
```{r}
# Read in the data, skipping the header.
sed_det = read.table("E:\\Murray\\R_netcdf-master\\R_netcdf-master\\140625.p0055c.x1p5CO2outgas\\140625.p0055c.x1p5CO2outgas\\biogem\\biogem_series_sed_det.res", skip = 1)
# Set the names manually using the "names()" function
names(sed_det) = c("time", "ave_det_conc")
# Make a super basin plot to look at the data
plot(sed_det)
# Make a nicer plot
plot(sed_det, type="l", lwd=2, xlab="Time (yr)", ylab="Average detritacl concentration (wt%)")
grid(col='black')
```
Finally, let's look at one of Sarah's climate model outputs. This file has dimensions of latitude x longitude x depth x time. So it has one extra dimension. Luckily you handle this extra dimension in exactly the same way - with the square bracket notation. Let's see how:
```{r}
# Open a file connection to the NetCDF file
ncfile <- nc_open("E:\\Murray\\R_netcdf-master\\R_netcdf-master\\140625.p0055c.x1p5CO2outgas\\140625.p0055c.x1p5CO2outgas\\biogem\\fields_biogem_3d.nc")
# Print the NetCDFs header
print(ncfile)
# Extract variables from the NetCDF file
lat=ncvar_get(ncfile, 'lat')
lon=ncvar_get(ncfile,'lon')
misc_pH = ncvar_get(ncfile, 'misc_pH')
# Close the NetCDF file connection
nc_close(ncfile)
```
Let's look at the dimensions of the misc_pH variable
```{r}
dim(misc_pH)
```
Then lets extract a time and depth slice. This is just a slice of all latitudes, all longitudes, at 1 depth, at one time.
```{r}
image.plot(lon, lat, misc_pH[,,1,1], legend.lab = "pH")
```
Finally, let's look at a method for changing the ticks around the plot. This is admitedly not very clear code, but it's an example of one of the few times that R is a bit of a pain!
```{r}
# First make extra space on the right hand side of the plot, for the colourbar
par(mar=c(5,4.5,4,7))
# Then plot the misc_pH matrix using image() not image.plot(). Use the axes=F argument to not plot any axes.
image.plot(lon, lat, misc_pH[,,1,1], axes=F, col=rainbow(256))
# Plot axis 1 (the bottom) with the defaults
axis(1)
# Plot axis 2 with ticks where we want them
axis(2, at=c(-60,-30,0,30,60))
# Use image.plot with the legend.only=T arguent to plot the colourbar only.
image.plot(lon, lat, misc_pH[,,1,1], legend.only=T, legend.lab = 'pH')
# Make the box around the figure.
box()
```