Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read ECCO-Darwin field from NASA Earthdata #328

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

seamanticscience
Copy link

@seamanticscience seamanticscience commented Jan 11, 2025

Adding functionality and interface to initialize and force BGC ClimaOcean simulations using ECCO-Darwin fields

@simone-silvestri
Copy link
Collaborator

Nice. I see there is a lot of code duplication, I think we can avoid it by, instead of writing a new metadata type, writing a new version type that contains the additional information we need to wrangle the data.

I would propose therefore to have

struct ECCO4DarwinMonthly{T, D}
    time_step_size :: T
    reference_date :: D
end

so that it is possible to define an alias

const ECCO4DarwinMetadata = ECCOMetadata{<:Any, <:ECCO4DarwinMonthly}

I think it would be nice to have only one Metadata type for all the data we wrangle with specialized version fields for all the different data (I was going that route in #286). It a bit ambitious, but it is still possible with this change

Comment on lines 416 to 424
data = ECCODarwinModelData(
path,
ECCODarwinModelMeta(
replace(path,"data"=>"meta"),
),
ECCODarwinMeshGrid(
metadata.version,
),
)
Copy link
Collaborator

@simone-silvestri simone-silvestri Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you can define a function read_ECCO_data(metadata, path) that dispatches on the version of the metadata and returns

    shortname = short_name(metadata)

    if variable_is_three_dimensional(metadata)
        data = ds[shortname][:, :, :, 1]
        data = reverse(data, dims=3)
    else
        data = ds[shortname][:, :, 1]
    end        

by default and

model_metadata = ECCODarwinModelMeta(replace(path, "data"=>"meta"))
mesh_grid = ECCODarwinMeshGrid(metadata.version)
data = ECCODarwinModelData(path, model_metadata, mesh_grid)

for ECCODarwin* versions. In this way you can reuse the current ECCO_field implementation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that sounds like a plan.

Comment on lines 135 to 137
Base.length(metadata::ECCODarwinMetadata) = length(metadata.dates)
Base.size(data::ECCODarwinMetadata{<:Any, <:Any, <:ECCO270DarwinMonthly}) = (1080, 540, 50, length(data.dates))
Base.size(data::ECCODarwinMetadata{<:Any, <:Any, <:ECCO4DarwinMonthly}) = (360, 180, 50, length(data.dates))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Base.length(metadata::ECCODarwinMetadata) = length(metadata.dates)
Base.size(data::ECCODarwinMetadata{<:Any, <:Any, <:ECCO270DarwinMonthly}) = (1080, 540, 50, length(data.dates))
Base.size(data::ECCODarwinMetadata{<:Any, <:Any, <:ECCO4DarwinMonthly}) = (360, 180, 50, length(data.dates))
Base.length(metadata::ECCOMetadata) = length(metadata.dates)
Base.size(data::ECCOMetadata{<:Any, <:Any, <:ECCO270DarwinMonthly}) = (1080, 540, 50, length(data.dates))
Base.size(data::ECCOMetadata{<:Any, <:Any, <:ECCO4DarwinMonthly}) = (360, 180, 50, length(data.dates))

Read ECCODarwin metadata file and return as a NamedTuple

"""
function ECCODarwinModelMeta(fileName)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this function actually read the data or provide a path to the data?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It reads a text ("meta") file that contains information about the "data" file - in particular the binary precision of the MITgcm output.

@seamanticscience
Copy link
Author

Thanks for the comments @simone-silvestri !

@@ -2,6 +2,7 @@ module ECCO

export ECCOMetadata, ECCO_field, ECCO_mask, ECCO_immersed_grid, adjusted_ECCO_tracers, initialize!
export ECCO2Monthly, ECCO4Monthly, ECCO2Daily
export ECCODarwinMetadata, ECCO4DarwinMonthly, ECCO270DarwinMonthly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, "ECCO4" refers to ECCO version 4, but certainly "ECCO 270" does not refer to ECCO version 270. Do we have ideas for clearer naming?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me, at least, if LLC270 will be ECCOv5, or if it's a stop-gap for a high res run, that's not CS510 (which is what you're calling ECCO2). Also, FYI there are 5 "releases" of ECCO version 4. ECCODarwin has results on V4r4 and V4r5, but I left that out for now, and just stuck to "ECCO4".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, well the key for the metadata is to define the grid and a way to set the variables. So if the different releases occupy the same pattern, then loading data from different releases can be described through properties of the types rather than the type names; for example ECCO4Monthly(release=5) or whatever.

We can call ECCO2 CS510 btw. As I understand it, "ECCO v2" refers to the run, whereas CS510 is the name of the grid "cubed sphere 510 points" or whatever. So the name "CS510" is less specific than "ECCO2"; there are many runs with CS510 that are not ECCO, or not on Earth at all. Similarly LLC270 is the name of the grid, right?

@glwagner
Copy link
Member

Please try to proofread the code carefully. There are some formatting inconsistencies, like spacing missing after commas. Reading the code carefully is useful because you may also catch typos.

The style is a bit inconsistent with the rest of the code. If a function is not a "constructor" (returning a type of the same name as the function), then it should use snake_case rather than TitleCase. Nowhere should we use snakeCase. Finally, for continuing functions and parentheses we generally line text up rather than indent, eg

some_function(a,
              b,
              c)

rather than

some_function(a,
    b,
    c
)

The main point is that the code will have to be proofread carefully anyways, and it's a good idea to write with good style and punctuation, same as writing a paper.

@seamanticscience seamanticscience changed the title Include interface for ClimaOceanBiogeochemistry.jl Read ECCO-Darwin field from NASA Earthdata Jan 17, 2025
@seamanticscience
Copy link
Author

Test failing on something external:

[ Info: Testing that we can download ECCO data...
Availability of ECCO data: Error During Test at /home/runner/work/ClimaOcean.jl/ClimaOcean.jl/test/test_downloading.jl:12
  Got exception outside of a @test
  RequestError: HTTP/1.1 401 Unauthorized while requesting https://ecco.jpl.nasa.gov/drive/files/Version4/Release4/interp_monthly//SIarea/1993/SIarea_1993_01.nc

@simone-silvestri
Copy link
Collaborator

simone-silvestri commented Jan 17, 2025

I am not sure that you have an ECCO_USERNAME and ECCO_PASSWORD set up as environmental variables in your fork of the repository, which would lead to an error like this. Can you try to add them as secrets and see if the tests pass?

@seamanticscience
Copy link
Author

Ah, I see, done! I had no idea that you could do that!

@seamanticscience
Copy link
Author

ECCO-Darwin DIC concentration at the surface, 1993-01-01 00:00:00
ECCO-Darwin DIC

@@ -200,7 +200,7 @@ function ECCO_field(metadata::ECCOMetadata;
# ECCO4 data is on a -180, 180 longitude grid as opposed to ECCO2 data that
# is on a 0, 360 longitude grid. To make the data consistent, we shift ECCO4
# data by 180 degrees in longitude
if metadata.version isa ECCO4Monthly
if metadata.version isa ECCO4Monthly || metadata.version isa ECCO4DarwinMonthly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be slightly cleaner to use abstract type or type union to express this. This pattern does not generalize well; this makes it more work to change how we express version for example.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in if metadata.version isa Union{ECCO4Monthly, ECCO4DarwinMonthly}?

@seamanticscience
Copy link
Author

Quick note: spotted a few issues that'll need fixing (downloading hFacC by MeshArrays doesn't automatically trigger like I had thought; ECCOFieldTimeSeries for ECCO4DarwinMonthly has some weird behaviour that I will document.)

@seamanticscience
Copy link
Author

Also, I added my ECCO details as secrets, but the test still fails, so not sure if I did it right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants