Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing dynamic urban land cover in CESM #1445

Closed
5 tasks done
fang-bowen opened this issue Jul 28, 2021 · 30 comments
Closed
5 tasks done

Implementing dynamic urban land cover in CESM #1445

fang-bowen opened this issue Jul 28, 2021 · 30 comments
Labels
enhancement new capability or improved behavior of existing capability science Enhancement to or bug impacting science
Milestone

Comments

@fang-bowen
Copy link
Contributor

fang-bowen commented Jul 28, 2021

Scientific relevance

CESM has the urban representation that is sufficiently realistic to capture the urban climate dynamics. However, one limitation of the current model is that the urban land cover is static, which prevents CESM from capturing the urbanization effects on local urban climate.

In this project we propose to develop an urban surface dataset for future urbanization scenarios and to implement such dynamic urban land cover in CESM, which can serve as an option for future CLM versions. This enhancement to CESM will also enable researchers to supply their projection of land use change data and study the transient climate impact of LULCC.

Plan of implementation

  1. Preparation of urban land datasets
  • Task 1: Generate a CLM surface dataset from scenario-based urban land use projection

Based on a set of data-driven projections of global urban land expansion under the framework of the shared socioeconomic pathways (SSPs) by Gao and O'Neill (2020) , originally at 10-year intervals for the 21st century, we plan to develop the annual surface dataset for CESM.

This current dataset represents the urban expansion based on SSP5-8.5 scenario, and assumes vegetation and cropland are replaced by urban when urbanization happens. The dataset spans between 2015 and 2100 and is at 0.9 × 1.25° resolution. This dataset is used in our development, and we will create surface datasets based on other SSP scenarios or for historical periods (1850 - present) when we have the data available.

  • Task 2: Modify mksurfdata tool to generate surface datasets from dynamic urban grid datasets.

We plan to enhance the mksurfdata tool so that it can handle a series of urban land use grid datasets and generate the surface datasets for CLM automatically. This will be useful for the community to test any given urbanization or land use change scenario.

  1. Update to model code
  • Task 3: Modify the code for urban energy and water processes

We will modify code in CLM to implement physical processes to address water and energy balance when dynamic urban land use change is adopted, and deal with biophysical process changes in regions that are converted from vegetated to urban.

  • Task 4: Enable CESM to update surface dataset annually

Currently CESM runs the entire simulation based on the initial land cover data. We plan to modify the code so that the model can update land cover each year with a new surface dataset.

  • Task 5: Offline and online tests.

We plan to test the code modification with active land components only, and then implement the change to fully-coupled CESM simulation.

Collaborators

Keer Zhang(@keerzhang1). Advisor: Dr Lei Zhao. We are working closely with Dr Keith Oleson (@olyson).

@billsacks billsacks added tag: enh - new science enhancement new capability or improved behavior of existing capability labels Jul 28, 2021
@billsacks
Copy link
Member

billsacks commented Jul 28, 2021

Thanks a lot for opening this issue and describing your plans.

You may be happy to know that the framework is already in place for task 3 (as described in https://escomp.github.io/ctsm-docs/versions/master/html/tech_note/Transient_Landcover/CLM50_Tech_Note_Transient_Landcover.html#mass-and-energy-conservation), so this will probably only involve relatively minor changes, if any. The relevant module is

https://github.com/escomp/CTSM/blob/master/src/biogeophys/TotalWaterAndHeatMod.F90

You'll just need to confirm that all of the relevant urban water and energy states are captured there.

For task 2, you will want to add the pct urban fields to the existing transient landuse file, which already provides time-varying fields for other variables like pct crop. Then task 4 will involve writing a module similar to https://github.com/escomp/CTSM/blob/master/src/dyn_subgrid/dyncropFileMod.F90 or https://github.com/escomp/CTSM/blob/master/src/dyn_subgrid/dynlakeFileMod.F90 which reads the relevant urban fields off of the file each year. One minor note is that you should probably also call the CheckUrban routine on each year's input:

subroutine CheckUrban(begg, endg, pcturb, caller)

One other question you'll want to think about is how to initialize water and energy state variables when an urban column newly comes into existence. You basically have two options:

  • Anywhere where a given urban landunit might come into existence in this simulation (as determined by a time-constant mask on the landuse timeseries file), run that urban landunit for the entire simulation, even if its current area is 0. Then when its area grows to non-zero, its state will already be spun up. This is probably the easier solution. For performance reasons, it would be important to create a reasonable mask field, so we don't end up running the urban code in places where it will never be needed. You can see variables like PCT_CFT_MAX in mksurfdat.F90 for an example.
  • Only run urban code where the current area is non-zero. Then you would need to extend https://github.com/escomp/ctsm/blob/master/src/dyn_subgrid/dynInitColumnsMod.F90 to initialize any long-spinup-time biogeophysical variables to reasonable values based on their values in other landunits/columns that already exist in this gridcell. For example, belowground water and temperature states for urban road could possibly be taken from the natural vegetated column. For states with short spinup times, you could leave them at their cold start values (i.e., not handle them explicitly in that module). This is probably trickier than the first option, and I'd probably just consider this if performance is a concern with the first option.

I'd be happy to help consult on some of the specific code changes that need to be made for this project.

@olyson
Copy link
Contributor

olyson commented Jul 29, 2021

Thanks Bill, this was very useful information. We are meeting bi-weekly on this project and will contact you when we need help, thanks!

@fang-bowen
Copy link
Contributor Author

Thanks Bill, that's very helpful! We'll update this issue with any progress.

@olyson
Copy link
Contributor

olyson commented Aug 11, 2021

@billsacks , I'm a bit confused by the PCT_CROP_MAX that you mentioned and similar variables that are intended to provide a reasonable mask field. We were going to do something similar for urban. I see those on the transient landcover dataset (created by mksurfdat) but I don't see where they are used in the clm code (unless it is a segmented field that doesn't allow for a search using grep). For lakes, it looks like the variable HASLAKE is read off the dataset if transient lakes are active, but I don't see a variable comparable to crops like, e.g., PCT_LAKE_MAX.

@billsacks
Copy link
Member

@olyson sorry for the confusion. This is something that got half-way implemented and then dropped. It's still intended, but fell off the priority list (first mine, then @slevisconsulting 's): see #229.

@billsacks
Copy link
Member

@olyson and I discussed BGC conservation by email and then in person today.

First, here's a copy of our email conversation so it's all in one place:

From Keith:

I'm implementing dynamic urban and am getting a methane conservation error the first time PCT_URBAN changes. It looks like the dynamic adjustments for conc_ch4_sat_col and conc_ch4_unsat_col are putting some non-zero values in for roof and walls. Then when the total column ch4 is summed over the soil layers (or in this case, urban layers), it is trying to sum over nlevsoi, not nlevurb, and thus dz is 1e36 for layers > nlevurb, which creates some very large total column ch4, and hence creates an imbalance.
So I'm trying to find a way to restrict the dynamic adjustments to just nlevurb for roof and walls.

My reply:

I spent a little time looking at the relevant code tonight and I can see why there are problems here. It looks like I didn't add the special handling of urban that's needed in the dynamic landunit BGC conservation code. First, there is no accounting for the fact that there are different numbers of layers in some urban columns. But perhaps more fundamentally, it looks like I didn't account for the special urban column scaling that's needed in dyn_subgrid/dynColumnStateUpdaterMod.F90, or the fact that it doesn't make any sense to store carbon in urban walls.

The way things are done right now, when a special landunit grows, any carbon and nitrogen that was in the soil in vegetated landunits essentially gets "trapped" in the corresponding column-level layer structure of the special landunit that is taking its place. This only really matters if the special landunit shrinks in the future: by trapping the C & N under the special landunit, this C & N is eventually restored to the vegetated landunit if the special landunit shrinks in the future, assuring conservation of C & N as well as a reasonable starting state for the new vegetated landunit.

I think the right thing to do is a combination of two things:
(1) Making sure we use the correct scalings of urban columns in the update_column_state routine in dynColumnStateUpdaterMod: This routine looks at the areas gained and lost by columns within each grid cell, and I have a feeling it isn't working right for urban columns.
(2) Something like adding a new wrapper routine in this module that forces states to 0 for some of the urban columns. This shouldn't be too hard: there is already a (currently-unused) wrapper routine that facilitates forcing states to 0 for certain landunits, and we just need a similar routine that does something similar for certain urban columns, and then we'd need to change many / all of the current calls to use that wrapper instead of the "no_special_handling" version.
(3) Possibly also changing ch4_totcolch4 to only count down to nlevurb in urban columns. This may be unnecessary with the other changes I'm proposing, but might be a good idea for robustness.

I'm not sure exactly what this would need to look like, but would be happy to put my head together with yours to work it out together. That said, it's possible that we could take some shortcuts, especially if it's pretty safe to assume that we wouldn't have urban areas growing then later shrinking within a given grid cell in a single simulation.

This differs somewhat from your suggestion of restricting the adjustments to just go down to nlevurb. That might be an additional right thing to do, but I'm thinking that we might want to prevent carbon from being "buried" in these special urban columns at all, in which case I think the special level handling would be unimportant. One thing I'm not sure about, though, is: If it is important to conserve C & N across urban transitions, then I think we'd want to store C & N beneath both roads and roofs. Roads should be fine (because they use nlevsoi/nlevgrnd, right?), but I don't know how we can store all of the vegetation's C & N beneath roofs. Conceptually, I guess that would be C & N trapped in the soil under a building; storing that seems to require some structure that mirrors the nlevsoi structure on the roof column. So maybe we just need to live with non-conservation of C & N in urban transitions – and that seems scientifically justifiable if you assume that, under a building, the soil is excavated down to bedrock and so the C & N is lost from that column. (For the (rare?) case where urban areas shrink, we could actually easily specify some fixed non-zero C/N values that are restored in these roof-occupied columns if that's important scientifically.)

Finally, this makes me realize that someone (you?) should also give a critical eye to the biogeophysical dynamic landunit conservation code to make sure that's doing the right thing for urban columns. The relevant code is in TotalWaterAndHeatMod and dynConsBiogeophysMod (especially subroutines dyn_water_content and dyn_heat_content): It's worth making sure that the summation of columns and the averaging from column to gridcell level is being done right for urban columns. I'd be happy to talk more with you about that, too. That's actually probably more important to get right than the CN conservation.

And finally, a summary of our discussion today:

  • We spent a while looking into what would be needed for points (1) and (2) in my above email. Adding a new wrapper should be pretty straightforward. But it looks like, if we wanted to store C & N under roads but not other urban columns, then we'd want to properly account for the correct column weightings when figuring out the column area changes, since col%wtlunit and col%wtgcell do not give the true column areas for urban roads (urban roads always take up 1/3 of the non-roof portion of the landunit, regardless of their actual area). It may be that the only change needed for this to work right would be changing set_new_weights in dynColumnStateUpdaterMod, but we didn't look closely.
  • However, after some discussion, we decided to try a simpler approach: Rather than trying to keep the BGC variables physically meaningful in urban landunits, we will just pack these variables in a way that should conserve these variables, even if the values in each of the urban columns is somewhat nonsensical. Specifically: we'll take col%wtgcell at face value in urban columns in dynColumnStateUpdaterMod – i.e., for the sake of storing / conserving these BGC variables, we'll act as if that gives the true column weight on the grid cell. This way we'll end up storing all of the C & N from the vegetated column in the urban columns, and there shouldn't be any that is lost from the system. If that urban landunit later shrinks, the stored C & N should be restored symmetrically. It shouldn't really matter that it was stored in a non-physical way (e.g., with some C & N stored in urban walls), since the BGC variables are irrelevant over the urban areas and we just want to be able to restore the amount that was originally stored if an urban landunit grows and then later shrinks.
    • But for this to work right, we need to treat the relevant BGC variables as having the same dz over all urban columns as over the soil column. Note that there already seems to be an implicit assumption that dz is the same for all columns in the dynamic column state updates, in that dz doesn't enter into the conservation equations. In terms of what needs to change, we think that the only relevant code is the code that sums up total C / N / CH4 for the sake of balance checks: these balance checks need to be consistent with the assumptions made in the conservation code. The C and N summations already use dzsoi_decomp, which is the same for all columns, so this is already what we want. So it may be that the only thing that needs to change is the use of dz in ch4_totcolch4: that will need to use dzsoi_decomp over urban columns. (This begs the question of why this isn't already using dzsoi_decomp for consistency with the C & N code; we're not sure about this.)

fang-bowen added a commit to fang-bowen/CTSM that referenced this issue Aug 27, 2021
Resolve CH4 conservation error when PCT_URBAN changes. Current approach is to use dzsoi_decomp instead of dz in ch4_totcolch4 over urban columns. See ESCOMP#1445 (comment).

Co-Authored-By: Keith Oleson <[email protected]>
fang-bowen added a commit to fang-bowen/CTSM that referenced this issue Aug 29, 2021
Resolve CH4 conservation error when PCT_URBAN changes. Current approach is to use dzsoi_decomp instead of dz in ch4_totcolch4 over urban columns. See ESCOMP#1445 (comment).

Co-Authored-By: Keith Oleson <[email protected]>
@fang-bowen
Copy link
Contributor Author

Progress update:

During the past weeks we have made some progress on the following tasks:

Task 1 & 2: Earlier we have generated a surface dataset for code development and testing. The dataset is processed into landuse.timeseries file to be read by the new dynamic urban module.

We are looking at the mksurfdata_map tool and plan to automate this process for dataset generation.

Task 4: We have created a new module (dyn_subgrid/dynurbanFileMod.F90) and made other necessary modification so that the model is able to read PCT_URBAN in landuse.timeseries file and update the urban fractions annually. This is verified by comparing the output with the input data.

As for the initialization of new urban columns: we have created PCT_URBAN_MAX variable in our landuse.timeseries file, but since the functionality (of only initialize urban where it might come into existence) is not fully implemented, we are currently setting (user_nl_clm) run_zero_weight_urban to .true. and running urban everywhere. We plan to evaluate the performance of this setting.

Besides these, as Keith suggested I have made some flowcharts (calling tree/sequence) and found them helpful in understanding the code and the model structure. I will refine and share some relevant ones here soon.

Our priority for now:

Task 3: We are trying to understand how water and energy balance is handled for dynamic urban in the current code. Through some calculation we believe the scaling method (c2l_scale_type='urbanf') makes sense, and so far we have not identify any states / processes that seem incompatible with dynamic urban.

However Keith has a time-step level output (in /glade/scratch/oleson/archive/ctsm51_ctsm51d049_1deg_GSWP3V1_SSP585_urbanonly_bowen/lnd/hist, 2019-12 to 2020-01) and we found the difference between HEAT_CONTENT1 and HEAT_CONTENT2 after the urban fraction changes does not match the aggregated EFLX_DYNBAL throughout the later year.

So, @billsacks, do you have any insight what could be the reason here? I can make some plots to show this if it helps.

By the way, I see some routines to "set start-of-run baseline values" (e.g., dyn_hwcontent_set_baselines). Will this baseline subtracting method affect our dynbal fluxes here?

dynamic_urban_update_2021_09_27.pdf

@billsacks
Copy link
Member

Good questions. The logic controlling this is pretty complex.

Regarding the mismatch between (HEAT_CONTENT2 - HEAT_CONTENT1) and EFLX_DYNBAL: sorry, I forgot about this point when we talked before: The delta heat is adjusted with this code before computing EFLX_DYNBAL:

call AdjustDeltaHeatForDeltaLiq( &
bounds, &
delta_liq = delta_liq_bulk(begg:endg), &
liquid_water_temp1 = temperature_inst%liquid_water_temp1_grc(begg:endg), &
liquid_water_temp2 = temperature_inst%liquid_water_temp2_grc(begg:endg), &
delta_heat = delta_heat(begg:endg))

Briefly, the problem this addresses is that the runoff generated by the delta liquid term is already adding or subtracting some heat to/from the grid cell, so we need to correct the delta heat adjustment to account for that. I would be happy to try to explain this in more depth if you're interested – and also happy to hear any arguments that this correction shouldn't be done at all if anyone feels that way, because I'm still not 100% sure this is the right thing to do (but Inne Vanderkelen and I discussed this in depth a few years ago in the context of dynamic lakes, and it at least stood up to that inspection).

Regarding the baselines: No, I don't think this should have any impact on dynbal fluxes for urban, unless the urban area is growing into glacier or lake areas: These baseline values are used to account for "virtual" states in glacier and lake landunits, and I believe aren't used in other landunits (though I could be remembering wrong). This is another subtle thing that I'd be happy to discuss further if you'd like to understand it more, though I think you won't need to deal with this for urban.

@olyson
Copy link
Contributor

olyson commented Oct 6, 2021

Performance notes for 5 year I2000 cases:
"allurban": 152 yrs/day, 289 pe-hr/simulated_year
normal: 172 yrs/day, 254 pe-hr/simulated_year

So, about 12% slower when running 3 urban landunits in every gridcell.

@Face2sea
Copy link

@billsacks Hi Bill, I think we have verified that the heat, liquid water, and ice conservation are all working properly. Thanks for your help! We are now planning to move forward with creating the multi-scenario (SSP-RCP) dynamic urban landuse.timeseries data based on Gao and O'Neill (2020) dataset. Before we are doing so, do you have any suggestions that how we should prepare these datasets and codes to make them better comply with the overall data framework in CESM/CTSM.

@olyson
Copy link
Contributor

olyson commented Oct 25, 2021

I spoke to Bill about this. Basically, we should proceed with implementing the raw dynamic urban data into mksurfdata. As long as we use a recent version of ctsm master, any ongoing work on the mksurfdata process should not affect us much, if at all. We can discuss further at our next meeting.

@olyson
Copy link
Contributor

olyson commented Nov 3, 2021

@billsacks, we are wondering if it makes sense to issue a PR on our work thus far; task 4 (updating model code to read in and execute a landuse timeseries file with dynamic urban, and task 3 (ensuring water and energy conservation). The proposed PR would have transient urban off by default and thus would be non-answer-changing, but would allow us and others to begin running experiments with transient urban using some input datasets that have already been created.
The second PR would be issued once our ongoing work with implementing transient urban into the mksurfdata routines was complete.

@billsacks
Copy link
Member

@olyson yes that sounds like a great plan.

@olyson
Copy link
Contributor

olyson commented Nov 4, 2021

Regarding implementing transient urban into the mksurfdata routines...the current method replaces bare soil with urban. If there is not enough bare soil, then both PCT_NATVEG and PCT_CROP are reduced proportionally to accommodate any excess urban area.
Given the increasing importance of crops to food security in the future, we are contemplating changing this behavior so that urban replaces PCT_NATVEG first and then PCT_CROP only if necessary.
@billsacks @ekluzek @lawrencepj1 @wwieder, any comments on this approach?

@lawrencepj1
Copy link

lawrencepj1 commented Nov 4, 2021

Hi Keith @olyson. That makes total sense given that the PCT_NATVEG is currently the residual after all other land units are allocated. Is the intention to have the total PCT_URBAN (with urban classes) on the transient landuse.timeseries file? If so the decision about PCT_CROP could be made before run time.

@billsacks
Copy link
Member

At first I thought this made sense. But as I read through the code & comments in normalizencheck_landuse in mksurfdat.F90, I'm starting to think that it's most correct to stick with the current approach, at least if you're talking about the surface dataset – the answer may be different for the landuse.timeseries file, which I think is what @lawrencepj1 was thinking about with his comment.

I'm looking at this comment:

! First normalize vegetated (natural veg + crop) cover so that the total of
! (vegetated + (special excluding urban)) is 100%. We'll deal with urban later.
!
! Note that, in practice, the total area of natural veg + crop is typically 100%
! going into this routine. However, the following code does NOT rely on this, and
! will work properly regardless of the initial area of natural veg + crop (even if
! that initial area is 0%).

Because the different PFTs / CFTs are expressed on the raw dataset in a way that adds to 100%, I think it makes sense to reduce everything proportionally. For example, if there is a grid cell that starts out as 50% c3 grass, 50% soybean and 50% urban: I think the meaning of the 50% c3 grass and 50% soybean is that 50% of the vegetated area is c3 grass and 50% of the vegetated area is soybean; this is in contrast to special landunits, where their percentage is specified as % of the gridcell area. That makes me think that the right thing to do is to reduce all PFTs / CFTs proportionally.

Also, I think we should be consistent between urban & other special landunits in this respect.

That said, I haven't looked closely at this. This could warrant more discussion.

@olyson
Copy link
Contributor

olyson commented Nov 4, 2021

Thanks @lawrencepj1 , yes we will have PCT_URBAN on the transient landuse.timeseries file.
I see the following code in dynLandunitAreaMod.F90 that controls the priority of which landunits get decreased if they add up to more than 100%:

! This parameter specifies the order in which landunit areas are decreased when the
! specified areas add to greater than 100%. Landunits not listed here can never be
! decreased unless the forcings say they should be decreased. In particular, note
! that istice doesn't appear here, so that the istice area always will match
! the areas specified by GLC. In general, the code will NOT be robust if more than
! one landunit is excluded from this list. Meaning: since istice is excluded from
! this list, all other landunits should appear in this list!
integer, parameter :: decrease_order(7) = &
     (/istsoil, istcrop, isturb_md, isturb_hd, isturb_tbd, istwet, istdlak/)

So I guess this is essentially doing what we were proposing since first istsoil is reduced, then istcrop...

@keerzhang1
Copy link
Contributor

keerzhang1 commented Nov 8, 2021

Thanks for the helpful discussions on mksurfdat tool!

I am modifying the mksurfdat tool based on your opinions. Just to confirm our current decisions: (1) As @billsacks said, we should not change the current normalizedcheck_landuse because the PFTs/CFTs are expressed in the raw data in a way that adds to 100% (but the normalizedcheck_landuse is able to first reduce bare soil, which is the index 0 of the natural pft array. I am still puzzled why we cannot similarly reduce other natural pfts first, and then reduce the cfts?).

(2) Then, as @lawrencepj1 said, the dynLandunitAreaMode.F90 will reduce the istsoil first, and then istcrop when the sum of all land units is larger than 100. But if we do (1), then essentially we have already reduced the crop and natveg proportionally when the urban expands, because the normalizedcheck_landuse ensures the sum of all land units to be 100.

I made a slide to show the current workflow and my thoughts:1.pdf
If my understandings are correct - doing (1) and (2) still reduce the crop and natveg proportionally when urban expands, right?

@keerzhang1
Copy link
Contributor

I wonder if we can do the following to make the expanding urban replace PCT_NATVEG first and then PCT_CROP:

(1) In mksurfdat.F90, normally the normalizedcheck_landuse will ensure the lake[1]+wet[1]+glacier[1]+urban[y]+natveg[y]+crop[y] to be 100. We modify the normalizedcheck_landuse so that it adjusts the natveg and crop assuming the lake,wetland, glacier, and urban all remain constant as the first year. Hence, it ensures lake[1]+wet[1]+glacier[1]+urban[1]+natveg[y]+crop[y] to be 100. Also, the normalizedcheck_landuse still replace bare soil with urban first, and then the natveg and crop proportionally.

call normalizencheck_landuse(ldomain)

(2) In the landuse.timseries data, we still generate the PCT_URBN which is the transient urban fractions (urban[y]).

(3) Then in the dynLandunitAreaMode.F90, the Landunit_sum might be >100 when urban expands, and this expanded urban will first take the istsoil, and then istcrop.

I also made a slide to show this workflow 2.pdf. Does this make sense? Thank you!

@olyson
Copy link
Contributor

olyson commented Nov 8, 2021

Lei, can you propose a meeting for us to discuss this further? I'd suggest inviting @lawrencepj1 and maybe also @billsacks as well. I don't know enough about the history behind some of the choices made in mksurfdata to judge whether these proposed changes are reasonable.

@billsacks
Copy link
Member

@keerzhang1 thank you very much for all of your thoughts on this. I really appreciate that you are thinking through this carefully. I agree with @olyson that a meeting makes sense at this point (partly to discuss your recent proposals, which I don't fully understand, and partly to discuss this issue more generally), though I'll share some more thoughts in writing first:

I have spent some more time looking through the relevant code and thinking about this issue. It does look like PCT_NATPFT + PCT_CROP = 100% is generally true prior to normalizencheck_landuse. I am left with more questions than answers at the moment, though I am starting to realize that my last comment (about reducing PCT_NATPFT and PCT_CROP proportionally) may not have made sense.

I think the important point here is what's done to create the raw datasets. I'm hoping that @lawrencepj1 can provide some insight here. At first I thought that urban should be treated the same as other special landunits. But now I'm realizing that what's really important is keeping the mksurfdat code consistent with the process for creating the raw datasets, and specifically the process for rescaling PCT_NATPFT and PCT_CROP to sum to 100% on the raw datasets. The current mksurfdat code, where urban is treated differently from other special landunits in that it preferentially replaces bare ground, makes sense if urban is treated differently from other special landunits in the creation of the raw datasets – and specifically, if urban areas are represented as bare ground PFTs on the raw datasets.

It would help me to know how various situations are represented on the raw datasets; for example, these three grid cells:

Grid cell 1: in reality:

  • 20% desert (i.e., truly bare ground)
  • 20% c3 grass
  • 20% soybean
  • 20% lake
  • 20% urban

Grid cell 2: in reality:

  • 30% soybean
  • 30% lake
  • 40% urban

Grid cell 3: in reality:

  • 50% lake
  • 50% urban

I'd particularly like to know what PCT_NATPFT, PCT_CROP and PCT_PFT look like on the raw datasets in each of these cases. It seems like the important point is making sure that the mksurfdat code is set up in a way to reproduce reality in each of these situations (and others) given the process for creating the raw datasets.

I do see the point that, for transient, it's a problem if we reduce PCT_CROP on the landuse timeseries file when urban expands. It seems like this could be an issue for other landunits, too – like the upcoming feature to enable transient lakes. This might argue for preferentially taking all special landunit area from PCT_NATVEG, but that might in turn require a change in how the raw datasets are constructed to keep everything consistent.

This may actually require some even bigger rethinking, like:

  • Do away with the convention that PCT_CROP and PCT_NATVEG add to 100% before normalizencheck_landuse, instead treating them the same as other landunits. (That may or may not help with the issues I'm struggling with. I'm not sure what other problems that might create, though.)
  • Or maybe we should give up on trying to renormalize percentages to 100% in mksurfdat, instead saving all of this renormalization to 100% to runtime... this might particularly make sense for cases where the areas of some landunits are only known at runtime (e.g., for glacier) – so we could use the same code for preferentially decreasing certain landunit areas in different points of the code. Again, this might create some new problems, so needs some careful thought....

One other comment is that, whatever we decide here, if there are behavior changes to the standard operation of mksurfdat, those should be done in their own branch/PR, not folded into the big transient urban PRs: we want to keep answer changes separate from answer-preserving feature additions. If it is easiest for you to fold all of these things together initially, then we should make sure we have a plan for separating them later (whether manually or with the help of git).

@Face2sea
Copy link

Face2sea commented Nov 9, 2021

@keerzhang1 @billsacks @lawrencepj1 @olyson Thank you all very much for these helpful information! @olyson yes, it is definitely a great idea to have a meeting to discuss these issues! I will create a doodle for everyone to find out a time shortly.

@lawrencepj1
Copy link

Hi Everyone @olyson @keerzhang1 @billsacks

Yes I agree a meeting would be good. In regard to Bill's comments on the raw datasets we currently have a PCT_URBAN specified but this is from MODIS and does not have the required breakdown into the urban classes. It could easily be included in the next version of the CLM raw data so that the nat veg, crop and other fractions are accounted for before mksurfdata as an alternative strategy if that was helpful.

cheers
Peter

@olyson
Copy link
Contributor

olyson commented Nov 9, 2021

As Peter implies, the original raw multi-density urban dataset was independently derived from considerations (mainly population density) other than, for example, remotely-sensed data such as from MODIS. At the time of its implementation in mksurfdata, it was assumed that urban should replace bare ground preferentially because of their spectral similarity in remote sensing derived datasets of that period.

@Face2sea
Copy link

@olyson @billsacks @lawrencepj1 @keerzhang1 @fang-bowen Here is the when2meet link for our meeting to discuss this further: https://www.when2meet.com/?13594604-RvqkA. Could you please indicate your availabilities in the link? I will find out a common time for everybody to set up the meeting. Thanks!

@billsacks
Copy link
Member

As we discussed on Monday: For now @keerzhang1 will do something simple in mksurfdata_map, getting dynamic urban to work without trying to address the issues identified above. However, before too long, we should overhaul how the PCT areas are determined in mksurfdata_map. I have opened a new issue for this: #1555 .

@keerzhang1 @Face2sea and @fang-bowen : thank you very much for bringing these issues to our attention and starting these important conversations around what is the right thing to do in mksurfdata_map!

@billsacks
Copy link
Member

See also #1572

@olyson
Copy link
Contributor

olyson commented Sep 8, 2022

Tasks 2, 3, and 4 have essentially been completed. Task 5 (testing) has also been completed but will need to be repeated once the items below have been completed. Task 1 has been expanded below to include a new building temperature stream file and code to determine nlevurb from the surface datasets as originally described in PR #591 which was completed but never implemented into master.

1a. Incorporate new raw urban surface datasets for present day (2000), historical, and five SSPs (SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-3.4, SSP5-8.5) developed primarily by @keerzhang1 from Gao and O'Neill (2021) and Gao and Pesaresi (2022). These datasets are available but have PCT_URBAN calculated with respect to the land area of the gridcell and will likely need to be replaced with PCT_URBAN calculated with respect to the total area of the gridcell based on recent discussions. The urban properties (morphological, radiative, and thermal) in these datasets are described in Oleson and Feddema (2019).

1b. A new building temperature stream file. It is available in inputdata at /glade/p/cesmdata/cseg/inputdata/lnd/clm2/urbandata/CLM50_tbuildmax_Oleson_2016_0.9x1.25_simyr1849-2106_c181017.nc, but needs to be converted to cdf5.

1c. Code changes in CTSM and mksurfdata (presumably mksurfdata_esmf) to read in nlevurb from the surface dataset instead of being hardcoded in clm_varpar.F90. This will ensure backward compatibility with older urban datasets (which have nlevurb=5 as opposed to the new datasets which have nlevurb=10).

The plan is to implement 1a-c on a branch off of ctsm5.2.mksurdata.

@slevis-lmwg
Copy link
Contributor

@olyson #1853 is now merged (in the ctsm5.2.mksurfdata branch).
Feel free to close this issue if appropriate.

@olyson
Copy link
Contributor

olyson commented Sep 28, 2022

Thanks @slevisconsulting ! I'm going to close this because the tasks have been completed and I assume that the final surface and landuse timeseries datasets will be generated and incorporated into the namelists using the ctsm5.2.mksurfdata branch and then merged into main. Any objections?

@olyson olyson closed this as completed Sep 28, 2022
@samsrabin samsrabin added the science Enhancement to or bug impacting science label Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new capability or improved behavior of existing capability science Enhancement to or bug impacting science
Projects
None yet
Development

No branches or pull requests

9 participants