b40.rcp6_0.1deg.001


Contents:


Run Specifications


===================
General Information
===================

   Purpose of Run: ensemble member 1 of rcp6.0 runs

   Scientific Lead: Jim Hurrell

   Software Engineering Lead: Mariana Vertenstein

   Assigned to: mai
 
   Date: 2010-11-09

   Run Length:  96 years
 
=========================
Case Creation Information (all fields are required)
=========================

   CCSM tag:   cesm1_0_beta10

   Case Name:  b40.rcp6_0.1deg.001

   Machine:    bluefire

   Compset:    B_RCP6.0_CN
 
   Resolution: f09_g16


=============================
Pre-Configuration Information
=============================

   Runtype: hybrid
   RUN_STARTDATE = 2005-01-01
   RUN_REFCASE   = b40.20th.track1.1deg.008 
   RUN_REFDATE   = 2005-01-01
           
   env_conf.xml mods 
   -----------------
   xmlchange -file env_conf.xml -id RUN_TYPE -val 'hybrid'
   xmlchange -file env_conf.xml -id RUN_STARTDATE -val '2005-01-01'
   xmlchange -file env_conf.xml -id RUN_REFCASE -val 'b40.20th.track1.1deg.008'

   env_conf.xml mods 
   _________________

   * none


   env_mach_pes.xml mods
   _____________________
      
   component       comp_pes    root_pe   tasks  x threads (stride)
   ---------        ------     -------   ------   ------   ------
   cpl = cpl        320         0        320    x 1       (1     )
   glc = sglc       1           0        1      x 1       (1     )
   lnd = clm        128         320      128    x 1       (1     )
   ice = cice       320         0        320    x 1       (1     )
   atm = cam        448         0        448    x 1       (1     )
   ocn = pop2       64          448      64     x 1       (1     )


==============================
Post-Configuration Information
==============================

   Buildconf
   _________

  * none


======================
SourceMods Information
======================

  * none (but see comments below)


==========================
Performance/Cost Estimates
==========================

  * 13.1 years/day across 8 nodes
 
====================
Special Instructions
====================

  * none
 

====================
Pre-Run Instructions
====================

  * Run create_production_test
 
  * Run debug smoke test

  * Add NCAR Software Levels info to checklist 

================
Run Instructions
================

  Run Length: 96 years

  Account key:  93300473 (for now same account as for 4.5)

  Priority/Targeted queue:  regular

  Other:

================
Diagnostics Plan
================

  * vs b40.20th.track1.1deg.008 (1986-2005) at 2041-2060 and 2081-2100


======================
Additional Information
======================

  * none


Return to Top


Run Checklist


Complete the following checklist prior to beginning the production run:



1.  Update status file: /web/web-data/cseg/ccsm4_0_runs/b40.rcp6_0.1deg.001/status.html:
    assigned
    pending
    running
    completed
    stopped


2.  Document NCAR software levels at beginning of run (use the spinfo command on bluefire)
***************************************************
NCAR SOFTWARE LEVELS: Tue Nov  9 14:50:56 MST 2010.
***************************************************
AIX:                  bos.mp              5.3.10.1
CSM:                  csm.core            1.7.1.4
LoadLeveler:          LoadL.full          3.5.1.3
GPFS:                 gpfs.base           3.2.1.14
VSD:                  rsct.vsd.vsdd       4.1.0.23
POE:                  ppe.poe             5.1.1.3
PESSL:                pessl.rte.smp       3.3.0.2
ESSL:                 essl.rte.smp        4.4.0.1
FORTRAN:              xlfrte              12.1.0.8
PERL:                 perl.rte            5.8.2.100
C:                    xlC.rte             10.1.0.3


3.  Complete the following table, as necessary, showing
    the component liaison's name and the date the setup
    was approved.
 

   Component         Liaison/                     Date Approved
                     Reviewer
   ================+==========================+==================

      atm            hannay                     2010-11-09

      cpl            [kauff,mvertens,tcraig,other]    ----

      ice                 dbailey                     ----

      lnd            [erik,slevis]                    ----

      ocn            [njn01,bates,gokhan]             ----

      env_ file      [mvertens,other]                 ----
      settings

      data           [strand,other]                   ----
 

4.  Create_production_test completed    mai     2010-11-09
 

5.  Debug smoke test completed          mai     2010-11-09


6.  Performance review completed [who,when]
 


Return to Top


Comments

On 24 Nov 2010 the run died during model date Oct 2100 (less than three
months from the finish of the run) with the following error from the
lnd.log.101124-115436 file:

 clm2: completed timestep  1678961
(shr_tInterp_getFactors)  ERROR illegal linear times:             -NaNQ       0.00000000            -NaNQ
(shr_sys_abort) ERROR: (shr_tInterp_getFactors)  illegal itimes
(shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping

After consulting with Erik Kluzek, Mariana Vertenstein suggested a patch to
shr_tInterp_mod.F90. From an email sent by Mariana to Andy Mai:

   I have looked at the csm_share ChangeLog - and I think the only thing
   that would be worth trying is to ONLY change shr_tInterp_mod.F90 with the
   following diffs (< are the changes you want to put in)

   115c115
   <    integer(SHR_KIND_I8)   :: spd         ! seconds per day
   ---
   >    integer(SHR_KIND_IN)   :: spd         ! seconds per day
   143c143
   <    itimein = int(edayin-edayin,SHR_KIND_I8)*spd + int(sin-sin,SHR_KIND_I8)
   ---
   >    itimein = (edayin-edayin)*spd + sin-sin
   147c147
   <    itime1 = int(eday-edayin,SHR_KIND_I8)*spd + int(s1-sin,SHR_KIND_I8)
   ---
   >    itime1 = (eday-edayin)*spd + s1-sin
   151c151
   <    itime2 = int(eday-edayin,SHR_KIND_I8)*spd + int(s2-sin,SHR_KIND_I8)
   ---
   >    itime2 = (eday-edayin)*spd + s2-sin


This change allowed the run to finish, re-starting from 2099-01-01-00000. Since
this is all integer arithmetic, the change is bfb except in cases where the
intermediate results on the RHS overflowed in the original code. The NaNQs in
the printed error message are another confusing problem. The print statement
used an "F" specification to print three integers. This will be corrected in
cesm1_0_beta12.


Return to Top