A Hybrid Coordinate Ocean Model (HYCOM) for Data-Assimilative Ocean Modeling Year 1 Implementation Plan -------------------------- November 1999 This implementation plan summarizes the first year goals of the consortium for a) the release of HYCOM 1.0, b) the global and basin scale simulations, c) the data assimilation approach, and d) evaluation of the results. 1. HYCOM 1.0 RELEASE 1.a SCALABILITY First priority: a scalable HYCOM that runs at a minimum on the Cray T3E, the Origin 2000, and various types of workstations. HYCOM must run on the Origin for global simulations (LANL) and on the T3E for Atlantic simulations (Miami/NRL). It must also run on workstations for testing purposes. There are currently two scalable implementations of MICOM. MP-MICOM uses MPI or SHMEM and can also run on all workstation types. The more recent SC-MICOM uses a two level parallelization scheme, threads and MPI, which is more flexible than MP-MICOM. However, SC-MICOM uses SGI-threads and so only runs on SGI computers. There is also a workstation-only version of MICOM. Multiple versions are difficult to keep consistent, so our goal is a single source code for HYCOM (and MICOM). The long term goal (by the end of year 1) is a single source code implementing both MICOM and HYCOM (replacing all previous versions) that runs on a wide range of machines. We expect this to be based on SC-MICOM. A preliminary port of SC-HYCOM already exists. We expect it to take 3-6 months to generalize this code to all machine types. In the interim (within 4-6 weeks), we will prepare an MP-HYCOM (based on MP-MICOM) that we expect to be the standard "beta" code for initial simulations and all model physics/numerics development until a portable SC-HYCOM (HYCOM 1.0) is ready. The numerical kernels are almost identical, so the switch from MP-HYCOM to SC-HYCOM should be relatively easy. However, MP-HYCOM will not have all of the re-designed user interface that is planned for HYCOM 1.0 to make MICOM/HYCOM easier to use. MP-HYCOM will be configured to allow it to run on workstations as well as on the Origin and T3E. No workstation-only code will be released. The most error-prone part of the interface between the computational kernels and the scalability code (in both MP-HYCOM and SC-HYCOM) is updating the halos that allow the kernels to ignore other tiles and then correctly "consuming" these halos as the computation progresses. TOPAZ is a software tool that provides guidance on these issues. We will confirm that all released versions of SC-HYCOM conform to TOPAZ guidance, and will include an option to run TOPAZ as part of the standard makefile for SC-HYCOM. The primary distribution method for HYCOM will be via the HYCOM web page (unlimited public release). We will investigate the use of "patches" to simplify updating from one minor release to the next. Major releases will consist of complete .tar files. One of the major features of the redesigned user interface will be a portable file format already in use by two NRL ocean models (NLOM and NCOM). In order to improve I/O performance, the actual arrays are stored as raw 32-bit IEEE REALs (no control words) in one file while any "header" information is written to a separate file. The array file is always identical to that produced using direct access I/O from a single workstation. The header file contains either plain text or unformatted sequential Fortran records holding only 32-bit IEEE REAL values. The two files by convention have the same filename root but different extensions (e.g., name.a and name.b). Distributing information across two files has worked well in practice, but there is the possibility that a header file will be used with the wrong array file. We will prevent this by including a "checksum" for each array in the header file. In NLOM/NCOM, the checksum is each array's minimum and maximum value, but we will use a more comprehensive checksum for HYCOM if one can be found that is portable and efficiently calculable in parallel. The single array file will act as the archive and portability medium, but on some machine types it may be necessary to split each array file into many small local files (one per node) for input and to combine many small files on output. This will be done where necessary as part of the standard model run script. Files of raw binary floats are very easy to directly import into many visualization packages, but we will develop utilities to convert them to netCDF files. The conversion will include the ability to select a sub-region for output. Implementation: Wallcraft, O'Keefe, Bleck 1.b CODE DEVELOPMENT MIXED LAYER MODEL: The K-Profile Parameterization (KPP) mixing model has been incorporated into HYCOM, and initial tests are underway. This version of HYCOM will be made available to NOPP participants in November 1999. Tests have demonstrated that the KPP model locally conserves T, S, u, and v to at least 15 decimal places during each time step. Initial plots suggest that the model estimation of both mixed layer thickness and turbulent boundary layer thickness are not seriously degraded by coarse and highly uneven vertical resolution. Other tests to be performed by the initial release include (1) consistency checks such as global conservation of T, S, and momentum, and (2) performance comparisons to both MICOM and HYCOM configured with the Kraus-Turner bulk mixed layer model. Other tests to be performed later include (1) sensitivity studies to vertical grid resolution; (2) sensitivity studies to horizontal resolution; and (3) sensitivity studies to the temporal resolution of the surface forcing (monthly vs. synoptic). The KPP model has been configured to archive several additional variables for analysis, including (1) turbulent boundary layer thickness, defined as the maximum penetration depth of nearsurface turbulent eddies and estimated by a bulk Richardson number criterion; (2) mixed layer thickness, presently estimated as the vertically-interpolated depth where density has increased by a specified threshold value; and (3) bulk mixed layer T, S, density, u, and v, all averaged over the water column above the mixed layer base. Mixed layer thickness defined by a delta-temperature criterion will also be incorporated. Other mixed layer models will be considered, such as the Price-Weller-Pinkel (PWP) bulk Richardson number model. These additional models may or may not be incorporated into HYCOM. For example, the PWP model does not work well at low vertical resolution, which is a distinct disadvantage in comparison to the KPP model. Any other mixed layer model that is selected for inclusion in HYCOM will be subjected to the same evaluation tests as KPP before being released to outside users. The HYCOM version with mixing governed by the bulk Kraus-Turner model will also be made available. To the greatest extent possible, the choice of mixed layer model will be made user-selectable. During year one of the NOPP project, mixed layer evaluations will be performed using the low-resolution Atlantic model, the 1/3 degree Atlantic simulation, and the high-resolution IAS simulations. Implementation: Halliwell Evaluation: Halliwell, Chassignet, Smith, Bleck, Romanou, Wallcraft MODE SPLITTING: Ocean models typically separate barotropic and baroclinic modes to improve numerical efficiency. The "explicit" splitting method employed is MICOM has recently been refined by R. Higdon (Oregon State). The new method is computationally somewhat more expensive than the traditional one but has been shown to reduce the "sloshing" in the barotropic component of the circulation. It is also computationally more stable, allowing a reduction in the time smoothing (Robert-Asselin filter) presently employed to ensure numerical stability. HYCOM 1.0 will incorporate Higdon's mode splitting method. Implementation: Bleck SEA-ICE/POLAR GRID: Projecting the earth's surface onto a logically rectangular computational grid is possible in ocean models by placing the poles of the projection on land. Various alternatives for doing so have recently been developed. The specific variant proposed for HYCOM is to smoothly connect a conventional Mercator projection grid, covering all of the Southern Hemisphere and the Northern Hemisphere to 50N, to a "bipolar" grid generated by placing two poles on opposite sides of the 50N parallel ("Pan Am" grid). Extending the model domain to the North Pole and to the edge of the Antarctic continent allows consideration of the dynamic and thermodynamic effects of sea ice in the model. For the purpose of this project, inclusion of a full-fledged sea ice model is deemed unnecessary. We intend to incorporate ice effects in the form of observed sea ice coverage coupled with T/S relaxation under the ice to observed climatological conditions. The question of whether HYCOM 1.0 should include the global grid extension was left open. Implementation: Bleck EQUATIONAL WAVE GUIDE: Ocean dynamics near the equator are modulated by Rossby and Kelvin waves traveling along the equator, as well as by equatorial Ekman suction. Capturing the correct phase speed and latitudinal structure of higher modes of the equatorial waves is facilitated by reducing the meridional mesh size in the equatorial belt. This option will be incorporated in HYCOM 1.0. Implementation: Bleck OPEN BOUNDARY CONDITIONS: The released version of HYCOM will incorporate two types of boundary conditions for areas not bounded by land (boundary at 65/70N for the quasi-global domain, user's choice for basin domains). The first choice of boundary condition will be the traditional "buffer zone" approach in which open ocean boundaries are treated as closed, but are outfitted with buffer zones in which temperature T and salinity S are linearly relaxed toward climatological (or any other observed) values. These buffer zones restore the T and S fields to observations in order to approximately recover the vertical shear of the currents through geostrophic adjustment. There are no constraints on the barotropic circulation. The second boundary condition will be "open" in the sense that relaxation to mass fluxes, interface depths, T, S, and density is prescribed in a finite-width sponge zone, and that the barotropic pressure and velocities are advected into/out of the domain via characteristics. Implementation: Smith, Chassignet, Bleck 2. MODEL CONFIGURATIONS 2.1 GLOBAL HYCOM HYCOM will be configured globally for the reanalysis. The first configuration will be a repeat of the "classic MICOM" 1.4 degree simulation described in Sun and Bleck (1999) forced with a climatology based mostly on COADS. A comparison to the MICOM solution as well as to observed data will be carried out for a 50-year climatological run. The second configuration will be identical to the first, except that a) the grid resolution in the equatorial wave guide in the latitudinal direction, which will vary from 1/3 degree at the equator to 1.4 degrees at 15N and 15S; b) the northern boundary, which will be placed farther north at 70N (DYNAMO northern latitude) and with relaxation to observations in the Greenland/Norwegian Seas. Comparison to the first configuration and to data will be performed for a 50-year run. The third climatological integration will be a repeat of the second one (with equatorial wave guide and 70N boundary), but forced with ECMWF climatology superimposed with typical daily wind variations derived for the full data set (50-year integration). This 50-year experiment will be followed by a 20-year integration forced with the twice-daily ECMWF forcing from 1979-1999. In the first experiment, the E-P forcing will consist of observed precipitation and computed evaporation (as in Sun and Bleck, 1999). The drift and the necessity to add a small relaxation to observed surface salinity in the second and third experiments will be evaluated. Implementation and evaluation: Thacker, Bleck, Chassignet 2.2 NORTH ATLANTIC HYCOM This series of basin experiments will build on the DYNAMO experience (1/3 degree grid spacing). The domain configuration will be from 70N to 20S with relaxation to observed monthly values (modified Levitus as in DYNAMO) at the northern boundary and to monthly climatology (Levitus) at the southern boundary. Three 20-year climatological simulations will be performed with forcing from a) COADS, b) ECMWF, and c) NCEP. The ECMWF and NCEP runs will then be integrated for an additional 20 years with daily winds from the 1979-1999 period. Freshwater flux will consist of observed precipitation and computed evaporation + a small relaxation to monthly climatological surface salinity. The model output will be saved daily for the surface values (including mixed layer depth) and every three days for the interior quantities. If time permits, a 1/12 degree simulation with ECMWF forcing will be initiated (to be compared to a companion MICOM run). Analysis: Performance metrics (see Appendix for evaluation criteria) Comparison to DYNAMO and CME results Implementation and evaluation: Chassignet, Hurlburt 2.3 MODEL CONFIGURATION FOR DATA ASSIMILATION The basic configuration will be the 1/3 degree DYNAMO domain forced by the 1979-1999 ECMWF forcing. 3. DATA ASSIMILATION METHODOLOGY 3.1 OVERVIEW Several assimilation routines are presently available for the MICOM and NLOM codes. These include optimal interpolation, PMOA, and adaptive and Markov Random Field implementations of Kalman filter. During the first year, these techniques will be applied for a set of controlled assimilation experiments (1/3 degree resolution over the DYNAMO domain) for evaluation, using the SSH data. Ultimately, it is desirable for HYCOM to have a small number of standardized options for data assimilation that are stable in terms of inter-variable balance and general for various data types. Such efforts will pay off in the future by providing a common platform for systematic inversion of the observation operators, and may even be relevant to such seemingly unrelated tasks as implementation of open boundary schemes. Given the pieces we have so far, we have two options for such a platform. One is a variational scheme similar to 4DVAR, using the adjoint code from the adaptive filter. Another is a Kalman filter implementation that can unify the existing statistically based methods (PMOA, adaptive, MRF). For the latter, an ensemble or Monte Carlo approach seems to give the most flexibility as well as the least code development effort. Either way, one must be careful due to the labor-intensive nature of this long-term task. The ensemble implementation approach relies on multiple forward model runs for time updating of the covariance matrix. Its flexibility is mostly due to the fact that the assimilation scheme is oblivious to the numerics of the model, as it depends only on the model outputs. The number of statistical runs is an issue in practice. Although the literature reports that a hundred or so runs are sufficient, this would probably capture only the major EOFs of the covariance matrix. The smaller scale correlations would then be supplemented in the proposed scheme by other existing methods, such as the MRF. Implementation: Chin, Baraille, Mariano, Smedstad, Thacker 3.2 APPROACH Several first-year tasks have been identified for the coarser resolution global data assimilation effort. Since the high resolution model is initially limited to the Atlantic basin, the global simulation will also be used to provide boundary conditions at the open boundaries. Thus the emphasis will be on achieving accurate simulations of the open boundaries. Also, because of the importance of the tropics for climate prediction, latitudinal resolution will be increased near the equator and special attention will be given to the tropics. Before assimilating data, unconstitutional simulations will be compared to data. Results with climatological forcing will be compared to the Levitus climatology. Then, simulations with ECMWF forcing for 1979-1999 will be compared closely to XBT data, in order to detect systematic model-data differences that might be corrected and to establish the nature of the error-covariances that will provide the basis for assimilation. Codes for assimilating data will be prepared in such a way that simple methods can be replaced by more sophisticated methods as they become available. Coarse resolution implementation: Thacker For the North Atlantic basin experiment, the data assimilation experiments will consist of a comparison and evaluation of four techniques in the 1/3 degree DYNAMO domain. The evaluation will consist of two primary experiments, twin experiments with simulated Sea Surface Height (SSH) data and the assimilation of measured SSH. The SSH anomalies from TOPEX/Poseidon, ERS 1 and 2 will be used in the assimilation experiments. The archived data at NRL/Stennis will be extracted and made available for the assimilation experiments. The mean sea surface height (SSH) to be added to the altimetric anomalies will be determined by comparing the mean SSH from different products. The mean SSH from the 1/3 degree HYCOM over the time period of the satellite data (1993-1999) will be compared to the mean dynamic height calculated from BTs over the same period. A high resolution (MICOM) HYCOM (1/12 degree) model mean will also be used as well as a mean from a high resolution QG model. The most realistic position of the Gulf Stream from these products will be combined with the mean from the 1/3 degree HYCOM using a rubber-sheeting technique. This mean will be used in the assimilation of the real altimetry data. In both experiments, HYCOM will be forced using ECMWF winds from 1979-1999. Assimilation of either simulated SSH data or real data for the period 1993 through 1999 will be performed. The four data assimilation techniques are: Optimal Interpolation (OI): Will be implemented to assimilate satellite altimetry observations track by track. The implementation will be based on the technique used in the Navy Layered Ocean Model (NLOM) which includes an optimal interpolation of the model/data SSH differences, a vertical projection of the surface information and a geostrophic correction to the velocities outside the equatorial region. The vertical projection will be done either by using isopycnal modes or Cooper-Haines. Implementation: Smedstad The Parameter Matrix Objective Analysis (PMOA) algorithm: An OI scheme, with correlation parameters that vary in space and time for space-time interpolation of SSH, a vertical projection of the surface information, and a geostrophic correction to the velocities outside the equatorial region. Implementation: Halliwell, Mariano Adaptive Filter: The algorithm estimates the unknown parameters at each data assimilation step by minimizing the forecast errors under some more- demanding hypothesis on both model and observation errors. The estimation process requires the use of the adjoint of a linearized version of the MICOM (HYCOM) model. The simple formulation of the filter is an Optimal Interpolation-like structure of the gain. The filter will be used to estimate vertical correlation coefficients of the forecast error covariance matrix. Implementation: Baraille The Markov Random Field Information filter (MRFIF): The horizontal (cross-)covariance functions of SSH and velocities in an extended information filter are parameterized as second-order spatial Markov Random Fields. A vertical projection scheme is used to correct lower-layer thicknesses and velocities. Implementation: Chin Vertical Projection: It will be important to transfer the surface information from the altimetric observation to the interior of the ocean. A statistical inference technique will be used to infer lower layer pressure anomalies from sea surface height data. Issues concerning which variable of the model to update (i.e., density or layer thickness) will be investigated. Coefficients for VP from the Adaptive Filter will also be evaluated. Implementation: Smedstad, Mariano An alternative approach to the vertical projections of the surface observation will be the use of synthetic profiles calculated using the Modular Ocean Data Assimilation System (MODAS). MODAS is a subset of the MOODS profile data base producing a variable resolution climatology plus statistical regressions to relate subsurface T and S to satellite measured surface temperature and sea surface height. This would give the corrections to both the temperature and salinity at all depths of the model. Skill assessment of each data assimilation method will use the following data sets: (a) the RSMAS two-day 18km blend of satellite and in-situ SST (b) velocities from near-surface drifters (c) SSH from tide gauges (d) WOCE hydrographic sections (e) mixed-layer depths from hydrography and profiling floats (f) forecast errors. A suite of 1st- and 2nd-order statistics will be used to compare the HYCOM results with the four techniques to the benchmark HYCOM simulation with no data assimilation. The results of the statistical evaluation and the computational costs of each data assimilation technique will be analyzed to determine which technique gives us the most "bang for the buck." 4. EVALUATION The project will develop a suite of model-data comparisons that can be routinely applied to atmospherically-forced and data-assimilative model runs. This includes acquiring appropriate data sets and casting them into a meaningful, user-friendly form for model and data assimilation evaluation. Forecasts from a data-assimilative initial state will be used in the evaluation of data-assimilative model runs, both very short forecasts to allow evaluation just prior to the update with new data as well as longer forecasts. In this way assimilated data can be used for more than measuring a fit to the assimilated data. This will be particularly effective for data that is non-repeating (e.g., hydrography) and for data where there are large temporal gaps (e.g., satellite altimetry). In addition, the model-data comparisons will be designed to meet the following goals: 1) assessing the accuracy of critical measurable fields such as SST and SSH; 2) assessing the model climatology, including means, variability statistics, and spectral properties; 3) assessing model ability to represent the evolution of observed interannual, seasonal, and intraseasonal variability in both atmospherically-forced and data- assimilative model runs; 4) dynamical performance of the model in different regions and dynamical regimes such as a) the Gulf Stream (including ring generation and the nonlinear recirculation gyres), b) the tropical Atlantic EUC and upwelling regimes, c) the North Brazil retroflection and ring generation, d) variability and passage transports in the IAS, e) the strengths, pathways, and variability of upper ocean and deep current systems and of the meridional overturning, f) water mass formation in the subpolar North Atlantic (including seasonal variation in MLD and overflows from the GIN Seas and the Mediterranean Sea, g) impacts of bottom topography on the upper ocean circulation and upper ocean and abyssal ocean interaction in general for the various dynamical regimes; 5) assessment of model deterministic vs. non-deterministic responses to atmospheric forcing, including wave propagation, regional variations and variations as a function of time scale; 6) assessment of data assimilation skill using both real and model-simulated data, including fit to the data, verification against independent data, verification of error estimates from the assimilation scheme, the ability to constrain different dynamical regimes, impact on model forecast skill, and sensitivity to model resolution; and 7) assessment of model forecast skill in various dynamical regimes, including sensitivity to atmospheric forcing, initial state accuracy/data input, and model resolution. There will also be some assessment of the accuracy of the atmospheric forcing and atmospheric forcing methodology, both direct and by assessing model sensitivity. In performing the model-data comparisons, those that assess model/data assimilation skill in representing oceanic features will generally be favored over those that provide only statistics of point comparisons, although performing some of the latter is unavoidable. Much of what is needed to perform the climatological assessments is already in place via past participation in ONR's Data Assimilation and Model Evaluation Experiment - North Atlantic Basin (DAMEE-NAB) (Chassignet et al., 1999; Hurlburt and Hogan, 1999; Paiva et al., 1999a,b). Data sets of particular interest in measuring model ability to represent the evolution of observed oceanic variability include SST, SSH or surface dynamic height, T&S as a function of depth, current velocities and transports, and oceanic frontal locations. For SST the primary data sets are a) IR SST from the Pathfinder Project (1985-1995), analyses by Mariano et al. (1985-pres.), plus recent IR data holdings by RSMAS and the U.S. Navy, b) time series of SST at buoys (globally at least 40 [7 in the Atlantic model domain] provide daily time series for multiple years over the time frame 1980-1998), and c) hydrographic profiles including PALACE floats. Satellite IR imagery is the primary data source for automated IR frontal analysis (Cayula and Cornillon, 1995). Ocean color imagery could also be used, but that is not under consideration for at least the first two years. Sea surface height will be obtained from satellite altimetry as corrected by the NRL group (TOPEX/POSEIDON and ERS-2 so far), tide gauge time series (especially useful for HYCOM evaluation over shelf regions), and IES/bottom pressure gauge time series. Surface dynamic height is obtained from hydrography, especially CTDs (e.g., WOCE) and PALACE float data. Most of this data is already in hand. The hydrographic database will also be used for evaluation of HYCOM subsurface temperature, salinity, and mixed layer depth. Current velocities will be obtained from limited current meter data, surface and near surface (WOCE) drifter data, and SOFAR/RAFOS floats. Current transports will be obtained from combinations of the preceding data (most already calculated) and from calibrated cable data across the Florida Straits. Implementation: Hurlburt, Mariano, Thacker, Chassignet APPENDIX: Performance metrics for GODAE 1. Operational utility measures 2. Model climatology, mean and variability statistics, including spectral character 3. Dynamical performance 4. Forecast skill 5. Fit to assimilated data, including error statistics of the assimilation 6. Discrepancy between the model and data just prior to assimilation 7. Comparison to unassimilated data a. Same quantity (field) i. Same data type ii. Different data type b. Ability to constrain a sparsely observed field or a field for which the data are not assimilated 8. Real-time metrics vs. metrics applied to applications using historical data 9. Metrics measuring the ability to represent oceanic features vs. point by point statistics 10. Observing system assessment vs. data-assimilative model assessment 11. Assessment of data impact, atmospheric and oceanographic