\majorprofPhilip Smith

\firstreaderRob Stoll

\secondreaderJames Sutherland

\thirdreaderJeremy Thornock

\fourthreaderKevin Whitty

\deptchairJoAnn Lighty

\graddeanCharles Wright

\deantitleDean of Graduate School

\degreeDoctor of Philosophy

\monthAugust

\year2011

\submitdateAugust 2011

\frontmatterformat

AN INSTRUMENTALIST APPROACH TO VALIDATION: A QUANTITATIVE ASSESSMENT OF A NOVEL COAL GASIFICATION MODEL

Charles Martin Reid

\copyrightpage

\committeeapproval

\finalreadingapproval

The explosion in computing power and its application to complex multiphysics problems has led to the emergence of computer simulation as a new way of extending the inductive methods of science. Many fields, particularly combustion, have been greatly changed by the ability of simulation to explore in great detail the implications of theories. But problems have also arisen; a philosophical foundation for establishing belief in simulation predictions, particularly important for complex multiphysics systems where experimental data is sparse, is sorely lacking. Toward the end of establishing such a foundation, a comprehensive philosophical approach to model validation, called instrumentalism, is proposed.

A framework for verification and validation/uncertainty quantification (V&V/UQ) of codes is presented in detail, and is applied to a novel entrained flow coal gasification model implemented in the massively parallel simulation tool Arches. The V&V/UQ process begins at the mathematical model. The novel coal gasification model, which utilizes the direct quadrature method of moments (DQMOM) for the solid phase and large eddy simulation (LES) for the gas phase and accounts for coupling between the gas and solid phases, is described in detail. A verification methodology is presented in the larger context of validation and uncertainty quantification, and applied to the Arches coal gasification model.

A six-step validation framework is adopted from the literature and applied to the validation of the Arches gasification model. One important aspect of this framework is model reduction, creating surrogate models for complex and expensive multiphysics simulators. A procedure for constructing surrogate response surface models is applied to the Arches gasification model, with several statistical analysis techniques used to determine the goodness of fit of the coal gasification response surface.

This response surface is then analyzed using two methods: the Data Collaboration methodology, an approach from the literature; and a Monte Carlo analysis of the response surface. These analyses elucidate regions of parameter space where the simulation tool makes valid predictions. The Monte Carlo analysis also yields probabilities of simulation validity, given input parameter values. These probabilities are used to construct a prediction interval, which can then be used to compute the probability of a valid simulation prediction.

\endabstract

Table of Contents

Chapter 1: INTRODUCTION

Section 1.1: Model Validation

Section 1.2: Coal Gasification

Subsection 1.2.1: Motivation

Subsection 1.2.2: Coal Gasification and Combustion Models

Section 1.3: A Need for Epistemology

Subsection 1.3.1: The Role of Probability

Subsection 1.3.2: The Validation Casino

Section 1.4: Parlance

Subsection 1.4.1: What Is Truth?

Subsection 1.4.2: What Is Reality?

Subsection 1.4.3: What Is a Model?

Subsection 1.4.4: Error vs. Uncertainty

Subsection 1.4.5: Reliability

Section 1.5: Dissertation Roadmap

Chapter 2: COAL GASIFICATION MODEL FORMULATION

Section 2.1: Large Eddy Simulation Equations

Section 2.2: Coal Particle Equations

Subsection 2.2.1: Single-Particle Probability Density Function (PDF)

Subsection 2.2.2: Population PDF: Number Density Function (NDF)

Subsection 2.2.3: NDF Transport Equation

Subsection 2.2.4: Filtered NDF Transport Equation

Section 2.3: Method of Moments Discretization

Subsection 2.3.1: Quadrature Approximation

Subsection 2.3.2: The Quadrature Method of Moments

Subsection 2.3.3: The Direct Quadrature Method of Moments

Section 2.4: Equations for Reacting Coal Systems

Subsection 2.4.1: Mixture Fraction Definitions

Section 2.5: Coal Models

Subsection 2.5.1: Coal Reaction Rates

Subsection 2.5.2: Coal Devolatilization Model

Subsection 2.5.3: Char Oxidation Models

Subsection 2.5.4: Particle Velocity Model

Subsection 2.5.5: Particle Heat Transfer

Section 2.6: Arches Coal Gasification Model

Chapter 3: MODEL VERIFICATION

Section 3.1: Overview

Subsection 3.1.1: A Definition

Subsection 3.1.2: Error vs. Uncertainty in Verification

Subsection 3.1.3: Numerical Error Taxonomy

Subsection 3.1.4: Errors vs. Mistakes in Verification

Section 3.2: Code Verification

Subsection 3.2.1: Software Quality Assurance

Subsection 3.2.2: Code Verification Criteria

Subsection 3.2.3: Exact Solution Methodologies

Subsection 3.2.4: Code Verification Grid Convergence Analysis

Subsection 3.2.5: Code Verification Grid Convergence Results

Section 3.3: Solution Verification

Subsection 3.3.1: Solution Verification Grid Convergence Analysis

Subsection 3.3.2: Solution Verification Grid Convergence Design

Subsection 3.3.3: Significance of Interaction Effect

Subsection 3.3.4: Solution Verification Grid Convergence Results

Subsection 3.3.5: Numerical Uncertainty and Convergence Indices

Subsection 3.3.6: Convergence Index Results

Subsection 3.3.7: Numerical Uncertainty and Validation

Section 3.4: Conclusions

Chapter 4: VALIDATION FRAMEWORK

Section 4.1: What Is Validation?

Subsection 4.1.1: Simulation as an Extension of Theory

Subsection 4.1.2: Validation Metric

Subsection 4.1.3: Instrumentalism

Section 4.2: What Is Empirical Uncertainty?

Subsection 4.2.1: Uncertainty Taxonomy

Subsection 4.2.2: Mathematical Treatment of Uncertainty

Section 4.3: Approaches to Validation

Subsection 4.3.1: Pre-1990s

Subsection 4.3.2: 1990s

Subsection 4.3.3: 2000s

Subsection 4.3.4: The Need for a Framework

Section 4.4: Application of NISS Framework to Coal Gasification

Subsection 4.4.1: Step 1: Creation of Input Uncertainty Map

Subsection 4.4.2: Step 2: Determination of Evaluation Criteria

Subsection 4.4.3: Step 3: Design of Experiments and Data Collection

Subsection 4.4.4: Step 4: Surrogate Models

Subsection 4.4.5: Step 5: Analysis of Model Results

Subsection 4.4.6: Step 6: Feedback and Feed Forward

Section 4.5: Conclusions

Chapter 5: SURROGATE MODELS FOR SIMULATIONS

Section 5.1: Surrogate Models: Filling a Need

Subsection 5.1.1: Terminology

Subsection 5.1.2: Classes of Surrogate Models

Subsection 5.1.3: Surrogate Model Training

Subsection 5.1.4: Goodness of Fit

Subsection 5.1.5: Budgeting and Spending Degrees of Freedom

Section 5.2: Response Surface Methodology

Subsection 5.2.1: RSM: For and Against

Subsection 5.2.2: Construction, Regression

Subsection 5.2.3: Variable Normalization

Section 5.3: Response Surface Assembly

Subsection 5.3.1: When I am weak, then am I strong.

Subsection 5.3.2: Sequential Assembly

Subsection 5.3.3: Computing Effects: Dot Method

Subsection 5.3.4: Computing Effects: Yates’ Method

Subsection 5.3.5: Fractional and Full Factorial Designs

Subsection 5.3.6: Screening Designs

Subsection 5.3.7: Quadratic Designs: Central Composite and Box-Behnken

Section 5.4: Response Surfaces for Coal Gasification

Subsection 5.4.1: Gasification Screening Study

Subsection 5.4.2: Gasification Fractional and Full Factorial

Subsection 5.4.3: First-Order Gasification Response Surface

Subsection 5.4.4: First-Order Gasification Response Surface With Curvature

Subsection 5.4.5: Coal Gasification Response Surface Conclusions

Section 5.5: Conclusions

Chapter 6: DATA COLLABORATION METHOD FOR VALIDATION

Section 6.1: The Analysis of Model Results

Subsection 6.1.1: Important Characteristics of Analysis Methods

Section 6.2: Data Collaboration Method

Subsection 6.2.1: Procedure

Subsection 6.2.2: Fitting into the NISS Framework

Section 6.3: An Instrumentalist Approach to Validation

Section 6.4: An Overview of the Data Collaboration Approach

Section 6.5: Data Collaboration for Coal Gasification

Subsection 6.5.1: A Statement of the Validation Problem

Subsection 6.5.2: The Expensive Model and the Cheap Model

Subsection 6.5.3: The Input Uncertainties and the Output Uncertainties

Section 6.6: Qualitative Validation Analysis

Section 6.7: Data Collaboration Validation Analysis

Subsection 6.7.1: Validation for All Data

Subsection 6.7.2: Validation for Data Grouped by Species

Subsection 6.7.3: Validation for Data Grouped Spatially

Subsection 6.7.4: Interpretation

Section 6.8: Monte Carlo Validation Analysis

Section 6.9: Prediction

Subsection 6.9.1: Prediction Interval Construction

Subsection 6.9.2: Prediction Intervals for Model Validation

Subsection 6.9.3: Coal Gasification Prediction Interval

Section 6.10: Conclusions

Chapter 7: CONCLUSIONS AND RECOMMENDATIONS

Section 7.1: Digest of Concepts and Conclusions

Subsection 7.1.1: Verification

Subsection 7.1.2: Validation

Subsection 7.1.3: Surrogate Models

Subsection 7.1.4: Validation Results Analysis

Section 7.2: The List

Chapter 8: GOVERNING EQUATIONS

Section 8.1: Reynolds Transport Theorem

Section 8.2: Continuity Equation

Subsection 8.2.1: Single Phase

Subsection 8.2.2: Multiple Phases

Section 8.3: Probability Density Function

Subsection 8.3.1: Definition

Subsection 8.3.2: PDF Transport Equation

Subsection 8.3.3: Filtered PDF Transport Equation

Chapter 9: MOMENTS

Section 9.1: Definition

Section 9.2: Method of Moments for NDF Transport

Section 9.3: Moment Transport Equation Derivation

Chapter 10: QUADRATURE-APPROXIMATED NUMBER DENSITY FUNCTION TRANSPORT EQUATION

Section 10.1: Univariate Quadrature-Approximated NDF Transport Equation

Section 10.2: Multivariate Quadrature-Approximated NDF Transport Equation

Section 10.3: Summary

Chapter 11: MOMENT-TRANSFORMED NUMBER DENSITY FUNCTION TRANSPORT EQUATION

Section 11.1: Moment-Transformed Univariate NDF

Section 11.2: Moment-Transformed Multivariate NDF

Chapter 12: CONSTRUCTION OF LINEAR SYSTEM FOR DQMOM

Section 12.1: Univariate Linear System

Section 12.2: Multivariate Linear System

Chapter 13: SPECIAL CASES FOR DQMOM LINEAR SYSTEM

Section 13.1: No Birth/Death

Subsection 13.1.1: No Birth/Death Only

Subsection 13.1.2: No Birth/Death, No Dispersion/Diffusion

Subsection 13.1.3: No Birth/Death, Unmixed Moments Only

Section 13.2: Small N, Small

Section 13.3: Mixed Moment Choices

List of Tables

Table 3.1: Coded values and corresponding variable values for the grid convergence analysis experimental design.

Table 3.2: Coded and uncoded values for the half factorial design matrix for the grid convergence analysis.

Table 3.3: Orders of convergence computed as part of the solution verification grid convergence study for the Arches coal gasification model.

Table 3.4: Grid convergence index at each grid resolution for the response. The response reported is for the highest value of available at the given resolution. The reported GCI is (compared to the m grid) for all grids except m, and (compared to the m grid) for the m grid.

Table 3.5: Grid convergence index at each grid resolution for the response. The response reported is for the highest value of available at the given resolution. The reported GCI is (compared to the m grid) for all grids except m, and (compared to the m grid) for the m grid.

Table 3.6: Grid convergence index at each grid resolution for the response. The response reported is for the highest value of available at the given resolution. The reported GCI is (compared to the m grid) for all grids except m, and (compared to the m grid) for the m grid.

Table 3.7: Environment convergence index at each grid resolution for the response. The response reported is for the highest value of available for the given value of . The reported ECI is (compared to the solution) for all except , and (compared to the solution) for the solution.

Table 3.8: Environment convergence index at each grid resolution for the response. The response reported is for the highest value of available for the given value of . The reported ECI is (compared to the solution) for all except , and (compared to the solution) for the solution.

Table 3.9: Environment convergence index at each grid resolution for the response. The response reported is for the highest value of available for the given value of . The reported ECI is (compared to the solution) for all except , and (compared to the solution) for the solution.

Table 4.1: Model input/uncertainty map: means and their associated prior uncertainties (* = log scale).

Table 5.1: Table illustrating the use of factor levels and responses to obtain multiway interaction effects using Yates’ Method.

Table 5.2: orthogonal array, used for creation of 8-run screening designs.

Table 5.3: Example orthogonal array for a 4 factor screening design.

Table 5.4: Aliasing identities for all main effects. , , , , , and represent , , , , , and , respectively.

Table 5.5: Screening study used for the first step of sequential assembly of the Arches coal gasification model response surface.

Table 5.6: Overall main effects for each variable on the three responses of interest, computed from the screening study. The main effects are averaged over Zone I and Zone II (all spatial locations) and ranked in order of most to least significant effect.

Table 5.7: Zone I main effects for each variable on the three responses of interest, computed from the screening study. The main effects are averaged over Zone I and ranked in order of most to least significant effect.

Table 5.8: Zone II main effects for each variable on the three responses of interest, computed from the screening study. The main effects are averaged over Zone II and ranked in order of most to least significant effect.

Table 5.9: Full factorial design for the screening study variables with the 4 largest main effects. An asterisk indicates a run at the specified conditions is already available; Table contains the screening study design points, while this table contains the complementary design points, which compose a full factorial design when combined with the screening study design points.

Table 5.10: Main effects for each variable on the three responses of interest, as determined by the factorial design. The main effects are averaged over all spatial points and ranked in order of most to least significant effect.

Table 5.11: Zone I main effects for each variable on the three responses of interest, as determined by the factorial design. The main effects are averaged over Zone I and ranked in order of most to least significant effect.

Table 5.12: Zone II main effects for each variable on the three responses of interest, as determined by the factorial design. The main effects are averaged over Zone II and ranked in order of most to least significant effect.

Table 5.13: Two way interaction effects as determined by the full factorial design. The interaction effects are averaged over all spatial points and ranked in order of most to least significant effect.

Table 5.14: Zone I two way interaction effects, as determined by the factorial design. The main effects are averaged over Zone I and ranked in order of most to least significant effect.

Table 5.15: Zone II two way interaction effects, as determined by the factorial design. The main effects are averaged over Zone II and ranked in order of most to least significant effect.

Table 5.16: Mean square error for response surfaces -.

Table 5.17: and MSE for 16-term response surface given by equation , updated to account for the new center design point.

Table 5.18: and MSE for 10-term response surface given by equation , updated to account for the new center design point.

Table 5.19: ANOVA for CO response at and cm.

Table 5.20: ANOVA for CO response at and cm.

Table 5.21: ANOVA for CO response at and cm.

Table 6.1: The parameters in the feasible set resulting from the validation of the model’s prediction of , , for the species-only comparison test.

Table 6.2: The parameters in the feasible set resulting from the validation of the model’s prediction of CO, , for the consistency test of radial profile points only at cm, cm, and cm ( cm is excluded because it is inconsistent).

Table 6.3: The parameters in the feasible set resulting from the validation of the model’s predictions of CO and , and , for the consistency test of radial profile points only, at cm and cm for CO and cm (the only consistent radial profile) for .

Table 6.4: Experimental observations for Brown II and Brown III experiments.

List of Figures

Figure 2.1: Illustrative schematic of coal particle components and reactions.

Figure 3.1: A proposed error taxonomy.

Figure 3.2: Grid convergence results for the MMS X manufactured solution.

Figure 3.3: Grid convergence results for the MMS XYZ manufactured solution.

Figure 3.4: Bar plot of the main effects of and and the interaction effect , computed from the results of the solution verification grid convergence studies of all three responses.

Figure 3.5: coefficients and mean squared error as a function of integer values of and for the convergence study of response at cm. Values of and are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

Figure 3.6: coefficients and mean squared error as a function of integer values of and for the convergence study of response at cm. Values of and are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

Figure 3.7: coefficients and mean squared error as a function of integer values of and for the convergence study of response at cm. Values of and are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

Figure 3.8: coefficients and mean squared error as a function of integer values of and for the convergence study of response at cm. Values of and are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

Figure 3.9: coefficients and mean squared error as a function of integer values of and for the convergence study of response at cm. Values of and are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

Figure 3.10: coefficients and mean squared error as a function of integer values of and for the convergence study of response at cm. Values of and are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

Figure 4.1: Each category of uncertainty in the uncertainty taxonomy (dotted ovals) corresponds to a step in the validation process (solid boxes).

Figure 5.1: Quantile plot of the main effects for Zone I, computed from the screening design.

Figure 5.2: Box contour plot of the model prediction for Zone I for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.3: Box contour plot of the model prediction for Zone I for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.4: Box contour plot of the model prediction for Zone I for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.5: Quantile plot of the main effects for Zone II, computed from the screening design.

Figure 5.6: Box contour plot of the model prediction for Zone II for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.7: Box contour plot of the model prediction for Zone II for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.8: Box contour plot of the model prediction for Zone II for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.9: Quantile plot of the main and interaction effects for the entire gasifier, computed from the full factorial design.

Figure 5.10: Quantile plot of the main and interaction effects for Zone I, computed from the full factorial design.

Figure 5.11: Quantile plot of the main and interaction effects for Zone II, computed from the full factorial design.

Figure 5.12: Box contour plot of the model prediction for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.13: Box contour plot of the model prediction for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.14: Box contour plot of the model prediction for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.15: Box contour plot of the model prediction for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.16: Box contour plot of the model prediction for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.17: Box contour plot of the model prediction for the response as a function of the two most significant main factor effects for the responses at . Each of the five insets are for (left to right) , 2 cm, 4 cm, 6 cm, and 8 cm.

Figure 5.18: Plot of the surrogate model response (gray surface) for the 16-term response surface , along with the Arches responses being fit, for cm and cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.19: Plot of the surrogate model response (gray surface) for the 16-term response surface , along with the Arches responses being fit, for cm and cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.20: Plot of the surrogate model response (gray surface) for the 16-term response surface , along with the Arches responses being fit, for cm and cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.21: Plot of the surrogate model response (gray surface) for the 10-term response surface , along with the Arches responses being fit, for cm and cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.22: Plot of the surrogate model response (gray surface) for the 10-term response surface , along with the Arches responses being fit, for cm and cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.23: Plot of the surrogate model response (gray surface) for the 10-term response surface , along with the Arches responses being fit, for cm and cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.24: Residuals from comparison of the 16-term response surface to Arches predictions, , as a function of the response , for cm and cm. The residual from the design point at the center of the factorial design is indicated with the center label.

Figure 5.25: Residuals from comparison of the 16-term response surface to Arches predictions, , as a function of the response , for cm and cm. The residual from the design point at the center of the factorial design is indicated with the center label.

Figure 5.26: Residuals from comparison of the 16-term response surface to Arches predictions, , as a function of the response , for cm and cm. The residual from the design point at the center of the factorial design is indicated with the center label.

Figure 5.27: Residuals from comparison of the 10-term response surface to Arches predictions, , as a function of the response , for cm and cm. The residual from the design point at the center of the factorial design is indicated with the center label.

Figure 5.28: Residuals from comparison of the 10-term response surface to Arches predictions, , as a function of the response , for cm and cm. The residual from the design point at the center of the factorial design is indicated with the center label.

Figure 5.29: Residuals from comparison of the 10-term response surface to Arches predictions, , as a function of the response , for cm and cm. The residual from the design point at the center of the factorial design is indicated with the center label.

Figure 6.1: Contour plots comparing experimental data to simulation results for runs screen-1 (a) and screen-2 (b).

Figure 6.2: Contour plots comparing CO experimental data to simulation results for runs screen-3 (a) and screen-4 (b).

Figure 6.3: Contour plots comparing CO experimental data to simulation results for runs screen-5 (a) and screen-6 (b).

Figure 6.4: Contour plots comparing CO experimental data to simulation results for runs screen-7 (a) and screen-8 (b).

Figure 6.5: Contour plots comparing experimental data to simulation results for runs fact-9 (a) and fact-10 (b).

Figure 6.6: Contour plots comparing CO experimental data to simulation results for runs fact-11 (a) and fact-12 (b).

Figure 6.7: Contour plots comparing CO experimental data to simulation results for runs fact-13 (a) and fact-14 (b).

Figure 6.8: Contour plots comparing CO experimental data to simulation results for runs fact-15 (a) and fact-16 (b).

Figure 6.9: Contour plots comparing residuals from comparison of Arches results to data for the response for the screening study runs (Section ).

Figure 6.10: Contour plots comparing residuals from comparison of Arches results to data for the response for the screening study runs (Section ).

Figure 6.11: Contour plots comparing residuals from comparison of Arches results to data for the response for the screening study runs (Section ).

Figure 6.12: Contour plots comparing residuals from comparison of Arches results to the data for the response for the full factorial design (Section ).

Figure 6.13: Contour plots comparing residuals from comparison of Arches results to the data for the response for the full factorial design (Section ).

Figure 6.14: Contour plots comparing residuals from comparison of Arches results to the data for the response for the full factorial design (Section ).

Figure 6.15: Plot of left and right bounds (, ) of simulation response for prior parameter set (gray line) and for feasible set (black line) for system response. Experimental data uncertainty range is demarcated with dotted lines.

Figure 6.16: Same quantity as Figure , but with reduced y-axis range for ease of interpretation.

Figure 6.17: Plot of left and right bounds (, ) of simulation response for prior parameter set (gray line) and for feasible set (black line) for CO system response, with validation being grouped by location into radial profiles. Experimental data uncertainty range is demarcated with dotted lines. Inconsistent radial profiles are indicated.

Figure 6.18: Plot of left and right bounds (, ) of simulation response for prior parameter set (gray line) and for feasible set (black line) for system response, with validation being grouped by location into radial profiles. Experimental data uncertainty range is demarcated with dotted lines. Inconsistent radial profiles are indicated.

Figure 6.19: Scatter plots of consistency for parameter subspaces. Negative values of (gray) indicate inconsistent model predictions, while positive values of (black) indicate consistent model predictions.

Figure 6.20: Scatter plots of consistency for parameter subspaces. Negative values of (gray) indicate inconsistent model predictions, while positive values of (black) indicate consistent model predictions.

Figure 6.21: Scatter plots of consistency for parameter subspaces, with consistency data grouped by species.

Figure 6.22: Scatter plots of consistency for parameter subspaces, with consistency data grouped by species.

Figure 6.23: Probability function for consistency and inconsistency for comparison of model predictions with data at all locations and for species.

Figure 6.24: Conditional probability function for consistency and inconsistency of all species and all spatial locations as a function of and conditioned on extreme values of , (a) and (b) .

Figure 6.25: Conditional probability function for consistency and inconsistency of all species and all spatial locations as a function of , conditioned on (a) , , and ; and on (b) , , and .

Figure 6.26: Probability function for consistency and inconsistency with respect to for species CO at each radial position.

Figure 6.27: Probability function for consistency and inconsistency with respect to for species at each radial position.

Figure 6.28: Probability function for consistency and inconsistency with respect to for species at each radial position.

Figure 6.29: Two-dimensional probability function for consistency (black) and inconsistency (red) for species CO at all spatial locations in the parameter subspace.

Figure 6.30: Two-dimensional probability function for consistency (black) and inconsistency (red) for species CO at all spatial locations in the parameter subspace.

Figure 6.31: Two-dimensional probability function for consistency (black) and inconsistency (red) for species at all spatial locations in the parameter subspace.

Figure 6.32: Two-dimensional probability function for consistency (black) and inconsistency (red) for species at all spatial locations in the parameter subspace.

Figure 6.33: Two-dimensional probability function for consistency (black) and inconsistency (red) for species at all spatial locations in the parameter subspace.

Figure 6.34: Two-dimensional probability function for consistency (black) and inconsistency (red) for species at all spatial locations in the parameter subspace.

Figure 6.35: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations cm and cm (the only location on the centerline for which CO measurements were available).

Figure 6.36: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations cm and and 36 cm.

Figure 6.37: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations cm and and 112 cm.

Figure 6.38: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations cm and and 36 cm.

Figure 6.39: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations cm and and 112 cm.

Figure 6.40: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm for subplots (a) and (b). data were not reported at and 34 cm.

Figure 6.41: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm.

Figure 6.42: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 36 cm.

Figure 6.43: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm.

Figure 6.44: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm.

Figure 6.45: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 36 cm.

Figure 6.46: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm.

Figure 6.47: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 36 cm.

Figure 6.48: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm.

Figure 6.49: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations cm and cm (the only location on the centerline for which CO measurements were available).

Figure 6.50: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations cm and and 36 cm.

Figure 6.51: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations cm and and 112 cm.

Figure 6.52: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations cm and and 36 cm.

Figure 6.53: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations cm and and 112 cm.

Figure 6.54: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm for subplots (a) and (b). data were not reported at and 34 cm.

Figure 6.55: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm.

Figure 6.56: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 36 cm.

Figure 6.57: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm.

Figure 6.58: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm.

Figure 6.59: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 36 cm.

Figure 6.60: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm.

Figure 6.61: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 36 cm.

Figure 6.62: Plots of the consistency probability function as a function of parameter for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species and spatial locations cm and and 112 cm.

\body

\parindent 2em \parskip 0pt

1 INTRODUCTION

The basic tool for the manipulation of reality is the manipulation of words.
― Philip K. Dick

1.1 Model Validation

The evolution of the field of simulation has been occurring at an exponential pace, largely following Moore’s Law. The proliferation of simulation as a methodology for exploring theories and their implications has been just as rapid. There is a great deal of optimism among the scientific community about the potential of simulation to change the face of science [25]. However, it has, like many fields of science at some point in their history, reached a point where a crisis of faith is nearly inevitable: many concepts inherent in constructing computational models and quantitatively determining levels of belief in their results have yet to be firmly established. There are many epistemological problems with computer simulations that are either ignored or are implicitly answered incorrectly, and without addressing these questions, simulation cannot mature as a science.

Other fields, such as mathematics, have experienced a similar cycle: the advent of a tool (e.g. the calculus); its widespread use and feeling that it is capable of provide answers to almost any question; a crisis of faith precipitated by epistemological questions (e.g., “Do we actually know what a differential is?” “How can this concept be applied successfully if it isn’t even defined rigorously?”); and a subsequent improvement of the tool through increased rigor and better definitions. While some are addressing the epistemological questions of simulation [41, 128, 40, 180], most are ignoring them and treating simulations as magic answer boxes (more magical than an ordinary black box). The rush to publish often trumps the need to find answers to philosophical questions.

Validation is central to all of these epistemological questions. By addressing validation, it will be possible to advance simulation science beyond its current capabilities, not just by bringing more power to bear on problems, but by developing a consistent approach to how one determines when a model is true and how much belief one should place in model predictions.

1.2 Coal Gasification

1.2.1 Motivation

Coal is an abundant and increasingly important source for domestic energy production in the United States; the Energy Information Administration estimates that 28% of the world’s coal is located in the United States, more coal than is found in Russia, China, or India [5]. Electrical power from coal accounts for 42% of the world’s electricity [5], and 51% of the United States’ electricity [1]. However, while coal is abundant and ubiquitous, it is a major source of pollution; although 51% of electricity in the U.S. comes from coal, CO₂ emissions from coal accounted for 80% of CO₂ emissions from electrical utilities [3]. Coal is also a source of black carbon, another contributor to global warming, as well as heavy metal compounds like mercury. Cleaner and more efficient utilization of coal by utilities is critical. Many proposed ideas for CO₂ separation or mitigation exist, ranging from gasification to chemical looping, oxy-fuel combustion, and underground thermal treatment of coal.

Gasification of coal offers a versatile and clean method for converting coal into gaseous fuel. In gasification, the solid fuel (coal) is oxidized in a fuel-rich environment at a high temperature and pressure. Under these conditions, the coal is broken apart into a gaseous mixture of CO and H₂, which compose syngas fuel, the primary product of coal gasification, along with other products, such as CO₂ and H₂ O. In addition to producing combustible gaseous fuel, coal gasifiers are also more efficient than traditional coal-fired boilers, both in thermal conversion of energy and in power cycle design. Additionally, gasification provides a method for converting fossil fuels to chemical feedstocks such as ammonia or methanol.

However, coal gasification is still poorly understood. Coal is an extremely complex fuel, and the physical processes occurring in a gasifier span enormous length and time scales and involve large amounts of energy. Comprehensive models describing coal gasification must account for a large number of coupled physical processes. In order to attain better understanding of gasification for the design and retrofit of applied-scale gasifiers, simulation tools that can handle these complex systems must be developed and the accuracy of their predictions quantified. For this reason, computer simulation has the potential to offer much-needed insight into coal gasification and offer a predictive capability to industry. Development of large-scale computational models and assessment of their predictive capabilities is a critical step in this process.

1.2.2 Coal Gasification and Combustion Models

Coal has been utilized as an energy source by humans for centuries. Despite this, coal combustion and gasification are not well-understood problems, and it is likely coal reserves will run out before they become well-understood problems. The common treatment of coal combustion and gasification is through empiricism; descriptions of the fundamental physical mechanisms driving coal processes have received attention only recently.

The existing body of literature related to coal utilization is substantial, in part because of the many facets of the problem. Anderson et al. [4] compiled an extensive amount of information addressing characterization and utilization of coal, but did not address modeling of coal systems. Smoot and Pratt [160] gave an overview of the major physical processes governing coal combustion and gasification, and included some mathematical models describing these processes. Several researchers have compiled these mathematical models into comprehensive computer models. Smoot and Smith [159] provided an extensive review of such modeling strategies, and implemented them to create a computer model for coal combustion and gasification, PCGC-2 and -3 [157, 69]. All of these references are widely cited and have formed an established starting point for much research in the coal community.

Comprehensive coal models must address a complex multi-physics problem by incorporating a multitude of sub-models. Additionally, there are a large number of controlling physical and chemical mechanisms in coal gasification [158, 159]. These varied physical processes include gas-phase turbulent mixing [171, 134, 49], turbulent particle mixing [14, 44, 99], convective and radiative heat transfer (both from the gas and from the particles) [44, 115], coal devolatilization [13, 93, 177, 8, 64], and heterogeneous char oxidation [155, 179, 42, 116].

The primary emphasis in this work is on the implementation of a novel combination of the direct quadrature method of moments (DQMOM) with large eddy simulation (LES) to simulate coal gasification. This implementation was performed in a massively parallel simulation tool called Arches and includes physical models for the dispersed coal phase, gas phase turbulence and combustion models, and coupling between the two phases. The mathematical formulation is covered in great detail in Chapter 2↓ and in the Appendices.

1.3 A Need for Epistemology

Simulation science is one of the newest branches of science to emerge. This branch of science applies computers to the numerical solution of mathematical models, composed of systems of equations, to create representations of reality. Scientific understanding of the world around us is bounding forward, leading to increasingly complex models accounting for more physical phenomena and incorporating more mathematics. This forward progress is matched by a tremendous increase in capability of computational hardware, as well as a corresponding increase in the complexity and scale of scientific software. While the numerical methods underlying computational implementations of mathematical models have been around much longer than computers, the scale at which these methods can be applied has increased by many orders of magnitude, opening up entirely new domains of application.

But despite the rapid growth and drastic changes to science that simulation has brought, simulation science has not reached a stage of maturity enjoyed by other, more established fields. The question of how to quantify how well a computer model matches reality (or even how well a computer model matches its corresponding mathematical formulation) is still being debated. While some progressive scientific journals have adopted policies that take modest steps forward, and while some authors have made urgent and long-standing calls for standards, a consistent epistemology for simulation is still lacking.

Loosely defined, epistemology is the study of knowledge. It poses the questions: When do we consider simulation results true? Why do we believe simulation results? How do we justify our belief in simulation results? For established fields, such as mathematics or scientific experimentation, a consistent epistemology has already been established, and is well developed as a result of decades or centuries of debate. Mature scientific fields like mathematics have experienced crises of faith that are precipitated by epistemological questions. These lead to debate, proliferation of new methods, and a general strengthening of the field’s foundations.

A debate of the epistemic foundations of mathematics has been ongoing for over a century [114]. Bertrand Russel, who, along with A. Whitehead, attempted to construct a consistent epistemology for mathematics, famously said that “mathematics may be defined as the subject in which we never know what we are talking about, nor whether what we are saying is true” [145]. Addressing such questions almost never results in more certainty. But the result of confronting such difficult epistemological questions is the strengthening of the scientific field in question and the development of new methods to address or account for these uncertainties.

1.3.1 The Role of Probability

Scientific experimentation has also confronted epistemological questions, which has led to the understanding of experimental error and bias, and has contributed immeasurably to probability theory. Probability provides a language in which to couch experimental observations and their associated uncertainties, a significant acknowledgement that experimental measurements do not exactly measure truth, but rather make statements or put conditions on truth. Concepts of probability underlying quantification of experimental uncertainty can be traced as far back as Jacob Bernoulli in the 16th century, who presented one of the first mathematical approaches to measurement of uncertainty [16]. The application of probability to astronomical measurements was the topic of a letter from Thomas Simpson, read to the Royal Society, entitled “On the Advantage of Taking the Mean of a Number of Observations, in Practical Astronomy,” which began:

My Lord,

It is well known to your Lordship, that the method practiced by astronomers, in order to diminish the errors arising from the imperfections of instruments, and of the organs of sense, by taking the Mean of several observations, has not been so generally received, but that some persons, of considerable note, have been of opinion, and even publicly maintained, that one single observation, taken with due care, was as much to be relied on as the Mean of a great number. [154]

Probability theory provides the language and the tools needed to address epistemological questions. Epistemology of simulation is clearly not black-and-white: simulations must predict many quantities, some of which are field values, vectors, or tensors; different simulations are expected to match experimental data to varying levels (i.e. the cheaper the model, the less agreement is expected); there are varying levels of confidence in the model predictions of experimental data; and the goal of simulations is typically to make predictions about a real system for which there is no data. All of these challenges are very unique, but probability theory provides mechanisms for dealing with all of these challenges in a quantitative and objective way.

As an example of how probability can contribute to a consistent system of epistemology, Bayesian statistics can be used to formally incorporate evidence to determine its impact on hypotheses. Bayes’ theorem is defined as:

(2.1) P(A∣B) = (P(B∣A)P(A))/(P(B))

where A and B are events (these can be thought of a hypotheses and data, respectively). P(A) and P(B) are the prior probabilities of A and B, and P(A∣B) is the conditional probability of event A given event B, and vice-versa for P(B∣A). This can be applied, for example, to a case of confirming a hypothesis H₁, given data observed from an experiment E₁. There are a set of alternative hypotheses H_i, i = 2…N. Bayes’ theorem can be applied to determine P(H₁∣E₁), the probability of hypothesis H₁ being correct conditioned on the experimental data E. Applying the theorem yields:

(2.2) P(H₁∣E₁) = (P(E₁∣H₁)P(H₁))/(P(E₁))

but P(E₁) can be expressed as:

(2.3) P(E₁) = ^N⎲⎳_i = 1P(E₁∣H_i) = P(E₁∣H₁)P(H₁) + ^N⎲⎳_i = 2P(E₁∣H_i)P(H_i)

(this assumes that all possible explanations, or hypotheses, for an explanation of the experiment have been proposed). This makes P(H₁∣E₁):

(2.4) P(H₁∣E₁) = (P(E₁∣H₁)P(H₁))/(P(E₁∣H₁)P(H₁) + ^N⎲⎳_i = 2P(E₁∣H_i)P(H_i))

Next, if new experimental data E₂ is gathered that disproves a hypothesis H₂, then P(H₂∣E₂) = 0. In this case, P(H₁∣E₁) can be re-expressed as:

(2.5) P(H₁∣E₁∩E₂) = (P(E₁∩E₂∣H₁)P(H₁))/(P(E₁∩E₂∣H₁)P(H₁) + ^N⎲⎳_i = 3P(E₁∩E₂∣H_i)P(H_i)).

This can be continued, with additional experimental data E₃ gathered that disproves hypothesis H₃, and so on, until eventually all probabilities P(E₁∩⋯∩E_N∣H_i) i = 2…N are zero. This makes the probability of the hypothesis:

(2.6) P(H₁∣E₁∩⋯∩E_N) = (P(E₁∩⋯∩E_N∣H₁)P(H₁))/(P(E₁∩⋯∩E_N∣H₁)P(H₁) + δ) = 1

so that, for δ = 0, the hypothesis is proven. The condition of δ = 0 rests on the assumption that all possible hypotheses explaining the experimental data have been proposed; if there is an alternative hypothesis that explains the experimental data E₁∩⋯∩E_N, then δ ≠ 0 and the probability of H₁ is no longer 1. Thus, a Bayesian statistical framework for epistemology allows for justification of inductive logic, and quantitative adjustment of justification for beliefs based on new data.

Sherlock Holmes, a famous Bayesian, stated: “Most people, if you describe a train of events to them, will tell you what the result would be. They can put those events together in their minds, and argue from them that something will come to pass. There are few people, however, who, if you told them a result, would be able to evolve from their own inner consciousness what the steps were that led to that result. This power is what I mean when I talk of reasoning backward, or analytically” [37]. Probability provides a language for logical justification of such inductive logic. Induction allows statements to be made about risky or uncertain outcomes, and as such, it is an entirely appropriate language for uncertain systems and for making predictions under uncertain conditions. As the famous theoretical physicist Wolfgang Pauli put it, “the inductive inferences of the natural sciences are always probability inferences” [129].

1.3.2 The Validation Casino

Like gamblers at the roulette table, scientists are making decisions about the outcome of uncertain systems, with some wager on the line (often this is human health, or the safety or efficiency of a particular system). The difference, however, is that scientists are continually forced to make such wagers, while the gambler can walk away from the betting table at any time. This analogy can be used to demonstrate the importance of answering philosophical questions and using tools such as probability, as well as the fallacy of ignoring them.

The initial wave of euphoria that has resulted from the advent and application of simulation is much like the wave of euphoria of a first-time gambler who wins big. Finally, he or she has found a magical way to double or triple their cash. With a streak of successes, the euphoria becomes greater, and the gambler wagers more money. But eventually, this ideal system comes crashing down around the gambler. Betting more and more, he or she begins to lose big. This causes a loss of faith in the magical money-making scheme. However, after some time, and much experience, the gambler understands that the system poses rewards as well as pitfalls and traps, and begins to understand these rewards and pitfalls. The gambler is then able to develop a betting system to get around the pitfalls and collect the rewards.

Mother Nature runs a crooked game. But as gamblers know, a crooked game provides opportunities for betting systems. With enough observations, gamblers can begin to develop models of the roulette wheel, starting with simple models (“The roulette wheel lands on black 45% of the time and red 55% of the time”), and progressing to more complex models (“The probability of the roulette wheel landing on the number N is given by the following formula...”). Posing philosophical questions is an important part of this process, too: how much does one trust the roulette wheel model? Such questions address levels of belief in models. Implicit in this question is, how much is one willing to bet that the model of the roulette wheel is correct? This question addresses the values that are held by the decision-maker.

1.4 Parlance

1.4.1 What Is Truth?

When Pontius Pilate asked the question, “What is truth?,” he was looking for an explanatory answer. However, the question will be answered here with a mere definition. Truth can be divided into rational truth and empirical truth. Rational truth, which may also be called mathematical truth, is logic-based, while empirical truth arises from observations of physical reality. This distinction between rational and empirical truth has been made by many philosophers, including Immanuel Kant (who referred to rational truth statements as “analytic a priori” statements and empirical truth statements as “synthetic a posteriori” statements) [71], David Hume (who referred to them as “relations of ideas” and “matters of fact,” respectively) [70], and Alfred Ayer (who divided these truths into analytic statements and empirically verifiable statements [A] [A] A potentially confusing entanglement of the computational engineering community’s use of the term “verification” with the activity that the community calls “validation,” probably resulting from the fact that “validatable” wasn’t as catchy a term., respectively) [11]. This distinction between rational truth and empirical truth is of upmost importance for the purpose of verification and validation/uncertainty quantification (V&V/UQ). Verification operates in the realm of rational truth (also called mathematical truth), while validation operates in the realm of empirical truth.

1.4.2 What Is Reality?

Empirical truth may be loosely defined as what is really “out there,” outside of ourselves. Rationalism and realism are philosophies that presume that meaningful, objective statements can be made about this reality, independent of ourselves. This equates reality to empirical truth. Phenomenalists and empiricists, on the other hand, hold that statements about reality are statements about subjective reality, that the only “out there” is “in here,” and that we cannot make meaningful objective statements independent of ourselves because we cannot have knowledge beyond ourselves. Reality, they state, is what we percieve, and nothing more: reality is not empirical truth, it is empirical observation. Such questions may seem pedantic, but it will be shown in Chapter 4↓ and elsewhere that the answers to such questions have strong implications for the validation process and how one judges a model.

1.4.3 What Is a Model?

There is a common perception, both in and outside of science, that the universe is governed by certain “laws,” many of which have been “found” and can be expressed in mathematical form (e.g., Newton’s Laws of Motion). However, such laws are not laws at all; they are, in fact, models. Models are simplified descriptions of reality. Various levels of detail in the model description are possible, ranging from mental rules of thumb to multi-scale, multi-phase mathematical models. But models should never be mistaken for empirical truth.

Let there be no misunderstanding: models must follow the laws of reality, not the other way around. Even such fundamental equations as governing equations, e.g., the continuity equation or the Navier-Stokes equation, are merely models. And, as George Box stated, “Essentially, all models are wrong, but some are useful” [17].

1.4.4 Error vs. Uncertainty

It is important to distinguish between the use of the terms “error” and “uncertainty.” Error refers to the deviation of a measured or calculated quantity from the truth, whether mathematical or empirical truth. For an equation with an exact, analytical solution, this is straightforward to calculate, because the true value of the solution can be evaluated with arbitrary accuracy. Error can be defined as:

e = y − ŷ

where e is the error, y is the true value of a quantity (true in either the mathematical or empirical sense), and ŷ is the approximation of y (measured or computed). Mathematical error refers to an error e for which y is a mathematically true quantity and ŷ is a model, or computed, value, whereas empirical error refers to an error for which y is an empirically true quantity and ŷ is an experimental measurement.

Uncertainty, in contrast, is used when the truth cannot be calculated or measured, and so the error must be approximated. Uncertainty is an interval which is believed to bound the truth (or error, a quantity defined by truth), with some level of belief. In the case of experimental measurements, this level of belief (also called a confidence level) is used to construct information about reality. A full statement about the uncertainty U thus consists of an uncertainty, which is an interval that bounds the truth:

U : l ≤ e ≤ u

and the level of belief in those bounds, denoted:

(2.7) U∣B

where l and u are the lower and upper bounds on uncertainty, respectively, and B is the level of belief about the error e being bounded by l and u. Like error, uncertainty can be mathematical uncertainty [B] [B] Mathematical uncertainty may be easily concieved for complex nonlinear equations with no analytical solution, but uncertainty also exists for such simple mathematical statements as 2 + 2 = 4; while it is very safe to say that 2 + 2 = 4, it is very cumbersome to prove it is true starting from purely logical principles (see Russell and Whitehead’s Principia Mathematica [181] for one such attempt); no proof satisfactory to the mathematics community has emerged. Indeed, in 1931 Kurt Gödel proved that attempts to do so were futile (see On Formally Undecidable Propositions of Principia Mathematica and Related Systems [63]). This type of uncertainty arises from the justification (or lack thereof) provided for axiomatic principles, the basis for mathematical proofs. However, this particular form of mathematical uncertainty is not at issue here, and is given a thorough treatment elsewhere [114]. It will be assumed that the fundamental mathematics underlying the techniques used are logically sound. (which is treated as analogous with numerical uncertainty for the purposes of this work), which is a bounds on mathematical error, or empirical uncertainty, which is a bounds on empirical error.

1.4.5 Reliability

As mentioned in the Validation Casino allegory, making risky decisions based on model predictions involves two questions, one explicit and one implicit. The explicit question is, “When can the model be trusted?” The implicit question is, “How much should be wagered on the outcome of the model?” A level of belief or reliability in a quantity may be established to help answer these questions about predictivity. This important concept of level of belief is addressed in Chapter 6↓, where a quantitative level of belief is given for the validity of a model.

1.5 Dissertation Roadmap

The dissertation consists of several pieces. Chapter 2↓ gives a detailed mathematical description of the coal gasification model, including the large eddy simulation (LES) gas phase turbulence model, the direct quadrature method of moments (DQMOM) dispersed phase model used to model the coal poarticles, and the relevant physical models for the coal particles. Several derivations relevant to equations given in Chapter 2↓ are covered in Appendices 10↓, 11↓, 12↓, and 13↓.

Chapter 3↓ begins the coverage of the verification and validation/uncertainty quantification (V&V/UQ) procedure by addressing the verification methodology used for the Arches coal gasification model. The numerical error and uncertainty in the gasification model are quantified, and a discussion of verification in the larger context of V&V/UQ is given.

Chapter 4↓ covers the overall post-verification validation process, whereby the agreement of a model with experimental data is quantified. A framework is adopted from the literature and applied to validation of the Arches coal gasification model.

Chapter 5↓ covers one of the steps of the validation model that is particularly critical for expensive computational models such as Arches, which is creation of surrogate models for use in the validation procedure. Response surfaces for each coal gasification system response are constructed, and a detailed statistical analysis is performed to quantify goodness of fit of the response surfaces.

Chapter 6↓ utilizes these response surfaces in the validation analysis. Two validation methodologies are used, the Data Collaboration approach (a validation methodology from the literature) and a Monte Carlo sampling approach. These methodologies are used to explore the characteristics of the response surfaces and determine where in parameter space the Arches gasification model makes valid predictions. The Monte Carlo results are then used to construct a prediction interval, which is a prediction of the probability of a model response being valid.

2 COAL GASIFICATION MODEL FORMULATION

Our present analytical methods seem unsuitable for the solution of the important problems arising in connection with nonlinear partial differential equations and, in fact, with virtually all types of nonlinear problems in pure mathematics. The truth of this statement is particularly striking in the ﬁeld of fluid dynamics...
― John von Neumann

This chapter establishes the mathematical bedrock in which the Arches coal gasification model is anchored. The chapter begins with a description of the governing equations of the multiphase coal gasification system being modeled. This begins with large eddy simulation (LES), which filters the governing equations to exclude small scale, high frequency turbulent length scales. The LES governing equations implemented in the Arches model are described (Section 2.1↓). Next, a detailed description of the coal particle is given, starting with the single particle probability density function (PDF), which describes the probability of a single particle having certain independent variable, or internal coordinate, values, such as temperature or composition (Section 2.2.1↓). This single particle PDF can be extended to describe all particles in a system, which is the particle number density function (NDF). The NDF describes the number of particles in a population having certain internal coordinate values. The transport equation for the number density function is a central equation in the multiphase direct quadrature method of moments (DQMOM) and its implementation in the LES coal gasification model (Sections 2.2.2↓, 2.2.3↓, and 2.2.4↓). The direct quadrature method of moments provides a method for tracking the transport and evolution of the particle NDF. The governing equations for quantities pertinent to coal systems (particularly the coal gas mixture fraction) are also given. The equations describing the solid phase reactions, physics, and chemistry are also described.

Next, the discretization of the equations describing the solid phase coal is described, and the relevant DQMOM equations are given. This provides a solid phase flow description to supplement the solid phase physics description. These two descriptions of the solid phase complement each other; this relationship is also described.

Finally, the Arches computational LES tool, which is the model that is extended to simulate coal gasification using DQMOM-LES, is briefly described.

2.1 Large Eddy Simulation Equations

A dispersed phase model was implemented in a large eddy simulation turbulence code. However, in order to cover the implementation of any dispersed-phase model, the implementation of the gas phase turbulence model must first be covered, as different turbulence modeling methodologies resolve and model flow field quantities very differently. Turbulence models can generally be classified into three groups: direct numerical simulation (DNS) models, Reynolds-averaged Navier Stokes (RANS) equation models, and large eddy simulation (LES) models [134].

DNS resolves all relevant length and time scales of turbulence, covering multiple orders of magnitude in length scales, and thereby minimize the dependence of the results on the models used. DNS also utilizes high-order numerical methods to marginalize the impact of numerical error and uncertainty on the simulation results. However, it is severely limited in its range of applicability due to the extremely high cost of resolving such a large range of length scales and including high-fidelity physical submodels.

RANS models, which solve a time-averaged governing equation, offer an alternative that is computationally tractable for realistic large-scale problems with complex geometries. However, the tradeoff is that RANS does not resolve any length or time scales of the flow; all effects of turbulence on the flow field are smeared out by a time-averaging process, and are replaced with models. Because of its computational feasibility, it has become ubiquitous in the computational fluid dynamics (CFD) community.

Large eddy simulation [150] provides a middle ground between RANS and DNS. Given that only 0.02% of scales are large and energy-containing [134], LES resolves only these large scales, and models small scales. This approach is based on the assumption that the fluid is locally isotropic below a certain scale (the Kolmogorov hypothesis [94]). This procedure is done using a low-pass filter kernel, where the smallest resolved scale is the filter width. Models for scales smaller than the filter width are denoted sub-filter scale (SFS) models.

The large eddy simulation equations in the computational LES tool are implemented in a finite volume formulation. First, the mass balance may be written:

(3.1) ⌠⌡_V(∂ρ)/(∂t)dV + ⌠⌡_V∇⋅(ρu) dV = ⌠⌡_VS_ρdV

where ρ is density, V is a control volume, u is the fluid velocity vector, and S_ρ is a mass source term (0 in most cases, but not when there is a phase change in the system).

Next, applying a box filter (following Pope’s definition [134]),

(3.2) φ = (1)/(Δ³)\dotsintop_xH⎛⎝(1)/(2)Δ − r⎞⎠φ(x, t) dr,

and a Favre filter following [46],

(3.3) \widetildeφ = (ρφ)/(ρ),

the filtered continuity equation becomes

(3.4) ⌠⌡_V(∂ρ)/(∂t)dV + ⌠⌡_V∇⋅(ρ\widetildeu)dV = ⌠⌡_V\widetildeS_ρdV.

(Note that because velocities are solved on a staggered mesh, the treatment of the velocities is slightly more complex than presented here, because they are face-filtered quantities; the reader is referred to [163] for further details.) No unclosed turbulent subgrid term appears in the filtered continuity equation due to the Favre filtering definition (3.3↑).

The same operations may be performed on the momentum equation,

(3.5) ⌠⌡_V(∂(ρu))/(∂t)dV + ⌠⌡_V∇⋅ρuudV = ⌠⌡_V[∇⋅τ − ∇p + ρg + S_ρu]dV

where the quantity τ is the deviatoric stress tensor, τ_ij = 2μS_ij − (2)/(3)μ(∂u_k)/(∂x_k)δ_ij, S_ij is the symmetric stress tensor S_ij = (1)/(2)⎛⎝(∂u_i)/(∂x_j) + (∂u_j)/(∂x_i)⎞⎠, and the second term in τ_ij, the trace, may be incorporated into the pressure term ∇p and computed as part of a pressure projection algorithm, as is done in Arches [163]. The source term S_ρu is a momentum source term that accounts for momentum transfer from other phases. Applying filtering yields the LES momentum equation,

(3.6) ⌠⌡_V(∂ρ\widetildeu)/(∂t)dV + ⌠⌡_V(∂)/(∂x_i)(ρ\widetildeu\widetildeu)dV = ⌠⌡_V[∇⋅τ + ∇⋅τ_SGS − ∇pδ_ij + ρgdV + \widetildeS_ρu]dV.

The energy balance is given by:

(3.7) ⌠⌡_V(∂(ρh))/(∂t)dV + ⌠⌡_V∇⋅(ρuh)dV = ∮_Sk∇h⋅dS − ⌠⌡_V∇⋅q dV + ⌠⌡_VS_hdV

where S_h is an enthalpy source term from another phase; when filtered, this becomes:

(3.8) ⌠⌡_V(∂(ρ\widetildeh))/(∂t)dV + ⌠⌡_V∇⋅(ρ\widetildeu\widetildeh)dV = ∮_Sk∇\widetildeh⋅dS − ⌠⌡_V[∇⋅\widetildeq + q^SGS_h + \widetildeS_h]dV

where q^SGS_h is the subgrid enthalpy dissipation containing the unresolved effects of turbulence on the enthalpy.

Finally, the mixture fraction equation, given by

(3.9) ⌠⌡_V(∂(ρf))/(∂t)dV + ⌠⌡_V∇⋅ρuf dV = ⌠⌡_V∇⋅(D∇f)dV + ⌠⌡_VS_fdV,

where D is the diffusivity and S_f is a mixture fraction source term, can be filtered, yielding the filtered mixture fraction equation:

(3.10) ⌠⌡_V(∂ρ\widetildef)/(∂t)dV + ⌠⌡_V∇⋅(ρ\widetildeu\widetildef)dV = ⌠⌡_V∇⋅(D∇\widetildef + ∇⋅q^SGS_f + \widetildeS_f)dV.

2.2 Coal Particle Equations

To begin, a single coal particle description will be established. From this single particle description, a description of a large population of coal particles will be derived. Transport equations describing the evolution of this population will be presented, and extra terms coming about due to the large eddy simulation filtering will be detailed.

Following Smoot and Smith [159], a single coal particle can be characterized using several particle independent variables. These are denoted:

Raw coal, α_cj
Char, α_hj
Particle size, d_pj
Ash (mineral matter), α_aj
Particle temperature, T_pj
Particle velocity vector, u_pj

The above quantities use the subscript j to denote the j^th particle. The variable r can be used to denote reaction rates, so that r_hj would be the net char reaction rate for the j^th particle. Using this nomenclature, physical processes important to coal particles can be depicted using Figure 2.1↓. The raw coal can react to form gaseous volatile matter in devolatilization reactions (subscript v) and solid char; the solid char can be oxidized to form more gaseous products. Water contained in the particle will evaporate and form steam. The ash mass is fixed, and ash is treated as inert.

Figure 2.1 Illustrative schematic of coal particle components and reactions.

2.2.1 Single-Particle Probability Density Function (PDF)

The single particle PDF is a starting point from which an approach for treating the entire solid phase can be formulated. At a particular location (x, t) = (x₀, t), the particle PDF is a joint velocity-scalar PDF (the velocity random variable vector u denoting the particle velocity vector and the scalar random variable vector ζ denoting the internal coordinate vector). The N_ξ-dimensional PDF (3 dimensions from the velocity sample space v and N_ξ − 3 dimensions from the internal coordinate sample space ξ) is defined following Section 8.3↓ and denoted as p_uζ. The transport equation for p_uζ can be written (following Section 8.3.2↓) as:

(3.11) (∂p_uζ(v, ξ;x, t))/(∂t) + (∂)/(∂x_i)(v_ip_uζ(v, ξ;x, t)) = − (∂)/(∂v_i)(⟨A_i∣v, ξ⟩p_uζ(v, ξ;x, t)) − (∂)/(∂ξ_i)(⟨G_i∣v, ξ⟩p_uζ(v, ξ;x, t)).

The quantities ⟨A_i∣v, ξ⟩ and ⟨G_i∣v, ξ⟩ are conditional quantities that describe the “velocity” of the PDF in the phase space (v, ξ). That is, A_i is defined by:

(3.12) A_i = (dv_i)/(dt)

and because the right side will depend on v and ξ, the quantity is a distribution. The particular value of A_i depends on vand ξ, and can be expressed as:

(3.13) ⟨A_i∣v, ξ⟩.

Likewise, G_j is defined by:

(3.14) G_i = (dξ_i)/(dt)

and is also a distribution, with a particular value of G_i expressed as a conditional quantity, depending on the value of v and ξ:

(3.15) ⟨G_i∣v, ξ⟩.

These expressions are posed in the same form as most Lagrangian single particle models. These models are composed of ordinary differential equations for internal coordinates of individual particles, and a large number of representative particles are tracked in this way. Thus, Lagrangian models can also be utilized in Eulerian models, which use a fixed frame of reference.

It should also be noted that because the variable v_i represents the entire velocity sample space, the particular value of velocity in the transport equation (3.11↑) is dependent on the value of p_u, ζ(v, ξ;x, t) and is a full distribution.

2.2.2 Population PDF: Number Density Function (NDF)

The single-particle PDF can be applied to a population of particles, and when this is done, it is called the number density function (NDF). The number density function describes the number of particles as a function of its spatial location and as a function of the particle independent variables, called internal coordinates; these are independent variables for the particles (for example, particle size, particle composition, etc.). This gives the number density function units of [# ⁄ (m³⋅ units of internal coordinates)]. The vector of internal coordinate random values is denoted by ζ, and the internal coordinate sample space is denoted by ξ. When the particle velocities are considered as internal coordinates, the random values are denoted by u, and the particle velocity sample space is denoted by v. In the case that the particle velocities are not considered as internal coordinates, an ensemble average velocity is used.

The full NDF as a function of internal coordinates, as well as space and time, is denoted f(v, ξ;x, t). At a fixed location in space and time (x, t) = (x₀, t), the number of particles at that point in space and time is denoted n_p and is given by:

(3.16) n_p(x₀, t₀) = ⌠⌡…^+ ∞⌠⌡_− ∞f(v, ξ;x₀, t₀) dξ

NDFs can be separated into two classes: univariate and multivariate. Univariate NDFs are only functions of one internal coordinate, so the internal coordinate sample space ξ is a single dimension:

f(ξ;x, t).

Multivariate NDFs, however, are functions of multiple internal coordinates, so the internal coordinate sample space has N_ξ dimensions:

(3.17) f(ξ;x, t) = f(ξ₁, ξ₂, … , ξ_{N_ξ}; x, t),

or, including the velocity as an internal coordinate,

(3.18) f(v, ξ;x, t) = f(v₁, v₂, v₃, ξ₁, …, ξ_{N_ξ − 3}; x, t).

As indicated, the NDF applies to a population of particles, and arises from applying the single-particle PDF p_uζ to each particle in the population. The particle PDF denotes the probability of the velocity-scalar vector taking on a particular value. At a fixed point in space and time, (x₀, t₀), the PDF is related to the NDF:

(3.19) f(v, ξ;x₀, t₀) = n_p(x₀, t)p_uζ(v, ξ;x₀, t₀).

Relationship (3.16↑) can be used to re-express this as:

(3.20) p_uζ(v, ξ;x₀, t₀) = (f(v, ξ;x₀, t₀))/(^+ ∞⌠⌡_− ∞f(v, ξ;x₀, t₀)dvdξ).

2.2.3 NDF Transport Equation

The PDF transport equation, given by

(3.21) (∂p_uζ(v, ξ;x, t))/(∂t) + (∂)/(∂x_i)(v_ip_uζ(v, ξ;x, t)) = − (∂)/(∂v_i)(⟨A_i∣v, ξ⟩p_uζ(v, ξ;x, t)) − (∂)/(∂ξ_i)(⟨G_i∣v, ξ⟩p_uζ(v, ξ;x, t)),

can be multiplied by the function n_p(x, t), and combined with a number balance equation so it commutes into each derivative, to yield an NDF transport equation:

(3.22) (∂f(v, ξ;x, t))/(∂t) + (∂)/(∂x_i)(v_if(v, ξ;x, t)) = − (∂)/(∂v_i)(⟨A_i∣v, ξ⟩f(v, ξ;x, t)) − (∂)/(∂ξ_i)(⟨G_i∣v, ξ⟩f(v, ξ;x, t)) + h(v, ξ;x, t)

where h is a source term representing the birth and death of particles in the domain. This is zero for coal systems and will be ignored.

The NDF transport equation velocity v_i, like the PDF transport equation velocity, represents the entire velocity variable sample space, so the particular value u_i that it takes on depends on the distribution f(v, ξ;x, t).

2.2.4 Filtered NDF Transport Equation

The operations described above can be performed on the filtered PDF transport equation (9.33↓) to yield the filtered NDF transport equation:

(3.23) (∂\widetildef(v, ξ;x, t))/(∂t) + (∂)/(∂x_i)(\widetildev_i\widetildef(v, ξ;x, t)) = − (∂)/(∂v_i)(\widetilde⟨A_i∣v, ξ⟩\widetildef(v, ξ;x, t)) − (∂)/(∂ξ_i)(\widetilde⟨G_i∣v, ξ⟩\widetildef(v, ξ;x, t)) + (∂τ_sgs, k)/(∂x_k) + (∂τ_{sgs, u_k})/(∂u_k) + (∂τ_{sgs, ζ_k})/(∂ζ_k).

The subgrid scalar flux τ_sgs, k represents flux of the number density as a result of unresolved turbulent velocity fluctuations v_k − \widetildev_k. Likewise, the subgrid scalar fluxes τ_{sgs, u_k} and τ_{sgs, ζ_k} both represent the subgrid flux of the number density in phase space (v, ξ).

2.3 Method of Moments Discretization

In order to track a continuous distribution like the NDF using a scalar transport equation framework, it is necessary to discretize the NDF using a set of scalars. One set of statistically significant scalars that can be used to represent the NDF are moments. Every distribution has a number of moments, with the k^th moment of a univariate PDF p(ξ) of a random variable x being defined as:

(3.24) m_k = ⌠⌡^+ ∞_− ∞ξ^kp(ξ)dξ.

This quantity can be interpreted physically as the expected value of ξ^k, given its distribution p(ξ). This can also be extended to the NDF by using equation (3.19↑):

(3.25) m_k = (^+ ∞⌠⌡_− ∞ξ^kf(ξ)dξ)/(^+ ∞⌠⌡_− ∞f(ξ)dξ).

Note that these definitions can also be extended to multivariate distributions, in which case the moment is a multiple variable index, k = {k₁, k₂, …, k_{N_ξ}}; in this case, the k^th moment is defined in terms of the PDF as:

(3.26) m_k = \dotsintop(^N_ξ∏_j = 1ξ^k_j_j)p(ξ)dξ

and is defined in terms of the NDF as:

(3.27) m_k = (\dotsintop^+ ∞_− ∞(^N_ξ∏_j = 1ξ^k_j_j)f(ξ)dξ)/(\dotsintop^+ ∞_− ∞ f(ξ)dξ).

(where the velocity v is incorporated into the internal coordinate vector ξ for notational convenience).

As discussed in Appendix B, the use of moments to represent distributions leads to a closure problem, because any higher order moment must be expressed in terms of still higher order moments. Thus, an expression for an arbitrary moment can’t be expressed only in terms of lower order moments. There are several methods for circumventing the closure problem, most of which require an assumed form of particle source terms or an assumed NDF shape. However, the heart of the problem is that the moments consist of an integral over the distribution, which is unknown. Gaussian quadrature provides an efficient way to approximate this integral, providing closure for the moment transport equations. Using Gaussian quadrature, the integrals can be expressed in terms of weights and abscissas, which can be expressed in terms of a finite set of lower order moments.

2.3.1 Quadrature Approximation

Quadrature approximates the integral of an unknown function with tabulated known values as a summation of a set of N weighted abscissas. It determines a polynomial of degree 2N − 1 whose zeros are the N weighted abscissas, and approximates the unknown function using this polynomial [136]. There are several common quadrature formulations, including the midpoint rule (the unknown function is assumed to be a constant, or zero-order polynomial), the trapezoid rule (the unknown function is assumed to be a straight line, or first order polynomial), and Simpson’s rule (the unknown function is assumed to be a second-order polynomial). Note that while the unknown function does not have to be a polynomial, the quadrature approximation becomes much better if it is (and exact if the unknown function is a polynomial of degree 2N − 1 or less). The general N-point quadrature formula can be written as:

(3.28) ^b⌠⌡_aw(r)g(r) dx ≈ ⎲⎳^N_α = 1w_αg(r_α)

where g(r) is an arbitrary function of the variable r. As N increases, the quadrature approximation usually becomes more accurate. This equation can also be extended to a multivariate function g(r), an arbitrary function of the D-element vector r = [r₁, r₂, …, r_D] to yield:

(3.29) ⌠⌡^b_aw(r)g(r)dr ≈ ⎲⎳^N_α = 1[w_αg(r_1α)g(r_2α)⋯g(r_Dα)].

The weights are common to all internal coordinates r because the weight function w(r) is binned into N discrete weights, and this weight function is common to all internal coordinates.

2.3.2 The Quadrature Method of Moments

The implementation of the method of moments using quadrature to provide closure is called the quadrature method of moments (QMOM). QMOM breaks the moment integrals (3.24↑) into a series of discrete weighted abscissas, and sums over all these weighted abscissas in order to evaluate the integral. QMOM provides closure for the method of moments because the weights and abscissas can be expressed in terms of lower-order moments of the NDF, eliminating the need to introduce successively higher order moments.

Applying equation (3.28↑), the NDF can be treated as a weighting function. Using the quadrature formulation, the internal coordinate vector is binned into N discrete values or phases (the abscissas of the quadrature approximation), and the NDF is binned into N discrete weights. If the value of the NDF is small at a given quadrature node α, the internal coordinate abscissa at that point in space and time ⟨ξ⟩_α has a small corresponding weight w_α.

This section will focus only on univariate distributions, due to the fact that QMOM can only treat univariate distributions (a feature discussed below). Mathematically, the univariate NDF can be expressed as the weighted sum of a set of delta functions. The quadrature approximation of a univariate average number density function f(ξ;x, t) in this form is:

(3.30) f(ξ;x, t) ≈ ⎲⎳^N_α = 1w_α(x, t)δ(ξ − ⟨ξ⟩_α(x, t))

where w_α is the weight of phase α. The moment transform of the quadrature approximation of this univariate NDF can then be taken (by multiplying by ξ^k and integrating over all of ξ-space):

(3.31) ⌠⌡^+ ∞_− ∞ξ^kf dξ ≈ ^N⎲⎳_α = 1{ξ^kw_α(x, t) δ(ξ − ⟨ξ⟩_α(x, t))}

Using QMOM, the equation for the k^th moment of the pdf p(ξ;x, t), given by equation (3.24↑), is approximated as:

(3.32) m_k ≈ ⎲⎳^N_α = 1{p_α ⟨ξ⟩^k_α}

where p_α is the probability of environment α, and the corresponding equation for the k^th moment of the NDF f(ξ;x, t), given by equation (3.25↑), is approximated as:

(3.33) m_k ≈ (^N⎲⎳_α = 1{w_α ⟨ξ⟩^k_α})/(^N⎲⎳_α = 1w_α)

meaning p_α is related to the weights as:

(3.34) p_α = (w_α)/(^N⎲⎳_i = 1w_i).

The quadrature approximation provides closure for the moments because the N weights and N abscissas can be written in terms of 2N moments.

The primary weakness of QMOM is that the transformation process of going from moments to weights and abscissas relies on the product difference (PD) algorithm [61], which utilizes properties of univariate distributions, and can only be applied to univariate distributions. The QMOM cannot be arbitrarily extended to multivariate distributions. Some authors have extended QMOM to bivariate distributions by combining the PD algorithm with principal component analysis (PCA) [190], or by using a conjugate gradient minimization algorithm for (N_ξ + 1)N dimensions (which quickly becomes computationally intractable for large numbers of quadrature points or internal coordinates) [188]. However, these suffer from similar weaknesses as QMOM - they cannot be arbitrarily applied to multivariate distributions without a great increase in algorithm complexity, as well as computational cost.

2.3.3 The Direct Quadrature Method of Moments

Because QMOM cannot be easily applied to multivariate distributions, and because the coal particle NDF is multivariate, an alternative method is needed that will apply to an arbitrary number of internal coordinates. The direct quadrature method of moments (DQMOM) is a more general approach than QMOM and will satisfy these requirements. While QMOM tracks the moments themselves, and thus requires an inversion process to go from the moments to the corresponding weights and abscissas, DQMOM tracks the weights and abscissas directly, eliminating the inversion process that is so troublesome for multivariate distributions.

DQMOM Equations

The direct quadrature method of moments involves several steps to go from the multivariate coal particle NDF f(ξ;x, t) to the set of transport equations used to track the NDF. First, the quadrature approximation is applied to the NDF, yielding a representation of the NDF using weights and absicssas (Section 2↓). This is then used to write the quadrature-approximated NDF transport equation. Next, the effect of the quadrature approximation on the NDF velocity is described, and a system of notation for the quadrature-approximated NDF velocity is presented. Next, the moment transform of the quadrature approximated NDF is taken in order to yield a set of independent moment transport equations. However, rather than solve these moment transport equations directly, they are re-expressed in the form of weight and weighted abscissa transport equations and a linear system that provides the source terms for these transport equations. The process of going from the NDF to the weight and weighted abscissa transport equations and the DQMOM linear system is demonstrated in its entirety in Appendix C. The derivation of the moment-transformed quadrature-approximated NDF transport equations are then covered in Appendix D. The construction of the linear system, which results from the moment-transformed quadrature-approximated NDF transport equations, is described in detail in Appendix E.

Quadrature-Approximated NDF

To begin a derivation of the DQMOM equations, the quadrature approximation is applied to a multivariate NDF, f(ξ;x, t), since the DQMOM can handle multivariate distributions. The multivariate NDF quadrature approximation is given by:

(3.35) f(ξ;x, t) ≈ ⎲⎳^N_α = 1(w_α δ(ξ − ⟨ξ⟩_α)) ≈ ⎲⎳^N_α = 1(w_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))

where, in the first definition (as with the multivariate NDF moment definition (3.27↑)), the velocity v is incorporated into the internal coordinate vector ξ for notational convenience, and, as with the univariate NDF quadrature approximation (3.30↑), both w_α and ⟨ξ⟩_α depend on space and time, but the dependence is omitted for clarity of notation. This quadrature approximated NDF can be plugged into the NDF transport equation, but first the proper approach and notation for the quadrature approximated NDF velocity should be introduced.

Quadrature-Approximated NDF Velocity

The quadrature-approximated NDF is composed of several environments, indexed by α. Each environment consists of a number of particles, equivalent to the weight w_α of the environment, each with a unique set of properties; ξ for the univariate NDF, and (v, ξ) for the multivariate NDF. The properties for the α^th environment are denoted ⟨ξ⟩_α or (⟨v⟩_α, ⟨ξ⟩_α), respectively. The environment-averaged velocity ⟨v_i⟩_α is:

(3.36) ⟨v_i⟩_α = ^w_α⎲⎳_q = 1a_qu_i, q

where a_q is defined as an arbitrary weighting factor subject to the constraint ∑_qa_q = 1, and u_i, q is the value of velocity for the q^th particle of the α^th environment. Likewise, the environment-averaged internal coordinate values ⟨ξ_j⟩_α are:

(3.37) ⟨ξ_j⟩_α = ^w_α⎲⎳_q = 1a_qζ_j, q.

This averaging procedure can also be applied to find the environment-averaged velocities A_i and G_i:

(3.38) ⟨A_i⟩_α = ^w_α⎲⎳_q = 1a_q⟨A_i∣v = u_q, ξ = ζ_q⟩ (3.39) ⟨G_i⟩_α = ^w_α⎲⎳_q = 1a_q⟨G_i∣v = u_q, ξ = ζ_q⟩.

Using these environment-averaged quantities leads to the convection terms being expressed somewhat differently; the spatial convection term for the NDF is expressed as:

(3.40) (∂)/(∂x_i)(⟨v_i⟩_αf(v, ξ;x, t)),

while the velocity and phase space convection terms are expressed, respectively, as:

(3.41) (∂)/(∂v_i)(⟨A_i⟩_αf(v, ξ;x, t)) (3.42) (∂)/(∂ξ_i)(⟨G_i⟩_αf(v, ξ;x, t)).

These can be used to apply the quadrature approximation to the NDF transport equation (3.22↑). Likewise, for the univariate case, the convection term becomes:

(3.43) (∂)/(∂ξ)(⟨G⟩_αf(ξ;x, t)).

Each of these environment averages will also have an associated diffusive flux term to account for fluxes due to velocities deviating from the environment-averaged velocities; these are defined for an arbitrary quantity φ, for v_i, A_i, and G_i, respectively:

(3.44) J_{φ, x_i, α} = ⎲⎳a_q[⟨v_i⟩_α − ⟨v_i∣v = u_q, ξ = ζ_q⟩]φ (3.45) J_{φ, v_i, α} = ⎲⎳a_q[⟨A⟩_α − ⟨A_i∣v = u_q, ξ = ζ_q⟩]φ (3.46) J_{φ, ξ_i, α} = ⎲⎳a_q[⟨G_i⟩_α − ⟨G_i∣v = u_q, ξ = ζ_q⟩]φ.

These diffusive fluxes create additional diffusive terms,

(3.47) D_{φ, x_i, α} = (∂)/(∂x_i)(J_{φ, x_i, α}) (3.48) D_{φ, v_i, α} = (∂)/(∂v_i)(J_{φ, v_i, α}) (3.49) D_{φ, ξ_i, α} = (∂)/(∂ξ_i)(J_{φ, ξ_i, α}).

These diffusive fluxes can be modeled with simple gradient diffusion models as,

(3.50) J_{φ, x_i, α} = Γ_{x_i, α}(∂φ)/(∂x_i) (3.51) J_{φ, v_i, α} = Γ_{v_i, α}(∂φ)/(∂v_i) (3.52) J_{φ, ξ_i, α} = Γ_{ξ_i, α}(∂φ)/(∂ξ_i).

Quadrature-Approximated NDF Transport Equation

The quadrature approximated NDF transport equation can be derived for both the univariate and multivariate case by combining the NDF transport equation (3.22↑) with the univariate and multivariate quadrature approximations (3.25↑) and (3.27↑), to yield the quadrature-approximated NDF transport equations. This procedure is performed in Appendix C. The resulting univariate quadrature-approximated NDF transport equation is given by a set of three equations:

(3.53) (∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i)) = a_α (3.54) (∂)/(∂t)(ς_α) + (∂)/(∂x_i)(⟨v_i⟩_ας_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂ς_α)/(∂x_i)) = b_α

where Γ_x, α is the spatial diffusivity of the NDF, as in equation (3.50↑), and ς_α = w_α⟨ξ⟩_α is the weighted abscissa for environment α. The third equation is given by:

\strikeout off\uuline off\uwave off

(3.55) ⎲⎳^N_α = 1[δ(ξ − ⟨ξ⟩_α) + δ^′(ξ − ⟨ξ⟩_α)⟨ξ⟩_α]a_α − ⎲⎳^N_α = 1[δ^′(ξ − ⟨ξ⟩_α)]b_α = ⎲⎳^N_α = 1δ^′′(ξ − ⟨ξ⟩_α)w_αC_α + S_ξ + D_ξ

where C_α is defined by\uuline default\uwave default

(3.56) C_α = Γ_x, α(∂⟨ξ⟩_α)/(∂x_i)(∂⟨ξ⟩_α)/(∂x_i),

S_ξ is the sum of the environment-averaged phase space convection terms, given by:

(3.57) S_ξ = − ^N⎲⎳_α = 1(∂)/(∂ξ)(⟨G⟩_αw_αδ(ξ − ⟨ξ⟩_α))

and D_ξ the associated phase space diffusive terms, given by:

(3.58) D_ξ = ^N⎲⎳_α = 1(∂)/(∂ξ)⎛⎝Γ_ξ, α(∂)/(∂ξ)(w_αδ(ξ − ⟨ξ⟩_α))⎞⎠.

Similarly, the multivariate quadrature-approximated NDF transport equation is given by a set of equations:

(3.59) (∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩w_α) − (∂)/(∂x_i)⎛⎝Γ_{x_i, α}(∂w_α)/(∂x_i)⎞⎠ = a_α (3.60) (∂)/(∂t)(ς_nα) + (∂)/(∂x_i)(⟨v_i⟩_ας_nα) − (∂)/(∂x_i)⎛⎝Γ_{x_i, α}(∂ς_nα)/(∂x_i)⎞⎠ = b_nα

where ς_nα = w_α⟨ξ_n⟩_α is the weighted abscissa for the n^th internal coordinate. The last equation in the set is given by:

(3.61) ⎲⎳^N_α = 1⎡⎣∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α) + ^N⎲⎳_m = 1(∂)/(∂⟨ξ_m⟩_α)(^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⟨ξ_m⟩_α⎤⎦a_α − ^N⎲⎳_α = 1^N⎲⎳_n = 1⎡⎣(∂)/(∂⟨ξ_n⟩_α)(^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎤⎦b_nα = ⎲⎳^N_α = 1^N_ξ⎲⎳_m = 1^N_ξ⎲⎳_n = 1⎡⎣(∂²)/(∂⟨ξ_m⟩_α∂⟨ξ_n⟩_α)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎤⎦w_αC_mnα + S_ξ + D_ξ.

where C_mnα = Γ_{x_iα}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_n⟩_α)/(∂x_i), and S_ξ and D_ξ are the sums of the phase space convective and diffusive terms, respectively:

(3.62) S_ξ = − ^N⎲⎳_α = 1(∂)/(∂v_i)[⟨A_i⟩_αw_α(∏_jδ(v_j − ⟨v_j⟩_α))(∏_jδ(ξ_j − ⟨ξ_j⟩_α))] − ^N⎲⎳_α = 1(∂)/(∂ξ_i)[⟨G_i⟩_αw_α(∏_jδ(v_j − ⟨v_j⟩_α))(∏_jδ(ξ_j − ⟨ξ_j⟩_α))],

and

(3.63) D_ξ = ^N⎲⎳_α = 1(∂)/(∂v_i)⎡⎣Γ_{v_i, α}(∂)/(∂v_i)(w_α(∏_jδ(v_j − ⟨v_j⟩_α))(∏_jδ(ξ_j − ⟨ξ_j⟩_α)))⎤⎦ + ^N⎲⎳_α = 1(∂)/(∂ξ_i)⎡⎣Γ_{ξ_i, α}(∂)/(∂ξ_i)(w_α(∏_jδ(v_j − ⟨v_j⟩_α))(∏_jδ(ξ_j − ⟨ξ_j⟩_α)))⎤⎦.

Moment-Transformed Quadrature-Approximated NDF Transport Equation

The quadrature-approximated NDF transport equation (equation (3.55↑) for the univariate case, equation (3.61↑) for the multivariate case) is a single equation, but there are multiple unknowns to be determined (weights and abscissas). In order to obtain a number of independent equations equal to the number of weights and abscissas, a set of independent moments are chosen, and the moment transform of the quadrature-approximated NDF transport equation yields a set of independent equations, equal in number to the number of independent moments. This procedure requires a number of moments equal to 2N in the univariate case (N weights and N abscissas), and (N_ξ + 1)N in the multivariate case (N weights and N_ξ × N abscissas), where N_ξ is the number of internal coordinates and N the number of DQMOM environments.

The process of taking the moment transform of the quadrature-approximated NDF is covered in detail in Appendix D. The results from this procedure are the univariate moment-transformed quadrature-approximated NDF transport equation:

(3.64) ⎲⎳^N_α = 1[⟨ξ^k⟩_α − k⟨ξ^k⟩_α]a_α + ⎲⎳^N_α = 1[k⟨ξ^k − 1⟩_α]b_α = ⎲⎳^N_α = 1k(k − 1)⟨ξ^k − 2⟩_αw_αC_α + S_k + D_k

and the multivariate moment-transformed quadrature-approximated NDF transport equation:

(3.65) ^N⎲⎳_α = 1[(^N_ξ∏_j = 1⟨ξ^k_j_j⟩_α) × − ^N_ξ⎲⎳_m = 1k_m(⟨ξ^k_m_m⟩_α)(∏^N_ξ_{j ≠ m, j = 1}⟨ξ^k_j_j⟩_α)]a_α + ⎲⎳^N_α = 1⎲⎳^N_ξ_n = 1[k_n⟨ξ^{k_n − 1}_n⟩_α × (^N_ξ∏_{j ≠ n, j = 1}⟨ξ^k_j_j⟩_α)]b_n, α = ^N⎲⎳_α = 1^N_ξ⎲⎳_m = 1[k_m(k_m − 1)⟨ξ^{k_m − 2}_m⟩_α × (^N_ξ∏_{j ≠ m, j = 1}⟨ξ^k_j_j⟩_α)]w_αC_mmα + ^N⎲⎳_α = 1^N_ξ⎲⎳_m = 1^N_ξ⎲⎳_n = 1[k_mk_n⟨ξ^{k_m − 1}_m⟩_α⟨ξ^{k_n − 1}_n⟩_α × (∏^N_ξ_{j ≠ m, j = 1}⟨ξ^k_j_j⟩_α)]w_αC_mnα + S_k + D_k

Both of these systems of equations are linear due to the quadrature approximation, and both can be rewritten as a matrix system,

(3.66) Ax = B.

This matrix can be solved for the weight and weighted abscissa transport equation source terms, a_α and b_nα in the above equations. The procedure of constructing and solving this linear system is covered in great detail in Appendix E. Some special cases lead to simplified linear systems that are much simpler to solve; these special cases are covered in Appendix F.

The DQMOM solution procedure is as follows:

For each internal coordinate i, the distribution is characterized by two sets of values, the weights w_α and the weighted abscissas w_α⟨ξ_i⟩_α. The starting values for these variables are obtained from the previous time step or from the initial distribution (initial conditions for the weights and weighted abscissas).
Using the weights’ and weighted abscissas’ values, the matrix system Ax = B is solved at each point in space to yield the source terms for the transport equations for the weights and the weighted abscissas.
The weights and weighted abscissas are updated to their new values at the next time step.
These new values are used in step 2.

DQMOM Equation Simplifications

While there are many terms in the DQMOM equations given above, many of them may be safely neglected. First, the terms C_mmα and C_mnα, given by:

(3.67) C_mnα = D_{x_i, α}w_α(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_n⟩_α)/(∂x_i) C_mmα = D_{x_i, α}w_α(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_m⟩_α)/(∂x_i)

may be neglected. Physically, these terms may be interpreted as diffusion of weights into different environments due to strong gradients in the abscissas. This term would play a more significant role if there were spatial diffusion of internal coordinate quantities among particles (e.g. heat or momentum transfer between particles), but this is insignificant in the case of entrained flow gasifier systems, because the coal particles are extremely dilute.

The phase space diffusion term D_ξ, defined as:

(3.68) D_ξ = ^N⎲⎳_α = 1(∂)/(∂ξ_i)⎡⎣Γ_{ξ_i, α}(∂)/(∂ξ_i)(w_αδ(ξ_j − ⟨ξ^k_j_j⟩_α))⎤⎦,

represents the diffusion in phase space, due to the deviation from the environmental-average phase space velocity ⟨G_j⟩_α of the actual phase space velocities of the particles composing environment α. This term becomes important when the particle distributions become wider, meaning there will be larger deviations from the environment averages, and when the number of environments N decreases. In most cases, the number of environments used to simulate coal particle systems is no less than seven (to ensure that N ≥ N_ξ, following the recommendation of [107]). For this reason, the phase space diffusion term is assumed to be insignificant.

2.4 Equations for Reacting Coal Systems

Given the NDF, the internal coordinate values for a given particle may be found; given those, the heat transfer, devolatilization reactions, and char oxidation reactions can be modeled. Specific sub-models for heat transfer and particle reactions are described below. However, a brief discussion of the general approach for treatment of the particle and gas phases is warranted.

The particle reactions can be generally described using a simple reaction schematic. The j^th particle undergoes M devolatilization and L char oxidation reactions (only one evaporation reaction is assumed),

(raw coal)_j \xrightarrowk_jm Y_jm(volatile gas) + (1 − Y_jm)(char) (ν_oxidizer)/(ν_product)(char)_j + (oxidizer) \xrightarrowk_jl (volatile gas)_l (moisture) \xrightarrow (steam)

where m = 1…M and l = 1…L; k_jmis the reaction rate for the m^th devolatilization reaction; and k_jl is the reaction rate for the l^th char oxidation reaction. The net reaction rate for the coal particle, r_j, can then be written

(3.69) r_hj = ⎲⎳_lr_hjl (3.70) r_vj = ⎲⎳_mr_vjm (3.71) r_j = r_hj + r_vj + r_wj

where r_hjl is the char reaction rate for the j^th particle and the l^th char oxidation reaction, r_vjm is the devolatilization reaction rate for the j^th particle and the m^th devolatilization reaction, r_hj and r_vj are the net char oxidation and devolatilization reactions (respectively), r_wj is the evaporation rate, and r_j is the net reaction rate for the j^th particle.

The gas-phase description of gaseous products from the coal can be described using varying levels of complexity. One general approach is the solids progress variable approach [21], a model chosen for its generality. Using this approach, the volatile coal gas and gas-phase reactions are tracked through the use of mixture fractions. Given N gas streams, N − 1 mixture fractions may be used to characterize the mixing of the streams. Thus, a system with a single inlet would have two streams (one feed gas stream, and one coal gas stream); a system with a primary and a secondary inlet would have three streams (two feed gas streams, and one coal gas stream); and so on. Multiple streams for coal off-gas (one for a given reaction or class of reactions) may also be used. The mixture fraction for an N-stream system will be defined as:

(3.72) η_i = (m_j)/(⎲⎳^N_k = 1m_k), i = 1…N − 1, j = 1…N.

where m_j is the mass originating in the j^th stream. Note that for coal gasification, these mixture fractions are not conserved quantities, as there is an introduction of mass into the system via coal particles. The mixture fraction source term comes from the coal particle reaction rates (the reaction rate from which the source originates depends on the formulation of the solids progress variable model).

2.4.1 Mixture Fraction Definitions

An entrained flow gasifier has three gas streams mixing: a primary inlet, a secondary inlet, and the volatile gas released from the coal particle. Three streams can be characterized using two mixture fractions, and this characterization will describe the coal particle model in Figure 2.1↑ due to the assumption that char oxidation and devolatilization gases being released by the coal particle are identical in elemental composition.

Denoting m_p as the mass of gas originating in the primary, m_s as the mass of gas originating in the secondary, and m_c as the mass of gas originating from the coal, definitions of the mixture fractions describing this system can be written:

Primary-Secondary Mixture Fraction: fraction of primary gas to primary and secondary gas; this quantity is conserved, because no primary or secondary mass is generated by coal particles:
(3.73) η_p, gas = (m_p)/(m_p + m_s).
Coal Gas Mixture Fraction: fraction of coal gas to the total gas phase mass; this quantity is not conserved, because the coal particles generate mass m_c:
(3.74) η_c = (m_c)/(m_c + m_p + m_s).
Primary Mixture Fraction: mass of primary feed to total gas phase mass; this quantity is not conserved because, like the coal gas mixture fraction 3.74↑, it contains mass generated by coal particles:
(3.75) η_p = (m_p)/(m_c + m_p + m_s) = η_pg(1 − η_c).

The transport equation for η_c follows 3.10↑; the source term for the η_c transport equation can be written:

(3.76) S_{η_c} = ⎲⎳_jr_{v_j}

and is equal to the net amount of volatiles released by the coal particles. The expressions for the reaction rates r_{v_j} are given in the next section.

2.5 Coal Models

2.5.1 Coal Reaction Rates

The coal particle is assumed to undergo several reaction processes. The first reaction is a devolatilization reaction, in which the raw coal in the particle is converted to both volatile gases and solid char. The second is char oxidation, in which the char in the coal particle is oxidized by the gas phase. The third is coal moisture evaporation, in which any water contained in the coal evaporates into the gas phase. The overall reaction rate for the j^th coal particle r_j can be described by:

(3.77) r_j = ⎲⎳^N_rxn, h_l = 1r_hjl + ^N_rxn, v⎲⎳_m = 1r_vjm + r_wj

where l is reaction number l of the N_rxn, h total char reactions, making r_hjl the reaction rate for the l^th char reaction for particle j; m is reaction number m of the N_rxn, v devolatilization reactions, making r_vjm the reaction rate for the m^th devolatilization reaction for particle j; and r_wj is the evaporation rate of water from particle j.

A simple reaction schematic for coal is as follows:

(raw coal)_j \xrightarrowk_jm Y_jm(volatiles) + (1 − Y_jm)(char) (ν_oxidizer)/(ν_product)(char) + (oxidizer)_l \xrightarrowk_l (gaseous products)_l (water) + (coal particle)_j \xrightarrowk_w (steam) + (coal particle)_j.

Net Raw Coal Reaction Rate

The net raw coal reaction rate is given by:

(3.78) r_cj = − ⎲⎳^N_rxn, v_m = 1r_cjm (3.79) = − ^N_rxn, v⎲⎳_m = 1(r_hjm + r_vjm)

where equation 3.78↑ shows that the net raw coal reaction rate is the sum of the reaction rates for each of the N_rxn, v devolatilization reactions, and equation 3.78↑ shows the contribution of the raw coal reaction rate to the char and volatile production rates, respectively. The relationship between these two quantities is given by:

(3.80) r_hjm = r_vjm(1 − Y_mj)/(Y_mj).

where Y_m is the fraction of raw coal that reacts to volatiles.

The reaction rate for raw coal in a particle j is therefore given by:

(3.81) (dα_c)/(dt) = r_cj.

Net Volatile Production Rate

The net volatile production rate due to devolatilization reactions is given by:

(3.82) r_vj = ⎲⎳^N_rxn, v_m = 1r_vjm.

Net Char Reaction Rate

The net char reaction rate is given by:

(3.83) r_hj = ⎲⎳^N_rxn, v_m = 1r_hjm − ^N_rxn, h⎲⎳_l = 1r_hjl

where the first term represents the rate of generation of char due to devolatilizing raw coal, with a contribution from each of the N_rxn, v devolatilization reactions (see equation 3.80↑); and the second term represents the consumption of char due to char oxidation reactions, with a contribution from each of the N_rxn, h char reactions. This reaction rate expression is given by Smoot and Smith as[159]:

(3.84) r_hjl = (A²_jMW_hjMW_gas(ν_oxidizer)/(ν_product)k_cjlk_jlξ_jC_olgC_g)/(MW_gasA_jC_g(ξ_jk_jl + k_cjl) + r_j)

On the right hand side, r_j is the overall reaction rate for particle j. The char reaction rate r_hjl appears in this term, as well as on the left-hand side, meaning the equation is implicit with respect to r_hjl. The other variables are: A_j, particle surface area; MW_hj, the molecular weight of the compound being oxidized (i.e. carbon); ν_oxidizer and ν_product are the number of oxygen atoms in the oxidizing agent and product gas of the char oxidation, respectively (the more oxygen atoms available in the oxidizer, the faster the oxidation rate; the more oxygen atoms required for the reaction, the slower the reaction rate); k_cjl is the mass transfer coefficient for particle j and reactants for char reaction l; k_jl is the reaction rate of char reaction l for particle j; ξ_j is the particle surface area factor for particle j; C_o, lg is the molar concentration of oxidizer (for reaction l) in the bulk gas phase; and C_g is the molar concentration of the bulk gas phase. If the bulk reaction term r_j is ignored, this expression becomes:

(3.85) r_hjl = (A_jMW_hj(ν_oxidizer)/(ν_product)k_cjlk_jlξ_jC_olg)/(k_cjl + ξ_jk_jl)

Note that both equation 3.84↑ and equation 3.85↑ account for both diffusion (through the mass transfer coefficient k_cjl) and reaction (through the reaction rate k_jl), but equation 3.84↑ accounts for the affect of the overall reaction rate (including devolatilization reactions and moisture evaporation) on the mass transfer, and vice versa, whereas equation 3.85↑ ignores this effect.

The reaction rate of char for a particle j is given by:

(3.86) (dα_h)/(dt) = r_hj.

2.5.2 Coal Devolatilization Model

Coal particle devolatilization is described using the two step devolatilization model presented by Kobayashi et al [93], hereafter referred to as the Kobayashi model.

The Kobayashi model addresses the need to describe the pyrolysis of coal as a function of temperature in the early stages of the combustor. This model introduces a set of two competing parallel first-order reactions that describe the conversion of raw coal into gas phase volatiles and char. The reactions for the devolatilization of raw coal for this model is expressed as,

(raw coal) \xrightarrowk₁ Y₁( volatiles) + (1 − Y₁)( char) (raw coal) \xrightarrowk₂ Y₂( volatiles) + (1 − Y₂)( char),

where Y is a stoichiometric coefficient. The values for Y₁ and Y₂ are determined from the volatile fraction of the proximate analysis (Y₁) and the fraction devolatilized at high temperatures (Y₂, often near unity).

The rate expression for the depletion of raw coal in the solid phase for a particle is

(3.87) (dα_c)/(dt) = r_jv1 + r_jv2 = − (k₁ + k₂)α_c,

and conversely the addition of coal gas to the gaseous phase is

(3.88) (dη_c)/(dt) = N_p(r_jv1 + r_jv2) = (Y₁k₁ + Y₂k₂)n_pα_c.

The rate constants k₁ and k₂ are modeled with an Ahrenius form as

(3.89) k_i = A_ie^{− E_i ⁄ RT},

where E₂≫E₁. The values of these constants are given by Kobayashi et al [93] as:

A₁ = 2 × 10⁵ s^− 1 A₂ = 1.3 × 10⁷ s^− 1 E₁ = − 25, 000 ( kcal)/(kmol) E₂ = − 40, 000 ( kcal)/(kmol)

with R = 1.987 ( kcal)/(kmol⋅K), T in units of K, and k in units of s^− 1.

2.5.3 Char Oxidation Models

After the raw coal in a particle has devolatilized, it forms volatile gas and char. The char has large amounts of carbon, and is oxidized by oxygen, steam, hydrogen, and carbon dioxide. The physical process of char oxidation is influenced by many different aspects. It is affected by the composition of the coal particle, the temperature of the particle, the microstructure of the particle after devolatilization, the temperature history of the particle, the size of the particle, etc. These effects cannot be modeled individually, so a global reaction approach is taken, where many of these processes are lumped into a global reaction rate parameter. The global reaction rate, incorporating both diffusion and reaction effects, is given by equation (3.84↑).

Again, following Smoot and Smith [159], the reaction rate can be expressed as:

(3.90) r_jl = (ν_oxidizer)/(ν_product)MW_hjk_jlξ_jA_pCⁿ_ox, surf,

where r_jl is the reaction rate for a particle for char reaction l, k_jl is the global reaction rate for char reaction l, C_ox, surf is the concentration of oxidizer at the surface of the particle, and n is the reaction order. This is combined with the expression for the diffusion of oxidizer to the surface of the particle:

(3.91) r_dlo = k_cjlMW_olA_p(C_olg − C_olp) + r_dC_olg ⁄ C_g

where MW_ol is the molecular weight of the oxidizer for char reaction l, C_olg is the concentration of the oxidizer for char reaction l in the bulk gas phase, C_olp is the concentration of oxidizer for char reaction l at the surface of the particle, and r_d is the total diffusion rate (which includes r_dlo; thus, this equation is implicit in r_dlo).

When equations (3.90↑) and (3.91↑) are combined, they yield equation (3.84↑).

A straightforward method for modeling the char oxidation reaction rate k_jl is to assume a reaction rate constant of the form:

(3.92) k_jl = A_lTⁿ_je^{− E_l ⁄ RT_j}

where the pre-exponential factor A_l and activation energy E_l correspond to char reaction l and are assumed to be the same for all particles.

2.5.4 Particle Velocity Model

In a gas-solid flow, the particle motion is affected by the drag force, which can be described by the Stokes drag law. For a mesoscale size particle, when the other additional mass forces are omitted, the momentum equation for the particle can be expressed by an ordinary differential equation.

(3.93) (du_i, p)/(dt) = ⎲⎳_i⎛⎝(f_drag)/(τ_p)(u_i, g − u_i, p) + (g_i(ρ_p − ρ_g))/(ρ_p) + (F_i, v)/(m_p)⎞⎠

where i denotes the i^th direction, g is the gravity force acting on the particle, F_v are the other body forces acting on the particle, u_p is the particle velocity, and f_drag is the coefficient of the drag force, which has a close relationship with particle Reynolds number:

f_drag = ⎧⎪⎨⎪⎩ 1 Re_p < 1 1 + 0.15Re^0.687_p 1 < Re_p < 1000 0.0183Re_p Re_p > 1000

and the particle Reynolds number Re_p is defined as:

Re_p = (ρ_pd_p|u_p − u_g|)/(μ_g)

where ρ_p is the particle density, d_p the particle diameter, u_p the particle velocity, u_g the gas velocity, and μ_g the gas dynamic viscosity. In equation (3.93↑), τ_p is the particle relaxation time

(3.94) τ_p = (ρ_pd²_p)/(18μ_g).

2.5.5 Particle Heat Transfer

Particle Heatup Model

The particle heatup can be modeled as follows. The particle is heated by convection, radiation, and reaction enthalpy changes:

(3.95) (d(α_jh_j))/(dt) = Q_rj + Q_j + r_jh_jg,

where Q_rj represents the net radiation to a particle, Q_j represents energy transfer due to convection and conduction between the gas and the particle, and r_jh_jg represents both the amount of energy lost by the particles due to lost mass, and the enthalpy released when the coal is converted to volatile gas and water vapor. Typically 100% of the (negative) heat of reaction and vaporization is contributed to the particle enthalpy, and 0% is contributed to the gas enthalpy.

Convection

The convection term can be expressed as:

(3.96) Q_conv = Nuπk_g(T_g − T_p)d_j⋅(B_j)/(exp(B_j) − 1)

where T_p is the particle temperature, T_g is the gas temperature, Nu is the Nusselt number, k_g is the gas thermal conductivity, and B_j is the heat transfer transpiration parameter.

Kunii and Levenspiel [97] and Kreith [95] reported the following correlation for Nu:

(3.97) Nu = 2.0 + 0.65Re^1 ⁄ 2_pPr^1 ⁄ 3.

Additionally, the heat transfer transpiration is given by the expression:

(3.98) B_j = (r_jC_pg)/(2πd_jk_g).

where r_j is the net reaction rate for the j^th particle, C_p, g Values of the thermal conductivity of the gas are given up to 30, 000 K by Yos [191].

Merrick [111] reported a function for the heat capacity of raw coal and char heat capacity:

(3.99) C_p, c = C_p, h = ⎛⎝(R)/(MW)⎞⎠⎡⎣g₁⎛⎝(380)/(T)⎞⎠ + 2g₁⎛⎝(1800)/(T)⎞⎠⎤⎦

where MW is the molecular weight of raw coal or char, respectively, and g₁ is defined as:

(3.100) g₁(z) = (e^z)/(⎡⎣((e^z − 1))/(z)⎤⎦²).

The heat capacity of ash is given by:

(3.101) C_p, a = 593.93 + 0.586T.

Radiation

The radiative flux is given by:

(3.102) Q_rj = Q_incident − Q_emitted

where Q_incident is then incident radiative flux to the particle and Q_emitted is the radiative heat flux emitted by the particle. This equation can be rewritten as:

(3.103) Q_rj = A_particle(K_abs, p)/(4)(F_sum − E_b),

where A_particle is the absorption coefficient of the particle, defined as

(3.104) A_particle = (Q_absπr²)/(4),

K_abs, p is the absorption cross-section, F_sum is the sum of all fluxes entering a given volume, and E_b is the blackbody emissive power, given by the Stefan-Boltzmann law:

(3.105) E_b = σT⁴.

2.6 Arches Coal Gasification Model

While LES does not incur as high a computational expense as DNS, resolving even a reduced range of time evolving length scales of the flow still carries a substantial cost. For this reason, many LES codes are designed for high-performance computing environments. The Arches LES code [163, 6] is one such massively parallel code, and is built within the Uintah computational framework. The framework is written in C++ and uses Message-Passing Interface (MPI), both widely-used tools for parallel scientific computing and software development [88]. These tools provide Arches and the Uintah framework with the ability to scale to large numbers of processors.

Arches is able to handle complex multiphysics problems through scalability and the use of sheer computational power. The design philosophy behind Arches is to remove computational limitations that stand in the way of better resolution and more accurate but more expensive models. Toward this end, several advanced multiphysics models are implemented in Arches. The DQMOM method is also implemented in Arches, and is able to utilize the scalability of Arches through the use of the transport equation framework to track the coal particle NDF. Extensive verification work has been performed on Arches, both on the fundamental CFD level and on the DQMOM level, to confirm that the algorithms are all implemented correctly and exhibit expected behavior.

3 MODEL VERIFICATION

God forbid that Truth should be confined to Mathematical Demonstration!
― William Blake

3.1 Overview

3.1.1 A Definition

It is beneficial to begin a discussion of verification by first defining it. The word “verification” comes from the root words verificare (Latin, to make true) and facere (Latin, to make or do). Indeed, verification is the act of making a code match truth, but the “truth” that this etymology refers to is mathematical in nature, independent of reality. Section 1.4.1↑ covered some terminology regarding truth; this terminology will be used in what follows. For the process of verification, it is important to partition rational and empirical truth, and to perform verification in a regime entirely free from physical reality, i.e. entirely within the realm of rational truth. The terms “rational truth” and “mathematical truth” will be treated as interchangeable in the discussion of verification.

Verification, then, is the attempt to make a computational implementation of a mathematical model match mathematical truth, and also to quantify how well it does so. Verification is defined for the purposes of this work as “the assessment of accuracy of the solution to a computational implementation of a mathematical model.” This definition is based on that given by Oberkampf, Trucano, and Hirsch [184], but stipulating that “accuracy” refers to accuracy with respect to mathematical truth, not empirical truth. Verification seeks to answer the question of whether the equations that compose the mathematical model are being solved correctly, and quantify or estimate the error resulting from the computational implementation of that mathematical model; it does not answer the question of whether the equations can be used to accurately describe physical reality (the activity answering that question is validation). Thus it is concerned with the mathematics, not the physics, of the model. Roache [141] states that code verification “can and should be completed without appeal to physical experiments” (emphasis in original).

Verification has two separate but equally important parts [101, 141, 144], code verification and solution verification. Code verification is intended to accomplish two goals: first, to ensure that the implementation of the mathematical model is free of mistakes; and second, to use exact solutions to quantify the discretization error associated with the implemented discrete operators, and verify that they exhibit expected behavior. An important part of the first goal is implementing procedures and utilizing tools to control source code changes; this is called software quality assurance (SQA, discussed in Section 3.2.1↓) [68]. SQA contributes several methodologies of finding user mistakes in code, including regression tests. Other methods, such as the method of manufactured solutions (MMS, discussed in Section 18↓), provide additional methods for identifying user mistakes in code. The second aspect of code verification utilizes known solutions to the implemented governing equations in order to quantify numerical error and ensure it behaves as expected (specifically, that it shrinks as the discrete elements shrink, and at the rate that is expected given the discrete operators implemented in the code). Solution verification has the goal of estimating numerical error in the intended use regime, leading to results that are more directly applicable, but it also eliminates the availability of exact solutions. Because exact solutions are unavailable, solution verification quantifies numerical uncertainty, not numerical error.

Typically, code verification is carried out after major development on a code has occurred, or when a release version of the code is being prepared; that is, it occurs only once per development cycle. Solution verification, however, occurs for the application of the code to each intended use. Several approaches for performing both parts of verification will be presented.

3.1.2 Error vs. Uncertainty in Verification

As discussed in Section1.4.4↑, the difference between error and uncertainty lies in the availability of a true value y with which to compute y − ŷ. Code verification is intended to quantify numerical error; the simulations being run as part of code verification consist of cases with known solutions y. Therefore code verification consists entirely of quantification of numerical error. In contrast, solution verification attempts to quantify numerical error in the intended use regime, where known solutions y are unknown. For this reason, solution verification quantifies numerical uncertainty. The quantification of numerical uncertainty is fundamentally different from the uncertainty quantification that is part of validation, due to the nature of the error being bounded by the uncertainty analysis. Solution verification creates uncertainty bounds for numerical error using high fidelity simulations as a surrogate for mathematical truth y, whereas validation creates uncertainty bounds for empirical error, which utilizes empirical observations as y.

Numerical uncertainty does, however, play a role in the validation process; the role of numerical uncertainty in validation is discussed in Section 3.3.5↓.

3.1.3 Numerical Error Taxonomy

In order to assess the precision of a solution to a computational model, it is important first to discuss the quantitative measure of precision: error. Verification error, as defined above and in the introduction, is

(4.1) e = y − ŷ

where y is the mathematically true value of a quantity and ŷ is the calculated value of the same quantity.

Many previous studies have recognized the importance of splitting the verification error e into contributions from respective processes; Roache [141] presented justification for creating an error taxonomy, or system by which various sources of verification error are classified. It is important and useful to do this as a first step in the verification process.

There are many references which have attributed verification error to different sources, with some overlap among them [38, 92, 59, 189, 48]. However, all of the accounts given are inadequate to taxonomically (systematically) describe various error sources in computational fluid dynamics (CFD) simulations. Other systematic accounts of error have attempted to include validation “errors” such as physical modeling errors “caused by inaccuracies in the mathematical model of the physics, completely separate of numerical issues” [33]. However, this is not an error: the task of determining whether a mathematical model for physics is inaccurate is the process of validation. To confound the activity of validation with error is confusing and misleading. Another general taxonomy given by Roache [141] classifies verification error based on order; that is, errors that are ordered in the discretization element Δ, errors ordered in nondiscretization numerical parameters, nonordered errors, etc. This is a significant improvement over existing taxonomies, as it provides a categorical way of thinking about error.

The primary problem with these taxonomies is that they are somewhat arbitrary. This problem is common to many taxonomies. The solution is not just to classify existing errors, but to create a consistent approach so that errors not included can be systematically classified and approaches to quantifying those errors can be formulated. To create a systematic approach, the procedure of computational implementation of the mathematical model is partitioned into separate steps (Figure 3.1↓). Each step introduces different errors, which are classified by the step in the procedure at which they are introduced.

The steps involved include the starting point, the true mathematical model; the solution to the mathematical model, y; the discrete formulation, that is, the mathematical formulation of the discrete model; the discrete implementation, where the actual values of the discretization elements Δx, Δt, N_cells, etc. are chosen; the numerical solution of the discrete equations, which results in some set of mathematical operations, e.g. solving a linear system Ax = B; the implementation of these mathematical operations on a finite-precision machine architecture; and a final step of post-processing of the computed solution to extract ŷ. Each type of error that is encountered can be examined to determine which level in the process it is actually introduced, using Figure 3.1↓ as a guide. Based on the level, different methodologies for error quantification can be applied. The level of primary interest is that of discrete implementation, which quantifies the amount of error introduced through the discrete representation of the mathematical model. This can be quantified using a grid convergence study, in which the size of discrete elements is decreased to examine whether and how the error decreases. This type of analysis yields an order of convergence with respect to each numerical parameter. Knupp and Salari [92] cover error quantification techniques for other types of verification error.

It is of critical importance to recognize that not all errors are independent; many errors are tightly coupled or are subsets of other errors. It is also critical to recognize the cost and the difficulty associated with quantifying all sources of error. The cost of verification must be weighed against the need for increased accuracy in order to determine how far verification must go.

Figure 3.1 A proposed error taxonomy.

3.1.4 Errors vs. Mistakes in Verification

A form of “error” conspicuously missing from the proposed error taxonomy proposed in Section 3.1.3↑ and Figure 3.1↑ are coding mistakes made by users. These, however, are fundamentally different from the errors classified by the proposed error taxonomy. The errors classified by the taxonomy are quantifiable deviations from a true mathematical value. Mistakes in coding, on the other hand, result from a lack of precision on the user’s part.

Because of this fundamental difference, these user mistakes are not included in the error taxonomy. The error taxonomy requires that the given procedure, covering the process of transforming the mathematical model into the solution obtained from the computational implementation of said model, is free of mistakes. This, however, does not imply that it is free of error!

3.2 Code Verification

Code verification has two goals. First, it ensures that the computational implementation of the mathematical model is rigorous and free of mistakes. Second, it quantifies and verifies the order of error convergence with respect to discrete elements by using exact solutions to the governing equations combined with grid convergence studies. In regards to the first goal, computational implementation of mathematical models describing physical phenomena is nontrivial, especially in instances where high-level object-oriented languages, computational frameworks, third-party libraries, complex coupled systems of equations, and supplementary submodels are used. Thus, it is currently impossible for developers to perform code verification by visual inspection of source code (if it ever was). Methodologies have been developed to facilitate finding coding mistakes. Several of these methodologies are addressed in the following sections, and applied to the intended-use computational model, Arches. The two goals of code verification are both discussed. First, Section 3.2.1↓ covers aspects of software quality assurance and how it can support verification activities by exerting a substantial degree of control over source code and changes to it. Several aspects of the second goal of quantifying error are then covered, starting with methodologies for obtaining exact solutions to the governing equations (Section 3.2.3↓) and moving into grid convergence analysis (Section 3.2.4↓). Results from the code verification grid convergence analysis performed on the Arches coal gasification model are then presented (Section 3.2.5↓).

3.2.1 Software Quality Assurance

Software quality assurance is the process by which source code and development activities are conducted. While software quality assurance (SQA) is not a code verification methodology, it provides a scaffolding for code development and code verification, which is software-based. SQA can be divided into three categories, with various activities discussed by Heroux [68] grouped into each category.

Coding Support
1. Source code management
  1. Basic management: version control
  2. Advanced source code management: branches, tags, and releases
2. Mailing lists to support code development, testing, and usage
Coding Practices
1. Easy-to-write, source-centric documentation
2. Team and pair programming
3. Build and configuration tools
Procedures
1. Checklists to standardize and improve procedures for repeated tasks
2. Continual process improvement
Code Development and Repair
1. Issue- and bug-tracking software
2. Test-driven development

Coding Support

Coding support activities utilize various tools in order to facilitate code development. Having a version-tracking system is important for ensuring that bug fixes and other corrected mistakes are proliferated through all developers’ code, rather than being lost during the reconciliation of various versions of a code. However, beyond basic source code management, advanced source code management, using features like branches, tags, and releases, should also be used. Branches can be used for the development of significant features or improvements independent of a main source tree. Tags can be used to mark milestones in the code, essentially providing archived working snapshots of the code at various significant points. In a similar vein, releases are specific major or minor versions of the code released to the public.

Just as usage of version control software to control a code base makes changes public, mailing lists make the process of code development a more transparent and, to the degree it is desired, a more democratic process. Communication about code is visible to all concerned, and mailing list archives can also serve as supplementary documentation.

Coding Practices

One of the biggest weaknesses of projects is lack of documentation. Typically, documentation is abandoned or put off until later. If a documentation effort is not neglected, projects will usually create standalone documentation that is entirely disconnected from the code, and in a format that is awkward, unwieldy, or difficult to navigate or search, such as plain text, info files, or LaTeX. This leads to two potential problems, covered below, and often the first of these problems leads to the second.

Detailed documentation is often written in spurts, and represents documentation for a snapshot of the code in time. While this documentation can be useful, it easily becomes obsolete and requires periodic spurts to bring the documentation up to date. These spurts must also be frequent, as the code is only as useful as the documentation is correct and up-to-date. Maintaining useful documentation thus requires significant effort. Because creation of documentation is secondary to the actual purpose of the code, the costs of maintaining detailed documentation quickly grow to outweigh the perceived benefits. This is the first problem.

The need to have documentation that requires less maintenance will usually lead to documentation that is less specific. Vague documentation can cover an abstraction or a concept whose specific interface may change drastically, but whose central idea remains the same, without having to be updated. However, vague documentation is not an improvement over detailed, out-of-date documentation. It is still marginally useful because it is too vague to yield the specific information that users often need. This, in turn, leads to underutilization of documentation, making all documentation effort pointless.

Ideally, documentation should be simple to create (i.e. a transparent and easy-to-use format) and modular. This allows for arbitrary content creation and content association. Furthermore, there should be at least some portion of the documentation process that is automated and drawn from the code directly. Several tools have emerged that facilitate this style of documentation. First, wikis allow for arbitrary content creation and content association. They also modularize content, and are editable by multiple collaborators. Most wiki systems also record historical information (revisions), making them useful tools for archiving discussions. If content on a wiki loses usefulness or relevance, it does not disappear, the page containing the content simply becomes infecund, with nothing linking into or out of the page. Second, documentation systems such as Doxygen are able to directly parse the actual source code to generate documentation. Even for code that is devoid of any comments, Doxygen still produces useful documentation for classes, class members and methods, and create hierarchical diagrams for classes related by inheritance. This alone makes it an invaluable tool even without effort on the part of the users. Further, if comments with metadata are added to the source code, Doxygen can parse this information and supplement the automatically-generated documentation with information written by the developer. This makes the documentation source-centric, and adding documentation is as easy as adding comments in the code. Finally, integration of Doxygen with wikis is possible with many wiki software packages.

Coding Procedures

Coding procedures are required to make coding activity fruitful. Coding procedures help to establish standardized approaches for coding tasks. The tasks that can be covered by these coding procedures include virtually anything, but their chief utility comes from giving new developers a starting point for tasks essential to code development. Particular checklists might cover tasks such as a pre-check-in procedure, updating gold standards for regression tests, creating formal release versions, or resolving check-in conflicts. Continual process improvement builds on this idea: these checklists are never in a “final form,” but are improved each time they are used. The same philosophy applies to these checklists as applies to documentation: checklists should be easy to create, improve, and associate with other content.

Code Development and Repair

The heart of SQA’s role in code verification lies in these code development and repair activities. First, tracking issues and bugs in an organized system greatly increases efficiency in finding and fixing code mistakes. If a code bug is identified but not fixed, it may be forgotten. With a tracking system, not only is the bug or issue identified, but there is a space created specifically for that issue. This space can be used for discussion, specific individuals can be assigned to fix a bug, timelines can be planned, and ideas discussed. Many wiki systems can incorporate bug tracking systems, making documentation, checklists, and issue-tracking part of a common system and making content association particularly useful.

The second primary activity of SQA is testing. Tests should be written in conjunction with, or even before, development of a new code feature begins. If written before the code is developed, the test file is written and the code is developed in such a way that, after some time, the code will pass the test as expected. This test-oriented approach has many advantages. When tests are written before code is written, the test is usually a clear perspective of what the end user expects to see; the new code can be designed around these expected user inputs. Code development that is not test-centric can lead to irrelevant algorithmic or other low level details being exposed to the tests.

Test-oriented development also ensures that there is a thorough suite of tests that covers all capabilities of a code. Over the lifetime of a large computational model, many new submodels will be added and linked together, many new approaches incorporated, and possibly new library and framework objects used. Due to the increasingly high probability of mistakes with increasing code complexity, it is important to test the functionality of many parts of the code in order to ensure it is working as expected.

Last, but not least, is the advantage of self-documentation. By writing tests as (or before) new code is developed, a library of example tests is written and added to the code base. This is just as useful, if not more so, than documentation of the features exercised by the tests. In this way the test-oriented approach to code development leads to self-documenting input files; as was mentioned above (Section 14↑), this is an ideal documentation methodology.

3.2.2 Code Verification Criteria

Knupp and Salari provide a list of evaluation criteria for when a code is verified [144, 151]:

Expert judgement
Error quantification
Consistency and convergence
Order of accuracy

Expert judgement (equivalent to an “eyeball norm”) is the process of checking whether results look right. This is a strictly nonrigorous processes in that it involves a subjective judgement of the code results. It also covers the insufficiently stringent process of verification by visual inspection. Care must be taken not to confound expert judgement regarding verification with expert judgement regarding validation.

The remaining evaluation criteria all require an exact solution to assess the results of code verification. Error quantification pins a quantitative number on the amount of error in a computation, but this is only the first of several requirements for code verification to be achieved. Consistency and convergence both look at how the error changes with decreasing discrete element size. Consistency is a statement of the relationship between the partial differential equation (PDE) Gu = F and its discrete representation G_Δu = F (where G and G_Δ are the continuous and discrete operators, respectively, and u and F are functions). Consistency is achieved for the PDE Gu = F and its discrete representation G_Δu = F if the quantity Gφ − G_Δφ goes to zero for any smooth function φ [169]. In other words, does the error shrink as the discrete element also shrinks? Convergence is a statement about the behavior of the error as the discrete elements go to zero; that is, how does the error shrink as the discrete element shrinks?

Order of accuracy is a measure of how quickly the error goes to zero, and is measured by the order of magnitude in which the error shrinks with shrinking discrete element size; for example, the temporal discretization operator has an order of magnitude of error that is proportional to Δtⁿ, where n is the order of the temporal scheme. However, just as important is confirming that the observed behavior matches the expected theoretical order of accuracy of the discrete operator used. Because order of accuracy is very sensitive to code mistakes, confirming the order of accuracy (or not confirming it) is a useful way to discover code mistakes.

3.2.3 Exact Solution Methodologies

Following are a few of the most common methodologies for obtaining exact solutions to the governing equations of the code being verified; each has unique advantages, mentioned in the respective sections. Exact solutions can be used with grid convergence analysis to identify errors at the level of the discrete implementation (see error taxonomy in Figure 3.1↑ above). The three methodologies consist of analytical solutions, in which the governing equations are often simplified in order to obtain a mathematical function that satisfies them; the method of manufactured solutions, in which the mathematical solution is “manufactured” and the difficulty of obtaining analytical solutions avoided; and benchmark solutions, which utilize expensive and high quality solutions to a set of the same or similar governing equations.

Analytical Solutions

One code verification methodology is to find an exact mathematical solution to the set of model equations and boundary conditions. However, analytical solutions are difficult to obtain for realistic problems, and often make gross modeling assumptions in order to arrive at a simplified set of equations. Analytical solutions provide an exact expression for the solution to the set of mathematical equations that are computationally implemented, thus allowing the exact value of error (as exact as a computer evaluation can get, at least) in the computational model to be calculated. As an example, an analytical solution to the two-dimensional Navier Stokes equation

(4.2) (∂u)/(∂t) + u⋅∇u = − ∇p + ν∇² u

and continuity equation

(4.3) ∇⋅u = 0

is [138]:

(4.4) u = [ u v ]^T u(x, y, t) = 1 − Acos(x − t)sin(y − t)e^− 2νt v(x, y, t) = 1 − Asin(x − t)cos(y − t)e^− 2νt p(x, y, t) = − (A²)/(4)[cos(2(x − t)) + cos(2(y − t))]e^− 4νt.

Method of Manufactured Solutions

The method of manufactured solutions is a powerful approach to manufacturing solutions to partial differential equations by adding source terms, first proposed by Steinberg and Roache (Steinberg Roache 1985, symbolic manipulation and computational fluid dynamics). The approach is described as follows. For a governing equation or other PDE,

(4.5) Gu = F,

a solution φ is manufactured:

(4.6) u = φ(x, t).

This solution is an arbitrary function. Because φ is not a solution to the original governing equation (4.5↑), an additional source term is added:

(4.7) Gφ = F + Q

such that

(4.8) Q = Gφ − F,

which is trivial to compute. The result is that the additional source term Q cancels out the remaining terms in the governing equations,

This may be understood better with an example. Let the governing equation Gu = F be a one dimensional inviscid convection equation with constant velocity V:

(4.9) (∂u)/(∂t) + V(∂u)/(∂x) = 0.

Now let the assumed solution φ be an arbitrary and simple function:

(4.10) φ = sin(xt).

Then the source term Q is computed as:

(4.11) Q = (∂)/(∂t)(sin(xt)) + V(∂)/(∂x)(sin(xt)) = xcos(xt) + Vtcos(xt).

Then this source term is added to the governing equation, which is straightforward to do in most computational fluid dynamics codes, so that the governing equation being solved becomes:

(4.12) (∂φ)/(∂t) + V(∂φ)/(∂x) = xcos(xt) + Vtcos(xt).

The function φ = sin(xt) is an exact solution to this equation.

Boundary conditions can also be verified using the method of manufactured solutions. For example, periodic boundary conditions can be exercised by selecting a manufactured solution such that φ(x = L_x) = φ(x = 0). Similarly, Dirichlet boundary conditions can be verified by setting the constant boundary condition equal to the value of the function at the given boundary. Neumann boundary conditions of the form

(4.13) (∂φ)/(∂x_i) = c

can be determined by analytically computing the derivative of φ and setting it equal to c. This will typically yield a function, which is then implemented as the boundary condition.

Different functions can be used to exercise different terms in the governing equations. For example, if the above example were coded in a three dimensional CFD code, it would only exercise the time integrator and the computation of the x convection term. If a function such as

(4.14) φ = sin(xt) + cos(yt) + sin(zt)

were chosen, it would exercise the computation of all convection terms. Likewise, if the function did not contain t, the function would be invariant in time.

This is one of the most useful features of the method of manufactured solutions, as it makes it possible not just to verify the overall order of convergence of error for the entire code using a grid convergence study; it allows one to verify particular terms of governing equations, and perform grid convergence studies that isolate individual terms of the governing equation (and individual discretization schemes corresponding to those terms). This provides a very powerful method for debugging codes to find errors affecting the order of convergence. However, it is important to understand that MMS cannot help to identify any type of coding error; it can only be used to identify those errors affecting grid convergence. However, as Salari and Knupp [92] point out, most bugs not identified by MMS are either straightforward to identify using other techniques, or do not significantly affect the solution.

Salari and Knupp [151] give a detailed discussion of the use of manufactured solutions, and give several considerations for selecting manufactured solutions, for example selecting functions that are infinitely differentiable (periodic functions) to exercise all derivatives.

Benchmark Solutions

Verification of codes can also be performed using benchmark solutions, which are very expensive, very high resolution solutions of sets of partial differential equations. The model being verified attempts to solve either the same governing equations with approximations introduced, or a reduced version of them. The verification process then consists of comparing the resulting solutions to the benchmark. For example, an LES computation solves the Navier Stokes equations only at large scales, and uses a model to represent the unresolved small scales. This LES solution could be verified by comparing it to a benchmark numerical solution, one which resolved the entire range of length scales, utilized high order spatial discretization schemes, and implemented physical models as good or better than the models used in the LES model, in order to determine the error introduced through the approximations made in the LES model, including modeling the unresolved small scales.

One example of benchmark solutions in the turbulence and CFD communities are direct numerical simulation (DNS) simulations. These fit the description of the benchmark solution just given; DNS simulations use high order numerical methods for discretization, high resolution grids for resolving all relevant scales of the turbulence, and detailed physical models to obtain solutions to coupled sets of equations. Many computational models are then compared to the DNS results in order to investigate how well they can reproduce these high quality solutions.

3.2.4 Code Verification Grid Convergence Analysis

Once the appropriate methodology for solution generation and error calculation has been selected, this can be used to satisfy the “error quantification” criteria listed above (Section 3.2.2↑). However, in order to satisfy the “consistency and convergence” and the “order of accuracy” criteria, a grid convergence study must be performed.

A grid convergence study is performed following the Richardson Extrapolation Estimation (REE) technique, which postulates a functional form for the numerical error in a discretized model and its dependency on numerical parameters. First, an output quantity of interest is chosen, which has an exact solution, denoted y (this quantity may or may not be a vector). The model (or simulation) prediction of y↓ is denoted y^M↓, and is a function of numerical parameters, denoted by x: thus, y^M(x). The quantity x may or may not be a vector, but is typically the single parameter h. In this case, a form for the model solution can be postulated:

(4.15) y^M(x) = y + f(x)

where f(x) is an error function. The error function is postulated to have the form f(h) = αh^p + ε, where α is a constant, p is the order of convergence, and ε is the error resulting from the power function representation. This gives the model prediction the form:

(4.16) y^M(x) = y + αh^p + ε

It is of interest to determine the order of convergence p. This can be done by approximating the exact solution y with the highest-fidelity model prediction available, denoted y^M_∞, and defined as

(4.17) y^M(h_∞) = y^M_∞,

where h_∞is the smallest grid size used in the grid convergence study. The exact solution, then, can be approximated as y ≈ y^M_∞, which, upon substitution into (4.15↑), yields:

(4.18) y^M ≈ y^M_∞ + αh^p.

Next, if this equation is written for two grid resolutions h₁ and h₂, and these are combined, an expression for p, the order of convergence with respect to grid resolution h, can be obtained:

(4.19) p = (log⎛⎝(y^M_h₁ − y^M_∞)/(y^M_h₂ − y^M_∞)⎞⎠)/(log⎛⎝(h₁)/(h₂)⎞⎠)

where y^M_{h_i} = y^M(h_i). In order to determine the value of the exponent p, a minimum of three simulations must be performed: one at h₁, one at h₂, and one at h_∞.

3.2.5 Code Verification Grid Convergence Results

In order to perform code verification, a grid convergence analysis was performed for several manufactured solutions. Each manufactured solution was intended to exercise a different part of the code. Two such manufactured solutions and grid convergence study results are presented here. For each grid convergence study, a set of 10 grids was used: 0.05, 0.08, 0.10, 0.12, 0.14, 0.16, 0.18, 0.20, 0.35, and 0.50 cm.

The first set of manufactured solutions exercised only a single convection term; there were three, referred to as MMS X, MMS Y, and MMS Z. For each manufactured solution, only the velocity in the direction of interest was nonzero (e.g. for MMS X, u_y = u_z = 0). These three manufactured solutions are given by:

(4.20) φ_{MMS X} = sin⎛⎝2π(x)/(L_x)⎞⎠cos⎛⎝2π(y)/(L_y)⎞⎠ (4.21) φ_{MMS Y} = sin⎛⎝2π(x)/(L_x)⎞⎠cos⎛⎝2π(y)/(L_y)⎞⎠ (4.22) φ_{MMS Z} = cos⎛⎝2π(y)/(L_y)⎞⎠cos⎛⎝2π(z)/(L_z)⎞⎠

with corresponding source terms:

(4.23) Q_{MMS X} = (2π)/(L_x)u_xcos⎛⎝2π(x)/(L_x)⎞⎠cos⎛⎝2π(y)/(L_y)⎞⎠ (4.24) Q_{MMS Y} = − (2π)/(L_y)u_ysin⎛⎝2π(x)/(L_x)⎞⎠sin⎛⎝2π(y)/(L_y)⎞⎠ (4.25) Q_{MMS Z} = (2π)/(L_z)u_zcos⎛⎝2π(y)/(L_y)⎞⎠cos⎛⎝2π(z)/(L_z)⎞⎠

Only results for MMS X are presented; the results for the other two convection terms were identical. Figure 3.2↓ shows the observed order of convergence (left) and the converging error norm (right). The order of convergence was better than the theoretical 2.0 for both the L₂ and L_∞ error norm. The grid convergence plots confirm that the code is in the asymptotically convergent regime for the grid sizes that were used in the study.

The second grid convergence study presented was performed for a manufactured solution that exercised all three convective terms, denoted MMS XYZ. This manufactured solution was given by:

(4.26) φ_{MMS XYZ} = sin⎛⎝2π(x)/(L_x)⎞⎠cos⎛⎝2π(y)/(L_y)⎞⎠sin⎛⎝2π(z)/(L_z)⎞⎠

with a corresponding source term:

(4.27) Q_{MMS XYZ} = (2π)/(L_x)u_xcos⎛⎝2π(x)/(L_x)⎞⎠cos⎛⎝2π(y)/(L_y)⎞⎠sin⎛⎝2π(z)/(L_z)⎞⎠ − (2π)/(L_y)u_ysin⎛⎝2π(x)/(L_x)⎞⎠sin⎛⎝2π(y)/(L_y)⎞⎠sin⎛⎝2π(z)/(L_z)⎞⎠ + (2π)/(L_z)u_zsin⎛⎝2π(x)/(L_x)⎞⎠cos⎛⎝2π(y)/(L_y)⎞⎠cos⎛⎝2π(z)/(L_z)⎞⎠.

Figure 3.3↓ shows the grid convergence study results for MMS XYZ. The grid convergence study for MMS XYZ exhibits an order of convergence of 2, although the largest grid resolutions deviate somewhat. Because the order of convergence is worse for the L_∞ norm, this indicates that the discrepancy is possibly due to a local source of error growing with the grid resolution at order 1 or 1.5. However, the discrepancy only occurs for the largest two grids; the majority of grid resolutions in the study are in the asymptotically convergent regime.

The results of the code verification grid convergence study indicated that all but the largest two grids (0.50 and 0.35 m) were in the asymptotically convergent regime, and that these two grids were close to the asymptotically convergent regime. This conclusion is based on the results from both the MMS X and MMS XYZ grid convergence studies. Based on this result, the grid resolutions selected for use in the solution verification grid convergence study (discussed below) were 0.14, 0.16, 0.18, and 0.20 cm. Smaller resolutions were not used due to the anticipated prohibitive cost of grids finer than 0.14 cm once coal particle physics, gas phase chemistry, and large sets of transport equations were added to the simulations.

figure figures/CodeVerification_MMSX_GridConvergenceAnalys.png

Figure 3.2 Grid convergence results for the MMS X manufactured solution.

figure figures/CodeVerification_MMSXYZ_GridConvergenceAnalys.png

Figure 3.3 Grid convergence results for the MMS XYZ manufactured solution.

3.3 Solution Verification

Solution verification applies to the regime of intended use. In this regime, no analytical or exact solutions are available, making an exact quantitative assessment of numerical error impossible. This means that solution verification yields numerical uncertainty - that is, a set of bounds on the numerical error with some level of confidence that the real numerical error is bounded - and not numerical error. The quantification of numerical uncertainty in the intended use regime, while more difficult than an evaluation of numerical error for simpler analytical or manufactured solutions, is far more useful, since statements about numerical error (or bounds on numerical error) can only be safely applied in the regime in which the verification was performed.

Determining the numerical uncertainty is an important first step in uncertainty quantification. The size of the numerical uncertainty may be shrunk, but the cost of doing so is inversely proportional to the resulting size of the numerical uncertainty bounds. The size of the numerical uncertainty bounds help to determine the level of verification, which is the amount of numerical uncertainty in the model predictions. This level of verification, in turn, dictates the highest level of validation that is possible. “Level of validation” refers to the narrowness of the experimental uncertainty bounds, and therefore how difficult it is for the model to make a prediction that matches [29]. If the numerical uncertainty is far larger than the experimental uncertainty bounds, then the model results cannot be validated. In this way, verification ultimately controls the level of validation that can be achieved. These concepts are developed further in Section 3.3.5↓.

3.3.1 Solution Verification Grid Convergence Analysis

While quantification of the numerical uncertainty in the intended use regime is more challenging, there are methods that can be used to approximate the numerical uncertainty, given the right assumptions and the right information. A very common technique used in verification of a code in the regime of intended use is a grid convergence analysis, discussed in Section 3.2.4↑, which postulates a functional form for numerical error and determines the parameters in the postulated functional form. Grid convergence analysis is typically applied to a single numerical parameter (grid resolution h), but this section develops a grid convergence for two numerical parameters, grid resolution h and number of DQMOM environments N.

First, an output quantity of interest is chosen, which has an exact solution y. The simulation prediction of y is denoted y^M. A form for the model solution can be postulated:

(4.28) y^M(x) = y + f(x)

where x is the vector of numerical parameters, and f(x) is the error function.

In the classical application of grid convergence analysis, in which x = h, the error function is postulated to have the form f(h) = αh^p + ε. However, in the DQMOM method, there is an additional numerical parameter of interest, N, which is the number of environments used to represent the particle distribution. In this case, the postulated error function is:

(4.29) f(x) = f(h, N) (4.30) = αh^p + βN^− q + γh^rN^− s + ε

where α, β, γ, p, q, r, and s are constants. The first term represents the functional dependence of convergence on the grid size, the second represents the functional dependence of convergence on the number of quadrature nodes used to represent the particle NDF (proportional to the inverse of N because error decreases with increasing N), and the third represents the interaction effect of these two variables on solution convergence.

It is of interest to determine the order of convergence with respect to the numerical parameters - that is, to find p, q, r, and s. This can be done by approximating the exact solution y with the highest-fidelity model prediction available, denoted y^M_∞, and defined as

(4.31) y^M(x∣h = h_∞, N = N_∞) = y^M_∞,

where h_∞ is the smallest grid size used and N_∞ is the largest number of environments used. This makes equation (4.28↑):

(4.32) y^M(x) = y + f(x) = y + αh^p + βN^− q + γh^rN^− s + ε ≈ y^M_∞ + f(x) ≈ y^M_∞ + αh^p + βN^− q + γh^rN^− s

The next step is to approximate the exact solution using this highest-fidelity model y^M_∞, but this depends on whether or not the interaction term γh^rN^− s is important. The two cases are addressed here.

Case A: Insignificant Interaction Effects

If the interaction effects are insignificant, then the interaction term can be lumped into the error term, like so:

(4.33) f(h, N) = αh^p + βN^− q + γh^rN^− s + ε = αh^p + βN^− q + ε^′.

This is done for all cases. Next, each term is isolated by lumping all other terms into ε, the order of the isolated term is determined, and the process is then repeated for each term.

Starting with the determination of the order of the term αh^p, the term βN^− q can be lumped into the error term:

(4.34) f(h, N) ≈ αh^p + ε^′′

which, upon substitution into equation (4.32↑) (and dropping the error term), yields:

(4.35) y^M = y^M_∞ + αh^p.

Next, if this equation is written for two grid resolutions h₁ and h₂, and these are combined via division, an expression for p can be obtained:

(4.36) p = (log⎛⎝(y^M_h₁ − y^M_∞)/(y^M_h₂ − y^M_∞)⎞⎠)/(log⎛⎝(h₁)/(h₂)⎞⎠)

where y^M_{h_i} = y^M(x∣h = h_i) and y^M_∞ is given by (4.31↑).

The order of the term βN^− q can be determined next, by lumping the remaining term αh^p into the error term:

(4.37) f(h, N) ≈ βN^− q + ε^′′′

which, upon substitution into equation (4.32↑), yields:

(4.38) y^M = y^M_∞ + βN^− q.

Next, if this equation is written for two numbers of environments N₁ and N₂, and these are combined via division, an expression for q can be obtained:

(4.39) q = (log⎛⎝(y^M_N₁ − y^M_∞)/(y^M_N2 − y^M_∞)⎞⎠)/(log⎛⎝(N₁)/(N₂)⎞⎠)

where y^M_{N_i} = y^M(x∣N = N_i).

In order to determine the values for exponents p and q, simulations must be performed with at least 3 unique values of each numerical parameter. Performing more than 3 simulations would ensure that the observed convergence behavior is consistent.

Case B: Significant Interaction Effects

In the case that interaction effects are significant, the expressions for p, q in equations (4.36↑) and (4.39↑) do not hold, due to the nondistributive properties of the log operator. In this case, principles of regression must be used in order to fit simulation results to a function with a specified form. The function that is to be fit is the error function, which is a function with 8 free parameters, y^M_∞, α, β, γ, p, q, r, and s:

(4.40) f(x) = y^M − y^M_∞ = αh^p + βN^− q + γh^rN^− s.

It is the error function y^M − y^M_∞, not the simulation output y^M, being regressed.

In order to regress the simulation results to the specified function, an experimental design should be used to select optimal parameter values for simulation evaluations; because solution verification occurs near the intended use regime, simulation evaluations are not cheap, and parameter combinations must be chosen with care. Once the parameter combinations are specified and the simulations are run, the results are regressed on the function. The apparent orders of h and N are min(p, r) and min(q, s), respectively.

One method to determine the orders of convergence p, q, r, and s is to guess their values, then regress the errors y^M − y^M_∞to the function (4.40↑) after substituting the guessed values. The goodness of fit of the regression can then be assessed using statistical quantities, with the final values of p, q, r, and s being those from the best regression. While an in-depth discussion of goodness of fit and its metrics is given in Section (5.1.4↓), two important statistical quantities used to determine the orders of convergence are given here. The first statistical quantity used is the R² coefficient, which measures the correlation between the regressed model predictions and the points on which the model was regressed. An R² value of 1 means the function matches the regression inputs perfectly. An R² of 0 or less means that the fit is worse than a constant. Using the p, q, r, and s that result in a maximum R² value is equivalent to verification method #3 given by Logan and Nitta [101], with the exception that the error function (4.40↑) has multiple numerical parameters. Another statistical quantity that can be used to judge goodness of fit of a regressed function is the mean squared error, MSE, which describes the average deviation between the response surface approximation and the actual simulation result. This is the same as the procedure described by Eca [38], and is used in Logan and Nitta’s method #6, method #7, and method #8. Logan and Nitta use the MSE but call it the “least squares error term.” For a grid convergence study with D degrees of freedom and P unspecified numerical parameters, the MSE is defined as:

(4.41) MSE = √(^D⎲⎳_i = 1([y^M_i − (y^M_∞ + αh^p + βN^− q + γh^rN^− s)])/(D − P)).

It should be noted that the number of degrees of freedom can be increased by increasing the number of responses gathered from the system (although these responses should all be relevant to the intended use, see Section 23↓ below), and simultaneously enforcing the assumption that the same presumed functional form, that is, the same values for the parameters in equation (4.54↓), hold for all responses.

Determination of Interaction Effects

The interpretation of grid convergence results (and, more generally, factorial design results) to determine the significance of interaction effects utilizes concepts that are used in later sections (specifically, in constructing surrogate models). Thus, important concepts and calculation procedures related to interaction effects of variables on system responses are described in Section 5.3.3↓. This material also provides detail about experimental design techniques used to optimize input parameter combinations in order to best analyze these interaction effects. However, a brief explanation of the process is given below, with enough information to interpret results from the grid convergence study performed for the coal gasification simulation tool (Section 3.3.3↓). Emphasis is placed on the results, however, with a more detailed treatment of interaction effects and experimental design left to Section 5.3.3↓.

Picking a Response

The purpose of solution verification is to quantify the numerical uncertainty in a simulation result. But in a transient simulation solving dozens of variables, what quantities should be used to determine convergence criteria? When y^M is a vector, which y^M should be used?

Determination of a system response for use in solution verification should follow a simple convention. The variable used for solution verification should be the system response of interest. That means, whatever quantity is being compared to experimental data is the quantity whose order of convergence should be determined. The solution verification is thus driven, ultimately, by the intended use. The reason for this, as covered in a later section (Section 3.3.5↓), is that the numerical uncertainty plays a role in the validation process. In order to complete validation, in which the model prediction y^M of a quantity y is compared with data d, it is necessary to have the estimated numerical uncertainty from the solution verification procedure for the quantity y^M. This means that computing the numerical uncertainty in, say, z^M, an unrelated quantity predicted by the computational model, is not ultimately useful for validation, although it may be useful for better understanding model behavior.

Solution Verification Scenario

The solution verification scenario used was similar in all respects to the final gasification simulation cases used for validation, with the exception that the domain was shortened substantially. The cylindrical gasifier had a diameter of d = 0.2 m. The validation cases used a domain with an axial length of L = 1.2 m. However, for the purposes of verification, the domain was shortened substantially so that it was a cube; the axial length of the verification simulation domain was set to L = 0.2 m. Uncertainty in input parameters was determined for the purposes of validation (Section 4.4.1↓), but for verification, the average value of each input parameter was used. The only parameters being modified were the two numerical parameters h and N.

The responses used were the same responses used in the final validation analysis: time-averaged concentration profiles of three species, CO, CO₂, and H₂.

3.3.2 Solution Verification Grid Convergence Design

In order to determine the order of convergence with respect to the two numerical parameters h (grid size) and N (number of DQMOM environments), a 4-level, 2-factor full factorial experimental design was used. Basic information about fractional and full factorial experimental designs are given in Section 5.3.5↓ and [20]. The important aspects of the solution verification experimental design matrix are presented in Tables 3.1↓ and 3.2↓.

In order to deal with each variable having 4 levels, each variable h and N was split up into two variables, h_A, h_B, N_A, and N_B. These variables each have two levels. Combined, this yields 4 levels for h and 4 levels for N. Table 3.1↓ shows the combinations of coded values that make up the 4 levels. Next, because a fractional factorial design is being run, the defining contrast for the fractional factorial is defined as:

(4.42) I = h_Ah_BN_AN_B.

This gives the design the characteristic of resolution IV, meaning the design can be denoted 2^4 − 1_IV. A full factorial design would consist of 2⁴ = 16 design points, yielding the average effect, 4 main effects, 6 two factor interactions, 4 three factor interactions, and 1 four factor interaction. However, the half factorial design with the defining contrast (4.42↑) aliases the four factor interaction with a constant, and the three factor interaction effects with the single factor main effects: for example, the relationship

(4.43) h_A = h_BN_AN_B

means that the computed main effect for h_A is confounded with the three way interaction effect between h_B, N_A, and N_B. This means that only the magnitude of the sum of h_A and h_BN_AN_B can be determined; these two individual effects cannot be separately determined without additional runs. Likewise, the design also aliases two factor interactions with each other,

(4.44) h_Ah_B = N_AN_B

meaning that the computed interaction effect between h_A and h_B is confounded with the interaction effect between N_A and N_B, so that only the magnitude of the sum of h_Ah_B and N_AN_B can be determined. Table 3.2↓ shows the half factorial design, for which only the defining contrast of I = + 1 is listed. For more details about the defining contrast, see Section 5.3.5↓ and [20].

h_A	h_B	Meaning
+1	+1	h = 0.0014
+1	-1	h = 0.0016
-1	+1	h = 0.0018
-1	-1	h = 0.0020

N_A	N_B	Meaning
+1	+1	N = 10 env
+1	-1	N = 9 env
-1	+1	N = 6 env
-1	-1	N = 3 env

Table 3.1 Coded values and corresponding variable values for the grid convergence analysis experimental design.

Case	h [m]	N	h_A	h_B	N_A	N_B	h_Ah_BN_AN_B
A	0.0014	10	+1	+1	+1	+1	+1
B	0.0020	10	-1	-1	+1	+1	+1
C	0.0016	9	+1	-1	+1	-1	+1
D	0.0018	9	-1	+1	+1	-1	+1
E	0.0016	6	+1	-1	-1	+1	+1
F	0.0018	6	-1	+1	-1	+1	+1
G	0.0014	3	+1	+1	-1	-1	+1
H	0.0020	3	-1	-1	-1	-1	+1

Table 3.2 Coded and uncoded values for the half factorial design matrix for the grid convergence analysis.

A half-factorial design was selected, first of all, because solution verification runs are near the intended use regime, and are therefore expensive. Second, a full factorial was judged to be unnecessary, since a half factorial would still yield information about the importance of the interaction effect between h and N. This interaction effect was necessary to quantify because it determined which of the two approaches above (Sections 20↑ and 21↑) would be used to determine the order of convergence. Furthermore, in the case that the interaction effect was unimportant, it would yield four sample points for each numerical parameter, more than the minimum three required to determine the orders of convergence p and q.

3.3.3 Significance of Interaction Effect

As mentioned, Sections 5.3.3↓ and 5.3.4↓ cover the procedure of calculating the significance of interaction effects, so only the results are presented here. In order to determine the importance of the interaction between N and h and its impact on the grid convergence error function, the effects of the main parameters N and h were computed, and compared to the interaction effect. Main effects much larger than interaction effects would lead to the order of grid convergence being determined using the procedure described in Section 20↑, while main effects on the same order as interaction effects would lead to the order of grid convergence being determined using the procedure described in Section 21↑. The grid convergence study was investigating a 2-factor 4-level (1)/(2) fractional factorial design; the design matrix is given above, and the design procedure for this type of experimental design is covered in detail in Section (5.3.5↓). Each variable was split into two variables to make the design a 2-factor 4-level (1)/(2) fractional factorial design with variables N_A, N_B, h_A, and h_B.

To determine the importance of the N × h interaction on the results of the grid convergence, the main effects M_{N_A}, M_{N_B}, M_{h_A}, M_{h_B} had to be calculated, from which the the four interaction terms I_{N_Ah_A}, I_{N_Ah_B}, I_{N_Bh_A}, and I_{N_Bh_B} were then calculated. To determine the average effect of N and h, the quantities were averaged to yield M_N, M_h, and I_Nh.

The main effects can be calculated by finding the difference in the system response at the two levels of the variable of interest, averaged over all variables except the variable of interest; for the variable N_A, the system response averaged over all variables of interest except the variable of interest is denoted by y_i∙∙∙, defined by equation (6.51↓).

The main effect of variable N_A can be calculated as follows:

(4.45) M_{N_A} = y_{+ ∙∙∙} − y_{− ∙∙∙} = (1)/(2)[y_{+ + ∙∙} + y_{+ − ∙∙}] − (1)/(2)[y_{− + ∙∙} + y_{− − ∙∙}] = (1)/(4)[y_{+ + + +} + y_{+ + − −} + y_{+ − − +} + y_{+ − + −}] − (1)/(4)[y_{− + − +} + y_{− + + −} + y_{− − + +} + y_{− − − −}] = (1)/(4)(y_A + y_C + y_G + y_E − y_B − y_D − y_F − y_H)

where the letter codings come from Table 3.2↑; the remaining main effects can be calculated as:

(4.46) M_{N_B} = (1)/(4)(y_A + y_D + y_F + y_G − y_B − y_C − y_E − y_H) (4.47) M_{h_A} = (1)/(4)(y_A + y_B + y_C + y_D − y_E − y_F − y_G − y_H) (4.48) M_{h_B} = (1)/(4)(y_A + y_B + y_E + y_F − y_C − y_D − y_G − y_H).

The interaction terms are calculated for N_A, N_B, h_A, and h_B according to equation (6.56↓), starting with the first interaction term I_{N_Ah_A}:

(4.49) I_{N_Ah_A} = (M(N_A)_{h_A = +} − M(N_A)_{h_A = −})/(2) = ((y_{+ ∙ + ∙} − y_{− ∙ + ∙}) − (y_{+ ∙ − ∙} − y_{− ∙ − ∙}))/(2) = (1)/(4)[y_{+ + + +} + y_{+ − + −} + y_{− + − +} + y_{− − − −}] − (1)/(4)[y_{− + + −} + y_{− − + +} + y_{+ − − +} + y_{+ + − −}] = (1)/(4)(y_A + y_C + y_F + y_H − y_B − y_D − y_E − y_G).

(4.50) I_{N_Ah_A}

Likewise, the other interaction effects can be determined:

(4.51) I_{N_Ah_B} = (1)/(4)(y_A + y_D + y_E + y_H − y_B − y_C − y_F − y_G) (4.52) I_{N_Bh_A} = (1)/(4)(y_A + y_D + y_E + y_H − y_B − y_C − y_F − y_G) (4.53) I_{N_Bh_B} = (1)/(4)(y_A + y_C + y_F + y_H − y_B − y_D − y_E − y_G).

Note that I_{N_Ah_B} = I_{N_Bh_A} and I_{N_Ah_A} = I_{N_Bh_B} due to the reason mentioned above.

Finally, in order to determine the effect of the variables of interest, N and h, rather than the variables used in the factorial design (N_A, N_B, h_A, and h_B), the main effects and two-way interaction effects were each averaged. The quantities defined by equations (4.45↑) and (4.46↑) are averaged to yield the main effect of the variable N; the quantities defined by equations (4.46↑) and (4.46↑) are averaged to yield the main effect of variable h; and the quantities defined by equations (4.50↑), (4.51↑), (4.51↑), and (4.51↑) are averaged to yield the interaction effect between variables N and h. The resulting main and interaction effects are shown in Figure 3.4↓.

The results were not surprising: the grid resolution exhibited the strongest effect on the results. The number of environments also exhibited an effect on the response. The interaction between N and h, while small for some responses, was overall of equal importance to the main effects. For this reason, the interaction effects were not ignored; the full form of the error function, equation (4.40↑), was regressed, using the procedures described in Section 21↑.

figure figures/ErrorFunctionSignificanceTest.png

Figure 3.4 Bar plot of the main effects of h and N and the interaction effect h × N, computed from the results of the solution verification grid convergence studies of all three responses.

3.3.4 Solution Verification Grid Convergence Results

From Figure 3.4↑ it is obvious that the error function is covered by Case B (Section 21↑), meaning the interaction between N and h is significant and cannot be ignored when computing the order of convergence. The 8 points from the fractional factorial design must be regressed to equation (4.40↑),

(4.54) y^M = y^M_∞ + αh^p + βN^− q + γh^rN^− s,

following one of the procedures mentioned in Section 21↑. In order to determine the orders of convergence p, q, r, and s, both criteria from Logan and Nitta were used (minimization of mean square error and maximization of R² coefficient). The results of the analysis are show in Figures 3.5↓ through 3.10↓. Each plot has a fixed value of r and s, indicated on the plot.

The procedure for the analysis was as follows. Values of p, q, r, and s were selected, and the data resulting from the Arches verification simulations (Table 3.2↑) were regressed to equation (4.54↑). Plots were then created of the R² and mean square error for each combination of different values of p, q, r, and s, shown in Figures 3.5↓ through 3.10↓, with the maximum values of R² and MSE indicated by the solid line, and the mean values of R² and MSE indicated by the dotted line. The results, presented in Table 3.3↓, were consistent among all responses. They indicate, first of all, a consistent order of convergence of p = 1 with respect to h, and r = 1 with respect to h and its interaction with the number of DQMOM environments N, at all locations. They also indicate that the grid convergence with respect to N is q = 1 at x = 10 cm, and q = 2 at x = 20 cm. However, the value of s, the order of convergence with respect to N and its interaction with the grid size h, exhibits the reverse trend: s = 2 at x = 10 cm and s = 1 at x = 20 cm. This indicates that while the observed order of convergence with respect to both h and N is 1, the order of convergence with respect to N exhibits second order behavior.

figure figures/SolnVerification_OrderAnalysis_Rsq_CO2_X010.png

(a) R² coefficients for [CO₂] at x = 10 cm.

figure figures/SolnVerification_OrderAnalysis_MSE_CO2_X010.png

(b) Mean square error for [CO₂] at x = 10 cm.

Figure 3.5 R² coefficients and mean squared error as a function of integer values of p and q for the convergence study of response [CO₂] at x = 10 cm. Values of r and s are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

figure figures/SolnVerification_OrderAnalysis_Rsq_CO2_X020.png

(a) R² coefficients for [CO₂] at x = 20 cm.

figure figures/SolnVerification_OrderAnalysis_MSE_CO2_X020.png

(b) Mean square error for [CO₂] at x = 20 cm.

Figure 3.6 R² coefficients and mean squared error as a function of integer values of p and q for the convergence study of response [CO₂] at x = 20 cm. Values of r and s are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

figure figures/SolnVerification_OrderAnalysis_Rsq_CO_X010.png

(a) R² coefficients for [CO] at x = 10 cm.

figure figures/SolnVerification_OrderAnalysis_MSE_CO_X010.png

(b) Mean square error for [CO] at x = 10 cm.

Figure 3.7 R² coefficients and mean squared error as a function of integer values of p and q for the convergence study of response [CO] at x = 10 cm. Values of r and s are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

figure figures/SolnVerification_OrderAnalysis_Rsq_CO_X020.png

(a) R² coefficients for [CO] at x = 20 cm.

figure figures/SolnVerification_OrderAnalysis_MSE_CO_X020.png

(b) Mean square error for [CO] at x = 20 cm.

Figure 3.8 R² coefficients and mean squared error as a function of integer values of p and q for the convergence study of response [CO] at x = 20 cm. Values of r and s are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

figure figures/SolnVerification_OrderAnalysis_Rsq_H2_X010.png

(a) R² coefficients for [H₂] at x = 10 cm.

figure figures/SolnVerification_OrderAnalysis_MSE_H2_X010.png

(b) Mean square error for [H₂] at x = 10 cm.

Figure 3.9 R² coefficients and mean squared error as a function of integer values of p and q for the convergence study of response [H₂] at x = 10 cm. Values of r and s are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

figure figures/SolnVerification_OrderAnalysis_Rsq_H2_X020.png

(a) R² coefficients for [H₂] at x = 20 cm.

figure figures/SolnVerification_OrderAnalysis_MSE_H2_X020.png

(b) Mean square error for [H₂] at x = 20 cm.

Figure 3.10 R² coefficients and mean squared error as a function of integer values of p and q for the convergence study of response [H₂] at x = 20 cm. Values of r and s are indicated on the plots. The solid lines indicate the maximum values, and the dotted lines indicates the mean values.

Location	Species	p	q	r	s
x = 10 cm	[CO₂]	1	1	1	2
	[CO]	1	1	1	2
	[H₂]	1	1	1	2
x = 20 cm	[CO₂]	1	2	1	1
	[CO]	1	2	1	1
	[H₂]	1	2	1	1

Table 3.3 Orders of convergence computed as part of the solution verification grid convergence study for the Arches coal gasification model.

The solution verification reveals that the grid convergence does not match the theoretical order of convergence with respect to h; this is not surprising, however, given the complexities introduced between the code verification case and the solution verification case. These include coal gasification physics, particle tracking, various boundary conditions, variable density, velocity, and pressure, time averaging, multiple responses, derived quantities used as responses (that is, the responses were values tabulated on the independent variables being tracked using scalar transport equations, rather than independent variables themselves), etc.

One valuable piece of information still lacking from the solution verification is a numerical uncertainty estimate and associated level of belief. Without this, it is impossible to determine the level of verification. It is, however, possible to obtain this from the solution verification results, using the grid convergence index, described in the section following. This was done, and a numerical uncertainty estimate was obtained. The results from this procedure are described in the next section.

3.3.5 Numerical Uncertainty and Convergence Indices

Numerical uncertainty plays a unique role in the validation process. Just as numerical, or mathematical, truth is separate and distinct from empirical truth, so too is numerical uncertainty separate and distinct from empirical uncertainty. Numerical uncertainty is linked to numerical error; it bounds it. As a result, the sources of numerical uncertainty are no different from the sources of numerical error (Figure 3.1↑). Numerical uncertainty creates an interval that bounds the true numerical error ε_numerical in the simulation,

U_numerical : l_ε ≤ ε_numerical ≤ u_ε

with some level of belief B in the uncertainty bounds U_numerical,

U_numerical∣B,

where the true numerical error is defined as:

(4.55) ε_numerical = y^M − y^M_{h_i, N_i}(x∣h = h_i, N = N_i)

and y^M denotes the exact mathematical solution to the model equations. (Note that empirical truth and reality play no role whatsoever in the verification process, nor in defining numerical error and numerical uncertainty.)

Most general approaches to establishing a belief or confidence level B in a partially known quantity utilize Student’s t-distribution, used to estimate the true mean of a population using a small sample size; this leads to an estimate of the error. However, this is impractical for estimating the numerical error, since this approach is based on an estimate of the standard deviation, and this is zero for simulations run with the same input parameters. Furthermore, the error is not normally distributed: it is a strong function of numerical parameters. The error is part of a highly nonlinear system, and the tasks of determining bounds on the numerical error at a high level of belief, and even determining the magnitude of the numerical error, both part of solution verification, are nontrivial. Indeed, as Roache says, “a well-founded probability statement of the error estimate, such as a statistician would prefer (e.g., a 2σ limit) is not likely to be forthcoming for practical PDE problems” [141].

Grid Convergence Index

Roache proposed an alternative method for establishing a level of belief in the uncertainty bounds based on grid convergence studies conducted at several grid resolutions. He called this the grid convergence index (GCI) [125, 141]. While less confident than the typical confidence level of 95% often reported from confidence interval construction, and less precise, it provides an estimate, at least, of the uncertainty bounds of the numerical error for a given grid size. To begin, the error between solutions on two grids (grid 1, a fine grid, and grid 2, a coarse grid), computed from a grid convergence study, is given by:

(4.56) ε₁₂ = (y^M_h₂ − y^M_h₁)/(y^M_h₁)

where y^M_{h_i} = y^M(x∣h = h_i). The solution on the fine grid can be used to estimate the exact solution:

(4.57) y^M ≈ y^M_h₁ + (y^M_h₁ − y^M_h₂)/(R^p − 1)

where R = h₂ ⁄ h₁ is the grid refinement ratio, and h₂ > h₁. Given that the numerical study contained two parameters, and that an order of convergence for h with respect to its interaction effect with N was also computed, the quantity on the bottom could alternatively be r^{(p + r)/(2)}. However, in order to be more conservative, the value used should be the apparent p, that is, min(p, r). Thus, because the results of the solution verification showed p = r = 1, R^p − 1 is used in the denominator. From this, an estimated error for the computation on the fine grid is:

(4.58) E₁ ≈ (y^M_h₁ − y^M_h₂)/(R^p − 1)

and a normalized error for the computation on the fine grid is:

(4.59) E^⋆₁ ≈ (ε₁₂)/(R^p − 1).

This estimate of the error leads to a (normalized) GCI, which is the error estimate multiplied by a safety factor F_s:

(4.60) GCI^⋆(fine grid) = F_sE^⋆₁ = F_s(|ε₁₂|)/(R^p − 1).

However, a more useful GCI is one that does not use a normalized error:

(4.61) GCI(fine grid) = F_sE₁ = F_s(|y^M_h₁ − y^M_h₂|)/(R^p − 1).

This is easier to interpret because it provides a direct estimate of the error in the fine grid y^M_h₁. Different definitions make more sense in different situations, but in most cases the nonnormalized GCI (4.61↑) is easier to apply.

The recommended safety factor for a study with 3 or more grids is F_s = 1.25, while the recommended safety factor for a study with only 2 grids is F_s = 3.0. Likewise, based on a similar approximation of the exact solution using the coarse grid solution:

(4.62) y^M ≈ y^M_h₂ + ((y^M_h₁ − y^M_h₂)r^p)/(R^p − 1),

the GCI can be defined for the coarse grid:

(4.63) GCI(coarse grid) = F_s⎛⎝(R^p)/(R^p − 1)⎞⎠|y^M_h₁ − y^M_h₂|.

The GCI is intended to indicate the value of |ε|that would result in the same E₁ for a grid convergence study of p = 2 and r = 2. That is, the GCI is equal to E₁ for h₁ = 2h₂ and p = 2, and GCI = |ε| for the same case if the safety factor is 1.

The GCI can be used to determine numerical uncertainty bounds in order to supplement the computation of the error function discussed above in Section 21↑,

(4.64) f(h, N) = y^M_{h_i} − y^M_∞ = αh^p + βN^− q + γh^rN^− s.

What the GCI provides is a ceiling on this error function, so that:

(4.65) − GCI(grid i) ≤ y^M − y^M_{h_i} ≤ GCI(grid i).

Thus, Tables 3.4↓, 3.5↓, and 3.6↓ provide estimates of the numerical uncertainty in each simulation prediction with respect to the grid resolution.

Environment Convergence Index

In order to supplement Roache’s grid convergence index to also obtain an estimate of the numerical uncertainty based on the number of DQMOM environments, an environment convergence index (ECI) was created. Following a similar procedure, the estimated error between two computations with different numbers of environments N₁ > N₂ can be expressed as:

(4.66) ε₁₂ = (y^M_N₂ − y^M_N₁)/(y^M_N₁)

and the solution with the highest number of environments can be used to estimate the exact solution:

(4.67) y^M ≈ y^M_N₁ + (y^M_N₁ − y^M_N₂)/(R^Q − 1)

where R = N₁ ⁄ N₂ and N₁ > N₂. As before, in order to keep the ECI conservative, the power of R is equal to the apparent order of convergence with respect to N, min(q, s). Because the solution verification results showed that the apparent order of convergence with respect to N was always 1, Q = 1. Next, the estimated error for the finely resolved DQMOM computation is:

(4.68) E₁ ≈ (y^M_N₂ − y^M_N₁)/(R^Q − 1)

with the normalized error for the finely resolved computation given by:

(4.69) E^⋆₁ ≈ (ε₁₂)/(R^Q − 1).

This leads to a normalized ECI:

(4.70) ECI^⋆(large N) = F_s(|ε₁₂|)/(R^Q − 1)

and a (more useful) nonnormalized ECI, given by:

(4.71) ECI(large N) = F_s(|y^M_N₁ − y^M_N₂|)/(R^Q − 1).

Likewise, the ECI for a smaller N may be computed using a solution with a larger N as:

(4.72) ECI(small N) = F_s⎛⎝(R^Q)/(R^Q − 1)⎞⎠|y^M_N₁ − y^M_N₂|.

As with the nonnormalized GCI, this provides an indication of the numerical uncertainty due to the number of environments selected,

(4.73) − ECI(N_i) ≤ y^M − y^M_{N_i} ≤ ECI(N_i).

This quantity is computed for the solution verification results, with the ECI for each response being given by Tables 3.7↓, 3.8↓, and 3.9↓.

3.3.6 Convergence Index Results

Interpreting the GCI results reported in Tables 3.4↓, 3.5↓, and 3.6↓, the numerical uncertainty is clearly spatially dependent. The numerical uncertainty closer to the inlet (x = 10 cm) is higher by a factor of 2-5 for each response compared to the numerical uncertainty at x = 20 cm. While a complete assessment of the experimental uncertainty will be presented in a later section, the numerical uncertainty at x = 20 cm is lower than the experimental uncertainty used in the validation analysis, while the numerical uncertainty at x = 10 cm is comparable to or higher than the experimental uncertainty used. This indicates that the numerical uncertainty in the anterior region is likely to be higher than in the posterior region. However, given that the location of the first experimental measurement in the gasifier was at x = 21 cm, the numerical uncertainty due to h is not significant enough to make the level of verification lower than the level of validation.

Grid Resolution	Radial Profile, x = 10 cm				Radial Profile, x = 20 cm
h [m]	[CO₂]^M_i	ε_ij	GCI	([GCI])/(y^M_i)	[CO₂]^M_i	ε_ij	GCI	([GCI])/(y^M_i)
0.0020	0.283	0.079	0.086	33%	0.270	0.029	0.031	12%
0.0018	0.272	0.035	0.052	20%	0.266	0.012	0.018	7%
0.0016	0.265	0.010	0.025	10%	0.263	0.003	0.007	3%
0.0014	0.262	-	0.025	10%	0.262	-	0.007	3%

Table 3.4 Grid convergence index at each grid resolution for the [CO₂] response. The response reported is for the highest value of N available at the given resolution. The reported GCI is GCI(coarse grid) (compared to the h = 0.0014 m grid) for all grids except h = 0.0014 m, and GCI(fine grid) (compared to the h = 0.0016 m grid) for the h = 0.0014 m grid.

Grid Resolution	Radial Profile, x = 10 cm				Radial Profile, x = 20 cm
h [m]	[CO]^M_i	ε_ij	GCI	([GCI])/(y^M_i)	[CO]^M_i	ε_ij	GCI	([GCI])/(y^M_i)
0.0020	0.408	0.086	0.159	36%	0.434	0.027	0.051	11%
0.0018	0.431	0.035	0.088	20%	0.442	0.010	0.026	6%
0.0016	0.442	0.010	0.044	10%	0.445	0.002	0.008	2%
0.0014	0.446	-	0.044	10%	0.446	-	0.008	2%

Table 3.5 Grid convergence index at each grid resolution for the [CO] response. The response reported is for the highest value of N available at the given resolution. The reported GCI is GCI(coarse grid) (compared to the h = 0.0014 m grid) for all grids except h = 0.0014 m, and GCI(fine grid) (compared to the h = 0.0016 m grid) for the h = 0.0014 m grid.

Grid Resolution	Radial Profile, x = 10 cm				Radial Profile, x = 20 cm
h [m]	[H₂]^M_i	ε_ij	GCI	([GCI])/(y^M_i)	[H₂]^M_i	ε_ij	GCI	([GCI])/(y^M_i)
0.0020	0.0071	0.0991	0.0032	41%	0.0076	0.0359	0.0012	15%
0.0018	0.0075	0.0424	0.0019	24%	0.0077	0.0077	0.0006	8%
0.0016	0.0078	0.0108	0.0008	11%	0.0078	0.0078	0.0002	2%
0.0014	0.0079	-	0.0008	11%	0.0079	-	0.0002	2%

Table 3.6 Grid convergence index at each grid resolution for the [H₂] response. The response reported is for the highest value of N available at the given resolution. The reported GCI is GCI(coarse grid) (compared to the h = 0.0014 m grid) for all grids except h = 0.0014 m, and GCI(fine grid) (compared to the h = 0.0016 m grid) for the h = 0.0014 m grid.

The ECI shows similar results to the GCI, namely, that the numerical uncertainty due to N is significant only near the inlet, x = 10 cm, and decreases to a negligible amount at x = 20 cm. As with the GCI, the much larger numerical uncertainty at x = 10 cm is approximately equal to the experimental uncertainty, and decreases to a small fraction of the experimental uncertainty used in the validation analysis. For this reason, the numerical uncertainty due to N is not significant enough to make the level of verification lower than the level of validation.

Environments	Radial Profile, x = 10 cm				Radial Profile, x = 20 cm
N	[CO₂]^M_i	ε_ij	ECI	([ECI])/(y^M_i)	[CO₂]^M_i	ε_ij	ECI	([ECI])/(y^M_i)
3	0.285	0.085	0.040	15%	0.265	0.008	0.004	1%
6	0.270	0.029	0.024	9%	0.265	0.011	0.009	3%
9	0.265	0.010	0.031	12%	0.263	0.003	0.009	3%
10	0.262	-	0.031	12%	0.262	-	0.009	3%

Table 3.7 Environment convergence index at each grid resolution for the [CO₂] response. The response reported is for the highest value of h available for the given value of N. The reported ECI is ECI(small N) (compared to the N = 10 solution) for all N except N = 10, and ECI(large N) (compared to the N = 9 solution) for the N = 10 solution.

Environments	Radial Profile, x = 10 cm				Radial Profile, x = 20 cm
N	[CO]^M_i	ε_ij	ECI	([ECI])/(y^M_i)	[CO]^M_i	ε_ij	ECI	([ECI])/(y^M_i)
3	0.408	0.087	0.069	15%	0.444	0.006	0.004	1%
6	0.433	0.029	0.041	9%	0.443	0.008	0.011	2%
9	0.442	0.010	0.054	12%	0.445	0.002	0.011	2%
10	0.446	-	0.054	12%	0.446	-	0.011	2%

Table 3.8 Environment convergence index at each grid resolution for the [CO] response. The response reported is for the highest value of h available for the given value of N. The reported ECI is ECI(small N) (compared to the N = 10 solution) for all N except N = 10, and ECI(large N) (compared to the N = 9 solution) for the N = 10 solution.

Environments	Radial Profile, x = 10 cm				Radial Profile, x = 20 cm
N	[H₂]^M_i	ε_ij	ECI	([ECI])/(y^M_i)	[H₂]^M_i	ε_ij	ECI	([ECI])/(y^M_i)
3	0.0071	0.0974	0.0014	17%	0.0078	0.0072	0.0001	1%
6	0.0076	0.0329	0.0008	10%	0.0078	0.0098	0.0002	3%
9	0.0078	0.0108	0.0011	13%	0.0078	0.0025	0.0002	3%
10	0.0079	-	0.0011	13%	0.0079	-	0.0002	3%

Table 3.9 Environment convergence index at each grid resolution for the [H₂] response. The response reported is for the highest value of h available for the given value of N. The reported ECI is ECI(small N) (compared to the N = 10 solution) for all N except N = 10, and ECI(large N) (compared to the N = 9 solution) for the N = 10 solution.

3.3.7 Numerical Uncertainty and Validation

One topic that has not yet been covered is the role of verification, particularly solution verification and the numerical uncertainty obtained from it, in the validation process. The numerical uncertainty provides an estimate of the numerical error: the amount that the solution to a computational implementation of a mathematical model deviates from the exact solution to that mathematical model. This numerical uncertainty is referred to as a level of verification. This gets at the question, to what degree is a model result truly due to the model, as opposed to numerical error?

A similar question may be posed for empirical uncertainty: how much of an empirical observation is due to the observed quantity itself, as opposed to observational error? These two questions are linked. Just as the amount of numerical uncertainty sets the level of verification, so too does the amount of empirical uncertainty set the level of validation, discussed further in Section 4.4.3↓. In order to perform validation at a certain level, one must also perform verification at a corresponding level. It is thrilling to obtain model results that compare well to high quality experimental data (that is, to have a high level of validation); but if the model has not been verified, or has been poorly verified, the mathematical model may be solving incorrect governing equations, or be suffering from significant numerical bias, making the validation ineffectual.

The last step in the validation process is comparison of the numerical uncertainty to the simulation uncertainty. Any validation procedure should determine valid values (lower and upper bounds) for the simulation input parameters; it should also return an estimate of the corresponding simulation output lower and upper bounds. In the case of the data collaboration method (discussed in Chapter 6↓), the initial input parameter uncertainty bounds are reduced to valid ranges, and simulation output lower and upper bounds corresponding to these input uncertainty bounds are estimated. Once these output “empirical” uncertainty bounds (empirical in the sense that they are valid for, and correspond to, a set of empirical data) are estimated, they should be compared to the numerical uncertainty bounds, and a determination should be made about whether the verification level corresponds to the validation level, or whether further numerical refinement is needed. As already mentioned, if the level of verification is much lower than the level of validation, no conclusions can be drawn about the validity of the model.

3.4 Conclusions

The material covered in this chapter began with an explanation of how the error vs. uncertainty discussion in Section 1.4.4↑ applies to verification through numerical error vs. numerical uncertainty. The distinction of verification activities are that they deal entirely with rational, or mathematical, truth, and two verification activities are intended to quantify numerical error and numerical uncertainty, namely code verification and solution verification.

Before discussing code or solution verification, various sources of numerical error, and a system for thinking about and categorizing them, were presented in Section 3.1.3↑, and a novel error taxonomy useful for thinking fundamentally about error and its sources was presented (Figure 3.1↑). This was a substantial improvement over existing error taxonomy approaches due to its emphasis of tying each type of error to its associated step in the process of implementing and solving a discrete computational version of a mathematical model.

Two parts of verification were covered: code verification (Section 3.2↑) and solution verification (Section 3.3↑). Code verification quantifies numerical error and examines the behavior of numerical error as grid size is decreased using a grid convergence study. In order to quantify numerical error, the mathematical model (that is, the principal equations that make up the mathematical model) are solved for simple problems with known mathematical solutions. This can be done using analytical solutions, which are difficult to obtain; manufactured solutions, which lend themselves well to computational fluid dynamics frameworks; or benchmark solutions, high resolution numerical solutions of the same or a similar set of governing equations. The method of manufactured solutions was used for the code verification grid convergence study. The results of the code verification grid convergence study demonstrated that the code converged with respect to grid size h with the expected theoretical order of accuracy corresponding to the discrete operator implemented.

Solution verification analyzed the numerical uncertainty, also using a grid convergence study but for a problem closer to intended use, and varying two numerical parameters (Section 3.3.1↑). This grid convergence study was run with 4 grid resolutions h and 4 different numbers of DQMOM environments N, with the combinations of each parameter selected according to a half factorial design, for a total of 8 cases. The analysis of the solution verification results revealed a significant interaction between the two numerical parameters (Figure 3.4↑). The resulting orders of convergence p, q, r, and s as expressed by equation (4.32↑) were computed by guessing values of each and regressing the model solutions y^M_{h_i, N_i} to the functional form. The selected orders of convergence were those that minimized the R² coefficient and the mean squared error, equivalent to methods #3 and #7 described by Logan and Nitta [101], described in Section 21↑. It was found that in the intended use regime the apparent orders p and r, corresponding to h, were both 1, and that q and s were alternately 2 and 1, so that the apparent order with respect to N was always 1, but exhibited some second order behavior (the orders of convergence for each response and each location are reported in Table 3.3↑). The solution verification also provided a numerical uncertainty estimate in the form of the GCI (Section 3.3.5↑). The numerical uncertainty was spatially dependent, but was small enough near the axial location corresponding to the first experimental measurement that the level of verification can be treated as much smaller than the level of validation.

4 VALIDATION FRAMEWORK

The purpose of the experiment is not to verify a proposed theory but to replace a computation from an unquestioned theory by direct measurement. . . . Thus, wind tunnels are used . . . as computing devices . . . to integrate the nonlinear partial differential equations of fluid dynamics.
― John von Neumann

4.1 What Is Validation?

Oberkampf et al. [184] define validation as “the assessment of the accuracy of a computer simulation by comparison to experimental data.” It is a test of whether and how well a computer simulation can reproduce empirical observations. Validation should be preceded by verification (see Chapter 3), as verification ensures that a code is mistake-free and numerically convergent (and otherwise numerically well-behaved). Establishment of a validation metric is an area of active research, but is a difficult task, not the least because of fundamental conceptual mistakes and ambiguity. For this reason, a discussion of concepts underlying validation should precede a specification of the validation approach used.

In order to validate a computer simulation, it is important to first specify what computer simulation is, what it entails, where it fits into scientific methodology, and its relation to experimental data. Misconceptions about simulation often lead to unrealistic expectations or abuse of computer simulations, misinterpretation of results, and validation procedures that are performed incorrectly. To avoid such misconceptions, a clear description of simulation and its role in the scientific method follows.

4.1.1 Simulation as an Extension of Theory

Traditionally, the scientific method has been interpreted as a one-way communication of information: from data to theory. In his Logic of Scientific Discovery, Popper presents his falsifiability view, which is essentially the same process: experiments are performed and data is gathered; hypotheses are generated to explain the data; and as new data is gathered, hypotheses are either falsified (that is, proven false through contradiction of the observed data) and discarded or refined, or they are not falsified because they do not contradict the observed data. This is the process of science most often disseminated to the non-scientific public.

Unfortunately, however, this view of science is overly simplistic, idealistic, and seriously flawed. While there is substantial information transfer from experiments to theory, there is also substantial information transfer from theory to experiments, in order to interpret experimental data. As an example, measured quantities often take on different meanings depending on the paradigm of accepted scientific theory (as an example, the quantity “mass” has differing meanings depending in the Newtonian and Einsteinian paradigms [96]). Theories can be and are used to run verifiability tests on experimental data [178], or even to throw suspicion on particular experimental instrumentation or techniques [51]. As Roache stated, “every observation is laden with theory” [141].

Simulation has recently elbowed its way onto this complicated scene and is now providing an additional approach to exploring scientific questions. But, like every branch of science before it, simulation is being applied before fundamental questions about how it works have been answered (or, in some cases, posed). What is the relationship between simulation and experiment? Simulation and theory? How does simulation change the way experiments are performed? How does it change the way theories are formulated? When can a simulation result be trusted, and how is that trust established? Can a simulation be right “to a degree?” As stated in Section 1.3↑, at the heart of these questions is a need for an epistemology of simulation.

Simulation is colloquially cited as a “third pillar of science” in online forums, YouTube videos, and class syllabi. Claims that simulation is a third pillar of science have even been made in the scientific literature [184, 25]. With this perspective, simulation is often used in place of experimental data, with interpretations of simulation results often sounding exactly the same as interpretation of physical results. Simulations of systems lacking experimental data are run to gain insight (so-called in silico experiments). Indeed, this approach is tempting, with many features of simulations being shared by experiments: experimental data sets share common traits with extracted simulation results, simulations are seen as virtual experimental facilities, and computational results are often presented in such way that they will bear a striking resemblance to real physical phenomena. Questions of whether the simulation results are reliable and trustworthy are often answered by resolving more scales and spending more computational power to obtain higher resolution solutions.

However, much like the falsifiability view of the scientific method, viewing simulation in this way is overly simplistic, hazardous to the scientific process, and a gross overinflation of simulation’s capabilities. In reality, computer simulations are simply an extension of scientific theory. Using a computer simulation, it is possible to explore the implications of a mathematical model in much greater depth than was possible a century ago, but computer simulations are never real; they are merely extensions of scientific theory. Depending on the mathematical model, the computer simulation may fall further or closer to reality. For example, direct numerical simulation (DNS) is a widely-used technique in fluid dynamics that takes a first-principles approach to modeling, and may be considered closer to reality than Reynolds-Averaged Navier Stokes (RANS) models that use ad-hoc closure methods to make solutions more economically feasible. But no simulation ever falls in the realm of reality. While simulations can be used to understand systems for which there is a sparsity or lack of data, simulations should never be thought of as independent of theory; simulations are an extension of theory. It is also absurd to call simulation a mature science [25] when such important observations as “simulation is not independent of theory” are routinely ignored or overlooked by scientists specializing in simulation.

An important question to ask is, what is the source of this misinterpretation of simulation? From what source does the eagerness to treat simulation as surrogate experimental data spring? The answer can be found by examining the purpose of science as a whole. All of science is an attempt to understand the reasons for things that happen. Understanding these reasons can inductively lead to principles that can be utilized to deductively predict the behavior of systems, which in turn allow civilization to design useful things like skyscrapers and satellites, or create useful processes like converting chemical or geological materials into electricity. The process of science culminates in the practical application of scientific principles to make useful predictions about reality, and improve society through these predictions.

Simulation is no exception: one significant goal of simulation is to extend theory and predictivity beyond experimentation, and help understand systems for which little or no experimental data are available. However, as stated above, fundamental questions, routinely deferred by researchers, are the chief roadblocks to simulation’s development into a more mature science. The process of validation is at the heart of each of these questions. This is why validation is so important: it forms the critical step of establishing trust in a model, which must occur between the construction of the model and the use of the model to make predictions.

4.1.2 Validation Metric

In order to perform validation, a metric is required. Validation metrics should have a number of characteristics, a subject discussed in Section 6.1.1↓; but the chiefest among these should be use of the truth. The point belabored in Section 4.1.1↑ was that simulation cannot and does not represent reality (empirical truth); so in order to establish any trust in a computational model, experimental data must be used. Simulations must match the criteria placed on empirical truth by experiments in order to be considered valid; any appeal to alternative principles or emphasis on alternative metrics should be rejected as a validation criteria. One recurring theme of validation is that, while principles and metrics that appeal to something other than truth criteria (experimental data) may be useful, they should not be considered validation.

One statement supporting the choice of such a metric comes from Ernst Mach. Mach, a proponent of an extreme form of empiricism called phenomenalism, made the following statement in his 1893 book Science of Mechanics:

The function of science, as we take it, is to replace experience. Thus, on the one hand, science must remain in the province of experience, but, on the other, must hasten beyond it, constantly expecting confirmation, constantly expecting the reverse. Where neither confirmation nor refutation is possible, science is not concerned. (p. 586, Science of Mechanics)

Mach’s statement embodies not just a proper philosophy of science, but also a proper philosophy of validation: recognizing, on the one hand, that validation is inherently limited to the experimental data available, but on the other hand, that validation is ultimately aimed at establishing trust in a simulation in order to extend the model beyond experience.

While the use of truth as the sole metric of validation may seem obvious straight off, it is not always treated as such. Models are often rejected straight away on the basis of the assumptions that have gone into the model, without regard to whether the model matches reality or not. Further, models are sometimes accepted regardless of their inability to match data; as an example, in [166], Oberkampf pointed out that, using the validation metric of Coleman et al. [165, 143], a simulation can be validated by “increasing the experimental uncertainty” or by “increasing uncertainty in data used from previous analyses”. He then stated, “as pointed out by Roache and Oberkampf and Trucano, this makes no sense.” Roache, too, made the reverse argument [30]: that validation becomes increasingly difficult as the experimental uncertainty bounds shrink. He advocated adding a tolerance to the experimental uncertainty in such cases to widen the empirical uncertainty bounds and make validation easier to achieve, albeit at a different level of validation. Such perspectives are misguided, because of the fact that the criteria used to judge models is not truth criteria. Roache’s suggestion, in particular, of adding a tolerance is ill-considered; it is intentionally throwing away (or worse, contaminating) information about reality, and is a complete departure from the activity of validation. Roache’s point is well-intentioned: codes can still be useful even if they do not match extremely rigorous truth criteria. However, it is dangerous to alias non-validation activities, such as Roache’s tolerance test, with validation activities, namely comparing simulation results to experimental data.

To illustrate this approach, a simple heat transfer problem is considered. If the temperature profile of an object is being measured with a low grade thermocouple at infrequent intervals, it is very easy for a predictive temperature model to be validated, that is, to match what is known about the true temperature, which, in contrast to Oberkampf’s claim, makes perfect sense. As the temperature measurements increase in frequency and precision, it becomes increasingly difficult for the model to match what is known about the true temperature. This is perfectly reasonable, despite Roache’s protests; if the model cannot predict correct values, it should not be trusted. If Roache’s approach of adding a tolerance to the experimental data is taken, this is equivalent to stating: “The thermocouple I am using measures temperature with a certain degree of accuracy; but I will fudge the instrument error, and pretend that I am using a lower grade thermocouple, in order to validate my model.” This approach to validation is disingenuous.

Experimental observations are the king in the chess game of model validation. The importance of experimental observations stems from the fact that they are the only source of quantitative information about empirical truth and about reality. They dictate the cost of validation, and the level of validation that may be achieved. And validation, like chess, results is either a win or a loss, a yes or a no: yes, the model matches the truth criteria, and is therefore validated; or no, the model does not match the truth criteria, and is therefore invalidated. Therefore, validation is a binary metric.

4.1.3 Instrumentalism

The use of truth as the sole metric of validation was supported in part by citing a quotation from Mach’s Science of Mechanics. Mach’s system of scientific philosophy is best characterized as instrumentalism [106, 87]. Instrumentalism holds that theories and models are merely instruments, through which scientists interpret empirical observations. Just as simulations are extensions of theory, so too are theories extensions of mathematics, and mathematics extensions of our “rational sense.” Each of these tools may be thought of as “rational instruments.” (So, too, are experimental instruments extensions of our empirical senses, and therefore “empirical instruments.”) The quality of each of these rational instruments is grounded entirely on its ability to match what is known about the truth: truth criteria, or experimental data. Mach distilled a central precept of instrumentalism into an excellent 1882 lecture to the Imperial Academy of Sciences in Vienna, entitled “The Economic Nature of Physical Inquiry.” He said:

In reality, the [model] always contains less than the fact itself, because it does not reproduce the fact as a whole but only in that aspect of it which is important for us, the rest being intentionally or from necessity omitted. (p. 193, Popular Scientific Lectures, [105])

In other words, to validate a model, data is chosen that reflects some aspect of the system that is interesting or important, because it is only the instrument’s reflection of this aspect of the system that is being validated, that is being made trustworthy.

Instrumentalism can be contrasted with two other dominant paradigms of philosophy of science, namely realism and empiricism. Realism approaches models and theories as logical systems composed of synthetic statements, based on principles of logic that are so fundamental that the logical systems cannot be refuted by experimental data. The challenge of realism lies in finding the correct synthetic statements [135, 100, 114, 172]. Another way to express realist views is that reality has an inherent mathematical structure or order, that the universe follows rules, and that there exist physical “laws” that describe the universe. The ability of mathematics to describe the universe is cited as evidence in support of realism. A realist would stipulate that “the principles of logic and mathematics represent the only domain in which certainty is attainable” [172]. On the opposite end of the spectrum from realism is empiricism, which is critical of any system grounded in purely analytical or synthetic statements. Empiricism does not just reject analytical statements in judging a model’s ability to match reality: it even goes so far as to reject any assumptions underlying a model which are not based on empirical statements. In fact, the empirical validity of the model output is based on the empirical validity of the model’s underlying principles and assumptions.

Instrumentalism, in contrast to both, sees the value of models, not in its empirical validity, or in its basis on logical, synthetic principles, but rather, its predictive capability. Naylor, who calls instrumentalism “positive economics,” explains the driving philosophy behind instrumentalism by quoting Milton Friedman; Friedman makes the point that often the emphasis on details of the assumptions in a model makes validation more complex than it should be:

The difficulty in the social sciences of getting new evidence for this class of phenomena and of judging its conformity with the implications of the hypothesis makes it tempting to suppose that other, more readily available, evidence is equally relevant to the validity of the hypothesis - to suppose that hypotheses have not only “implications” but also “assumptions” and that the conformity of these “assumptions” to “reality” is a test of the validity of the hypothesis different from or additional to the test by implications. This widely held view is fundamentally wrong and productive of much mischief. Far from providing an easier means for sifting valid from invalid hypotheses, it only confuses the issue, promotes misunderstanding about the significance of empirical evidence for economic theory, produces a misdirection of much intellectual effort devoted to the development of consensus on tentative hypotheses in positive economics. (Essays in Positive Economics, [54])

In other words, comparison to empirical observation is the primary, and only, test of validity. Friedman is claiming “that it makes no difference whatever to what extent the assumptions falsify reality” [172].

To illustrate the approach of each philosophy, consider an analog clock whose gears have stopped; this clock will tell the correct time of day twice a day. Each philosophy will have a different approach to determining whether this clock is correct:

The realist would say: “The mechanism of this clock appears to be broken, according to my schematics of the clock gears. Therefore, the clock will always tell the incorrect time; it will be unconditionally wrong as long as the gears are not functioning properly.”
The empiricist would say: “Clocks are supposed to move their hands, but this clock is not. Therefore, this clock will not tell the correct time; it will be unconditionally wrong as long as it is not exhibiting normal clock behavior.”
The instrumentalist would say: “We are interested in knowing what time it is. Therefore, we shall compare the reading of the clock to a well-established standard time, and make a judgement about whether the clock is correct. The more correct readings the clock gives, the more confidence may be placed in the clock. Disassembling an invalidated clock reveals information about other clocks, but disassembling a validated clock can reveal information about principles of time.”

Each philosophy has its obvious advantages and disadvantages. However, the advantage of the instrumentalist approach is that it is less presumptive. While this example presents obvious good and bad choices, it is because we already understand clocks and time very well; real world problems are vastly more complex, and the real world equivalents of normal clock behavior, the concept of time, and a well-established standard time are almost never available.

Taking an instrumentalist perspective on validation does not preclude the use of rationalism or empiricism in the various stages of model construction. Each philosophy has its uses. However, for the process of model validation, instrumentalism is the only tenable philosophical approach to validation. The instrumentalist philosophy provides a consistent perspective for model validation. A consistent perspective is of particular importance given the many complexities and difficult scenarios encountered while performing validation.

4.2 What Is Empirical Uncertainty?

Any discussion of validation must also include a discussion of uncertainty. As defined in the introduction (Section 1.4.4↑), uncertainty quantification provides a bounds for the error based on a true value when the true value is unknown or unmeasurable. Solution verification, covered in Section 3.3↑, is an activity that quantifies the numerical uncertainty (a bounds on the mathematical or numerical error). Validation, on the other hand, quantifies the uncertainty bounding the empirical error; this is referred to as the empirical uncertainty.

This chapter will first categorize different sources of empirical uncertainty and describe the procedure of validation and uncertainty analysis. Like the error taxonomy of Section 3.1.3↑, this is not intended to be comprehensive; rather, it is intended to provide a cohesive framework for thinking about uncertainty, where it enters into the validation process, and what effect it ultimately has. Then a variety of methods for treating uncertainty mathematically will be reviewed to provide a perspective on the unique mathematical problems posed by validation and some formulations for dealing with these problems. Finally, the “level of belief,” or confidence, associated with the uncertainty interval will be discussed.

4.2.1 Uncertainty Taxonomy

As mentioned, uncertainty quantification and analysis can focus on either empirical uncertainty, which bounds empirical error when comparing a model to experimental data, or numerical uncertainty, bounding numerical error resulting from a comparison of a numerical implementation of a model with the corresponding exact mathematical solution. The focus here will be on empirical uncertainty, since the quantification of numerical uncertainty was already covered in the Solution Verification section (Section 3.3↑). Several of the uncertainties referred to here do not strictly follow the definition of uncertainty given in Section 1.4.4↑; these in fact bound a true value y, rather than an error y − ŷ; but the uncertainty concept is easily extendable to such cases.

When analyzing uncertainty in a system, uncertainties can be classified as either input uncertainty or output uncertainty. Input uncertainty refers to uncertainty that feeds into a system, for example through imperfectly known boundary conditions. Empirical input uncertainties can be classified as scenario uncertainties, which include uncertainty in boundary conditions as well as uncertainty in material physical properties. Model input uncertainties, on the other hand, consist of three types:

Submodel uncertainty: uncertainty in what choice to make for submodel forms or submodel parameters and what effect they will have on the empirical error.
Numerical parameter uncertainty: uncertainty in what choice to make for numerical parameters such as grid resolution h and what effect they will have on the empirical error.
Scenario parameter uncertainty: uncertainty in what values to use for boundary conditions and other scenario parameters, and the effect they will have on the empirical error.

Model input uncertainties can be thought of as fundamentally different from empirical input uncertainties: model input uncertainties are uncertainties of choice, rather than uncertainties of imperfect knowledge. This difference alludes to a significant difference in uncertainty analysis of models and uncertainty analysis of experiments, covered shortly. Output uncertainty is a resulting uncertainty in an experimental observation d_e or a model prediction y^M_e originating from a number of sources, including input uncertainty propagated through the entire system.

Empirical uncertainty analysis is an attempt to answer the question: “How well does one know what the true observation y is?” Empirical uncertainty expresses a lack of information about the true observation. Model uncertainty analysis attempts to answer the question: “How well does one know what the true model prediction y^M is?” It expresses lack of information about the model prediction. The goal of model validation is not to reduce the level of experimental uncertainty; for the purposes of validation, the experimental uncertainty is what it is. The purpose of model validation is to reduce the level of model uncertainty until it matches the truth criteria.

Questions of this form complicate the validation process. Validation answers the question of whether a model can match data. But while validation is a binary measure, it is an uncertain binary measure. This necessitates a probabilistic mathematical treatment of uncertainty.

4.2.2 Mathematical Treatment of Uncertainty

The mathematical treatment of uncertainty has a storied history dating back to Legendre and Gauss [168]. Approaches can be generally classified in two ways: set based, or probability based. Set theory approaches to uncertainty describe specific sets of events, which may or may not have particular properties. Depending on their properties, they are placed in different sets, based on a logical variable: “this thing has this property,” or “this thing does not have this property.” Further consideration can be given to properties that are graded (ordinal variables), or that have multiple possible unordered values (nominal variables) (see [15]). This approach has been extended in many fields, and includes such diverse approaches as fuzzy logic [193], which considers non-discrete (fuzzy) inclusion in sets of “this thing has this property;” interval analysis [113], which examines the behavior of functions for sets (intervals) of values; and mathematical programming, which involves selecting optimal elements of sets. The data collaboration method, the validation methodology used for validation of the Arches coal gasification model in Chapter 6↓, can be classified as a set-based approach to uncertainty.

Probabilistic approaches to uncertainty describe it from a statistical perspective; given a population of i atoms A_i, the probability P(B) of an event B can be described as the number of atoms A_i that confirm to, or follow, B, divided by the total number of atoms in the population. This simple idea can be extended and generalized to create probability systems (see e.g., Jaynes [84]). There have been many useful extensions of probabilistic approaches to uncertainty, just as with set theory. These include such approaches as stochastic processes and stochastic calculus [75].

Some researchers argue that their chosen probability or set approach is superior to other approaches, such as Lindley [36] with probabilistic methods (Bayesianism), or Klir [91] with set based methods (fuzzy set theory). However, ideas and methods from both approaches provide valuable ways of thinking about uncertainty. In burgeoning fields such as uncertainty quantification, it is self-defeating to argue over which approaches are “better.” Much like the philosophy of instrumentalism, which takes the high road and bypasses the conflict between the deeply entrenched rationalists and empiricists by using both rationalism and empiricism to achieve the end goal of obtaining validated models, so too should the high road be taken for a mathematical treatment of uncertainty, and the best features of both approaches used to obtain an accurate and useful description of reality.

4.3 Approaches to Validation

A review of various approaches to validation is critical to understanding the issues related to validation and uncertainty quantification. It is, of course, acknowledged that the field of model validation is large and difficult to cover comprehensively (see [187] and [141] for two such attempts). This is not the goal. The focus in this overview of validation approaches is to summarize papers whose conclusions or contributions are important to highlight, or whose approach is novel. Once this is done, it will become clear that while many authors have contributed ideas for validation metrics or introduced new validation metrics, there is often no clear way to reconcile different models. For this reason, a framework from the literature is adopted that will lead to a better understanding of the relationship between various approaches.

4.3.1 Pre-1990s

The concept of validation appeared as early as 1967, one year after ARPANET was created. Naylor et al. [172] discussed validation (which, during its early development, was also called “verification,” with the two terms often used interchangeably) and presented several validation measures to quantify goodness of fit. Even at this very early stage of validation of computer simulations, it was recognized that empirical observations played a central role in validation: “Although the construction and analysis of a simulation model, the validity of which has not been ascertained by empirical observation, may prove to be of interest for expository or pedagogical purposes (e.g., to illustrate particular simulation techniques), such a model contributes nothing to the understanding of the system being simulated.” Naylor et al. also highlighted the central role of probability theory in the process of validating models.

Much interest and early development in simulation validation emerged from operations research, and in particular military operations research [27, 43, 153, 83, 82, 126]. Military operations applications share interesting parallels with the systems of interest in the present study: expensive and sparse data; significant bias; large numbers of known and unknown variables [43]. They also discuss precisely the same concerns that later came up in discussions of validating engineering simulations. The fact that validation is inextricably linked to the intended use was addressed by Hodges [82]: “the appropriate form of quality assurance for a model depends fundamentally on how the model is used, so any attempt to define a single validation standard and procedure for all models in all uses will surely fail.”

Another field of study that has made significant contributions to the validation literature is nuclear reactor design. This field is particularly concerned, not just with validation, but with degree of validation and predictivity. Griffin [153] discussed the use of computer simulations to design nuclear reactors, formulated the idea of levels of confidence in validation, rather than attempting to create a single metric for all models: “it is apparent there is no such thing as absolute verification [and validation] of a computer program... Rather than talking about verification [and validation], it would seem more appropriate to talk about level of confidence.” This led in part to the adoption of the terminology “level of validation,” which is now common in the model validation field.

Several journals also began to adopt guidelines that required attention be given to uncertainty quantification. In 1986, the Journal of Fluids Engineering (JFE) published guidelines for authors to give attention to quantification of numerical uncertainty, but the guidelines did not make any statements regarding empirical uncertainty. In 1987, the American Nuclear Society (ANS) adopted guidelines for validation and quantification of uncertainty. Likewise, NASA researchers also published definitions of verification and validation in the context of aerospace applications [2, 108]. Adoption of uncertainty quantification guidelines by societies and journals raise the bar for peer reviewed publications and grants, and can help to institutionalize practices in their respective fields by adopting such guidelines.

Perhaps the most pithy, if not rigorous, definitions of verification and validation were given by Boehm [12, 141]: verification is “solving the equations right” and validation is “solving the right equations.”

4.3.2 1990s

The decade of 1990-1999 saw a proliferation of validation into many new applications in systems modeled using partial differential equations, particularly in aerospace engineering. The groundwork for validation and uncertainty quantification was laid by researchers investigating verification, quantification of numerical error and numerical uncertainty. Many of these questions and concepts were then extended to model validation. Oberkampf [121] proposed a framework for thinking about verification and validation of engineering codes, acknowledging the need for separate treatment of numerical errors and modeling errors. The framework applied experimental, numerical, and analytical approaches to all aspects. This framework was further detailed in [120]. However, as mentioned in Section 3.1.3↑, the work blurs the very important activities of verification and validation - that is, assessment of numerical uncertainty and empirical uncertainty - as well as the distinction between the concept of error and uncertainty (Section 1.4.4↑). Karniadakis [59], in addressing numerical uncertainty, made the same mistake. For example, he refers to modeling error as stemming from lack of knowledge about “the precise constitutive laws and thus the corresponding governing equations.” In fact, this is model uncertainty, as there are no “true” governing equations for any system (Sections 1.4.2↑ and 1.4.3↑). Additionally, he refers to boundary condition error as stemming from the use of an incorrect boundary condition. However, this is an uncertainty in the scenario, not an error, because there is no “true” boundary condition. Any boundary condition error would come only from incorrectly-implemented or incorrectly-formulated boundary conditions (see 3.1.3↑), not from using a variety of correctly-implemented boundary conditions.

Marvin [108] made important progress toward establishing (or, re-establishing) the definitions of verification and validation in the field of engineering simulations such that they correspond more closely to those in the field of operations research (the same meaning they now carry). Marvin referred to two important aspects of comparisons between simulations and experimental results: numerical and physical. He went on to compartmentalize the numerical aspects, verification, and the physical aspects, validation. Marvin stated that “the accuracy of a computation depends on two principal considerations: 1) the physical realism of the governing equations and boundary conditions [validation] and 2) the accuracy of the numerical solution of these equations [verification].” In addition, he made the important observation that numerical accuracy could be evaluated in the absence of experiments, but validation could not. Furthermore, in a very important step forward, Marvin recognized the importance of validation experiments, which are experiments intended primarily to be used to validate experiments. Because the interests of experimentalists and modelers diverge so much, most legacy experimental data cannot be used for validation, as the systems are not well-characterized (e.g., measurements of system input variables or boundary conditions do not have the level of accuracy or detail that is needed). He also recognized the potential for the internet to be used to create large repositories of experimental data to help facilitate validation using the data.

Coleman et al. [29] proposed a validation methodology that was an excellent example of a synergistic approach: the validation metric incorporated both numerical and empirical uncertainties, both measurement and simulation errors, and confounded all of them into a single uncertainty quantity, which the authors called the error, E, which was the difference between the data measurement D and the simulation response S, E = D − S. Various forms of uncertainty were accounted for, and validation was the process of reducing the error E below the value of each of these uncertainties: in short, getting the simulation predictions to fall within the error bounds of the data measurements.

Kleijnen [79, 81, 80] advocates the use of mathematical statistics to compare simulation results to experimental observations, and discusses application to both transient and steady state computations. His approach is centered much more on statistics, and the applications emphasized are in the field of operations research. However, he proposes many unique and interesting approaches to validation and related questions. Additional statistical approaches to validation utilized concepts from designs of experiment, and applied them to designs of “computer experiments” [86, 85, 182, 137].

Roache [125, 141] combined much of the existing literature on verification and validation and synthesized it in a single cohesive way.

One notable paper dealing with verification and validation of simulation models is by Oreskes et al. [124] in Science. The paper greatly befuddles many concepts in verification and validation (for example, by confusing the terms “verification” and “validation,” using non-technical dictionary definitions for the terms, and ignoring definitions established in the literature). The paper also comes to a somewhat absurd conclusion, that “any scientist who is asked to use a model to verify or validate a predetermined result should be suspicious” (even if the predetermined result is an analytical solution or a set of experimental observations; this conclusion is untenable). Although the paper helped to bring verification and validation into the public eye, it was not the best paper to have done so. Further criticism of [124] may be found in Roache [141].

4.3.3 2000s

The years after 2000 saw a great proliferation of validation throughout the literature, with significant steps taken toward establishing validation as a more legitimate and more mature science. Large leaps forward in computational power lead to the rapid rise of simulation as a standard methodology for modeling, and correspondingly there was a surge in papers dealing not just with validation issues but with issues validating large and expensive computer models, including surrogate modeling [78, 45, 152, 76], computer experiment design [77], optimization [112], and efficient exploration of sample space [66, 65]. There was also a greatly increased emphasis on uncertainty in the 2000s [58, 102, 132, 50, 67, 47, 9, 140], including many applications of new and existing mathematical methods to deal with uncertainty [35, 73, 91, 31].

Oberkampf published many interesting and useful papers that covered a wide range of topics related to validation and uncertainty quantification. These topics included validation experiments [122, 186], comprehensive coverage of the validation field and its associated terminology [173, 184], approaches for representation of uncertainty [74, 67, 65], and proposed validation metrics [186, 183, 185].

Coleman et al. [165] also continued to develop their approach described in the previous section, with lively discussion; Coleman and Stern incorporated many of these ideas about model validation into Chapter 7 of an excellent reference on experiments and experimental uncertainty [31].

Another interesting approach to model validation grew out of the process control theory, mathematical programming, and optimization community. This approach originates from the need for a control system to be robust, that is, to be able to remain stable for all possible values of a number of variables, each for a given range. Monte Carlo methods are not powerful enough to determine extremes (or worst case scenarios) of parameter combinations, so it is of interest to be able to compute lower and upper bounds of a system response given lower and upper bounds of input parameters.

These ideas were developed into the data collaboration approach and applied to problems such as the GRI Mechanism [147]. This approach synthesized many ideas, such as surrogate modeling, set based treatment of uncertainty, and the need for quantitative measures of model validation, while utilizing ideas from fields largely unexposed to the validation and uncertainty quantification community (e.g., robust control theory). An overview of this approach is given in [147], while an in-depth treatment is given in Feeley’s thesis [148]. This approach is discussed in greater detail and applied to the problem of coal gasification in Chapter 6↓.

4.3.4 The Need for a Framework

Nearly every method discussed above has the weakness of providing only a piece of the validation process. However, this is not the fault of those introducing the methods, and it does not imply that they did not perform validation correctly. To be fair, validation is extremely dependent on the problem, the experimental data, and the simulation tool being validated: each case must proceed differently. Thus, each approach presented in the literature must by necessity omit the details of the entire approach and focus only on the piece of interest in of the overall validation process.

There is, however, an evident lack of validation frameworks presented; that is, an approach that comprehensively covers the often crucial first step of selecting variables, all the way to the last step of what to do once the simulation is validated, as well as important intermediate steps. While this is, as mentioned, entirely dependent on the problem, the experimental data, and the simulation tool, frameworks must be flexible by design in order to be industrious.

Additionally, a framework provides a way of synthesizing a system in which various approaches and ideas can be combined; for example, the metamodeling approach to validation of Kleijnen [76] with those of Oberkampf and Barone [183] or Coleman [29].

One such framework was presented by Bayarri et al. [104]. This framework, which will be referred to as the NISS framework, was intended for expensive computational models, but less expensive than the Arches computer model (Section 2.6↑). Many of the ideas apply to a very expensive model like Arches, as well as to very cheap models that take on the order of seconds to run.

The NISS framework consists of six steps:

Specification of model input parameters and creation of input/uncertainty map
Determination of evaluation criteria
Data collection and design of experiments
Approximation of computer model output using metamodel
Analysis of model output and comparison to experimental data
Feedback and feed forward of information to present and future validation activities

A detailed description of each step will be omitted and left to the Bayarri paper. However, the application of each step to the intended problem, coal gasification, and the Arches model is presented in the section that follows.

4.4 Application of NISS Framework to Coal Gasification

4.4.1 Step 1: Creation of Input Uncertainty Map

Following the framework of Bayarri et al. [104], the first step of validation is to generate an input/uncertainty map, which lists all parameters that have a potential to affect the model and their associated ranges of uncertainty. This combines modeling, scenario, and numerical parameters into a single list. Because of the importance of this step, as well as the difficulty of making the right selections, this step is best performed with a group of experts, both experimental and modeling. It should also utilize any and all prior studies of the physical phenomena of interest, in order to utilize the most possible information in selecting the potential active parameters. This is also revisited when validation studies have been completed, in order to utilize the additional information provided by the studies.

In order to construct an input uncertainty map, which is a listing of all potentially active parameters in a system ranked in order of anticipated importance, several relevant gasification studies were consulted [156, 162, 161, 89, 117, 118, 23, 22, 158]. Smith [156] provided an extremely useful digest of the results of his sensitivity studies of the RANS coal combustion code PCGC. His conclusions regarding gasification included:

Parameter coupling played a strong role in coal particle burnout
Recirculation and devolatilization strongly affected local gas temperatures
Coal gas mixture fraction was significantly affected by devolatilization, and was also affected by recirculation and by strong multiparameter coupling

In addition, Smith offered the following recommendations:

Future modeling efforts should focus on particle devolatilization and oxidation mechanisms
Sensitivity and other future studies should focus on furnaces at industrial or industrially relevant scales.

Many of the conclusions and recommendations of Smith were incorporated into the formulation of the present validation study.

Additionally, several papers from the group operating the BYU gasifier whose data was being used for validation were analyzed to obtain information about experimental uncertainties, measurement techniques, and specific conclusions about the gasifier. For example, Brown provided several conclusions about the effect of coal types on the experimental results; Soelberg [161] reported several uncertainties for quantities of interest; Nichlols [89] and Sowa [162] provided detailed information about the gasifier facility, probes, injectors, sampling methods, and procedures; and Sowa [162] provided a very detailed experimental error analysis, including both experimental verification (quantifying and reducing instrument bias error) and repeat gasification experiments to provide greatly improved estimates of variance and experimental error bounds. Each of these references were utilized to better understand the gasifier, some of the issues associated with the operation of the gasifier, and hints of potentially important scenario parameters.

After these conclusions were reviewed, an initial list of important parameters was created, and a roundtable discussion with experimentalists and modelers to determine useful parameters to investigate for the validation was held. The following variables were decided upon as the primary variables of interest:

Kobayashi devolatilization model activation energy E₂
Kobayashi devolatilization model Arrhenius factor A₂
Mass-mean particle diameter d_p
Gasifier wall temperature T_wall

It was anticipated that these variables would have the largest effect on the flow.

In addition, this study utilized a “sequential experimentation” technique [142], in which the effect of many parameters on the system response were investigated using a low-order statistical model (a screening study, discussed further in Section 5.3.6↓), and the effect of progressively important parameters on the system response were modeled using progressively higher-order statistical models. This technique was used for determining the functional form of the system responses, as well as for providing justification for the constructed response surfaces for the simulation model (see Section 5.4↓). Because the screening step allowed investigation of up to seven parameters, the following parameters were also included as potentially important:

Coal feed rate ṁ_coal
Char-CO₂ oxidation reaction activation energy E_{char − CO₂}

All of the above parameters were investigated using the sequential technique just described. The results of this sequential assembly process are reported in Section 5.4↓. The mean values of each variable, along with the uncertainty range explored in the screening design, are given in Table 4.1↓.

Variable	Units	Mean Value	Uncertainty	Uncertainty Range	References
E₂	(J)/(kmol)	2.0 × 10⁸	50%	1.0 × 10⁸ ≤ E₂ ≤ 3.0 × 10⁸	[93, 177]
A₂*	s^− 1	1.0 × 10¹¹	27%*	1.0 × 10⁸ ≤ A₂ ≤ 1.0 × 10¹⁴	[93, 177]
d_p	μm	37	10%	33.3 ≤ d_p ≤ 40.7	[158, 160]
T_wall	K	1200	16%	1000 ≤ T_wall ≤ 1400	[161, 158]
ṁ_coal	(kg)/(hr)	22.1	10%	19.9 ≤ ṁ_coal ≤ 24.3	[161, 23, 162]
E_{char − CO₂}	(J)/(kmol)	9.3 × 10⁷	60%	3.7 × 10⁷ ≤ E_{char − CO2} ≤ 1.5 × 10⁸	[159]

Table 4.1 Model input/uncertainty map: means and their associated prior uncertainties (* = log scale).

4.4.2 Step 2: Determination of Evaluation Criteria

The evaluation criteria provide the bookends for validation: where does one begin validation? And where does one end? The evaluation criteria is intended to address the question of what the validation is intended to accomplish. All validation activities have a common goal: make the model match the data. Thus, to determine the evaluation criteria, one must first define the system response of interest: the quantity that the model is expected to reproduce. In models of complex systems with many inputs and outputs, particular inputs and particular outputs will be of interest, since a model cannot be all things for all purposes; there will always be an intended use for the model. This intended use will dictate which data are important to use for the validation and which data are irrelevant.

Applying the instrumentalist approach to validation, the determination of evaluation criteria is straightforward. The evaluation criteria must be comparison to the information known about empirical truth: the data, all the data, and nothing but the data. By “data” is meant experimental observations of the system of interest relevant to the intended use of the simulation tool.

If data are not used as the evaluation criteria, what else may be used? No other goal. If data are not related to the intended use of the simulation tool, that data should not be used in the validation. However, although it is easy to settle on the type of evaluation criteria, the data can be very different for each of two classes of experiments: traditional experiments, and validation experiments.

Traditional Experiments

Traditional experiments are experiments run independent of any modeling activity, and they are run in order to accomplish a variety of goals, including improvement of understanding of a physical process, construction or improvement of mathematical model parameters (e.g., transport properties), and quality or safety tests of systems [122]. When these types of experiments are run, there is no input from modelers about what inputs are important, so these quantities are poorly quantified, if they are quantified at all. There is also no determination of what system responses would be most useful for a computer model to predict, so the system response that is measured is typically useful only for the particular goal of the particular experimental campaign being run. In addition, data is typically reported in journal articles, where length limitations prevent reporting of detailed information about the experimental setup or the quantitative results. Thus, modelers must resort to using tricks with rulers or magnifying glasses to convert qualitative plots into quantitative data. They are also forced to make gross assumptions about scenario parameters, which often turn out to be the input parameters of principal importance.

Validation Experiments

In order to overcome the difficulties associated with validation using traditional experimental data, a new type of experiment, called a validation experiment, was proposed [109, 108, 122, 183]. These experiments are designed by both experimentalists and modelers with the primary goal of validating a computer model. All inputs to the computer model are determined so that they can be quantified as part of the experimental measurements. This type of experiment has the potential to greatly improve agreement between models and experimental data through better characterization of input values and associated uncertainties. Without such characterization, model responses may vary wildly due, for example, to assumptions about scenario parameter values and uncertainties. With good quantification of computer model inputs, there can be much more confidence in attributing disagreement between the model and experimental data to deficiencies in the model, rather than incorrect parameter values.

Obviously, there is a much greater preference to use validation experimental data over traditional experimental data. These types of experiments are not always possible due to various financial, institutional, and personnel challenges. It would be very useful (both for validation and for improved interpretation and understanding of results) if characterization of scenario parameters, such as boundary and initial conditions, and their uncertainties, received greater attention. Online databases and archives are excellent ways to disseminate all relevant quantitative experimental results without the restrictions of a scientific journal’s page limitation, thereby addressing many of the deficiencies of traditional experiments. The impetus for such changes in attitude and approach must come from the community, but should be incorporated into the policies of scientific journals in order to provide motivation to apply such procedures. It also must take place on a management and funding level; Paul Davis [127] stated that validation experiments (what he calls verification, validation, and accreditation, VV&A) are “very important and [have] long been inadequately funded by any measure. By explicitly budgeting for ’serious’ VV&A, the Department of Defense would create incentives that do not now exist for model developers. Without such incentives, VV&A may improve only marginally, despite the suggestions and exhortations from this and other studies.”

4.4.3 Step 3: Design of Experiments and Data Collection

This step consists of two portions, and it is a step of particular importance when the model, the data, or both are expensive to evaluate. The two parts of this step, experimental activities and modeling activities, will be discussed in turn.

The experimental activities that compose step 3 depend on the type of experiments that are used: validation experiments, or traditional experiments. If validation experiments are used, then the experimental campaign can be designed to support the model validation activities (see discussion of validation experiments above, Section 4.4.2↑). Validation experiments should, first of all, report a range of uncertainty in the observed system responses, as part of an “uncertainty budget” that accounts for how much uncertainty there is in each experimental observation, and determines the sources of that uncertainty (for example, instrument error or input variable uncertainty that is propagated through the system and impacts the response). Additional useful experimental activities involve exploring various regions of parameter space. This allows for a greater level of validation, as the model must be more robust (match experimental data in a larger regime of parameter space).

Traditional experiments will sometimes report uncertainty, although it is rarely reported with a corresponding analysis indicating sources and corresponding quantities of error. While they will almost always explore a range of parameter space, the modeler has no input on the experiment design process, so the range of parameter space that is explored may not be of interest for the simulation effort. Additionally, quantities critical for the model, such as boundary or initial conditions, are often not reported, or little effort is expended to control these quantities. (This is understandable, since these quantities are often not of great interest to the experimentalist). Care must be taken when selecting a traditional experiment type data set.

The second piece of step 3 is the modeling piece: design of experiments and data collection for the computer simulation. The simulation data collection usually begins with selecting a set of parameter combinations at which to sample the simulation, using a space-filling design (e.g., Latin Hypercube) to cover a wide range of parameter space. This is followed by supplemental simulations to explore more local and more interesting regions of parameter space. The particular set of parameter combinations that are sampled depends on the assumed functional form of the response, which may be determined from the results of the space-filling design (see Chapter 5↓ for a broader overview of experimental design). If the simulation is extremely cheap to sample, a space-filling Monte Carlo method may be adopted, where only the space-filling step is performed. On the other extreme, if the simulation is extremely expensive to sample, the space-filling step is skipped; a functional form is assumed and an experimental design for the simulation samples is selected based on the assumed functional form. More details on the design points for the simulation are provided in Section 5.3.2↓.

Description of BYU Gasifier Data and Uncertainty

The BYU gasifier data used to validate the Arches gasification model come from Brown [22, 139, 23] and are data from a traditional experiment. The data were originally gathered to investigate the effect of coal type on gasification. The data consist of time-averaged radial profiles of three species, CO, CO₂, and H₂, at 7 axial locations: 0.21 m, 0.36 m, 0.51 m, 0.67 m, 0.81 m, 1.21 m, and 1.73 m. A separate study, with different operating conditions, reported carbon conversions and effluent concentrations of CO and H₂, but no radial concentration profiles were reported.

Sowa [162] discusses sources of experimental uncertainty in the BYU gasifier. Sowa performed several experimental measurements: first, he performed experimental verification experiments in order to quantify instrument error and identify systematic (bias) error, and he reported the standard deviations. These measurements included species mole fractions, carbon conversion, feed mass flowrates, and solids composition measurements. Sowa also performed computations and used a Monte Carlo error propagation technique to estimate the propagation of input uncertainty and its effect on the system responses. Sowa also investigated the amount of uncertainty in the actual gasification experiments by repeating measurements over the course of the same and different experiments. He reports the pairwise differences for carbon conversion and CO concentration for a subset of these measurements.

Sowa was primarily investigating the effect of the injector, and the pairwise differences exhibit a sensitivity to the injector design. However, there were additional factors that were known sources of uncertainty for these pairwise differences, including the carbon conversion measurement location (near-wall vs. near-centerline), sample volume (too large a sample size would be sampling gas with sharp gradients that was not yet well-mixed), difficulty with measuring and adjusting coal feed rate, difficulty with reproducing coal feed rate conditions for different experiments, and difficulty diagnosing sampling bias, which were also impossible to rectify. However, Sowa did not specify which of these sources of uncertainty corresponded to which specific measurements.

After analyzing the pairwise differences from the experimental repeats, Sowa created an uncertainty budget, comparing them with the measurement uncertainties and the computational estimates of input uncertainty propagation. The measurement uncertainties and propagated input uncertainties were expected to balance experimental error, but Sowa found they did not. Sowa concluded there were remaining sources of uncertainty for which he had not accounted. Sowa listed several experimental uncertainties beyond the control of the experimenters, but several were already accounted for above. These included lack of control over the coal feed rate, lack of knowledge of the effect of probe disturbances on the flow field, problems correcting for gas sampling bias, and non-steady state conditions in the reactor. The first uncertainty listed was addressed by Brown [23], who stated that one type of coal had inconsistent moisture content due to being pulverized far in advance of the experiments. The last uncertainty listed was addressed by Whitty [90], who showed that the time to reach steady state in a fluidized bed gasifier as determined by temperature and composition measurements was much different from the time to reach steady state as determined by bed carbon content, and that this was an important factor impacting the state of the gasifier. The implication is that the reactor may have appeared to be at steady state when looking at one variable, but not while looking at another.

Sowa also performed an interesting comparison for various instrument models - that is, the model converting the experimentally observed quantity measured by the instrument into something more physically useful (in this case, carbon conversion). Significant differences were observed in all but one model comparison, meaning that all but two models disagreed with each other.

Brown [23] showed that there was a 7% sensitivity to a coal feed rate range of 20.8 ≤ ṁ ≤ 27.3( kg)/(hr). Sowa [162] also reported uncertainties for several coal feed rates ranging from 27 to 34 (kg)/(hr), with uncertainties ranging from 7% for the highest feed rate to 20% for the lowest feed rate. Additionally, Sowa provided standard deviations for reactor pressures and O₂-coal ratios, all determined from repeat tests.

Uncertainties in the system response measurements (CO, CO₂, and H₂) were also quantified and reported [89, 161].

4.4.4 Step 4: Surrogate Models

The fourth step in the validation framework is to construct a cheaper and simpler surrogate model for the more complex model. This activity, sometimes called metamodeling, is one of the most critical steps in the validation procedure. An enormous concentration of resources and effort is spent developing and running large scale and expensive models like the Arches coal gasification model. The surrogate model distills the results of these thousands of CPU hours into a simple polynomial that approximates the output of the more complex model. However, this activity is fraught with problems. Trying to represent the output of an enormously complex nonlinear model using a model as simple as a quadratic polynomial is difficult to do, and even more difficult to justify.

This confluence of reasons makes a statistical analysis imperative for surrogate model design. It provides justification for the selected response surface, it indicates the variables of chief importance, and it makes analysis of the model results tractable. For this reason, an extensive treatment of surrogate models is given in Chapter 5↓.

4.4.5 Step 5: Analysis of Model Results

Much like step four of the framework, the fifth step is very important. The surrogate model generated in step four can be used in a number of different validation procedures, some of which were covered in Section 4.3↑. However, the approach adopted is the data collaboration (DC) method. This uses a set-based treatment of uncertainty, and uses mathematical programming (optimization) techniques to address several questions relevant to model validation. In addition to addressing the question of whether a model is validated, the DC method also attempts to provide information about where additional runs should sample the input parameter space, provide a means for comparing models objectively, and provide an uncertainty bounds on simulation results.

As with step four, the significance of step five is such that Chapter 6↓ covers it exclusively.

4.4.6 Step 6: Feedback and Feed Forward

The last step in the validation procedure is not intended to be the last step. With each preceding step, more information about the model is obtained. Even after a model has been validated using a set of data, improvement of the model continues; the model is validated against other data; weaknesses of the existing model are uncovered. This can also be extended to multiscale and hierarchical approaches to multiphysics problems [163]. In these systems, validation performed at each scale can either provide information for new validation activities at the same hierarchical level, or the information can be transferred among scales in the hierarchy (either up or down scales). For example, validation at low levels in the hierarchy may provide initial parameter sets for validation activities at higher scales; likewise, validation activities at higher scales can provide indication as to which submodels are controlling and need to be improved.

4.5 Conclusions

The approach to validation presented here began with a general discussion about computer simulation. The question of whether simulation is a third branch of science that has joined experimentation and theory as a “new” method was definitively answered in the negative. Viewing simulation as an extension of theory is an important perspective for validation and for deciding on appropriate validation metrics. The choice of a validation metric was discussed, and applying the instrumentalist philosophy of validation, the choice is clear: simulations must be validated using experimental data, and only experimental data. The role of rationalism and empiricism in the development of models in computational fluid dynamics and other fields is very important; but for model validation activities, they must not play a role. For model validation, the only appropriate validation metric is agreement with experimental data.

Empirical uncertainty was then defined and discussed. Various approaches to treating empirical uncertainty in the context of validation were covered, from early literature on validation of computer simulations dating from the infancy of computers [172] to the plethora of recent papers on the subject, as the field has advanced rapidly to keep up with the pace of computer hardware and the growing power of computer simulations. However, it can be difficult to get a handle of the entire field, primarily because many disjointed approaches seem to be trying to accomplish the same thing, or borrowing the same ideas, but speaking different languages. To help rectify these difficulties, a six step validation framework proposed by Bayarri [104] was adopted as a way of systematically approaching validation and utilizing the many approaches to different aspects of validation that are available in the literature.

The initial steps of this framework (Steps 1-3) were applied to coal gasification to determine the active input variables and their range of uncertainty, as well as assessing the uncertainty in the experimental data being used. All of this information feeds to the latter steps of the framework, presented in later chapters. Step 4 is covered in Chapter 5↓, while Step 5 is covered in Chapter 6↓.

5 SURROGATE MODELS FOR SIMULATIONS

With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.
― John von Neumann

5.1 Surrogate Models: Filling a Need

The term “black box” has become widely used in statistical modeling and analysis. The understanding of the black box model, from the statistical perspective, does not come from understanding the governing equations or the mathematical model, but rather from sampling the black box model for various combinations of input parameters, then using statistical procedures to understand the output better and, often, to construct a surrogate model to approximate the system output, with which to make predictions about the system. These methods work very well for simple models, which can usually be sampled using a “brute force” technique like Monte Carlo. However, more care must be taken for sampling and approximating expensive black box models.

Expensive computer models for scientific computing consist of underlying mathematical models, sometimes simple but more often complex, which are solved discretely for a large number of discrete elements. The solution process obfuscates the effects of input parameters on system responses, and there are often large numbers of input parameters. It also obfuscates the effects of input uncertainties on system responses - if a given input variable has an associated range of uncertainty, that uncertainty will be propagated through the system in a way that is nonlinear and difficult to predict. These models present a formidable challenge for understanding a physical system using a simulation tool. It is also impossible to use optimization procedures, which are often used to determine the effects of input variables and associated uncertainties on system responses. Typically, optimization procedures require hundreds or thousands of function evaluations, and this can be infeasible even for some moderately cheap computer codes (those taking on the order of minutes to run).

In situations such as these, it is useful to have a simpler model, also called a “metamodel” [45], that is much cheaper to evaluate and much simpler in form, such as a polynomial. These can be constructed based on limited information about the larger scale computer simulation model. Statisticians have developed many such models and methods for constructing them. These methods sample functions using a very small number of samples, or use limited information about a function, in order to maximize the amount of information that can be extracted. Many such techniques fall under the category of “design of experiments.” As the name indicates, many of the original applications consisted of determining the effect of input parameters (or operating conditions) on a system response, with the intention of adjusting the operating conditions to optimize the response. This has uses for industrial applications, in optimizing a process to minimize material waste or maximize profit. However, the application for validation is oriented more toward optimizing the input variables to have a corresponding response that agrees with experimental data.

Box and Wilson [18] first proposed the response surface methodology (RSM) in 1951. RSM uses polynomial surfaces to represent the response y as a function of an input variable vector x. Polynomials are very general and work well for many responses, and as a result RSM has thrived as a widely used modeling technique for complex systems. Other techniques have also emerged, each with their own sets of advantages and disadvantages.

The design methodology used also depends on the cost of running a simulation; the term “expensive” is extremely relative. Some models take only a few seconds to run (for example, integrating a simple ordinary differential equation (ODE) in time), but can be considered “expensive” compared to a polynomial model, or if performed many times. Other models may take up to an hour to perform, making them relatively more expensive, but still allowing for several thousand function evaluations. The simulations discussed in the present work take several days to perform, and are thus greatly restricted in the number of function evaluations that can be obtained. While a general methodology that applies to all ranges of simulations will be presented, the main focus will be on extremely expensive function evaluations.

5.1.1 Terminology

Surrogate model design and assembly utilizes statistical terminology to describe the elements of the process. The analyst is interested in coming to a better understanding of a “true” function

(6.1) y = f(θ),

where f(⋅) is some process of interest and θ is a vector of variables representing the state of the system. This may be a real process, such as a chemical reactor or the behavior of a group of test subjects, in which case θ is difficult to characterize. Alternatively, f(⋅) may be a computer simulation, in which case θ may still be difficult to characterize, but is more easily quantifiable. The quantity y is referred to as the response of the system f(⋅).

The role of surrogate models is to create a new function

(6.2) ŷ = g(ξ)

that exhibits a subset of the characteristics of f(⋅), typically approximating the system response ŷ(ξ) ≈ y(θ) for some set of system states θ ∈ Θ. Surrogate models g are also typically intended to be cheaper or less complex than the system of interest f.

In most cases, a subset of variables ξ_i are chosen such that they overlap with the variables representing the state of the system, ξ_i = θ_i. These variables are called factors and are shared between f and g. Each factor is assigned a range of possible values, composing a parameter hypercube Θ = {θ_i:α_i ≤ θ_i ≤ β_i}. For each variable, discrete values in the parameter hypercube are chosen at which to sample f. These values are referred to as levels corresponding to each factor. A given experimental design requires several evaluations of the system response at different parameter value combinations; each of these evaluations is referred to as a run.

Depending on the functional form of the surrogate model, and depending on the surrogate model assembly procedure, these terms may take on different meanings. An overview of various classes of surrogate models is presented in the next section; the class of surrogate model that is of primary interest is the response surface class. The use of factors, levels, and the parameter hypercube for response surfaces will be discussed in greater detail below.

5.1.2 Classes of Surrogate Models

Surrogate models (or, metamodels) come in all forms for many different uses. A surrogate model is a model that attempts to duplicate the output of a more expensive model or a more complex system (whose output is denoted y(x)) using a cheap, simple model (whose output is denoted ŷ(x)). A few representative surrogate models will be discussed here, but this is not intended as a comprehensive overview of surrogate models.

Surrogate model contributes to numerical uncertainty, covered in Section 3.3.5↑. The surrogate model will always be inadequate to precisely describe the actual output of the expensive model, and as a result it will introduce some amount of error. However, this error cannot be quantified, except at the points in parameter space where the function is sampled (error analysis is covered in Section 5.1.4↓ and error analysis examples are given in the Section discussing surrogate model construction for the coal gasification model, Section 5.4↓). A significant part of this error analysis is reducing the number of degrees of freedom that the model requires, in order to better estimate the error (see Section 5.1.5↓). The fewer parameters in the model, the better the estimate of the error. When the number of parameters is equal to the degrees of freedom, no estimate of the surrogate model error is obtained.

Surrogate models can be extremely simple: for example, regressing a number of data points to a line creates a surrogate model ŷ = ax + b. This is a simple model that approximately duplicates a more complex physical system (whatever system produced the data points). These models are least squares models. Least squares models are posed as a search for an unknown β that will minimize the sum of the squared error of the linear equation:

(6.3) y_i = x_iβ + ϵ_i

or, in linear algebra form,

(6.4) Y = Xβ + E.

The general least squares regression solution utilizes the pseudoinverse of X, and is expressed as:

(6.5) β = (X^TX)^− 1X^TY.

These linear regression models are very widely used.

Generalized linear models (GLM) are a unifying (and therefore large) class of surrogate models. GLM covers univariate (scalar y) and multivariate (vector y) models, maximum likelihood methods, such as the Newton-Rhapson method for finding the least squares value, Bayesian methods for linear model parameter estimation, and others. GLMs can also be used in concert with analysis of variance (ANOVA) models [142, 103].

Splines provide an additional way of creating piecewise polynomials to connect knots (sequences of points with known values) [24]. For N knots x_i, a sequence of N − 1 polynomials with matching endpoints (typically cubic, but also linear functions, as well as special polynomials such as Chebeyshev polynomials) are constructed for each interval [x_i, x_i + 1]. Splines have the advantage of being smooth and of being guaranteed to pass through all of the knots. They can be applied to arbitrarily high dimensions, and to problems with a multivariate y [123]. They are also common features of many scientific software packages such as SciPy and the GNU Scientific Library. Some excellent references for concepts behind splines are [24, 98]. A reference covering spline algorithms is [167]. More advanced concepts in splines are covered by [123].

Response surfaces provide a less robust but more general approach to fitting data with polynomials. Response surface methodology constructs a polynomial surface that minimizes the surrogate modeling error y − ŷ. This can be expressed generally as

(6.6) ŷ = g(x) + ϵ,

where g(x) is a polynomial function. The error term ϵ is unnecessary for the applications of interest, because the computer models being approximated are deterministic and do not exhibit any variation when computer simulations are repeated with the same inputs; thus E(y) = y (where E is the expectation operator).

Bayesian methods for regression can also be used to find models ŷ. The normal Bayesian linear model predictor ŷ is assumed to be a normal process (for ordinary linear regression), so it has the form:

(6.7) ŷ∣β, σ², X = Xβ + σ²I

where I is the identity matrix. A prior distribution for β and σ², denoted p(β, σ²∣X), is constructed or presumed, from which a posterior distribution p(β, σ²∣Y) is determined by multiplying the prior by the likelihood function (6.7↑). This posterior distribution can then be sampled to predict β, and can also be updated as additional information is gathered. This can be viewed as a more general approach by posing traditional linear regression models (6.4↑) as a “frequentist” or a posteriori approach that assumes there is enough information available to determine β, whereas Bayesian models use an approach that does not make any presumptions about completeness of data, but rather creates a probabilistic framework in which models may be continuously improved using new information. An excellent overview of Bayesian regression is provided by [57].

Neural networks are another surrogate modeling approach that have emerged recently for applications such as machine learning. The idea behind a neural network originates from the way the human brain works, and neural networks are essentially creating a coarse model replicating the behavior of the human brain. A neuron can be crudely modeled as a transistor with multiple connections in and out. When voltage passes through the neuron, it acts as a gateway, either stopping the voltage if it is below some threshold, or passing the voltage through if it is above the threshold. These are then combined into networks, with multiple layers of parallel neurons and connections between neurons at various layers, to transform input data into output data. The neural network is trained by feeding large amounts of input data with known outputs into the neural network, and randomly adjusting the thresholds of the neurons in the network until they correctly predict the known outputs. Using neural networks to represent black box functions is somewhat precarious, as the neural network is itself a black-box function, but the ultimate judge of the model should be its performance. However, the biggest disadvantage of neural networks is the need for large amounts of training data. While this provides an advantage in situations with huge amounts of data (as in many applications in machine learning), this essentially rules out neural networks as surrogate models for expensive black box simulations.

One does not have to choose between these methods; many can be combined. For example, Bayesian methods can be combined with splines (see Chapter 3 of [131] or [60]); Gaussian estimation models can be combined with a response surface methodology [86]; Bayesian methods can be combined with Gaussian estimation models [26, 110]; etc.

Simpson et al [170] provided a survey of the response surface methodology and compared it with kriging and neural network methods. They concluded, first, that response surface methodology has a storied history of successes in modeling computer simulations, and have been used for a wide range of phenomena. For this reason, they are well-understood and well-established. The authors recommended response surface methodology for deterministic applications with a small number of well-behaved factors. Second, they concluded that kriging was the best choice for highly nonlinear models with a large number of factors (up to 50). Beyond 50 factors, the authors recommend neural networks, despite the high cost of data.

5.1.3 Surrogate Model Training

The process of training a surrogate model differs among each family of surrogate model. For example, it was mentioned that neural networks require a large amount of training data. However, other surrogate models require only small amounts of training data to fit a function (some are even designed to use minimal training data, e.g., Plackett Burman or other screening designs for linear models [133]; see Section 5.3.6↓ below). The response surface methodology utilizes statistical theory to optimize sampling points and maximize information obtained about the behavior of the response of the expensive function, part of the field of experimental design [19, 20, 142, 45, 152]. The underlying principle for surrogate model training, however, is that without proper training there can be little confidence in the model’s performance.

This presents a dilemma: a high quality surrogate model is absolutely necessary, lest the expense in constructing the surrogate model goes to waste. But higher surrogate model quality necessitates higher cost. There is a simple solution to the dilemma, which is that the available resources determines the level of accuracy of the surrogate model, just as the available resources determines the level of validation that can be achieved for a model. Sections 5.1.4↓ and 5.1.5↓ below discuss ways of ensuring, first, that the selected model has an appropriate level of accuracy, and second, that the information gained about expensive models is used to the maximum degree.

5.1.4 Goodness of Fit

Assessing goodness of fit of surrogate models is one of the most important questions in the entire validation process. If a poor surrogate model is used, an enormous amount of computational work is wasted. Particularly for the goal of developing predictive tools, two questions are highly relevant:

How biased are the surrogate model parameters or coefficients if they accurately represent the more complex model?
What are appropriate methods for checking the need for a more complex model?

Quantification of goodness of fit can provide the information needed to address both of these questions, although the second question is addressed in greater detail by the section on sequential assembly of response surfaces (Section 5.3.2↓).

Residuals

Residuals are perhaps the most obvious quantities to determine the goodness of fit of a model. For a surrogate model prediction ŷ of a response y, the residual is defined as:

(6.8) r = y − ŷ.

However, this simple metric can reveal much information. As an example, one way of determining whether an appropriate functional form of a surrogate model has been chosen is to plot the residuals y − ŷ against the system response y. If the selected surrogate model is linear and the residuals exhibit a quadratic trend, this indicates that a quadratic surrogate model will lead to a better fit of the data. Graphical representations of the residuals can reveal complex relationships between the model and the data that numerical quantities such as R² coefficients cannot capture. The residuals can be plotted against several different quantities to reveal these relationships. In addition, residual plots can help to identify trends in the data variance.

Variance

One fundamental quantity that can reveal information about the goodness of fit of a model is the variance. Analysis of variance (ANOVA) models are often used to determine goodness of fit by comparing the sources of variance in a set of system responses and a corresponding set of predictions of those responses. The ANOVA tests the hypothesis that the mean of the system responses is the same as the mean of the response predictions. The approach involves computation of a number of quantities (some discussed further below), such as the sum of squares of error and F-statistic, which can be used to quantify the believability of the means hypothesis. The principle behind ANOVA models is that the variance yields much useful information for determining goodness of fit.

For an experimental system, the variance measures the deviation of experimental observations from a mean (expected value) given a set of input parameters. That is, given a set of input parameters x, the mean for a system with j = 1…n observations is defined as:

(6.9) μ_y = E(y_j)

where E(⋅) is the expectation operator (interchangeable with a top bar, y_j). The variance measures the expected deviation from this value,

(6.10) Var(y_j) = σ² = E[(y_j − μ_y)²]

and can be thought of as a measure of width for a distribution, or scatter for a set of values.

Computer simulations, however, are deterministic, and so a simulation that is repeated for the same set of input parameters x will be the same each time (have zero variance). In this case, the variance refers to the variance of the model fit, denoted s² [142, 132]. For a system with n observations, s² is defined as:

(6.11) s² = E[(y − ŷ)²] = (1)/(n)ⁿ⎲⎳_i = 1(y_i − y)².

The variance of any system can be split into two parts, random error and bias error. Several approaches can be taken to do this. For example, one may use an ANOVA approach, which partitions the variance into contributions from various effects, to partition part of the variance into a lack-of-fit sum of squares, which separates the variance s² into fit into the sum of squares due to random or pure error (that is, the deviation of a given system response y(x) for the same parameter values x from a mean y(x) corresponding to those parameter values), and sum of squares due to bias or lack of fit error (the deviation of the model prediction ŷ(x) for a set of parameter values x from the mean system response y(x) corresponding to those parameter values).

The separation of error into these two parts can be expressed symbolically for a function of 1 variable y(x), with a corresponding surrogate model prediction ŷ(x), as:

(6.12) ϵ_ij = ⎲⎳_j{y_ij − ŷ_i} = ⎲⎳_j{y_ij − ŷ_i − E(y_ij − ŷ_i) + E(y_ij − ŷ_i)} = ⎲⎳_j[(y_ij − ŷ_i) − (E(y_ij) − E(ŷ_i))] + [E(y_ij) − E(ŷ_i)] = ⎲⎳_j[(y_ij − ŷ_i) − (y_i∙ − ŷ_i)] + [y_i∙ − ŷ_i] = ⎲⎳_j[y_ij − y_i∙]_{random (pure) error} + [y_i∙ − ŷ_i]_{bias (lack of fit) error}

(the fifth step is possible due to the deterministic nature of the surrogate model, so that ŷ_i = ŷ_i), where i indexes the discrete values of the variable of interest x, j indexes the measurements of each response at the corresponding value of x, y_ij is the j^th measured system response for the i^th value of x, and y_i∙ indicates the mean value for the i^th value of the variable x, i.e. averaged over the j^th measurements of the response. This can also be extended to multiple variables x by increasing the number of subscripts i to account for the number of discrete values of each variable (for example, see [142]). From this, it can be shown that the variance of the model fit s² is an estimator of σ², where s² = σ² if the model is correct, and s² = σ² + bias if the model is incorrect (see [119]). For computer simulations, the system response y is deterministic, so the random (pure) error for the expensive function can also be eliminated, since

(6.13) E(y_ij) = y_ij,

which leaves only bias error. Thus the equation for s² (6.11↑) quantifies the surrogate model bias error.

Chi-Squared Statistic

The χ² statistic is another standard measure of goodness of fit for parameters of a model, originating from maximum likelihood estimate methods. The quantity χ² characterizes the deviation of observed quantities’ frequency distribution from expected quantities’ frequency distribution. For example, for a set of observations of a system y, there is a “true” distribution P(x^⋆) with parameter vector x^⋆ describing the probability of obtaining that set of observations y for that system given . Likewise, for the surrogate model predictions of the same system ŷ(x), there is a distribution with the same form (the form is assumed known, one of the weak points of the χ² method) and estimated parameter values x. To obtain the χ² measure, the distribution of observations is assumed to be Gaussian, such that the probability of a single system observation P_obs(x) is given by

(6.14) P_obs(x) = ⎛⎝(1)/(σ√(2π))⎞⎠exp⎛⎝ − (1)/(2)⎲⎳⎡⎣(y − ŷ(x))/(σ)⎤⎦²⎞⎠

and the probability of a set of observations is the product of individual probabilities of each observation. The constant term in front does not depend on the parameter value x, so the exponential sum must be minimized in order to maximize the probability of P(x), which will make it most closely match the “true” distribution P(x^⋆). This sum is the goodness of fit parameter χ²:

(6.15) χ² = ⁿ⎲⎳_i = 1([y_i − ŷ_i(x)]²)/(σ²).

This quantity can be minimized to find the optimal values of x, by creating a set of equations based on the partial derivatives (∂χ)/(∂x). The corresponding numerical uncertainty bounds for the estimated parameters can also be obtained using linear algebra (the result is commonly referred to as the covariance matrix) or from the expression (for the i^th variable)

(6.16) σ²_{x_i} = ⎲⎳⎡⎣σ²⎛⎝(∂x_i)/(∂y)⎞⎠²⎤⎦.

Several quantities can affect the value of χ², including random error (defined above), values assigned to the response uncertainties σ, the ability of the functional form of the surrogate model ŷ(x) to accurately describe the system response, and the approximated parameter values x̂.

R-Squared

Additional measures of goodness of fit include correlation coefficients, or R-squared values. These can be calculated as correlation coefficients for two variables, denoted r_ij, or as multiple correlation coefficients (most common), called the R-squared coefficient and denoted R². Two-variable correlation coefficients can be defined as:

(6.17) r_ij = (σ²_ij)/(σ_iσ_j)

where σ_ij is the covariance between variables i and j and σ_k is the variance for variable k. For a model ŷ(x) that is a function of a set of input parameters, this quantity would be useful to quantify the covariance between ŷ and a single input parameter x_i, or to quantify the covariance between two input parameters x_i, x_j. R-Squared coefficients can be computed using several quantities that have appeared already:

(6.18) R² = 1 − (SS_error)/(SS_total) = 1 − (ⁿ⎲⎳_i = 1(y_i − ŷ_i)²)/(ⁿ⎲⎳_i = 1(y_i − y))

where SS_i refers to the sum of squares for i, n is the number of system responses or observations y (note, however, that y here is the overall average of the system responses,

(6.19) y = (1)/(n)ⁿ⎲⎳_i = 1y_i,

which is different from the y_i∙ average used above). For circumstances where the number of degrees of freedom is on the same order as the number of parameters in the regression model, an adjusted R-squared value, R²_a, adjusts for statistical bias and is more appropriate:

(6.20) R²_a = 1 − (SS_error)/(SS_total)(df_total)/(df_error)

where df_total and df_error are the number of degrees of freedom of the system and of the error, respectively. For a system with n observations being regressed to a model with p parameters, this becomes:

(6.21) R²_a = ⎛⎝(n − 1)/(n − p − 1)⎞⎠R².

5.1.5 Budgeting and Spending Degrees of Freedom

When constructing a response surface for an expensive function or simulation, degrees of freedom are as precious as gold. When the number of function evaluations is very small, each observed response contributes an additional degree of freedom. Each degree of freedom, in turn, can be used to extract additional information from the system. Thus, there is a balance to be struck between the number of degrees of freedom and the number of parameters in the surrogate model - as the number of parameters increases, the number of available degrees of freedom decreases. For a system with N degrees of freedom and a surrogate model with p parameters, p degrees of freedom are used to determine the parameters. The remaining degrees of freedom may be used to determine the random (pure) error; the more degrees of freedom spent on this, the better the estimate. Degrees of freedom may be used to remove blocking effects (that is, effects of unintentional changes in things like operating conditions, important system characteristics, or operators for different sets of runs in the same experimental design). They may also be used to isolate and identify adequacy of fit to address particular surrogate model inadequacies that are thought to be important; for example, to determine the need for a cubic term x³_i for a particular variable in a response surface; these can be thought of as surrogate model bias errors, that is, the surrogate not accounting for important characteristics of the system. Many examples of uses of additional degrees of freedom are provided by (Box and Draper, RSM etc).

5.2 Response Surface Methodology

The development of linear models and regression techniques go back several centuries, to Laplace and Legendre [168]. A linear model can be expressed in a very general way as a linear relationship between inputs and outputs:

(6.22) y = Xβ + ε

wherey is a vector or matrix of observed data (responses), X is the matrix of input variables, β is the matrix of coefficients for the model equation being regressed (implicitly contained in X), and ε is the vector of residuals, equal to y − ŷ = y − Xβ.

Generalized linear models (GLM), a term first introduced by Nelder and Wedderburn [72], are a more general extension of linear models such as (6.22↑). GLMs for data (y_i, x_i), i = 1…N have the form:

(6.23) y = zβ + ε = η + ε

where y is the vector of values y_i, z is the design vector, which is a function of the inputs x_i, β is a vector of unknown parameters, and the error term ε is assumed to be a normally distributed zero-mean error term with constant variance σ², commonly indicated using the notation:

(6.24) ε ~ N(0, σ²).

The expected response is denoted μ,

(6.25) E(y) = μ.

GLMs are also characterized by a response function and a link function, which create a map between inputs η and the expected response:

(6.26) μ = h(η) = h(zβ)

(6.27) η = g(μ)

where h is the response function and g = h^− 1 is the link function. This can also be extended to the multivariate case (where y is a matrix instead of a vector),

(6.28) y = Zβ + ε

where Z is the design matrix, a function of x_i.

It follows from this that the form of linear model mentioned above (called a “linear model” if x is a scalar and “linear multiple model” if x is a vector) is a special case of the design matrix, where:

(6.29) z = ⎡⎢⎢⎢⎣ 1 x_{1, i = 1} … x_{p, i = 1} 1 ⋮ ⋮ 1 x_{1, i = N} … x_{p, i = N} ⎤⎥⎥⎥⎦

and the coefficient matrix β is given by:

(6.30) β = ⎡⎢⎢⎢⎢⎢⎣ β₀ β₁ ⋮ β_p ⎤⎥⎥⎥⎥⎥⎦.

Likewise, polynomial models (called “polynomial models” if x is a scalar and “polynomial multiple models” if x is a vector) of the form (2 dimensional in this example):

(6.31) y = β₀ + ^d⎲⎳_m = 0^d⎲⎳_{n = 0, m < n}β_mnx^m₁xⁿ₂

have a design matrix of the form

(6.32) z = ⎡⎢⎢⎢⎣ 1 x_{1, i = 1} x_{2, i = 1} x_1, 1x_2, 1 … x^m_1, 1xⁿ_2, 1 1 ⋮ ⋮ ⋮ ⋮ 1 x_1, N x_2, N x_1, Nx_2, N … x^m_1, Nxⁿ_2, N ⎤⎥⎥⎥⎦.

These can be further generalized to multivariate versions of each model, where y is a matrix instead of a vector and the system has multiple response variables, all within the framework of GLMs.

Response surfaces, then, are simply an extension of the GLM framework to multivariate polynomial multiple models, that is, polynomial models with multiple outputs and multiple inputs. Stating the model in a general form, the basis functions (single polynomial terms) are defined by:

(6.33) Z₀ = 1 Z₁ = x₁ … Z_n = x_n Z_n + 1 = x²₁ … Z_2n = x²_n …

and so on, including all higher-order interaction effects. These basis functions form a row, and one row is written for each observed response y. This can then be expressed as part of a linear model:

(6.34) ŷ = Zβ

5.2.1 RSM: For and Against

The adequacy and appropriateness of response surface methodology is a subject of debate. RSM models have many positive qualities, but these are balanced by negative qualities. RSM models utilize polynomials, which are ubiquitous in science and engineering, and are easy to implement. The choice of polynomial models is also easy to justify using a Taylor series; any function can be represented as an infinite polynomial series, and often only a few terms are needed to obtain an accurate estimate. Polynomial models can also easily handle many dimensions, and the elimination of variables representing interaction effects is trivial.

However, RSM models have several disadvantages. Polynomial models often lead to spurious fits of data, especially as the number of degrees of the polynomial approaches the number of data points. Often, functions must have the right functional form to be well-approximated with polynomials, and many functions, in reality, are nowhere near these functional forms. In addition, the range of application is often limited quite strictly to the domain in which they are applied; polynomials typically asymptote right next to the boundaries of the range of applicability. Furthermore, the number of coefficients grows exponentially with the number of degrees or the number of variables. While some may see this as an advantage, this conclusion is certainly refutable; if the number of coefficients grows exponentially, so too must the amount of data gathered. Polynomials grow very expensive very fast.

Additional justification could be provided by use of low-dimensional models to explore the shape of the response, and assume that it is same between low-dimensional and high-dimensional model (see Section 5.3.1↓). However, this was not done for the present study.

5.2.2 Construction, Regression

Matrix notation can be used to describe the construction and regression process for response surfaces. Following the notation used in the previous section, the vector of system responses to be fit is denoted y, and contains i rows, where i is the number of observations of the system. For a multivariate system with j responses, y contains j columns. The matrix X contains the input parameters, and for a model with k input parameters, X contains k + 1 columns (one for each variable, plus one column of 1’s representing the constant effect). It contains i rows, one set of input variable values for each observation. The vector of model coefficients is β, and contains k columns, one for each system response, and j rows, one for each input variable. As above, the linear model is expressed as:

(6.35) y = Xβ + ε,

(6.36) ŷ = Xβ.

This can also be expressed in terms of the sizes of each component:

(6.37) i × k = i × j∙j × k.

Note that the vector ε in (6.35↑) is ignored when computing β. While the solution seems obvious,

(6.38) β = X^− 1 y,

X can only be inverted if it is square. Thus, equation (6.36↑) must first be multiplied by X^T to create a square and invertible matrix:

(6.39) X^Ty = X^TXβ

and the second step is to isolate β:

(6.40) β = (X^TX)^− 1 X^Ty

which is the expression for the solution to the linear regression equation, (6.35↑). This technique is used to construct response surfaces. More details can be found in [119].

5.2.3 Variable Normalization

Typically, before performing a regression on a set of input parameters, or factors, the factors are normalized. If each factor falls in the range [ − 1, 1] or [0, 1], this makes the regression procedure much easier. For any input parameter x_I, with a range of values α_i ≤ x_i ≤ β_i, the variable can be transformed either linearly or logarithmically. The first transformation of a linear variable x_i to the variable x_î ∈ [ − 1, + 1] can be done using the formula:

(6.41) x̂_i = (x_i − ⎛⎝(β_i − α_i)/(2) + α_i⎞⎠)/((β_i − α_i)/(2)),

and the logarithmic variable x_i can be transformed to the linear variable x̂_i ∈ [ − 1, + 1] using the formula:

(6.42) x̂_i = (log(x_i) − ⎛⎝(log(β_i) − log(α_i))/(2) + log(α_i)⎞⎠)/((log(β_i) − log(α_i))/(2)).

If the variable should be transformed to a more general range \widetildex_i ∈ [ − s, + s], this can be accomplished by:

(6.43) \widetildex_i = x̂_is.

5.3 Response Surface Assembly

The process of response surface assembly for surrogate models depends entirely on the cost of the system being represented by the surrogate model. In order to determine how to sample the system being represented by the surrogate model, knowledge of the underlying functional form of the system becomes essential. However, complex systems are expensive to sample, and knowledge of the underlying functional form is therefore unavailable. This is the information gap that is filled by the use of response surface methodology to construct surrogate models.

However, even under the constraints of the assumptions underlying response surface methodology, the goal is to minimize the amount of assumptions that must be made going in, and gather information piecemeal in order to assemble the response surface piecemeal. This allows assumptions to be checked at each successive order of the assembled response surface. This goal is complemented by the goal of minimizing the number of functional samples. By assembling response surfaces in pieces, results can be analyzed and choices made to reduce the number of variables, terms, or orders of terms in the surrogate model.

5.3.1 “When I am weak, then am I strong.”

Expensive, complex models (also called expensive functions) are designed to model physical systems. Likewise, surrogate models are designed to model expensive, complex models. However, surrogate model design suffers from the curse of dimensionality: as the number of input parameters increases linearly, the number of samples covering this multidimensional space must increase exponentially. Furthermore, the very process of selecting samples rests on such perilous assumptions as, “it is assumed that the response of this complex system can be modeled using a quadratic polynomial.” Assembling response surfaces suffers from a catch-22: in order to know how to sample the expensive function, a surrogate model must be picked (in other words, a functional form of the system response must be guessed). But in order to pick a surrogate model appropriate to the expensive function, the expensive function must be sampled many times. So the question naturally arises: are we doomed to wander in the desert of ignorance?

In fact, some models of physical systems are designed to be accurate physical models but with a very low cost. For such cheap model, or functions, space filling designs can be used to determine a functional form for the system response, and can make choosing a surrogate model for the function very easy. But if the physical model is cheap to evaluate, why construct a surrogate model for it?

The answer lies, not in the cheap physical model, but in the expensive physical model: the two are connected, in that they both try to model the same physical phenomena. By reducing the dimensions of an expensive physical model to yield a cheap (or low dimensional) physical model, the system response can still be approximated, and the assumption that it is the same between the two can be made. Then the choice of surrogate model used to represent the response of a complex system can be informed and justified by a low dimensional model of that system.

For example, one may use a Reynolds-Averaged Navier Stokes (RANS) code (which eliminates the temporal dimension, possibly a spatial dimension if the simulation is two dimensional, and all resolved turbulent scales, making the computation very cheap) to investigate the functional form of a system response. The functional form for this system response would then be used to decide how to sample a more expensive physical model of the system, such as large eddy simulation (LES). While the RANS simulations will be less reliable, they are far more economical for the initial exploration of parameter space using space filling-designs such as Latin Hypercube or Monte Carlo. They can also lead to a reduction in the range of each parameter explored by the expensive LES model, which typically leads to a more accurate surrogate model.

5.3.2 Sequential Assembly

Construction of response surfaces for very expensive computer simulations like Arches must proceed in a piecemeal manner, with each piece informing the next piece. This is particularly the case with response surface models, which lend themselves well to piecemeal construction, also called sequential assembly. A polynomial can be thought of as consisting of several “layers.” For example, a full quadratic polynomial in 3 variables, given by

(6.44) y = β₀ + β₁x₁ + β₂x₂ + β₃x₃ + β₁₂x₁x₂ + β₂₃x₂x₃ + β₁₃x₁x₃ + β₁₁x²₁ + β₂₂x²₂ + β₃₃x²₃,

can be broken up into several layers:

main effects, {β₁x₁, β₂x₂, β₃x₃};
interaction effects, {β₁₂x₁x₂, β₂₃x₂x₃, β₁₃x₁x₃}; and
quadratic effects, {β₁₁x²₁, β₂₂x²₂, β₃₃x²₃}.

Various experimental design techniques for determining the values of polynomial model coefficients are discussed below. Most of these experimental design techniques lend themselves to a straightforward but dangerous “TV dinner” approach to surrogate modeling: “throw it in the microwave, wait for a while, and consume whatever comes out,” the analogy being that an experimental design will be selected, the samples gathered, the data regressed, and the resulting response surface consumed without regard to quality. This typically happens when the wielder has no knowledge of, or does not make efficient use of, statistical science.

A superior approach to response surface assembly is to make better use of statistical science. This is the philosophy behind sequential assembly.

The first step in the sequential assembly process is to determine the main effects using a screening design, such as a Plackett Burman or highly-fractionated factorial design. These consist of sets of multiples of 8 runs, with the total number depending on the number of variables being screened. They are intended to provide estimates of the main effect of a large number of variables on the system response, without the need for large numbers of runs. Such screening designs are covered in Section 5.3.6↓. Next, the higher order interaction effects are investigated using fractional and full factorial designs. Fractional factorial designs reveal less information, but require fewer samples, and can be used to assess whether a full factorial is necessary. Full factorial designs provide enough function samples to determine main effects and interaction effects for linear models. Factorial designs are covered in Section 5.3.5↓ (this section precedes the screening design section because several concepts central to factorial design are required to understand screening designs). Composite designs require additional function samples in order to estimate quadratic effects, and are covered in Section 5.3.7↓. Additional higher-order effects can be explored using extensions of the above methods.

In addition to providing a more solid justification for the variables being investigated by higher order experimental designs, which are more expensive (requiring a justification of the cost), sequential assembly also allows for incorporation of a pyramidal structure: a large number of variables can be screened in the first step, with progressively fewer variables included in subsequent steps. This allows one to maximize information, minimize function evaluations, and re-use existing information at each level. Also, as mentioned in Section 5.1.5↑, each degree of freedom provides an additional piece of information about, or an improvement upon, the surrogate model, so each of these steps’ samples can be designed to yield desired information.

Box, Hunter, and Hunter phrase the need for a sequential approach particularly well:

The “one-shot” philosophy of experimentation described in much statistical teaching and many textbooks would be appropriate for situations where irrevocable decisions must be made based on data from an individual [computer] experiment that cannot be augmented. Particularly in the social sciences, many problems are of this kind and one-shot experimentation is the only option. However, this is much less common in industrial investigations. It is the goal of this book to emphasize the great value of experimental design as a catalyst to the sequential process of scientific learning. [emphasis in original] (Statistics for Experimenters: Design, Innovation, and Discovery, [17])

5.3.3 Computing Effects: Dot Method

Notation

In order to compute the effects of one or more variables on a system response, some notation must be covered first. Let the response of a system that is a function of input variables x be denoted y. This response may be supplemented with a number of subscripts. For a system with p parameters, y will have p subscripts, or p + 1 subscripts if there are repeated observations of the response at a set combination of parameter values (as in an experiment), where the last subscript indexes all experimental repeats. Each subscript indicates the value of the p^th parameter for the observed system response. The number of values that a subscript can have is equal to the number of levels for that factor, denoted n^p_levels for the p^th variable.

Let i index the first input variable x₁, j index the second input variable x₂, and so on. Then the response the combination of the i^th, j^th, etc. input variables is denoted:

(6.45) y_ijklm…z.

The average over a particular variable is indicated by replacing the index letter with a dot; thus the average over the various values of x₁ would be indicated by:

(6.46) y_∙jklm…z.

For example, a 4-factor experimental design with parameters A, B, C, and D, each with 2 levels, would have a system response

(6.47) y_ijkl

where i, j, k, and l are defined as:

(6.48) i = 1…nⁱ_levels, j = 1…n^j_levels, k = 1…n^k_levels, l = 1…n^l_levels

and the set of level values is most typically { − 1, + 1}, but may also be thought of as {0, 1}, or any other preferred designation of upper and lower levels.

Main Effects

The average response for a level is denoted with a dot, so that the average response over the variable indexed by i is denoted:

(6.49) y_⋅jkl,

and the average response over all levels of all factors is designated y_∙∙∙∙, and it is computed as:

(6.50) y_∙∙∙∙ = ⎛⎝(1)/(nⁱ_levelsn^j_levelsn^k_levelsn^l_levels)⎞⎠⎲⎳_ijkly_ijkl

and the marginal response (that is, the average response for a particular level of a particular variable), for example y_i∙∙∙, can also be calculated:

(6.51) y_i∙∙∙ = ⎛⎝(1)/(n^j_levelsn^k_levelsn^l_levels)⎞⎠⎲⎳_jkly_ijkl.

The marginal response may also be indicated in some situations as y_i. Average responses can also be calculated for fractional factorial designs. If a ⎛⎝(1)/(2)⎞⎠^k fractional factorial design (discussed in Section 5.3.5↓ below) is performed, the average response formula (6.51↑) is divided by ⎛⎝(1)/(2)⎞⎠^k:

(6.52) y_i∙∙∙ = ⎛⎝(2^k)/(n^j_levelsn^k_levelsn^l_levels)⎞⎠⎲⎳_jkly_ijkl.

For a two level design, the set of level values { + 1, − 1} may be denoted with { + , − } to ease notation. In this case the main effect of a factor A may be computed as:

(6.53) M_A = M(A) = |y_{+ ∙∙∙} − y_{− ∙∙∙}|

and a two-factor interaction between factors A, B may be computed as:

(6.54) M(A)_{B = −} = |y_{+ − ∙∙} − y_{− − ∙∙}| (6.55) M(A)_B = + = |y_{+ + ∙∙} − y_{− + ∙∙}|.

Note that the shorthand y₊ and y₋ may be used in some situations to indicate the marginal response at a particular level, but only when it refers to the marginal response for all variables at that level (for example, in Section 5.4.1↓); otherwise, the less ambiguous dot notation will be used to specify which variable is held constant at a particular level, and which variables are averaged.

Interaction Effects

Once the two-factor interactions are computed, these can be used to determine the interaction effect between two variables:

(6.56) I_AB = I(AB) = (M(A)_B = + − M(A)_{B = −})/(2).

This interaction effect I_AB can then be compared to the main effects M_A and M_B to determine the relative significance of the A − B interaction in the response, relative to A or B alone.

Determining the significance of interaction effects is an important part of sequential assembly of the experimental design. It is also important in other situations, such as the experimental design done in Chapter 3↑, to determine the importance of cross-interaction effects when determining the order of convergence of the error function (Section 3.3.1↑).

5.3.4 Computing Effects: Yates’ Method

Yates’ Method is a method for obtaining main and interaction effects for a full factorial design that generalizes to n-way interaction effects. In order to use Yates’ Method for a 2ⁿ full factorial design, a table containing the factor levels and observations is constructed (Table 5.1↓). Next, a set of n columns is constructed. Each column is constructed in two parts, an additive part and a subtractive part. For the first column, the i^th entry for the 2^n − 1 additive rows are created according to the formula:

(6.57) C_1, i = y_2i + y_2i − 1 ∀ i = 1…2^n − 1

and the j^th entry for the 2^n − 1 subtractive rows created according to the formula:

(6.58) C_1, k = y_2j − y_2j − 1 ∀ j = 1…2^n − 1, k = 2^n − 1 + 1…2ⁿ.

	Factor Levels								Interaction Effect
Run	x₁	x₂	x₃	Responses	C₁	C₂	C₃	Divisor	Value	Name
1	-	-	-	y₁	C_1, 1 = y₂ + y₁	C_2, 1 = C_1, 2 + C_1, 1	C_3, 1 = C_2, 2 + C_2, 1	8	C_3, 1 ⁄ 8	I
2	+	-	-	y₂	C_1, 2 = y₄ + y₃	C_2, 2 = C_1, 4 + C_1, 3	C_3, 2 = C_2, 4 + C_2, 3	4	C_3, 2 ⁄ 4	M(x₁)
3	-	+	-	y₃	C_1, 3 = y₆ + y₅	⋮	⋮	4	⋮	M(x₂)
4	+	+	-	y₄	C_1, 4 = y₈ + y₇	⋮	⋮	4	⋮	I(x₁x₂)
5	-	-	+	y₅	C_1, 5 = y₂ − y₁	C_2, 5 = C_1, 2 − C_1, 1	C_3, 5 = C_2, 2 − C_2, 1	4	⋮	M(x₃)
6	+	-	+	y₆	C_1, 6 = y₄ − y₃	⋮	⋮	4	⋮	I(x₁x₃)
7	-	+	+	y₇	C_1, 7 = y₆ − y₅	⋮	⋮	4	⋮	I(x₂x₃)
8	+	+	+	y₈	C_1, 8 = y₈ − y₇	C_2, 8 = C_1, 8 − C_1, 7	C_3, 8 = C_2, 8 − C_2, 7	4	C_3, 8 ⁄ 4	I(x₁x₂x₃)

Table 5.1 Table illustrating the use of factor levels and responses to obtain multiway interaction effects using Yates’ Method.

Column C₂ is constructed by performing the same operations, but on column C₁:

(6.59) C_1, i = C_1, (2i) + C_{1, (2i − 1)} ∀ i = 1…2^n − 1

and

(6.60) C_2, k = C_1, (2j) − C_{1, (2j − 1)} ∀ j = 1…2^n − 1, k = 2^n − 1 + 1…2ⁿ.

Likewise, column C₃ is constructed by performing the same operations on column C₂, and so on.

Finally, the last column in the set is divided by 2ⁿ if the row represents the interaction effect, and 2ⁿ − 1 if the column represents a main or interaction effect. The effect that a row represents is determined by which factors have a high level (indicated with a + in the table). For this case, the first row represents the defining contrast, since no factor has a high level. Row 2 has only x₁ at a high level, so row 2 represents the main effect of variable 1, M(x₁). Row 4 contains two variables at high levels, so it represents the interaction effect between variable x₁ and x₂, I(x₁x₂).

5.3.5 Fractional and Full Factorial Designs

Factorial designs are intended to sample a system response enough times to determine all coefficients in a linear surrogate model, in which the maximum degree of any variable in any term is 1. For surrogate models with n variables, the number of terms (and therefore number of undetermined coefficients) is 2ⁿ, so a full factorial design requires 2ⁿ runs to fully specify the surrogate model (this is typical for a factorial designs where each variable has 2 levels; factorial designs can, however, be extended to variables with more than 2 levels, and this is discussed below). This number becomes prohibitively expensive even for moderate n, which is the idea behind the curse of dimensionality: as the variables increase linearly, the number of samples required increases exponentially.

The construction of a factorial design consists of assigning two discrete levels, or possible values, to a set of n variables. One run is created for each unique combination of these levels. Each variable is assigned two levels, a low level and a high level, typically indicated by { − , + } or { − 1, + 1}. A table is created with one column for each variable, and the first row is populated with alternating { − , + } values every 2⁰ = 1^st row. The second column is populated by alternating between { − , + } every 2¹ = 2^nd row, the third column populated by alternating between { − , + } every 2² = 4^th row, and so on. Eventually a table is generated that includes every possible combination of low and high levels of every variable, for a total of 2ⁿ rows, corresponding to the 2ⁿ runs required by a factorial design.

However, two variable interaction terms often have much smaller effects than main effects; three variable interactions are often unimportant when compared to two variable interactions; etc. Additionally, sometimes the system is understood well enough that certain interaction terms are known to be unimportant. In reality, many of the 2ⁿ samples are unnecessary. This is the idea behind fractional factorial designs: unnecessary interaction terms are aliased with other terms or with constants. In this way, two term like:

(6.61) β₁₂₃₄₅x₁x₂x₃x₄x₅ + β₁₂x₁x₂

are combined into one:

(6.62) β^′₁₂(x₁x₂x₃x₄x₅ + x₁x₂)

and it is assumed that the interaction term coefficient β₁₂₃₄₅ ≈ 0.

To apply this to the factorial design table described above, which has one column for each unique variable, the table is extended to also include columns for each variable interaction. For a three variable factorial design, with three columns, one each for x₁, x₂, and x₃, four columns are added: x₁x₂, x₁x₃, x₂x₃, and x₁x₂x₃. The values of { − , + } for each of these columns is equal to the product of the corresponding variables; thus if for row i, x₁ = + , x₂ = − , and x₃ = − , then x₁x₂ = − ; x₁x₂x₃ = + ; and so on. Fractional designs are equivalent to eliminating one column of values for an independent variable, and instead using a column representing the values for an interaction effect. If a fractional factorial design were being run for the 3-variable case, the variable x₃ would be eliminated; the interaction columns (in this case, one, x₁x₂) would be created; and rather than creating a new x₃ column, the values of x₃ would be taken from the column x₁x₂. In this case, the main effect of the variable x₃ would be aliased with the interaction effect x₁x₂, meaning a statistical analysis will yield information about the effect of x₃ plus the effect of x₁x₂, but no information about each independent effect is available. In this case, the defining contrast of the system can be found by starting with the identity used for x₃:

(6.63) x₃ = x₁x₂.

Next, an identity may be used for 2 level designs: if an effect is squared, it is impossible to identify it, so it becomes equal to 1: x²₃ = 1. Multiplying 6.63↑ by x₃ gives:

(6.64) x²₃ = x₁x₂x₃ I = x₁x₂x₃

where I is 1. I is called the defining contrast of the fractional factorial design, and is a compact way of uniquely identifying the factorial design. The resolution of a fractional factorial, denoted by Roman numerals, is defined by the number of variables appearing in the defining contrast equation; equation 6.64↑ is a resolution III design. A fractional factorial with a larger number of variables, say 5, with a defining contrast I = x₁x₂x₃x₄x₅, would be a resolution V fractional factorial design.

One way to think about each run of a factorial design is, each run is intended to exercise a different level of a different variable. But by aliasing one term’s coefficient (say, x₁x₂x₃x₄x₅) with a constant, that term no longer needs to be run at each of its different levels; the number of runs is cut in half, because every other run would keep all conditions constant and only change x₁x₂x₃x₄x₅ to determine what its effect is on the response. By aliasing one term with a constant, the number of runs is reduced to 2^n − 1. Likewise, other terms can grouped with other terms; this idea can be generalized to k variables, in which case the number of runs reduces to 2^n − k, and this is a ⎛⎝(1)/(2)⎞⎠^k fractional factorial design, since the number of runs is reduced by 2^− k.

Factorial designs are most commonly applied to variables with two levels, but it is possible to extend them to variables with more than two levels. This is easiest to do when the number of levels is a power of 2. If a variable has L levels, it can be broken up into √(L) variables with 2 levels each. For example, for a factorial design in two variables A and B, each with four levels, the factorial design can be performed with 2 variables representing the full effect of A, and 2 variables representing the full effect of B,

(6.65) A = A₁A₂ B = B_!B₂.

In this case, the interaction effects A₁B₁ or A₁A₂B₁ do not represent the full interaction effects of the original variables A and B; they represent only partial information about the interaction effect. Only all four variables combined (A₁A₂B₁B₂) represent the full interaction effect of the original variables (AB).

It is also possible to extend factorial designs to variables with numbers of levels that are not powers of two, but this is not trivial. Additional details are given by Mason [142].

5.3.6 Screening Designs

Screening designs are designed to yield maximum information about main effects with the least number of runs possible. This is done by aliasing high order effects with low order effects, and assuming that the low order effects are dominant. By eliminating the number of independent effects, the number of degrees of freedom required to specify the system is likewise reduced. If enough effects are aliased, then a large number of parameters may be screened. The utility of screening designs stems from the common rule of thumb assumption that main effects are more significant than interaction effects; that interaction effects become less significant as the number of factors involved in the interaction increases. For this reason, it is assumed that if a main factor is found to be significant in a screening design, it is unlikely to be due to a large and important interaction effect being aliased with the main effect.

In order to perform a screening design, the desired number of runs, which should be a power of 2, is selected, typically 2³ = 8. Once the number of runs is set, a full 2³ factorial design is created for 3 factors, following the procedure detailed in Section 5.3.5↑ and given in Table 5.2↓. This is also called the L₈ orthogonal array. There are 3 factors, yielding 2³ − 1 total main and interaction terms.

Run	A	B	C	AB	BC	AC	ABC
1	+	+	+	+	+	+	+
2	-	+	+	-	+	-	-
3	+	-	+	-	-	+	-
4	-	-	+	+	-	-	+
5	+	+	-	+	-	-	-
6	-	+	-	-	-	+	+
7	+	-	-	-	+	-	+
8	-	-	-	+	+	+	-

Table 5.2 L₈ orthogonal array, used for creation of 8-run screening designs.

The screening design construction technique can be best explained by illustration. For an experiment or computer simulation with 4 factors A, B, C, and D, a full factorial design would require 2⁴ = 16 runs to determine the average, the 4 main effects, the 6 two factor interactions, the 4 three factor interactions, and the 1 four factor interaction. However, it is desirable to use a screening design so that only 2^4 − 1 = 8 runs are required. To do this, the variable D is aliased with one of the interaction terms, so that the headings of the columns in the L₈ orthogonal array become A, B, C, and D. The remaining columns are ignored.

For example, if D is aliased with the interaction term ABC, then the L₈ orthogonal array becomes the array shown in Table 5.3↓. In this case, the defining relationship can be derived from the relation D = ABC as follows:

D = ABC D² = ABCD I = ABCD

This is the defining contrast of the screening design. Because there are 4 letters in the defining contrast, this screening design is a resolution IV design. If D were aliased with a different interaction term, such as AB, the result would be a resolution III design:

D = AB D² = ABD I = ABD.

Run	A	B	C	D = ABC
1	+	+	+	+
2	-	+	+	-
3	+	-	+	-
4	-	-	+	+
5	+	+	-	-
6	-	+	-	+
7	+	-	-	+
8	-	-	-	-

Table 5.3 Example L₈ orthogonal array for a 4 factor screening design.

Using this defining contrast, one can also determine all aliased effects. For example, starting with the defining contrast and multiplying by A shows that the main effect of A is aliased with the interaction effect of BD:

I = ABD A = A²BD = BD.

Multiplying by B shows that B is aliased with the interaction effect AD:

I = ABD B = AB²D = AD,

and so on. If the results of a screening study indicate that the main effect of A is significant, one should interpret this as the main effect A plus the interaction effect BD being important.

A Combinatoric Tie-In

Pascal’s Triangle gives a convenient way of thinking about polynomials of degree 2, via the binomial coefficient ⎛⎜⎝ n k ⎞⎟⎠. Given the n^th row of Pascal’s Triangle, containing n terms, the k^th term gives the number of interaction terms in the k^th layer of the polynomial (that is, the number of k-way interaction terms). The total number of terms in an n-variable polynomial of degree 2, which is the sum of all terms on the n^th row, is 2ⁿ (hence the number of design points in a factorial design). Thus a 2ⁿ screening design can be used to screen up to 2ⁿ − 1 variables (all terms in the polynomial, excluding the constant, can be aliased to a variable).

Pascal’s Triangle has also been generalized to higher dimensions; this provides similar combinatoric rules for polynomials of corresponding degree.

In this way, the procedure described above for 8 run screening designs can be extended. Up to 7 variables can be screened using an L₈ orthogonal array (there are 4 total interaction terms with which other variables can be aliased, plus the main 3 variables). But more variables can be screened by adding an additional 8 runs. If a 2⁴ = 16 run screening design were used, up to 15 variables could be screened with 16 runs (compare that to the 2¹⁵ = 32, 768 runs that a full factorial design would require!).

5.3.7 Quadratic Designs: Central Composite and Box-Behnken

Central Composite Designs

While factorial designs are intended to reveal information about first order linear models, composite designs provide information required to build second order (quadratic) linear models. The design is created by picking a median value (e.g., 0) and, optionally, two additional high and low levels (e.g., − a and + a) for each variable that will be quadratic. The function samples are then arranged in a “star” formation in parameter space: each variable is set to its (new) high and low levels while all other variables are held at their median value. The two additional levels are optional because a minimum of 3 points are required to fit a second degree polynomial, so a 3-level design works for this purpose. This is easy to visualize in a three parameter space: a factorial design forms the edges of a cube (fractional factorial designs form various opposite edges of the cube), and the composite design forms a six point star, with one additional sample point in the center. When a = 1, the composite design is referred to as a face centered composite design, because the star sample points fall on the faces of the cube formed by the factorial design. The number of runs required by a central composite design is:

2ⁿ + 2n + 1.

Box-Behnken Designs

A closely related quadratic experiment design technique is the Box-Behnken design [55]. This also places sample points on a parameter hypercube, but no sample points are located at edges of the hypercube. Each Box-Behnken design point is placed on the middle of each edge. Box-Behnken designs sample each hypercube face with 4 sample points: one in the middle of each edge. In contrast, central composite designs sample each hypercube face with 5 sample points: one in the center of the face, and one in each corner of the face. Thus Box-Behnken designs are more economical; for a design in three variables, the Box-Behnken design uses 12 points, whereas the central composite design uses 15.

There are advantages to either composite or Box-Behnken designs. Box-Behnken designs are slightly more economical, but they do not provide information about parameter combinations at their extreme values (the corners of the hypercube). Box-Behnken designs are rotatable (a desirable property) by design (more information on rotatability and rotatable designs is given below), and therefore require only 3 levels. Composite designs are not rotatable for a = 1, and so must use 5 levels to be rotatable. However, this is primarily a disadvantage when running experiments at 5 conditions is more expensive than running experiments at 3 conditions, which is not an issue for computer simulations.

More importantly, Box-Behnken designs are much more conducive to the “TV dinner” approach to experiment design (Section 5.3.2↑). Box-Behnken designs make sequential assembly of response surfaces impossible. Once a Box-Behnken design is selected, the user must leap over all intermediate steps, including screening studies, fractional factorial designs, and full factorial designs, and go straight to the quadratic surrogate model. This may lead the user to be more conservative in the variables explored, miss valuable information from screening study steps, and even make incorrect assumptions about important effects. The reason the Box-Behnken design is rotatable is because it does not include a factorial design as a subset; this should be viewed as a disadvantage. Except for cases where the response is known to be quadratic in form and the most important factors have already been determined (which is rarely the case when creating surrogate models for expensive simulations), this design’s disadvantages significantly outweigh its advantages, and sequential assembly should be used instead.

Rotatable Designs

For designs with 3 or more levels, one desirable property is for each sample to contribute equal amounts of information about the surrogate model coefficients. In order for this to be true, all of the sample points must lie on a hypersphere in parameter space. In this way, each point is equidistant from the center, and thus contributes equal information. Central composite designs are rotatable for certain values of a: those that satisfy the equality

a = n^(1)/(4)_c

where n_c = 2^n − k is the number of hypercube points with parameter coordinates in the form (±1, ±1, …±1). 3ⁿ factorial designs, on the other hand, are not rotatable. Additionally, central composite designs require 2^n + 1 + n + 1 runs, fewer than 3ⁿ, so they are also cheaper than 3ⁿ factorial designs. Because Box-Behnken designs are rotatable with a = 1 and therefore require only 3 levels, whereas central composite designs are not and therefore require 5 levels, Box-Behnken designs can be advantageous for certain situations (for example, if it is particularly difficult or expensive to run at extreme combinations of parameters, but this is usually not the case for simulations).

It can be shown (see [142]) that central composite designs are in fact fractional 3ⁿ factorial designs, as are Box-Behnken designs, and that if the two are combined for n = 3, they form a complete 3³ factorial design.

5.4 Response Surfaces for Coal Gasification

In order to accomplish Step 4 in the NISS validation framework (Section 4.4↑), computer simulations of the coal gasifier of Brown [23, 139] were performed with the Arches model (Section 2.6↑) and a response surface surrogate model was constructed for use in the Data Collaboration validation method (Chapter 6↓). The available gasification data consisted of measured concentration data for 3 species (CO, CO₂, and H₂), consisting of radial profiles composed of 5 radial measurements (0 cm, 2 cm, 4 cm, 6 cm, and 8 cm from the centerline) at 6 axial locations (21 cm, 36 cm, 51 cm, 67 cm, 81 cm, and 121 cm from the injector). Each sample was gathered over a time period of approximately 30 minutes.

For each system response, one response surface was constructed, resulting in 90 total response surfaces. For the sake of simplicity, clarity, and economy of space, many results presented here are only a representative sample (one species, one spatial region, or an ensemble average).

The process that was applied to construct the Arches coal gasification model response surface was as follows:

Perform a screening study and investigate the main effects of 6 variables by gathering 8 function samples. Obtain information about which variables are the most important for constructing a response surface.
Reduce the number of variables from 6 to 4, and gather an additional 8 function samples to perform a full 2⁴ factorial design. Obtain a linear response surface, and determine goodness of fit.
If linear model from Step 2 is insufficient, obtain supplementary function samples to construct a quadratic response surface model. Obtain a quadratic response surface, and determine goodness of fit.

This procedure was carried out, and results are presented below.

5.4.1 Gasification Screening Study

The screening study that was run for the coal gasification case used the six variables and ranges listed in the input/uncertainty (I/U) map ( Section 4.4.1↑), and an 8 run screening design was used to explore the main effects of these variables, as described in the section describing sequential assembly (Section 5.3.2↑). This was a 2^6 − 3 = 8 fractional factorial design with the defining contrasts:

I = ABD = BCE = ACF.

These result from letting the levels for D equal the levels for AB, letting the levels for E equal the levels for BC, and letting the levels for F equal the levels for AC. When these three defining contrasts are combined, they yield the full set of defining contrasts for this screening study:

(6.66) I = ABD = ACE = BCF = AEF = BCDE = BDEF = ABEF = ACDF = ABCE.

Following Section 5.3.6↑, the defining contrast can be used to determine which main effects are aliased with which interactive effects. This is presented in Table 5.4↓. Table 5.5↓ shows the various parameter levels used for each screening run.

Aliasing identities for the defining contrast:

I = ABD = ACE = BCF = AEF = BCDE = BDEF = ABEF = ACDF = ABCE.

M_A = A + BD + CE + EF + BCE + BEF + CDF + ABCF + ABCDE + ABDEF

M_B = B + AD + CF + CDE + DEF + ABEF + ABCE + ABCDF

M_C = C + BF + AE + ABE + ADF + BDE + ABCD + ACEF + BCDEF + ABCEF

M_D = D + AB + ACF + BCE + BEF + ADEF + ACDE + BCDF + ABDEF + ABCDE

M_E = E + AC + AF + ABC + ABF + BCD + BDF + BCEF + ABDE + ACDEF

M_F = F + AE + BC + ABE + ACD + BDE + ABDF + ACEF + BCDEF + ABCEF

Table 5.4 Aliasing identities for all main effects. A, B, C, D, E, and F represent E₂, A₂, T_wall, E_{h − CO₂}, d_p, and ṁ_coal, respectively.

Run	E₂	A₂	T_wall	E_{h − CO₂}	d_p	ṁ_coal
	(A)	(B)	(C)	(D)	(E)	(F)
screen-1	+	+	+	+	+	+
screen-2	-	+	+	-	+	-
screen-3	+	-	+	-	-	+
screen-4	-	-	+	+	-	-
screen-5	+	+	-	+	-	-
screen-6	-	+	-	-	-	+
screen-7	+	-	-	-	+	-
screen-8	-	-	-	+	+	+

Table 5.5 Screening study used for the first step of sequential assembly of the Arches coal gasification model response surface.

Analysis of Arches Screening Study Results

A total of 8 Arches simulations were run. Time-averaged concentration profiles were extracted from the temporally and spatially dependent concentration fields computed by the gasification model in Arches. A visual assessment of the comparisons of model predictions to experimental data, with plots comparing experimentally obtained concentration fields with simulation results, are presented in Section 6.6↓. These results show fair agreement. Many of the features of the experimental data are captured by the Arches simulations. Furthermore, incorporating the experimental error into the comparison would certainly improve the agreement. However, how well the model prediction y^M_e matches the data d_e varies significantly with the parameter values. Clearly, a qualitative comparison is insufficient to determine which parameter values are “good” and which ones are not. It is for this reason that a statistical analysis is used to investigate the main effect of each of the six screening study factors.

The main effects for each factor were computed for the entire reactor, and are presented in Table 5.6↓. Determining the factors with the most significant main effects was difficult, given that there were 90 total response surfaces (3 species concentrations, 5 radial location measurements, and 6 axial location measurements), with potentially different rankings of significant effects for each response. For this reason, the gasifier was divided into two zones, the near-injector region (Zone I) and the near-exit region (Zone II). In the first zone, devolatilization was the dominant mechanism, so the factors with the strongest main effects were likely to be those related to the devolatilization reaction. Char oxidation reactions were the dominant mechanism in the second zone. There is no distinct cutoff between the location of Zone I and Zone II, but it was approximated as being halfway through the gasifier (60 cm; see Chapter 7 of [159]). The main effects were computed for the entire reactor, as well as separately for Zone I (first three axial locations) and Zone II (last three axial locations).

	Main Effects
Variable	[CO₂]	[CO]	[H₂]	Mean Main Effect
E₂	0.0698	0.0494	0.0133	0.0441
d_p	0.0343	0.0276	0.0070	0.0230
T_wall	0.0246	0.0128	0.0114	0.0163
ṁ_coal	0.0278	0.0104	0.0085	0.0155
E_{h − CO₂}	0.0135	0.0032	0.0025	0.0064
A₂	0.0011	0.0008	0.0010	0.0010

Table 5.6 Overall main effects for each variable on the three responses of interest, computed from the screening study. The main effects are averaged over Zone I and Zone II (all spatial locations) and ranked in order of most to least significant effect.

The contour plots given in Section 6.6↓ give a visual representation of the variation of one response (CO) with all of the factors. The main effects of each factor were calculated (see Table 5.6↑), and from this information, the number of factors was reduced from 6 to 4, with the the 2 factors determined to be least important from the statistical analysis eliminated, and the 4 factors with the most significant main effects investigated in the next step of the sequential response surface assembly process (the factorial design step).

A word of caution should be interjected here before an attempt is made to interpret the results of the screening study, lest one read too much into these screening study results: for screening studies, main effects are confounded with many interaction effects (these relationships are given in detail in Table 5.4↑). It is obvious that in a system as complex as a coal gasifier, each variable will interact with several others and therefore the main effect of a variable may be coming primarily from an interaction effect, or several interaction effects. A main effect may also be moderate, but appear much more significant due to a number of other moderately important aliased interaction effects. As a response surface is assembled, each step reveals additional information about its main and interaction effects. For this reason, any judgements made during the first stage of sequential assembly about why a main effect was important is stated hypothetically. However, because the rankings of each variable are largely the same throughout the reactor, because interaction effects are typically weaker in magnitude than main effects, and because experience with gasification systems has indicated that the selected active factors will be important (this is, after all, the reason why they were chosen to be in the I/U map), it is justifiable to interpret the screening results as indicating which variables are most important.

Zone I

The main effects in Zone I computed from the screening study are presented in Table 5.7↓. The devolatilization process is likely a very strong influence, as the two most significant main effects, E₂ and d_p, directly control the rate of the devolatilization process. This, in turn, controls the rate of fuel release in the reactor. The devolatilization process starts when cold particles enter the domain, heat up, and devolatilize, releasing their volatile gaseous fuel. The first step, heating, is controlled by the particle size, while the second, the devolatilization reaction, is controlled by E₂, the high-temperature devolatilization reaction activation energy. Note that the main effect for E₂ is nearly three times the main effect for d_p. The mass flowrate and wall temperature main effects are also significant, though about half as much as the particle size main effect. It is not surprising that these factors are significant because they all contribute to the mechanism of particle heating and devolatilization.

	Zone I Main Effects
Variable	[CO₂]	[CO]	[H₂]	∣Mean Main Effect∣
E₂	-0.0998	0.0472	0.0174	0.0548
d_p	0.0343	0.0231	0.0033	0.0203
ṁ_coal	0.0182	0.0126	0.0085	0.0131
T_wall	0.0132	0.0022	0.0144	0.0099
E_{h − CO₂}	0.0055	0.0031	0.0008	0.0032
A₂	0.0021	0.0015	0.0019	0.0018

Table 5.7 Zone I main effects for each variable on the three responses of interest, computed from the screening study. The main effects are averaged over Zone I and ranked in order of most to least significant effect.

A graphical interpretation of the effects is presented in Figure 5.1↓. This is a quantile plot, a type of plot used to compare data to distributions (in this case, comparing the effects of each variable to a normal distribution). A quantile Q(f) is a quantity that divides a population into two parts: a fraction f that have values less than or equal to Q(f), and a fraction 1 − f that have values greater than Q(f). In order to construct the quantile plot, the main factor effects (denoted M_i and referred to as quantiles in this context) are first computed; these quantiles are ordered; and each quantile divides the population into two fractions, one fraction f whose main effect is less than or equal to M_i and a second fraction 1 − f whose main effect is greater than M_i. Each quantile is then plotted against the corresponding quantile for a standard normal distribution, and this plot is the quantile plot.

In a quantile plot, if the quantiles from the presumed (normal) distribution (x axis) match the quantiles from the actual distribution (y axis), the points for each main effect will lie on a line. Points deviating significantly from the line indicate that the main effect represented by that point deviates significantly from the assumed (normal) behavior. From Figure 5.1↓, it can be seen that the main effect for each variable on CO roughly follows the presumed normal distribution. However, interpretation of this plot comes with a very strong caveat: it is critical to remain conscious of the aliasing of main effects with interaction effects, as specified by Table 5.4↑. What looks like a strong or weak main effect may in fact be a strong or weak interaction effect that eclipses the main effect. This can also cause counterintuitive results: suppose, for example, T_wall does not strongly effect the H₂ response. But the main effect of T_wall, M_Twall, is also aliased with the interaction effect of E₂ and ṁ_coal, I_{E₂ṁ_coal}. If this interaction effect strongly affects the H₂ response, then it will appear as though T_wall has a strong effect on the H₂ response. All that can really be said is that the sum of all effects aliased with the main effect is significant. Beyond that, no differentiation can be made without gathering additional information via additional runs.

figure figures/Screening_Quantiles_Zone1.png

Figure 5.1 Quantile plot of the main effects for Zone I, computed from the screening design.

Box contour plots of the model response y^M_e in parameter space are given in Figures 5.2↓, 5.3↓, and 5.4↓. The model predictions are plotted versus the two parameters with the largest main effect for the given response. These plots indicate visually the effect that various variables have on the model response. While this is a crude representation of the surface, using only four points, it can provide some indication as to whether and how much parameters affect the model response. Each plot shows consistently lower model predictions for increasing E₂ and lower model predictions for increasing d_p. These were the two most dominant variables for every point in Zone I. This result indicates that in Zone I of the gasifier there is a single dominant physical mechanism or parameter. As mentioned, it is highly likely this is the devolatilization mechanism. Both E₂ and d_p are controlling parameters in the devolatilization model used. A higher value of E₂ and a higher value of d_p will both suppress devolatilization, due to both increasing the energy required for the devolatilization reaction to occur. Suppressed devolatilization will lead to slower formation of fuel species like CO, as seen in the plots below. This qualitative behavior matches what is expected.

figure figures/Screening_BoxContour_ymodel_CO_x021.png

Figure 5.2 Box contour plot of the model prediction y^M_e for Zone I for the response [CO] as a function of the two most significant main factor effects for the responses at x = 21 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

figure figures/Screening_BoxContour_ymodel_CO_x036.png

Figure 5.3 Box contour plot of the model prediction y^M_e for Zone I for the response [CO] as a function of the two most significant main factor effects for the responses at x = 36 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

figure figures/Screening_BoxContour_ymodel_CO_x051.png

Figure 5.4 Box contour plot of the model prediction y^M_e for Zone I for the response [CO] as a function of the two most significant main factor effects for the responses at x = 51 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

Zone II

The main effects in Zone II, the char oxidation region in the latter half of the gasifier, are presented in Table 5.8↓. The variables appear largely in the same order, with only T_wall and ṁ_coal switching spots. This is largely due to T_wall having a much increased main effect in Zone II. The quantile plot shows much the same trend: T_wall had a strong effect on both fuels, CO and H₂. E₂ also had a significant effect on all three variables, with the effect being negative for CO and H₂ and positive for CO₂. This is due to the gas phase chemistry; slower devolatilization leads to fuel being released at a different location in the reactor, which affects the temperature, local concentrations of fuel, and the char oxidation process. Despite the fact that much less devolatilization occurs in Zone II than in Zone I of the gasifier, the devolatilization activation energy parameter still has a strong main effect due to its influence over all aspects of the gas phase chemistry. This influence propagates through the entire gasifier.

	Zone II Main Effects
Variable	[CO₂]	[CO]	[H₂]	Mean Main Effect
E₂	0.0397	0.0516	0.0092	0.0335
d_p	0.0343	0.0321	0.0105	0.0257
T_wall	0.0359	0.0234	0.0084	0.0026
ṁ_coal	0.0373	0.0082	0.0085	0.0180
E_{h − CO₂}	0.0215	0.0032	0.0041	0.0096
A₂	0.0002	0.0002	0.0002	0.0002

Table 5.8 Zone II main effects for each variable on the three responses of interest, computed from the screening study. The main effects are averaged over Zone II and ranked in order of most to least significant effect.

Zone II main and interaction effects can also be visualized. A quantile plot (Figure 5.5↓) visualizes the main and interaction effects on the main system response, while the box contour plots in Figure 5.6↓, 5.7↓, and 5.8↓ provide a visual representation of the effect of the two variables with the largest main effect and their effect on the model prediction, y^M_e. These show the trend of higher E₂ leading to lower model predictions, higher d_p leading to lower model predictions, and higher ṁ_coal leading to lower model predictions. As mentioned in the Zone I discussion, the dominance of E₂ at all but the very furthest points from the injector in the gasifier indicate that it will play a strong role in the validation process, and be an important part of the final surrogate model that is constructed to reproduce the predictions of Arches.

figure figures/Screening_Quantiles_Zone2.png

Figure 5.5 Quantile plot of the main effects for Zone II, computed from the screening design.

figure figures/Screening_BoxContour_ymodel_CO_x067.png

Figure 5.6 Box contour plot of the model prediction y^M_e for Zone II for the response [CO] as a function of the two most significant main factor effects for the responses at x = 67 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

figure figures/Screening_BoxContour_ymodel_CO_x081.png

Figure 5.7 Box contour plot of the model prediction y^M_e for Zone II for the response [CO] as a function of the two most significant main factor effects for the responses at x = 81 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

figure figures/Screening_BoxContour_ymodel_CO_x112.png

Figure 5.8 Box contour plot of the model prediction y^M_e for Zone II for the response [CO] as a function of the two most significant main factor effects for the responses at x = 112 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

Conclusions

Keeping in mind the caveat that these “main effects” are in fact confounded with multiple interaction effects, the variables with the strongest main effects appear to be the wall temperature T_wall, the devolatilization activation energy E₂, the coal mass flowrate ṁ_coal, and the mean particle size d_p. The variable A₂ had a marginal effect in every case. The variable E_{h − CO₂} had a more significant effect than A₂, but was still marginal in every case.

One intention of using sequential assembly is to enable the reduction of the number of factors for each step. Selecting 4 factors is an economical choice, as it then takes only 8 additional runs to complete a full 2⁴ factorial design. Deciding which factors to keep for the next step of the response surface assembly was straightforward: the same four variables were the most significant in both Zone I and Zone II. In cases where the decision is not as straightforward (when, perhaps, one variable is very significant in Zone I and insignificant in Zone II, while another variable is very significant in Zone II but not in Zone I), it is important to lay out decision criteria. First, the variable selected should not just have a strong main effect on the system, but should also have a strong effect on whether or not the model can match the data. For this, the box contour plots (Figures 5.2↑, 5.3↑, and 5.4↑) can help. Furthermore, it is also important to select variables that will affect the system response where data is available. If the preponderance of data is in Zone I, then the variable with the strongest main effect in Zone I should be chosen. In such cases, it may be particularly useful to look at main effects for individual responses, rather than looking at an average over a spatial region. For this, one may use tables like Table 5.6↑, as well as quantile plots like Figure 5.1↑ to evaluate the main effects graphically.

5.4.2 Gasification Fractional and Full Factorial

The next step in the sequential assembly of the response surface was to perform a fractional factorial design, then a full factorial design, both for a reduced number of factors. It was desirable to reduce the number of design factors to 4 to keep the cost of the response surface assembly economical. For this reason, the 4 most significant main effects mentioned above (E₂, ṁ_coal, d_p, and T_wall) were kept as factors for the next sequential assembly step. This decision was based on the significance of the main effect of all four variables in both Zone I (the devolatilization region near the injector) and Zone II (the char oxidation region near the exit) of the gasifier.

The screening study performed in Section 5.4.1↑ was a 2^6 − 3 fractional factorial design, and for the reduced set of factors that translated into a 2^4 − 1 fractional factorial design. To supplement it further required an additional 8 runs, listed in Table 5.9↓, which made it a full factorial design. Had the number of starting variables been greater, for example transforming a 7 variable 2^7 − 4 fractional factorial screening study into a 6 variable 2^6 − 3 (1 ⁄ 8) fractional factorial design, or if the original 6 variables were reduced in number to 5 instead of 4, one could then complement the screening study with an additional 8 runs to create a fractional factorial design, and supplement the fractional factorial design with additional runs (or reduce the number of factors) to form a full factorial design. However, the implemented design (i.e. the reduction in the number of variables from 6 to 4) was selected so that only one additional set of 8 complementary runs were needed to form a full factorial design. The Arches model was run at each of these 8 sets of conditions in order to complete the full factorial design.

One outstanding question relates to the defining contrast and corresponding aliasing identities for the new, reduced fractional factorial design. In other words, in the original screening study, the main effect of variable d_p was aliased with E₂ × A₂, meaning it was not an independently varied factor. Likewise, the main effect of variable ṁ_coal was aliased with E₂ × T_wall, and was also not an independently varied vector. The question is, how did this dependence change with the new fractional factorial design (that is, with the change in the number of design variables)? For the new fractional factorial design, which has only 4 variables, E₂, T_wall, d_p, and ṁ_coal, this question is answered by looking at the new fractional factorial design cases in Table 5.5↑. From this, it is clear that eliminating A₂ as a design factor has made d_p an independently varied factor. However, because T_wall was not eliminated as a design factor, ṁ_coal is still not independently varied. Note that the 8 supplementary design points in Table 5.9↓, compared to the design points in Table 5.5↑, do not change with respect any variables except ṁ_coal. This causes ṁ_coal to be independently varied. Note that for the full 2⁴ factorial design, no table analogous to Table 5.4↑ need be presented, since the design is not fractional and consequently has no defining contrast.

Run	E₂	T_wall	d_p	ṁ_coal
fact-9	+	+	+	-
fact-10	-	+	+	+
fact-11	+	+	-	-
fact-12	-	+	-	+
fact-13	+	-	-	+
fact-14	-	-	-	-
fact-15	+	-	+	+
fact-16	-	-	+	-

Table 5.9 Full factorial design for the screening study variables with the 4 largest main effects. An asterisk indicates a run at the specified conditions is already available; Table 5.5↑ contains the screening study design points, while this table contains the complementary design points, which compose a full factorial design when combined with the screening study design points.

Analysis of Arches Factorial Results

The calculated main effects averaged over the entire domain, given in Table 5.10↓, were similar to those calculated for the 2^6 − 3 fractional factorial screening study. The most significant main effect was still E₂. However, interestingly, the wall temperature become the second most important parameter. This is due to the fact that in the full factorial design, no main effects were aliased with interaction effects. This indicates that there was likely an interaction effect aliased with the main effect for T_wall that canceled it out or made it appear less significant than it was. Tables 5.11↓ and 5.12↓ show the main effects averaged over Zone I and Zone II of the gasifier, respectively, and show some variation in the value of the main effect, but no variation in the ranking of effect significance.

Naturally, the question arises: what has been learned about the interaction effects? Were any main effects considered significant that should not have, because they were aliased with significant interaction effects, or vice versa? It is likely that, due to the fact that the ranking of significant main effects did not change significantly from the screening study to the factorial design, none of the interaction effects were making unimportant effects look important. But in fact, precisely the opposite happened for the only factor whose main effect changed significantly between the screening and factorial design analyses, T_wall.

The rankings of each interaction effects, given in Table 5.13↓ for the entire gasifier and represented visually in Figure 5.9↓, indicate that the interaction E₂ × ṁ_coal is the most significant. From Table 5.4↑, it can be seen that this interaction effect was aliased with T_wall. In fact, the increase in the main effect of T_wall is approximately equal to the value of the interaction effect of E₂ × ṁ_coal,

M_screen(T_wall) = M_factorial(T_wall) − I_factorial(E₂, ṁ_coal) 0.0163 ≈ 0.0085 + 0.0247.

In other words, the interaction effect of E₂ × ṁ_coal was canceling out the main effect of T_wall in the screening study. The reason that each interaction effect can be determined is that the reduction in the number of variables reduces the number of degrees of freedom required to completely specify a linear model from 64 to 16, and the 8 supplementary runs listed in Table 5.9↑ provide the appropriate number of degrees of freedom to completely specify the coefficients of a linear model. As a result, all interaction effects can be computed separately from main effects.

As before, the interaction effects are also reported for the Zone I and Zone II local averages, in Tables 5.14↓ and 5.15↓, respectively. These are plotted using quantile plots in Figures 5.10↓ and 5.11↓, respectively.

	Main Effects
Variable	[CO₂]	[CO]	[H₂]	Mean Main Effect
E₂	0.0715	0.0498	0.0141	0.0451
T_wall	0.0344	0.0224	0.0173	0.0247
d_p	0.0309	0.0283	0.0080	0.0224
ṁ_coal	0.0231	0.0101	0.0077	0.0136

Table 5.10 Main effects for each variable on the three responses of interest, as determined by the factorial design. The main effects are averaged over all spatial points and ranked in order of most to least significant effect.

	Zone I Main Effects
Variable	[CO₂]	[CO]	[H₂]	Mean Main Effect
E₂	0.1018	0.0475	0.0182	0.0558
T_wall	0.0273	0.0147	0.0228	0.0216
d_p	0.0333	0.0234	0.0038	0.0201
ṁ_coal	0.0167	0.0121	0.0082	0.0123

Table 5.11 Zone I main effects for each variable on the three responses of interest, as determined by the factorial design. The main effects are averaged over Zone I and ranked in order of most to least significant effect.

	Zone II Main Effects
Variable	[CO₂]	[CO]	[H₂]	Mean Main Effect
E₂	0.0412	0.0522	0.0100	0.0345
T_wall	0.0415	0.0301	0.0118	0.0278
d_p	0.0284	0.0332	0.0122	0.0246
ṁ_coal	0.0294	0.0081	0.0071	0.0149

Table 5.12 Zone II main effects for each variable on the three responses of interest, as determined by the factorial design. The main effects are averaged over Zone II and ranked in order of most to least significant effect.

	i − j Interaction Effects
Variable	[CO₂]	[CO]	[H₂]	Mean Interaction Effect
E₂ × ṁ_coal	0.0091	0.0109	0.0055	0.0085
E₂ × d_p	0.0011	0.0223	0.0012	0.0082
d_p × ṁ_coal	0.0071	0.0029	0.0016	0.0039
T_wall × ṁ_coal	0.0002	0.0002	0.0002	0.0002
T_wall × d_p	0.0000	0.0000	0.0000	0.0000
E₂ × T_wall	0.0000	0.0000	0.0000	0.0000

Table 5.13 Two way interaction effects as determined by the full factorial design. The interaction effects are averaged over all spatial points and ranked in order of most to least significant effect.

	Zone I i − j Interaction Effects
Variable	[CO₂]	[CO]	[H₂]	Mean Interaction Effect
E₂ × ṁ_coal	0.0127	0.0163	0.0082	0.0124
d_p × ṁ_coal	0.0036	0.0043	0.0016	0.0031
E₂ × d_p	0.0003	0.0067	0.0003	0.0024
T_wall × ṁ_coal	0.0001	0.0001	0.0001	0.0001
T_wall × d_p	0.0000	0.0000	0.0000	0.0000
E₂ × T_wall	0.0000	0.0000	0.0000	0.0000

Table 5.14 Zone I two way interaction effects, as determined by the factorial design. The main effects are averaged over Zone I and ranked in order of most to least significant effect.

	Zone II i − j Interaction Effects
Variable	[CO₂]	[CO]	[H₂]	Mean Interaction Effect
E₂ × d_p	0.0019	0.00379	0.0021	0.0140
E₂ × ṁ_coal	0.0055	0.0055	0.0029	0.0046
d_p × ṁ_coal	0.0106	0.0014	0.0017	0.0046
T_wall × ṁ_coal	0.0004	0.0004	0.0004	0.0004
T_wall × d_p	0.0000	0.0000	0.0000	0.0000
E₂ × T_wall	0.0000	0.0000	0.0000	0.0000

Table 5.15 Zone II two way interaction effects, as determined by the factorial design. The main effects are averaged over Zone II and ranked in order of most to least significant effect.

\clearpage

Figure 5.9↓ shows most of the interaction effects clustered near the point (0, 0), meaning they are not significant. Most of the interaction effects are 1 order of magnitude smaller than the main effects. However, the two most significant interaction effects, E₂ × ṁ_coal and E₂ × d_p, visually deviate from this pattern. The Zone I quantile plot in Figure 5.1↑ shows E₂ × ṁ_coal separated from the cluster of interaction effects for the CO₂ and H₂ responses; the Zone II quantile plot in Figure 5.5↑ shows E₂ × d_p clearly deviating from this trend as well for the CO₂ response.

figure figures/Factorial_Quantiles_Zone2.png

Figure 5.9 Quantile plot of the main and interaction effects for the entire gasifier, computed from the full factorial design.

figure figures/Factorial_Quantiles_Zone1.png

Figure 5.10 Quantile plot of the main and interaction effects for Zone I, computed from the full factorial design.

Figure 5.11 Quantile plot of the main and interaction effects for Zone II, computed from the full factorial design.

The effect of variables on the comparison residuals, plotted in Figures 5.12↓ through 5.17↓, show similar trends in all but two locations (at x = 36 cm, r = 0 cm and r = 2 cm). This was also the case in the residual contour plots reported from the screening study.

figure figures/Factorial_BoxContour_ymodel_CO_x021.png

Figure 5.12 Box contour plot of the model prediction y^M_e for the response [CO] as a function of the two most significant main factor effects for the responses at x = 21 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

figure figures/Factorial_BoxContour_ymodel_CO_x036.png

Figure 5.13 Box contour plot of the model prediction y^M_e for the response [CO] as a function of the two most significant main factor effects for the responses at x = 36 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

figure figures/Factorial_BoxContour_ymodel_CO_x051.png

Figure 5.14 Box contour plot of the model prediction y^M_e for the response [CO] as a function of the two most significant main factor effects for the responses at x = 51 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

figure figures/Factorial_BoxContour_ymodel_CO_x067.png

Figure 5.15 Box contour plot of the model prediction y^M_e for the response [CO] as a function of the two most significant main factor effects for the responses at x = 67 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

figure figures/Factorial_BoxContour_ymodel_CO_x081.png

Figure 5.16 Box contour plot of the model prediction y^M_e for the response [CO] as a function of the two most significant main factor effects for the responses at x = 81 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

figure figures/Factorial_BoxContour_ymodel_CO_x112.png

Figure 5.17 Box contour plot of the model prediction y^M_e for the response [CO] as a function of the two most significant main factor effects for the responses at x = 112 cm. Each of the five insets are for (left to right) r = 0 cm, 2 cm, 4 cm, 6 cm, and 8 cm.

\clearpage

5.4.3 First-Order Gasification Response Surface

The 2⁴ full factorial design detailed in Section 5.4.2↑ yields enough information to completely specify a 16-term linear model containing only first order terms. However, here one runs into the same problem as with computing the main effects of variables: with 90 total responses, it is somewhat unwieldy to present and discuss all relevant results. Therefore, for reasons of style and economy, three representative points out of the 90 total were chosen: two points in the same radial profile and two points on the same axial profile. The relevant calculations are demonstrated only for these three responses. The three points selected were x = 36 cm and r = 0 cm, x = 36 cm and r = 6 cm, and x = 81 cm and r = 0 cm. Also, because nine responses is still an unwieldy number, where appropriate only the CO response was examined. Unless otherwise mentioned, conclusions about the CO response also apply to the other responses.

It is important to begin the discussion of the constructed first order response surface with a review of the terminology being used in order to provide clarity. For any system of interest, there are data available, designated d_e (the subscript e indexes an experiment, or an experimental measurement). A model is constructed to attempt to replicate that data, and the model’s predictions are denoted y^M_e(x) (superscript M for model). These predictions are a function of an input parameter vector x. Next, as part of the validation process, a greatly simplified surrogate model is constructed to attempt to replicate the output of the more complex model. The surrogate model prediction is denoted ŷ_e(θ), where θ is another input parameter vector; this may be a subset of x, contain a few elements in common with x, or have no elements in common with x. The first situation listed is the most common.

An overview of linear regression has already been presented in Section 5.2.2↑; details will be relegated to this coverage. The first function onto which the simulation results were being regressed is:

(6.67) y_i(x, r) = β₀ + β₁x₁ + β₂x₂ + β₃x₃ + β₄x₄ + β₁₂x₁x₁ + β₁₃x₁x₃ + β₁₄x₁x₄ + β₂₃x₂x₃ + β₂₄x₂x₄ + β₃₄x₃x₄ + β₁₂₃x₁x₂x₃ + β₁₂₄x₁x₂x₄ + β₁₃₄x₁x₃x₄ + β₂₃₄x₂x₃x₄ + β₁₂₃₄x₁x₂x₃x₄.

where x₁ = E₂, x₂ = T_wall, x₃ = d_p, and x₄ = ṁ_coal. Nine polynomials were computed, three for each response at the representative points. For ease of use and ease of regression, the input variable vector x was normalized; the normalized vector is denoted x̂. The set of level values was {0, 1}, so that the “low” level of each variable was 0 and the “high” level of each variable was 1. One of the response surface polynomials is given below and its characteristics analyzed.

(6.68) y_CO(x = 0.36, r = 0) = (0.2976) − (0.095223)E₂ + (0.026395)T_wall − (0.045384)d_p − (0.016755)m_coal + (4.9 × 10^− 05)E₂T_wall + (0.012408)E₂d_p − (0.01812)E₂m_coal + (9.8 × 10^− 05)T_walld_p + (0.0001296)T_wallm_coal + (0.026371)d_pm_coal − (9.8 × 10^− 05)E₂T_walld_p − (1.4164 × 10^− 16)E₂T_wallm_coal − (0.025794)E₂d_pm_coal − (3.3307 × 10^− 16)T_walld_pm_coal + (4.4409 × 10^− 16)E₂T_walld_pm_coal.

This response surface has an R² value of 1.0 at all spatial locations (as do the response surfaces of other species). This, however, is obvious: if the number of constants is equal to the number of degrees of freedom, the R² value will always be 1.0 because the response surface will always fit the data perfectly. (In fact, this is the reason for the adjusted R², defined in Section (5.1.4↑), which is ∞ when the number of points is equal to the number of parameters.) The tradeoff is that one cannot make any statements about the amount of error y − ŷ that may be associated with a prediction ŷ, nor can one make any statements about a confidence interval for any of the coefficients.

It should be noted that in this response surface polynomial, several of these terms have extremely small coefficients, primarily the third and fourth order interaction terms (this is also true of the response surface polynomials for other species and at other locations). In addition to the higher order interaction effects with very small coefficients, several of the interaction terms are known to be insignificant from the effects analysis (Tables 5.13↑-(5.15↑)). For example, the three interaction effects E₂ × T_wall, T_wall × d_p, and T_wall × ṁ_coal are insignificant throughout the entire reactor. In the polynomials above, the coefficients of each of these interaction terms are on the order of 10^− 4 to \strikeout off\uuline off\uwave off10^− 5\uuline default\uwave default (again, also true of the response surface polynomials for other species and at other locations). It is prudent to utilize the results obtained from the analysis of effects when choosing a regression function. The choice of function (6.67↑) is a poor choice because it does not utilize any of this information.

Instead, a new regression function should be chosen, excluding several interaction terms found to be insignificant through the effects analysis and via Yeates’ Method. This was a model of the form:

(6.69) y = β₀ + β₁x₁ + β₂x₂ + β₃x₃ + β₄x₄ + β₁₃x₁x₃ + β₁₄x₁x₄ + β₃₄x₃x₄ + β₁₂₃x₁x₂x₃ + β₁₃₄x₁x₃x₄.

When this was done, the following polynomials for the CO response at each spatial location of interest were obtained:

(6.70) y_CO(x = 0.36, r = 0.0) = (0.29755) − (0.095197)E₂ + (0.026509)T_wall − (0.045335)d_p − (0.01669)m_coal + (0.011998)E₂d_p − (0.01812)E₂m_coal + (0.026371)d_pm_coal + (0.00072253)E₂T_walld_p − (0.025071)E₂d_pm_coal

(6.71) y_CO(x = 0.36, r = 0.6) = (0.15349) − (0.088104)E₂ + (0.026509)T_wall − (0.035085)d_p − (0.006414)m_coal + (0.0018837)E₂d_p − (0.020509)E₂m_coal + (0.0055708)d_pm_coal + (4.5158 × 10^− 05)E₂T_walld_p − (0.0048423)E₂d_pm_coal

(6.72) y_CO(x = 0.81, r = 0.0) = (0.34107) − (0.036914)E₂ + (0.033672)T_wall − (0.012188)d_p − (0.016856)m_coal − (0.015124)E₂d_p − (0.014362)E₂m_coal − (0.023564)d_pm_coal + (0.00015575)E₂T_walld_p + (0.023907)E₂d_pm_coal

Comparing the R² values of these two approaches, one sees very little difference: for CO and CO₂, the value of R² was 1.0 at all three points listed above. H₂ was also essentially the same, with R² values ranging from 0.9992 − 1.0. (No mean square error (MSE) values were reported for the 16-term response surface because none could be estimated.) However, eliminating the unnecessary parameters provided an estimate of the MSE, provided for each response and location of interest in Table (5.16↓).

Location		Mean square error
x	r	CO	CO₂	H₂
0.36	0.0	1.062 × 10^− 7	7.425 × 10^− 8	5.26 × 10^− 9
0.36	0.6	1.294 × 10^− 8	4.38 × 10^− 9	3.274 × 10^− 9
0.81	0.0	9.362 × 10^− 8	5.88 × 10^− 8	7.352 × 10^− 8

Table 5.16 Mean square error for response surfaces (6.70↑)-(6.72↑).

5.4.4 First-Order Gasification Response Surface With Curvature

In order to obtain a better idea of how linear the responses were, an additional sample point was obtained at the center of the factorial design. A design point in the center gave three points to be fitted in each direction, which provided an estimate of the curvature in the response surface. Furthermore, the additional degree of freedom provided some basis for comparison of the 16-term polynomial response surface (6.67↑) with the reduced 10-term polynomial response surfaces. The response surfaces were recomputed for the additional sample point, and the R², adjusted R², and MSE values are reported in Tables 5.17↓ and 5.18↓. The updated response surfaces are plotted in two dimensions; the 16-term response surface (6.67↑) is plotted in Figures 5.20↓ through 5.20↓, and the 10-term response surface 6.69↑ is plotted in Figures 5.21↓ through 5.23↓.

Location		R² (R²_a) Coefficient			Mean square error
x	r	CO	CO₂	H₂	CO	CO₂	H₂
0.36	0.0	0.9969 (0.9507)	0.9982 (0.9714)	0.9945 (0.912)	0.0001647	4.768 × 10^− 05	3.356 × 10^− 05
0.36	0.6	0.996 (0.9358)	0.9994 (0.99)	0.9906 (0.85)	0.0001909	5.688 × 10^− 06	2.968 × 10^− 05
0.81	0.0	0.9995 (0.9924)	0.9993 (0.9892)	0.9838 (0.7416)	9.244 × 10^− 06	1.476 × 10^− 05	3.223 × 10^− 05

Table 5.17 R² and MSE for 16-term response surface given by equation (6.67↑), updated to account for the new center design point.

Location		R² (R²_a) Coefficient			Mean square error
x	r	CO	CO₂	H₂	CO	CO₂	H₂
0.36	0.0	0.9969 (0.9929)	0.9982 (0.9959)	0.9945 (0.9874)	2.362 × 10^− 05	6.875 × 10^− 06	4.8 × 10^− 06
0.36	0.6	0.996 (0.9908)	0.9994 (0.9986)	0.9906 (0.9786)	2.727 × 10^− 05	8.144 × 10^− 07	4.243 × 10^− 06
0.81	0.0	0.9995 (0.9988)	0.9993 (0.9984)	0.9836 (0.9626)	1.401 × 10^− 06	2.159 × 10^− 06	4.668 × 10^− 06

Table 5.18 R² and MSE for 10-term response surface given by equation (6.69↑), updated to account for the new center design point.

Figure 5.18 Plot of the surrogate model response ŷ^M_e (gray surface) for the 16-term response surface (6.67↑), along with the Arches responses y^M_e being fit, for x = 36 cm and r = 0 cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.19 Plot of the surrogate model response ŷ^M_e (gray surface) for the 16-term response surface (6.67↑), along with the Arches responses y^M_e being fit, for x = 36 cm and r = 6 cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.20 Plot of the surrogate model response ŷ^M_e (gray surface) for the 16-term response surface (6.67↑), along with the Arches responses y^M_e being fit, for x = 81 cm and r = 0 cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.21 Plot of the surrogate model response ŷ^M_e (gray surface) for the 10-term response surface (6.67↑), along with the Arches responses y^M_e being fit, for x = 36 cm and r = 0 cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.22 Plot of the surrogate model response ŷ^M_e (gray surface) for the 10-term response surface (6.67↑), along with the Arches responses y^M_e being fit, for x = 36 cm and r = 6 cm. The dimensions plotted are those of the three most active interaction effects.

Figure 5.23 Plot of the surrogate model response ŷ^M_e (gray surface) for the 10-term response surface (6.67↑), along with the Arches responses y^M_e being fit, for x = 81 cm and r = 0 cm. The dimensions plotted are those of the three most active interaction effects.

While the R² values were nearly the same for the two responses, the 16-term response surface had slightly worse adjusted R² values (though still very good), and larger errors, than the 10-term response surface. With the addition of the center point, the R² values of the two response surfaces were still very close to 1.0, indicating that there was not a significant amount of curvature in the response. Additionally, the MSE values for both response surfaces were extremely small. After an initial analysis of the results, it appeared that a full composite design was not necessary. However, before making this decision, additional statistical analysis was performed to determine whether this was a justifiable hypothesis.

Two additional statistical analyses were performed. First, an analysis of variance (ANOVA) table was created to establish confidence levels for each polynomial coefficient in (6.69↑), as well as to perform an F-test of the quadratic versus linear model, which indicated the importance or unimportance of quadratic terms for constructing an accurate response surface. The quadratic model being tested was:

(6.73) y = β₀ + β₁x₁ + β₂x₂ + β₃x₃ + β₄x₄ + β₁₃x₁x₃ + β₁₄x₁x₄ + β₃₄x₃x₄ + β₁₂₃x₁x₂x₃ + β₁₃₄x₁x₃x₄ + β₁₁x²₁ + β₂₂x²₂ + β₃₃x²₃ + β₄₄x²₄.

Second, an analysis of the residuals was performed to test whether there were underlying quadratic effects that were missed by the analysis of variance hypothesis test. This test was graphical, and compares the residuals to the system response. A trend in residuals indicates that the polynomial model is missing important features and should be improved or changed.

The ANOVA tables (Tables 5.19↓, 5.20↓, and 5.21↓) provide justification for the hypothesis that no quadratic model is needed to adequately model the system response. The important columns in this table are the last two. The F value gives a measure of the amount of explained variance to the amount of unexplained variance. It is essentially a test of the null hypothesis; in the case of the first row, it is a test of the hypothesis that the data variability can be explained by a linear model, and in the case of the second row, it is a test of the hypothesis that the additional variability (remaining after the linear model) can be explained by the quadratic terms added to the linear model. The corresponding p value is the probability of the data being explained without the hypothesized model (that is, the probability of the null hypothesis being true). What this means is, the probability that the linear terms included in the model are required to explain the data is extremely high (99.99999999…%). On the other hand, the probability that the quadratic terms included in the model are required to explain the data is less than a coin flip (40%). Typically, the test of whether model terms are statistically significant establish a significance level α, for which statistical significance requires a p value that satisfies

(6.74) p < 1 − α.

The ANOVA table clearly indicates that the 10-term response surface polynomial (6.69↑) is statistically significant, and the 14-term quadratic response surface polynomial (6.73↑) is not.

		Sum of	Mean Sum
	Degrees of	Squares	of Squares
Source	Freedom	(SS)	(MS)	F value	p value
Linear Terms	9	0.05	0.01	250	6.1 × 10^− 8
Error	7	0.00017	2.4 × 10^− 5
Added quadratic terms	13	0.0002	1.5 × 10^− 5	0.63	0.60
Error with quadratic terms	3	8.2 × 10^− 7	2.7 × 10^− 7

Table 5.19 ANOVA for CO response at x = 0.36 and r = 0.0 cm.

		Sum of	Mean Sum
	Degrees of	Squares	of Squares
Source	Freedom	(SS)	(MS)	F value	p value
Total	9	0.05	0.01	193	1.5 × 10^− 7
Error	7	0.00019	2.7 × 10^− 5
Added quadratic terms	13	0.0002	1.5 × 10^− 5	0.58	0.61
Error with quadratic terms	3	8.0 × 10^− 8	2.7 × 10^− 7

Table 5.20 ANOVA for CO response at x = 0.36 and r = 0.6 cm.

		Sum of	Mean Sum
	Degrees of	Squares	of Squares
Source	Freedom	(SS)	(MS)	F value	p value
Total	9	0.02	0.01	193	1.1 × 10^− 10
Error	7	9.8 × 10^− 6	2.7 × 10^− 5
Added quadratic terms	13	0.0001968	1.5 × 10^− 5	0.80	0.51
Error with quadratic terms	3	8.0 × 10^− 8	2.7 × 10^− 7

Table 5.21 ANOVA for CO response at x = 0.81 and r = 0.0 cm.

The second test performed was a graphical analysis of the residuals. The residuals are defined here as the difference between the Arches computation y^M_e (the “data” the surrogate model is intending to reproduce) and the surrogate model prediction ŷ^M_e. The residuals were plotted for both the 16-term response surface (6.67↑) (Figures 5.24↓, 5.25↓, and 5.26↓) and the 10-term response surface 6.69↑. It is clear from the residual plots that the center point is the primary outlying point, but the magnitude of the residual, which is not very high, indicates that there is only slight curvature in the surface; given these results, it would be difficult to justify running the additional 8 runs required by a composite design.

Figure 5.24 Residuals from comparison of the 16-term response surface (6.67↑) to Arches predictions, y^M_e − ŷ^M_e, as a function of the response y^M_e, for x = 36 cm and r = 0 cm. The residual from the design point at the center of the factorial design is indicated with the “center” label.

Figure 5.25 Residuals from comparison of the 16-term response surface (6.67↑) to Arches predictions, y^M_e − ŷ^M_e, as a function of the response y^M_e, for x = 36 cm and r = 6 cm. The residual from the design point at the center of the factorial design is indicated with the “center” label.

Figure 5.26 Residuals from comparison of the 16-term response surface (6.67↑) to Arches predictions, y^M_e − ŷ^M_e, as a function of the response y^M_e, for x = 81 cm and r = 0 cm. The residual from the design point at the center of the factorial design is indicated with the “center” label.

Figure 5.27 Residuals from comparison of the 10-term response surface (6.69↑) to Arches predictions, y^M_e − ŷ^M_e, as a function of the response y^M_e, for x = 36 cm and r = 0 cm. The residual from the design point at the center of the factorial design is indicated with the “center” label.

Figure 5.28 Residuals from comparison of the 10-term response surface (6.69↑) to Arches predictions, y^M_e − ŷ^M_e, as a function of the response y^M_e, for x = 36 cm and r = 6 cm. The residual from the design point at the center of the factorial design is indicated with the “center” label.

Figure 5.29 Residuals from comparison of the 10-term response surface (6.69↑) to Arches predictions, y^M_e − ŷ^M_e, as a function of the response y^M_e, for x = 81 cm and r = 0 cm. The residual from the design point at the center of the factorial design is indicated with the “center” label.

5.4.5 Coal Gasification Response Surface Conclusions

In order to construct a surrogate model for the Arches coal gasification simulation tool, concepts from statistics and experimental design were used to design sample points of Arches in parameter space in order to construct a response surface. A multistage approach was adopted called sequential assembly. This utilizes information obtained at each step in the validation process in order to optimize both the samples that are gathered and the presumed form of the response surface. Initially, a screening study was used to determine the main effects of six total parameters using a small number of runs (8; see Section 5.4.1↑ and Table 5.5↑). The six parameters were devolatilization activation energy E₂ (from the Kobayashi devolatilization model), the devolatilization Arrhenius factor A₂, the wall temperature T_wall, the CO₂ char oxidation reaction activation energy E_{h − CO₂}, the mass mean particle size d_p, and the solids mass flowrate ṁ_coal. This information was analyzed, and of the six variables, four were retained for additional analysis because they were determined to have the most significant affect on the responses (see Table 5.6↑). It was found that the Arrhenius factor and the char oxidation activation energy had insignificant effects in all regions of the gasifier. It was also found that E₂ was by far the most significant factor. Its main effect was propagated through the gasifier. This confirms findings of earlier sensitivity studies [156], which indicate the primary importance of the devolatilization process. In addition, the remaining three factors all have strong influences on the devolatilization process, further confirmation that the Arches model results match expectations. However, due to the fact that main effects are aliased with interaction effects in small screening designs, further analysis was required.

The next stage of the response surface construction was to complement the 8 runs of the screening study with an additional number of runs (8) to complete a full 2⁴ factorial design for four variables (see Section 5.4.2↑ and Table 5.9↑). The results from the full factorial were analyzed to produce a list of important main effects (Tables 5.10↑, 5.11↑, and 5.12↑) and interaction effects (Table 5.13↑, 5.14↑, and 5.15↑). Because these were determined from a full factorial design, they were not aliased with any other effects. From these results, an initial first order response surface was constructed, of the form (6.67↑). However, some of the downfalls of fitting a model with as many constants as degrees of freedom were pointed out, and an alternative model (6.69↑) with fewer parameters was proposed.

From the results of the factorial design, it was found that the behavior of the response was close to linear and well behaved. Several tests of this finding were performed in order to determine whether further sampling was needed to extend the factorial design to a composite design, which would provide enough information to construct a response surface. The first test of the linearity of the response surface was a sample of the center of the factorial design, which added an additional point and an additional level for each factor. This point indicated that the curvature of the surface was small, and that the linear response surface finding was likely true.

Further tests were performed, in the form of analysis of variance tables (Tables 5.19↑ through 5.21↑). These showed that the data was described very well by the linear response surface model, and that the need for a quadratic model to fit the data was highly improbable. The final test was in the form of a graphical analysis of residuals. There were no detectable patterns in the residuals that indicated an underlying quadratic trend missed by the regression or the analysis of variance; the residuals y^M_e − ŷ^M_e exhibited no dependence on the response y^M_e.

The results of the surrogate model construction were surprising; a highly nonlinear system such as a coal gasifier would not normally be expected to behave in a manner described well by a linear model. However, this surprising result made a thorough statistical analysis all the more important. Multiple tests of this result all confirmed that it was reasonable, given the set of samples from the full factorial design. It also demonstrated the great advantage of the sequential assembly approach to surrogate model design. Had a quadratic model been assumed from the start, and a Box-Behnken or similar approach to quadratic experimental design been adopted, these would have resulted in a substantially higher cost due to the fact that a larger number of simulations would have been run (25 for the Box-Behnken design versus 17 for the sequential assembly approach), and with intermediate analysis being difficult or impossible, the superfluousness of many of the runs would not have been known until afterwards.

The final conclusion of the Arches coal gasification response surface construction is that the linear coal gasification response surface models given by the polynomials below are highly appropriate and have been justified through a detailed statistical analysis.

5.5 Conclusions

The validation procedure adopted in Section 4.4↑ consists of six steps in order to perform validation of a model. Step 4 is creation of a surrogate model for expensive computer simulations. This surrogate model is intended to be used in optimization and other routines that require a large number of samples of the model. Because this is entirely impractical for a simulation code as expensive as the Arches coal gasification model, a surrogate model was constructed.

An overview of several varieties and families of surrogate models was given, and the surrogate model family deemed most appropriate was the generalized linear model family, specifically response surface models. Details were given on construction of response surfaces using statistical design of experiment techniques, and a sequential assembly approach was reviewed and adopted. This approach assembles the response surface in a piecemeal fashion, with the construction process consisting of multiple steps. The first step is a screening design intended to calculate main effects of a large number of variables in order to determine which variables are of primary importance to the chosen system response. A detailed statistical analysis is performed to reveal useful information about the behavior of the system response. Subsequent steps sample the function in such a way as to build up the degree of the surrogate model. At each subsequent step, a detailed statistical analysis is performed to extract as much information as possible from each round of function samples. This procedure was demonstrated as part of the construction of a response surface for the Arches coal gasification model. It was determined that the most appropriate response surface was a linear response surface, and several statistical tests were performed in order to confirm this surprising result.

It is worth questioning whether this approach is necessary. In many fields, emphasis has been placed on solving problems with Monte Carlo techniques to solve problems that were intractable only a short time ago. Given the nature of ever-increasing computing power, the question is, why expend so much effort to save oneself the cost of 8 or 16 computations, rather than utilizing cheaper and lower dimensional models? Why not use a “brute force” approach to lower dimensional models instead of an “intelligent design” approach to very expensive models?

Twenty years ago, the field of computational fluid dynamics expressed hope that the extremely expensive problems of that time would eventually be tractable, and cheap enough for Monte Carlo approaches to exploring system responses. Indeed, problems which kept supercomputers of 1980 busy for weeks can now be solved on desktop computers in minutes, making a Monte Carlo approach tractable. However, computational fluid dynamics is still grappling with extremely expensive problems, even with astronomical increases in computing power, new software to parallelize to ever larger systems, and specialty hardware. This is because, no matter how much computing power is available, there will always be difficult and expensive problems. The challenge of constructing accurate surrogate models for expensive computational models should not be avoided in favor of the use of only low dimensional models; expensive models have great potential and much to contribute to scientific understanding of complex systems.

The goal of constructing accurate response surfaces for expensive models is a difficult one. The question was posed earlier: is it impossible? Are we cursed? It seems that the answer is, only if we curse ourselves. We cannot rely on blind faith as a legitimate approach to building models or metamodels (see the discussion of the TV dinner approach to metamodel construction in Section 5.3.2↑). However, we should also not give up hope entirely and resign ourselves to the attitude that all modeling is in vain and that creating accurate models, let alone surrogate models of accurate models, of physical systems is just “too hard” (this is the attitude adopted by the pessimistic paper by Oreskes et al [124]). Instead, one must utilize the groundwork that has been laid in many scientific and engineering fields, including statistics; one must stand on the shoulders of giants. It is only then that one may see a brighter future for modeling.

6 DATA COLLABORATION METHOD FOR VALIDATION

Essentially, all models are wrong, but some are useful.
― George Box

6.1 The Analysis of Model Results

The last step in the validation procedure is the analysis of simulation model results. Several validation approaches discussed in Section 4.3↑ provide approaches for analyzing simulation model results; the methodology selected here is the Data Collaboration (DC) methodology [53, 147, 52, 146, 148, 174, 175]. This method provides a quantitative assessment of the simulation model, along with additional information, such as insight into the weaknesses of the model and sensitivity of simulation agreement with experimental data to the reported experimental uncertainty.

6.1.1 Important Characteristics of Analysis Methods

In Chapter 4, an overview of several validation approaches was given, in particular the approach of Coleman et al. [29, 165, 31]. This approach was particularly attractive because of its joint consideration of experimental and numerical uncertainties in the validation process. They first define the difference between the data D and the simulation S,

(7.1) E = D − S.

They then define uncertainty in the comparison error U_E, which combines simulation and data uncertainties:

(7.2) U²_E = U²_D + U²_S.

They then state that “if the absolute value of E is less than its uncertainty U_E, then validation is achieved.” However, because U_S contains a quantity they call U_SMA, uncertainty arising from simulation modeling assumptions, that cannot be quantified, they define an alternative uncertainty metric consisting of all quantities that are quantifiable:

(7.3) U²_V = U²_E − U²_SMA

and validation is achieved when the value of E is less than this alternative uncertainty U_V,

(7.4) ∣E∣ < U_V.

This is referred to as a validation at the level of U_V.

Several criticisms of this model have been put forth, by both Roache [29, 30, 165, 166] and Oberkampf [165, 166]. The most interesting and useful criticisms are twofold. First, using the proposed validation method, validation becomes increasingly difficult as the uncertainty U_V shrinks; Roache proposes including a tolerance quantity to get around this fact,

(7.5) ∣E∣ < U_V + TOL_V.

However, the problem can also be posed in reverse: he states that it is a “paradox” that increasing uncertainty in the experiments or the simulation can make validation easier, with the resolution of the paradox being that the level of validation changes. The second criticism is that the approach makes implicit assumptions about the distribution of the experimental uncertainty. These criticisms can be dissected and analyzed to obtain useful characteristics for any system that is to be used for step 5 of the NISS validation framework, analysis of validation results.

The first criticism has two parts: the problem of validation difficulty with shrinking U_V, and the problem of validation ease with growing U_V. A cursory critique of this criticism was presented in Section 4.1.2↑, but the point is repeated and the critique expanded due to the importance of the point. This criticism, and Roache’s proposal, are detached from validation (and, in a sense, detached from reality). Using a tolerance, or trying to artificially increase the uncertainty bounds to make validation easier, throws away information about reality. Because the extra quantity TOL_V cannot lumped into to the simulation uncertainty term, it must be treated as an addition to the experimental uncertainty. This is equivalent to saying, “This thermocouple takes accurate, unbiased readings that are within ±0.01 K of the actual temperature, but I will treat it as a less accurate instrument whose readings are actually within ±0.15 K so that I can validate my simulation results.” This is not an acceptable approach.

The problem with validation being easier to achieve with growing U_V, however, does identify a problematic feature of Coleman’s approach, not addressed by either Roache or Oberkampf: the uncertainty measure U_V includes terms for both simulation uncertainty and experimental uncertainty, meaning the final validation verdict is not only dependent on the experiment, but on the simulation as well. This obfuscates the central role of experimental measurements and their associated uncertainties in the process of validation. The validation process is intended to get to the truth, and it is only the experimental measurements and uncertainties that give information about the truth. For this reason, it is only the experimental data and associated uncertainties that are relevant to determining the validity of a computational model.

The second criticism mentioned was that the approach makes implicit assumptions about the distribution of experimental uncertainty. This is in fact a very revealing point: Roache brings up the case of uncertainty distributions, and how conclusions made in the paper about two models would change if the errors were treated as normally distributed rather than uniformly distributed. In fact, treating the uncertainties as uniformly distributed is the least presumptive approach, and maximizes the information entropy in the uncertainty bounds. Typically, various statements about experimental uncertainties are made: for example, that they are normally distributed, that the mean is zero, and that they are distributed with a constant variance. This means the uncertainty bounds are centered on the data and symmetric. All of these statements are made based on assumptions, however. Assuming normally distributed uncertainties is more presumptive than treating uncertainties as uniformly distributed, and the Gaussian distribution has a lower information entropy than a uniform distribution.

These criticisms reveal important characteristics that methods used to analyze validation results should have:

Experimental data and their associated uncertainties are the only source information about reality and must not be contaminated.
A simulation’s numerical uncertainty should not appear in the validation metric being used to compare simulation results to experimental data.
Assuming a uniform distribution for uncertainty is the safest and least presumptive treatment of uncertainty.

It will be shown that the methodology selected for analyzing validation results, the DC approach, satisfies these criteria.

6.2 Data Collaboration Method

Because the concepts and the procedure of the DC method are closely related, the nomenclature will be presented with the procedure description.

6.2.1 Procedure

The DC method begins with a set of experiments E. For a given experiment e ∈ E, a quantity is measured; this quantity is called an observable and is denoted Y_e. If there are multiple observables, the set of observables is denoted Y_e. The set of actual values that the observables Y_e take on is denoted by y_e: this is the true value of the observable that experiment attempts to obtain. For multiple observables, indexed by j, the true value of the j^th observable is denoted y_je. In reality, y_e cannot be measured exactly; instrumental measurements are imperfect and always have a range of uncertainty associated with them. The set of all values measured in the experiment compose the quantity d_e. Each experimental measurement also has uncertainty associated with it. The uncertainty may not be symmetric (as mentioned in Section 6.1.1↑), and may not have a known distribution. Thus a given experimental measurement d_je for a given observable j and experiment e has a lower and upper bound on its uncertainty, denoted by l_je and u_je respectively. These are related to the quantities y_je and d_je as follows:

(7.6) d_je + l_je ≤ y_je ≤ d_je + u_je j = 1…N_observables

or,

(7.7) l_je ≤ y_je − d_je ≤ u_je j = 1…N_observables.

For the reasons mentioned in Section 6.1.1↑, the probability distribution for these uncertainty quantities are treated as uniform. Note that because uncertainty is never composed of exact and hard bounds (extremely high deviations from the true value are improbable but still possible), a decision must be made about where to set l_je and u_je. This may be a standard quantity, such as 1σ or 2σ, a 95% confidence level, or an estimated uncertainty multiplied by a safety factor (these are only some illustrative examples).

The DC approach extends beyond data and incorporates simulation model predictions of the observable y_je, in addition to the experimentally measured values of y_je. The approach is rooted in the concept of a data set unit, which consists of the experimental data set (the experimental data of measurements of the observable d_je and its associated lower and upper bounds l_je and u_je), and an associated model prediction of the observable y_je, which will be denoted y^M_je. These four values form a data set unit, U_je, defined by:

(7.8) U_je = {d_je, l_je, u_je, y^M_je}

or, to compose an entire data set unit for all observables for an experiment,

(7.9) U_e = {d_e, l_e, u_e, y^M_e}.

Also considered, but not explicitly included, are the model’s input parameters x that are used by the model, y^M_je(x), and their associated range of values.

It was stated in Section 4.1.2↑ that the only appropriate validation metric for a simulation model was the truth criteria. To make this more concrete, the truth criteria is equation (7.7↑). This is the only information known about truth. For this reason, the value of the true observable y_e is replaced with the simulation model’s prediction of the observable y^M_e, so that equation (7.7↑) becomes the validation or consistency criteria:

(7.10) l_je ≤ y^M_je(x) − d_je ≤ u_je j = 1…N_observables.

The data d_je, associated uncertainties l_je and u_je, and model prediction y^M_je are treated using an integrated approach, which is the reason for using the data set unit. The reason for this integrated approach comes from the recognition of the fact that the measured values of data d_e provide the best measure of the truth y_e that we can attain, and the focus is to determine when our model matches this data. The ultimate outcome of the DC approach is a quantitative measure of how well the model can reproduce the experimental data. This measure is called the consistency (defined below, equation (7.13↓)).

A model typically consists of a set of coupled differential equations; in this case, the model is the coupled DQMOM-LES code Arches (Section 2.6↑). Any model will require a set of input parameters x to be specified for simulating a particular system. Each parameter has a range of a priori uncertainty associated with it, which comprises the initial parameter set, denoted H. It is of interest to find the subset of values of x ∈ H that will satisfy the consistency criteria (7.10↑). This set of parameter values is called the feasible set and is denoted F.

The DC approach treats the uncertainty values using a set-based representation of uncertainty (Section , which assumes no prior information about the probability of different uncertainties. Other representations of uncertainty, such as Bayesian probability-based representations, can be used to incorporate prior uncertainty probability distributions. To begin, an a priori range of uncertainty in each parameter must be determined. This comprises the initial parameter set, which is a hypercube in parameter space. This initial parameter set, or hypercube, can be written:

(7.11) H = {x : α_i ≤ x_i ≤ β_i, i = 1…n}

where n is the number of parameters. When applying the DC approach to complex and expensive models such as Arches, the total number of parameters becomes very large. Because the number of function evaluations grows geometrically with the number of parameters being investigated, the H and x actually used in the DC analysis are in fact subsets of the full H and x. For simplicity, H will refer to the dimensionally-reduced hypercube actually used in the analysis.

The feasible set consists of parameter values that satisfy criteria (7.10↑). Thus the feasible set consists of the intersection of the initial parameter set with the set of parameters x that will satisfy the criteria (7.10↑) for each observable:

(7.12) F = ∩_j{x ∈ H : l_je ≤ y^M_je(x) − d_je ≤ u_je} j = 1…N_observables.

There are two potential outcomes of searching for the feasible set F. The first outcome is that F is an empty set. This implies that no possible input values will make the model fall between the experimental uncertainty bounds l_je, u_je, and therefore the model is said to be inconsistent with the experimental data provided. The second outcome is that a feasible set is returned, and the model is validated for the given model operating conditions, and for the feasible input parameter values x ∈ F.

In order to quantify the ability of the model to fit the data, a consistency measure C_D is also defined as the maximum amount by which the experimental uncertainty bounds may be shrunk, subject to the constraints given above:

(7.13) C_D = maxγ ; ⎧⎪⎪⎨⎪⎪⎩ − x_j ≤ α_i x_j ≤ β_j − y^M_e(x) + d_e ≤ l_e(1 − γ) y^M_e(x) − d_e ≤ u_e(1 − γ).

The term “consistency” may be used to refer to C_D, or it may mean γ if it refers to the consistency for a single model prediction and its comparison to experimental data.

A final word should be said about the “collaboration” aspect of the Data collaboration approach. Section 4.2.1↑ discussed the ambiguous nature of uncertainty. Some uncertainties can be attributed to either a model or an experiment, but many uncertainties can be flexibly categorized as model input uncertainties in one form, and experimental uncertainties in another form. It is for this reason that collaboration between experimentalists and modelers is important, both for the specific activity of analysis of uncertainty, and more generally for the entire process of model validation. The constraints imposed on the initial parameter set H come from a variety of sources (experimental observations; experimental verification, or calibration, measurements; numerical studies; existing model validation studies; and sensitivity analyses, to name several). Each of these sources may be complemented, and the usefulness extended, by the insight of both experimentalists and modelers.

6.2.2 Fitting into the NISS Framework

It is also useful to discuss how the DC approach fits into the entire validation framework introduced by Bayarri [104] and applied in 4.4↑. As mentioned in Section 4.3.4↑, many validation methodologies in the literature provide only pieces of the process. The DC approach is no exception, and most details related to the entire model validation process, aside from the actual DC procedure, are omitted from explanations or presentations of the model. However, it fits in well to Step 5 of the NISS framework, as it provides a sophisticated method for comparing experimental data to simulation results. It is also able to utilize information and results from prior steps in the framework.

The first step of the NISS framework is to generate an input/uncertainty map, Table 4.1↑. The starting point is a large list of input variables, only some of which are important; the variables are then ranked, and the list is reduced to the input parameters thought to be most important. This reduced input list is x. Each input parameter x_i is then assigned a lower and upper uncertainty bounds, α_i and β_i. This information composes the hypercube H.

It is easy to pick an initial parameter set that is too small, or simply wrong. There may be dimensions of the hypercube that are ignored in the analysis, but that are dominant in the real-world application. For this reason, it is critical to establish sound reasoning for the selection of the initial parameter set. This is the primary role of the gasification studies discussed in Section 4.4.1↑, and is performed as part of step 1 of the framework (construction of the input/uncertainty map). Given the review of relevant gasification literature, substantial confidence can be invested in the well-informed initial parameter set H constructed for the validation of the Arches gasification model.

The second step is to determine the evaluation criteria. This determines what experimental data d_e is used for the validation procedure. Depending on the experiment type (traditional experiment or validation experiment), this will make the determination of l_e and u_e more or less difficult. The third step is the gathering of data and determination of the experimental uncertainty bounds l_e and u_e.

The fourth step is construction of a surrogate model for the expensive simulation. This is an important and critical step for the DC method, because it uses a constrained optimization technique to determine the optimal parameter values given the constraints imposed by the specified input parameter and observable uncertainties.

6.3 An Instrumentalist Approach to Validation

One theme that has been reiterated several times is the centrality of experimental data in the validation process. This is rooted in the adopted validation philosophy of instrumentalism (Section 4.1.3↑). An emphasis on experimental data was one of the of the attractive features of the validation procedure proposed by Coleman et al. [29, 165, 31]. As discussed in Sections 4.2.1↑ and 6.1.1↑, experimental uncertainty is the baseline for validation. Only experimental observations and associated uncertainties reveal information about empirical truth (numerical uncertainty plays a different role; see Section 3.3.5↑), and so only the experimental uncertainty should be used when actually comparing the simulation results to experimental data, a conclusion drawn in Section 6.1.1↑.

The philosophy of instrumentalism is reflected in the Data Collaboration approach. First, a statement about truth was made:

(7.14) l_je ≤ y_je − d_je ≤ u_je.

Next, the model is held to a high standard: the model must be judged by the same criteria by which truth is judged:

(7.15) l_je ≤ y^M_je(x) − d_je ≤ u_je.

And the only set of input parameter values that are feasible are the input parameter values that make the model match experimental observations about the truth:

(7.16) F = ∩_j{x ∈ H : l_je ≤ M_je(x) − d_je ≤ u_je} j = 1…N_observables.

This validation methodology can be seen more generally as an inductive approach to model construction, in which the data determines the form of the model. While many scientists have raised issues with the process of inductively constructing models, notably Popper [135, 87], the fact is that induction is the only practical way forward in many cases. (Indeed, some extreme phenomenalists such as Mach would argue that it is the only way forward.) The DC method can be seen as an inductive approach, in that it starts from the data and draws conclusions about the model form based on the data. Instrumentalism is a deeply inductive approach, and truly the DC method fits the instrumentalist philosophy of validation well.

To repeat a quote from Section 4.1.3↑, which provides a general discussion of instrumentalism, Ernst Mach said the following in his 1882 lecture “The Economical Nature of Physical Inquiry:”

In reality, the [model] always contains less than the fact itself, because it does not reproduce the fact as a whole but only in that aspect of it which is important for us, the rest being intentionally or from necessity omitted. (p. 193, Popular Scientific Lectures, [105].)

The DC method is, in many ways, an embodiment of this statement, applied to model validation. First, the “aspect... which is important for us” can be thought of in several different ways. The model is an economical representation of reality; it is trying to make a prediction relevant to the evaluation criteria selected in the second step of the framework (Section 4.4.2↑). Validation, too, is kept economical by reducing the model’s hypercube to only those variables which are postulated to be important for the evaluation criteria.

Second, these simplifications or omissions are made both “intentionally” and “from necessity.” The Arches simulation model uses large eddy simulation to resolve turbulent scales, to use DQMOM to track the full, multivariate distribution of coal particles, and overall it attempts to omit less and less physics from the problem. On the other hand, the intention in the surrogate model construction procedure is to move in the opposite direction: to omit as much of the physics as possible by concocting a surrogate model that economically reproduces the behavior of the full-scale Arches model: that aspect of the Arches model that is most important to us.

6.4 An Overview of the Data Collaboration Approach

The DC approach described by Feeley [148] utilizes a systems analysis approach to uncertainty propagation and model validation. The intention of the data collaboration approach is to address not just the question of how well the model matches data, but to answer the question of how the model might be improved; what conditions would be useful for further experimentation; and the impact that an additional experiment may have on the accuracy of a model prediction. This is done by using a systems analysis approach, which can be used to approximate the range of the output set,

(7.17) L_je = min_x ∈ F y^M_je(x)

and

(7.18) R_je = max_x ∈ F y^M_je(x),

called the left and right bounds on y^M_je, which both provide a quantitative measure of the effect of input uncertainty propagated through the system. This is of great interest due to the fact that y^M_je can be compared directly to d_j, the data measurement, and its uncertainties l_je and u_je. If a map is constructed between the input uncertainty bounds α_i and β_i for parameter x_i and the output uncertainty bounds L_je and R_je for the outputs y^M_je, this map can be used to answer some of the questions mentioned above. However, due to the nature of this problem, it is extremely computationally intensive, making surrogate models necessary. These were covered extensively in Chapter 5. This provides a way forward with the solution mapping technique. The actual problem being solved is not to find L_je and R_je, but rather to find an inner and outer bounds for both, such that L_je ≤ L_je ≤ L_je and R_je ≤ R_je ≤ R_je. This optimization problem can then be expressed as a quadratic program, that is, optimization of a quadratic function under quadratic constraints. The outer bounds can be found using convex relaxation, a topic covered well by Borichev [7], while the inner bounds, which are more difficult to find, can be computed using constrained optimization techniques such as branch and bound [164]. These tie in neatly with several set-based approaches to uncertainty mentioned in Section 4.2.2↑, particularly interval analysis (see [113] and [10] ).

Approaches to problems of these type are referred to as mathematical programming; programming is a synonym for optimization. For example, linear programming solves linear constrained problems, posed as maximizing some objective function c^Tx subject to a set of constraints,

(7.19) Ax ≤ b.

Similarly, quadratic programming, the technique utilized by the Data Collaboration toolbox to find 7.17↑ and 7.18↑, is computing a minimum or maximum of x^TQx subject to the inequality constraints,

(7.20) Ax ≤ b,

or equality constraints,

(7.21) Ex = f

(or both). A special case of this (also utilized by the Data Collaboration toolbox) is quadratically constrained quadratic programming (QCQP). This solves the minimization or maximization problem for x^TQx subject to quadratic constraints of the form

(7.22) x^TPx + q^Tx + r ≤ 0, Ex = f.

This subject is covered in Chapter 4 of Boyd [164], which provides an excellent explanation and several good examples.

Several alternative approaches to programming problems exist, such as the genetic algorithm, which attempts to mimic the process of evolution in searching for optimal solutions by selecting populations of samples and operating on them in stages (or “generations”) [62]; simulated annealing, which draws its inspiration from the physical process of annealing, with two parameters (local gradient, or “heat,” and global “temperature”) dictating the rapidity and randomness of changes (as the global “temperature” parameter decreases, the changes become increasingly local) [149]; and neural networks, already covered in the context of metamodeling for expensive functions, which can also be applied to optimization.

The technique ultimately utilized by the DC method combines a QCQP method for determining L_je and R_je with a technique for approximating the output of general models fed to the DC algorithm with piecewise polynomials; this allows an extension of the QCQP method to general, nonquadratic models. The initial hypercube H is thus broken up into several regions, each described by a quadratic model, and the efficient QCQP methods are used on each region of the hypercube. This is handled within the framework of a branch and bound algorithm.

The optimization approach is implemented in the Data Collaboration Matlab toolbox. This toolbox applies the Data Collaboration approach using the optimization procedures and algorithms described in order to determine the consistency measure C_D for the data and model predictions fed to the toolbox. The toolbox is described in greater detail by Feeley [148] and Russi [176].

One question that springs from this approach is, why pose the problem in this way? Why require complex algorithms such as branch and bound, instead of using a Monte Carlo approach? This can be posed as a question about the worst case scenarios of the system, L_j and R_j. The intention of robust control theory is to provide better (but more conservative) estimates of the worst case scenarios; Ghaoui and Calafiore put it this way: “the worst case analysis seems to be somewhat conservative, but the reader should be aware that the actual worst-case behavior cannot be accurately predicted, in general, by taking random samples” [39]. Similarly, in the Matlab Robust Control Toolbox User’s Guide, Balas et al. state: “Monte Carlo method are inherently hit or miss. With Monte Carlo methods, you might need to take an impossibly large number of samples before you hit upon or near a worst-case parameter combination” [56]. Given simple enough models, however, and given the right assumptions, Monte Carlo methods may be a viable alternative to determining the worst case behavior of the system (see e.g., [32]).

6.5 Data Collaboration for Coal Gasification

6.5.1 A Statement of the Validation Problem

The data collaboration approach is intended to complete the validation process by providing a method for Step 5 in the NISS framework (Section 4.4↑). For the particular problem of validating very expensive computer simulations, Step 5 has several pieces of information to integrate. First, and of primary importance, is the experimental data and associated uncertainties. This provides the validation measure. The second piece of information is the evaluation criteria, which is very closely linked to the experimental data and uncertainty. Third, simulations of the system of interest are run using the expensive model; while the experiments that are run require extensive attention to details, the step of running simulations requires equal consideration, due to the expense of the model. For the validation presented here, the expensive model is the Arches coal gasification model, which is described in Section 2.6↑.

The Arches model is an expensive function, y^M_e(x), which is a function of a set of input parameters x that can be grouped into three categories: model parameters, scenario parameters, and numerical parameters (Section 4.2.1↑). In the previous chapter, several techniques for constructing surrogate models were covered, and response surfaces for the responses of interest of the Arches coal gasification model were generated. These pieces are integrated into dataset units in step 5, using the procedure described above.

6.5.2 The Expensive Model and the Cheap Model

The model that is being validated is the Arches coal gasification model, described in Section 2.6↑. This model is extremely expensive, incorporating many coupled multiphysics models, and solves the governing equations of the flow with very high temporal and spatial resolution. For this reason, it cannot be used in optimization routines or as part of a Monte Carlo sampling study. In order to obtain the best of both worlds (the accuracy of an expensive model like Arches with the cheapness of a polynomial or other function that would typically be used in optimization routines), a response surface was constructed. This procedure was described in Chapter 5↑.

6.5.3 The Input Uncertainties and the Output Uncertainties

The input uncertainty map, which lists all active parameters considered and their associated uncertainties, is presented in Table 4.1↑, along with a discussion of experimental error in the BYU gasifier (Section 4.4.3↑). There were several input uncertainties reported for mass flowrates, by Brown (Brown 1986 and Brown thesis) and Sowa (Sowa), ranging from 7% to 20%, with the percentage increasing as a function of mass flowrate. These uncertainties were based on repeated observations, from which a standard deviation was computed and a confidence interval was constructed. From this information, a mass flowrate uncertainty of 10% was considered reasonable based on the mass flowrate of the simulated gasification case (22.1 (kg)/(hr)) and the reported uncertainties. Unlike mass flowrates, input uncertainties for mass mean particle diameter are not reported in any studies, so determination of mass mean particle size (37 μm) and an associated uncertainty range (10%) was based on the coal type used and the range of mass mean particle sizes reported for this type of coal [158, 160]. The same approach was taken for wall temperature (1200 K with ±200 K, or 16%, uncertainty), based on information provided in [161, 158]. The model input parameters had larger uncertainties, primarily by accounting for the range of model parameter values reported in the literature (see [93, 177] for reported E₂ and A₂ values, and [159] for E_{char − CO₂} values).

The uncertainties in the system response measurements (CO, CO₂, and H₂) were also reported [89, 161]. Both sources gave the uncertainty in measurements of [CO], [CO₂], and [H₂] of ±1.7%. No information on the uncertainty quantification procedure was given, but presumably these were confidence intervals constructed from standard deviations from repeat runs. After several discussions regarding this value of uncertainty, it was concluded that the reported value was dubious, and that it was likely to be much higher in reality. The uncertainty analysis by Sowa [162] cast doubt on whether ±1.7% could truly account for all of the uncertainty reported and analyzed by Sowa. Additionally, as mentioned in Section 4.4.3↑, the reporting of traditional experiments often introduces substantial uncertainty simply through the reporting of values using plots rather than quantitative values. Accounting for all of these factors led to an expansion of the uncertainty to ±10%.

6.6 Qualitative Validation Analysis

A qualitative analysis can be performed by comparing the experimental and computational gasifier concentration profiles (Figures 6.1↓ through 6.4↓ for the screening study runs, covered in Section 5.4.1↑, and Figures 6.5↓ through 6.8↓ for the factorial design runs, covered in Section 5.4.2↑). In addition, plots 6.9↓ through 6.14↓ show a surface plot of the residuals from the comparison between the experimental gasifier data and the Arches model predictions.

figure 05_SurrogateModels/figures/Screening_Contour_CO_Case1.png

(a)

figure 05_SurrogateModels/figures/Screening_Contour_CO_Case2.png

(b)

Figure 6.1 Contour plots comparing experimental data d_e to simulation results y^M_e for runs screen-1 (a) and screen-2 (b).

figure 05_SurrogateModels/figures/Screening_Contour_CO_Case3.png

(a)

figure 05_SurrogateModels/figures/Screening_Contour_CO_Case4.png

(b)

Figure 6.2 Contour plots comparing CO experimental data d_e to simulation results y^M_e for runs screen-3 (a) and screen-4 (b).

figure 05_SurrogateModels/figures/Screening_Contour_CO_Case5.png

(a)

figure 05_SurrogateModels/figures/Screening_Contour_CO_Case6.png

(b)

Figure 6.3 Contour plots comparing CO experimental data d_e to simulation results y^M_e for runs screen-5 (a) and screen-6 (b).

figure 05_SurrogateModels/figures/Screening_Contour_CO_Case7.png

(a)

figure 05_SurrogateModels/figures/Screening_Contour_CO_Case8.png

(b)

Figure 6.4 Contour plots comparing CO experimental data d_e to simulation results y^M_e for runs screen-7 (a) and screen-8 (b).

\clearpage

figure 05_SurrogateModels/figures/Factorial_Contour_CO_Case9.png

(a)

figure 05_SurrogateModels/figures/Factorial_Contour_CO_Case10.png

(b)

Figure 6.5 Contour plots comparing experimental data to simulation results for runs fact-9 (a) and fact-10 (b).

figure 05_SurrogateModels/figures/Factorial_Contour_CO_Case11.png

(a)

figure 05_SurrogateModels/figures/Factorial_Contour_CO_Case12.png

(b)

Figure 6.6 Contour plots comparing CO experimental data to simulation results for runs fact-11 (a) and fact-12 (b).

figure 05_SurrogateModels/figures/Factorial_Contour_CO_Case13.png

(a)

figure 05_SurrogateModels/figures/Factorial_Contour_CO_Case14.png

(b)

Figure 6.7 Contour plots comparing CO experimental data to simulation results for runs fact-13 (a) and fact-14 (b).

figure 05_SurrogateModels/figures/Factorial_Contour_CO2_Case15.png

(a)

figure 05_SurrogateModels/figures/Factorial_Contour_CO2_Case16.png

(b)

Figure 6.8 Contour plots comparing CO experimental data to simulation results for runs fact-15 (a) and fact-16 (b).

Using a qualitative analysis, it is easy to pick out particular cases that look like they match well. However, due to the fact that parameters are being changed simultaneously, it is not easy to determine a pattern in which runs result in good results, and which runs result in mediocre results. Were the number of parameters being changed one or two, the process of identifying a pattern would be trivial. However, the importance of using statistical analysis to determine patterns in the effect of variables on the responses is easily demonstrated.

An analysis of the residual plots yields similar conclusions: while there are general regions that can be identified as regions where the model prediction is particularly good or particularly poor, there is no underlying pattern that can be identified due to the number of variables. Even if the experimental uncertainty bounds were added to this plot to illustrate the level of residual error versus the level of experimental error to show how they compared, it would be difficult to draw conclusions about what effect the parameter values had on these regions.

figure 05_SurrogateModels/figures/Screening_ErrorContour_CO.png

Figure 6.9 Contour plots comparing residuals d_e − y^M_e from comparison of Arches results to data for the [CO] response for the screening study runs (Section 5.4.1↑).

figure 05_SurrogateModels/figures/Screening_ErrorContour_CO2.png

Figure 6.10 Contour plots comparing residuals d_e − y^M_e from comparison of Arches results to data for the [CO₂] response for the screening study runs (Section 5.4.1↑).

figure 05_SurrogateModels/figures/Screening_ErrorContour_H2.png

Figure 6.11 Contour plots comparing residuals d_e − y^M_e from comparison of Arches results to data for the [H₂] response for the screening study runs (Section 5.4.1↑).

figure 05_SurrogateModels/figures/Factorial_ErrorContour_CO.png

Figure 6.12 Contour plots comparing residuals d_e − y^M_e from comparison of Arches results to the data for the [CO] response for the full factorial design (Section 5.4.2↑).

figure 05_SurrogateModels/figures/Factorial_ErrorContour_CO2.png

Figure 6.13 Contour plots comparing residuals d_e − y^M_e from comparison of Arches results to the data for the [CO₂] response for the full factorial design (Section 5.4.2↑).

figure 05_SurrogateModels/figures/Factorial_ErrorContour_H2.png

Figure 6.14 Contour plots comparing residuals d_e − y^M_e from comparison of Arches results to the data for the [H₂] response for the full factorial design (Section 5.4.2↑).

\clearpage

6.7 Data Collaboration Validation Analysis

The validation using the data collaboration method proceeded in several steps. The data collaboration method started with an initial hypercube H containing the prior bounds for each input parameter, defined in equation (7.11↑). It then proceeded to reduce this hypercube to satisfy truth constraints (equation 7.7↑), resulting in a set of feasible parameters F (equation 7.12↑). The validation process begins with an attempt to validate the entire dataset, which is an attempt to find an F that will satisfy all experimental observations. Based on the results of this validation, further validation steps fragment the data into subsets, and validation is attempted on each of these subsets. This procedure can reveal specific information about strengths and weaknesses of models, based on how well particular subsets of data are matched, and whether they can be validated.

As an example, it may be found that of the 90 possible responses, there is a single feasible hypercube that leads to validation for 89 of the 90 points, but one point is an extreme outlier, and consistency may be achieved by excluding that outlier (this was the case for the GRI Mech application given by Feeley et al. [147]; upon further analysis, the outlier data points were revised by the experimentalists who obtained it). Another possibility is that consistency may be achieved for a particular spatial region of the gasifier, or for a particular species. This may point, for example, to physical models needing improvement.

6.7.1 Validation for All Data

The first validation attempt was a validation using all data simultaneously. This validation was unsuccessful; the data collaboration method returned the result that the model was inconsistent with the data. This was the anticipated outcome of the validation process. The next step was to fragment the data to attempt validation on subsets of the data. The first subset attempted was the data grouped by species.

6.7.2 Validation for Data Grouped by Species

The next validation attempt was for each species separately; this consisted of three validations, each with 30 data points. The validation resulted in inconsistency for all three species. The experimental error was increased to a slightly larger value of 10% to see if this would have an effect. When this was done, consistency was achieved for CO₂. The consistency measure C_D computed was 0.01. H₂ and CO were both inconsistent with data. The lower and upper bounds for the feasible set for CO₂, F_CO₂, are presented in Table 6.1↓. One somewhat surprising result is that the feasible bounds on E₂, denoted α^F_i and β^F_i, are reduced from the prior bounds α_i and β_i more than they are reduced for any other variable. This is likely an indication that the effect of uncertainty in E₂ on the ability of the model to match data overwhelms the effect of uncertainty in other parameters. Figures 6.15↓ and 6.16↓ plot the prior left and right bounds L^H_j and R^H_j, defined by

(7.23) L^H_je = min_x ∈ H y^M_je(x) R^H_je = max_x ∈ H y^M_je(x),

in gray, and the feasible left and and right bounds L_je and R_je,

(7.24) L_je = min_x ∈ F y^M_je(x) R_je = max_x ∈ F y^M_je(x),

in black. The experimental data uncertainty range is also plotted (dotted lines). These plots illustrate the wide range of prior left and right bounds, compared with the very narrow feasible left and right bounds. For the CO₂ system response, a majority of this feasibility space depends on E₂.

	CO₂
	α^F_i	β^F_i
E₂ ⎡⎣(J)/(kmol)⎤⎦	1.82 × 10⁸	2.3 × 10⁸
T_wall [K]	1000	1390
d_p [μm]	33.37	38.55
ṁ_coal ⎡⎣(kg)/(hr)⎤⎦	22.1	24.1

Table 6.1 The parameters in the feasible set resulting from the validation of the model’s prediction of CO₂, F_CO₂, for the species-only comparison test.

figure figures/Consistency_Species-wise_CO2.png

Figure 6.15 Plot of left and right bounds (L_j, R_j) of simulation response for prior parameter set H (gray line) and for feasible set F (black line) for CO₂ system response. Experimental data uncertainty range is demarcated with dotted lines.

figure figures/Consistency_Species-wise_CO2_zoom.png

Figure 6.16 Same quantity as Figure 6.15↑, but with reduced y-axis range for ease of interpretation.

6.7.3 Validation for Data Grouped Spatially

After determining that two of the species concentration profiles were inconsistent even when analyzed separately, it was of interest to explore this inconsistency further. The data were grouped by axial location, and each radial profile validated independently. This revealed that CO was consistent at all axial locations except x = 36 cm, and H₂ was inconsistent at all axial locations except x = 21 cm. Results from this consistency analysis are presented in Table 6.3↓ and Figures 6.17↓ and 6.18↓.

The location x = 36 cm was previously identified as bucking trends seen in other spatial locations when box contour plots of residuals were presented for Zone I in Section 5.3↑

	CO
	x = 21 cm		x = 51 cm		x = 67 cm
	α^F_i	β^F_i	α^F_i	β^F_i	α^F_i	β^F_i
E₂ ⎡⎣(J)/(kmol)⎤⎦	1.35 × 10⁸	1.72 × 10⁸	1.91 × 10⁸	2.03 × 10⁸	1.85 × 10⁸	2.83 × 10⁸
T_wall [K]	1036	1385	1050	1364	1027	1344
d_p [μm]	35.9	40.6	35.4	40.2	33.4	39.5
ṁ_coal ⎡⎣(kg)/(hr)⎤⎦	19.9	22.8	21.1	22.4	20.9	24.3

Table 6.2 The parameters in the feasible set resulting from the validation of the model’s prediction of CO, F_CO, for the consistency test of radial profile points only at x = 21 cm, x = 51 cm, and x = 67 cm (x = 36 cm is excluded because it is inconsistent).

	CO				H₂
	x = 81 cm		x = 121 cm		x = 21 cm
	α^F_i	β^F_i	α^F_i	β^F_i	α^F_i	β^F_i
E₂ ⎡⎣(J)/(kmol)⎤⎦	1.89 × 10⁸	2.99 × 10⁸	2.01 × 10⁸	2.99 × 10⁸	2.02 × 10⁸	2.35 × 10⁸
T_wall [K]	1101	1361	1082	1343	1123	1340
d_p [μm]	35.5	40.6	34.3	40.2	34.5	40.0
ṁ_coal ⎡⎣(kg)/(hr)⎤⎦	22.9	24.3	21.8	24.3	19.9	24.3

Table 6.3 The parameters in the feasible set resulting from the validation of the model’s predictions of CO and H₂, F_CO and F_H₂, for the consistency test of radial profile points only, at x = 81 cm and x = 121 cm for CO and x = 21 cm (the only consistent radial profile) for H₂.

figure figures/Consistency_Profile-wise_CO.eps

Figure 6.17 Plot of left and right bounds (L_j, R_j) of simulation response for prior parameter set H (gray line) and for feasible set F (black line) for CO system response, with validation being grouped by x location into radial profiles. Experimental data uncertainty range is demarcated with dotted lines. Inconsistent radial profiles are indicated.

figure figures/Consistency_Profile-wise_H2.png

Figure 6.18 Plot of left and right bounds (L_j, R_j) of simulation response for prior parameter set H (gray line) and for feasible set F (black line) for H₂ system response, with validation being grouped by x location into radial profiles. Experimental data uncertainty range is demarcated with dotted lines. Inconsistent radial profiles are indicated.

6.7.4 Interpretation

The results do not lead to a clear, definitive interpretation. They suggest that the prediction of CO and CO₂ is adequate, while the prediction of H₂ is poor except near the inlet. The results also indicate that although all the surrogate model predictions of CO could not be validated as a whole, the trend was that surrogate model predictions of CO were consistent throughout nearly the entire reactor, and predictions of CO₂ were consistent throughout the entire reactor.

Another interesting trend that appeared for both CO and CO₂ was that E₂ varied the most widely from location to location, more so than other parameters. This indicates an important distinction between the statistical analysis of the factorial design results (Section 5.4.2↑), which indicated strong main effects from T_wall, and the validation procedure, which indicated that T_wall had nearly negligible effect on the ability of the model to match data; that distinction is that the statistical analysis reveals only sensitivity trends. Sensitivity trends indicate whether a variable has a strong effect on the system response, and is purely a mathematical question, independent of data. In contrast, the change in a variable’s range from H to F indicates its impact, not on the response, but on how well the response matches the data. This is a critical difference, analogous to the difference between verification and validation.

Although the Data Collaboration method provides a valuable way of approaching validation, and provides a useful validation criteria, namely consistency, it yields muddled conclusions about the valid parameter space, particularly when the data cannot be validated as a whole and must be validated in fragments. For example, the Data Collaboration results show that the surrogate model predictions of CO and CO₂ compare to the data fairly well, but the H₂ predictions do not, and valid parameter ranges were given for each fragment of data. But it is unclear how to proceed with this information, given that each range is disparate. Some valid parameter values (1.35 × 10⁸ ≤ E₂ ≤ 1.72 × 10⁸ for CO at x = 21 cm) have no overlap with others (1.82 × 10⁸ ≤ E₂ ≤ 2.3 × 10⁸ for CO₂ at all spatial locations). Further fragmentation, e.g., of axial profiles, would likely further confound the interpretation of results. From these results, it is unclear how the validation is to be judged, what the feasible hypercube looks like (or if there is a feasible hypercube), or how to move forward and use these results to make predictions using the model. Some illusory conclusions may be drawn from these results, but these illusions of conclusions may fuel confusion.

6.8 Monte Carlo Validation Analysis

An alternative to the Data Collaboration approach alluded to earlier is a Monte Carlo sampling of the surrogate model to explore the consistency subspace. Due to the low dimensionality of the surrogate model, a large enough number of samples is likely to reveal outlying trends. Furthermore, due to the linear behavior of the surrogate model, it is anticipated that Monte Carlo should be simple to perform.

A total of 9 million Monte Carlo samples were gathered: 100,000 samples for each of the 90 response surfaces (one response surface for each combination of x, r, and species). The computational time required for gathering all of these samples was approximately 2 hours in serial (using a single 2.8 GHz Intel Xeon dual-core processor) and approximately 20 minutes in parallel (using eight 2.8 GHz Intel Xeon dual-core processors). For each sample of the surrogate model, the consistency metric γ, as defined in equation (7.13↑), was computed. This γ is analyzed as a function of the input parameters, γ(x). It should be noted that input parameters, as presented in this section, are scaled to be in the interval [0, 1], following the variable normalization procedure described in Section 5.2.3↑.

Figure 6.20↓ shows plots of the consistency measure in each parameter subspace. From these plots, one obvious trend appears, just as it appeared in the DC analysis: the dominant influence of E₂ on consistency. Figure a↓ shows a clear trend of larger inconsistencies with higher E₂. The trend can also be observed in Figures c↓ and b↓. Also of note is that consistency is achieved for all values. While this does not contradict the Data Collaboration approach (which searched for a set of parameter combinations that would achieve consistency for all locations simultaneously), it does show that either (a) the surrogate model is able to make good predictions for a large number of parameter combinations and locations, (b) the parameter ranges selected were good choices, or (c) both.

figure figures/ScatterConsistency_E2_Tw_AllData.png

(a) γ for parameter subspace E₂ × T_wall.

figure figures/ScatterConsistency_E2_dp_AllData.png

(b) γ for parameter subspace E₂ × d_p.

figure figures/ScatterConsistency_E2_mc_AllData.png

figure figures/ScatterConsistency_Tw_dp_AllData.png

(d) γ for parameter subspace T_wall × d_p.

figure figures/ScatterConsistency_Tw_mc_AllData.png

(e) γ for parameter subspace T_wall × ṁ_coal.

figure figures/ScatterConsistency_dp_mc_AllData.png

(f) γ for parameter subspace d_p × ṁ_coal.

Figure 6.20 Scatter plots of consistency γ(x) for parameter subspaces. Negative values of γ (gray) indicate inconsistent model predictions, while positive values of γ (black) indicate consistent model predictions.

More trends are also seen upon examining the consistency by species; only a subset of plots of consistency for model predictions grouped by species as a function of input parameters x are presented in Figure 6.21↓. For example, a contradictory trend is seen in the consistency of CO and CO₂ with respect to parameter E₂ (Figures a↓ and b↓): the consistency for species CO increases as E₂ decreases; for large E₂, there are more inconsistent (gray) points, and the consistent plots are less densely distributed. On the other hand, at low E₂, the consistent (black) points are more densely distributed, with inconsistent points of larger magnitude at higher values of E₂. The opposite trend is observed for CO₂, with inconsistencies of larger magnitude at lower values of E₂. Consistency of CO₂ model predictions also exhibit a strong dependence on both T_wall and d_p (Figure e↓), with the worst agreement between model predictions and data occurring at high values of both. CO, in contrast, exhibits both good and bad agreement with data regardless of the value of T_wall (Figure d↓). Looking at H₂, the most densely-distributed consistent predictions occur at high values of E₂, high values of ṁ_coal, and low values of T_wall.

While conclusions drawn from these scatter plots do not in themselves lead to strong or easy-to-implement conclusions, they are yet another step in the process of analyzing model agreement with data, and plots such as those in Figure 6.21↓ reveal more about the underlying trends of effects of input variables on the validity of model predictions than the Data Collaboration toolbox approach.

figure figures/ScatterConsistency_E2_mc_CO.png

(a) γ for parameter subspace E₂ × ṁ_coal, CO only.

figure figures/ScatterConsistency_E2_mc_CO2.png

(b) γ for parameter subspace E₂ × ṁ_coal, CO₂ only.

figure figures/ScatterConsistency_E2_mc_H2.png

figure figures/ScatterConsistency_E2_Tw_CO.png

(d) γ for parameter subspace E₂ × T_wall, CO only.

figure figures/ScatterConsistency_Tw_dp_CO2.png

(e) γ for parameter subspace T_wall × d_p, CO₂ only.

figure figures/ScatterConsistency_E2_Tw_H2.png

(f) γ for parameter subspace E₂ × T_wall, H₂ only.

Figure 6.21 Scatter plots of consistency γ(x) for parameter subspaces, with consistency data grouped by species.

In order to further explore the consistency measure and visualize it in a more easily interpretable form, plots of the consistency probability, defined as:

(7.25) Pr{γ(x) > 0∣ x = θ},

were computed. Plots of this probability function are easier to interpret than the scatter plot. The first plots are analogues to Figure 6.20↑ and show the probability of consistency or inconsistency as a function of the input parameters. While the patterns of effect of the input variable value on consistency are visually easier to understand, the plots demonstrate a similar conclusion to Figure 6.20↑: the consistency measure is roughly 50% for each parameter. However, unlike the scatter plots, it is easier to explore conditional probabilities (Figure 6.24↓). These can reveal trends that may be lost by showing all data simultaneously.

figure figures/ProbabilityConsistencyFunc_E2_AllData.png

(a) Probability function for input parameter E₂.

figure figures/ProbabilityConsistencyFunc_Tw_AllData.png

(b) Probability function for input parameter T_wall.

figure figures/ProbabilityConsistencyFunc_dp_AllData.png

figure figures/ProbabilityConsistencyFunc_mc_AllData.png

(d) Probability function for input parameter ṁ_coal.

Figure 6.23 Probability function for consistency and inconsistency for comparison of model predictions with data at all locations and for species.

figure figures/ConditionalProbabilityConsistencyFunc_lowE2_Tw_AllData.png

(a)

figure figures/ConditionalProbabilityConsistencyFunc_hiE2_Tw_AllData.png

(b)

Figure 6.24 Conditional probability function for consistency and inconsistency of all species and all spatial locations as a function of T_wall and conditioned on extreme values of E₂, (a) 0 < E₂ < 0.1 and (b) 0.9 < E₂ < 1.0.

figure figures/ConditionalProbabilityConsistencyFunc_bw_mc_AllData1.png

(a)

figure figures/ConditionalProbabilityConsistencyFunc_bw_mc_AllData2.png

(b)

Figure 6.25 Conditional probability function for consistency and inconsistency of all species and all spatial locations as a function of ṁ_coal, conditioned on (a) 0 < E₂ < 0.1, 0 < d_p < 0.1, and 0 < T_wall < 0.10; and on (b) 0 < E₂ < 0.10, 0 < d_p < 0.10, and 0.9 < T_wall < 1.0.

Furthermore, it is also useful to visualize probability functions for particular species or particular spatial locations. Visualization of the consistency probability for radial profiles of individual species reveals very interesting results. CO shows a marked increase in its consistency probability with axial distance from the injector, while CO₂ shows a sharp decrease in agreement with data near the gasifier exit. It is interesting to note that Figure 6.27↓ shows feasible values of T_wall that match those found by the Data Collaboration toolbox (Table 6.1↑). Investigating these probabilities further leads to visualization of the probability function in two dimensions.

figure figures/ProbabilityConsistencyFunc_Tw_CO_x021cm_Data.png

(a) Probability function at x = 21 cm.

figure figures/ProbabilityConsistencyFunc_Tw_CO_x036cm_Data.png

(b) Probability function at x = 36 cm.

figure figures/ProbabilityConsistencyFunc_Tw_CO_x051cm_Data.png

figure figures/ProbabilityConsistencyFunc_Tw_CO_x067cm_Data.png

(d) Probability function at x = 67 cm.

figure figures/ProbabilityConsistencyFunc_Tw_CO_x081cm_Data.png

(e) Probability function at x = 81 cm.

figure figures/ProbabilityConsistencyFunc_Tw_CO_x112cm_Data.png

(f) Probability function at x = 112 cm.

Figure 6.26 Probability function for consistency and inconsistency with respect to T_wall for species CO at each radial position.

figure figures/ProbabilityConsistencyFunc_Tw_CO2_x021cm_Data.png

(a) Probability function at x = 21 cm.

figure figures/ProbabilityConsistencyFunc_Tw_CO2_x036cm_Data.png

(b) Probability function at x = 36 cm.

figure figures/ProbabilityConsistencyFunc_Tw_CO2_x051cm_Data.png

figure figures/ProbabilityConsistencyFunc_Tw_CO2_x067cm_Data.png

(d) Probability function at x = 67 cm.

figure figures/ProbabilityConsistencyFunc_Tw_CO2_x081cm_Data.png

(e) Probability function at x = 81 cm.

figure figures/ProbabilityConsistencyFunc_Tw_CO2_x112cm_Data.png

(f) Probability function at x = 112 cm.

Figure 6.27 Probability function for consistency and inconsistency with respect to T_wall for species CO₂ at each radial position.

figure figures/ProbabilityConsistencyFunc_Tw_H2_x021cm_Data.png

(a) Probability function at x = 21 cm.

figure figures/ProbabilityConsistencyFunc_Tw_H2_x036cm_Data.png

(b) Probability function at x = 36 cm.

figure figures/ProbabilityConsistencyFunc_Tw_H2_x051cm_Data.png

figure figures/ProbabilityConsistencyFunc_Tw_H2_x067cm_Data.png

(d) Probability function at x = 67 cm.

figure figures/ProbabilityConsistencyFunc_Tw_H2_x081cm_Data.png

(e) Probability function at x = 81 cm.

figure figures/ProbabilityConsistencyFunc_Tw_H2_x112cm_Data.png

(f) Probability function at x = 112 cm.

Figure 6.28 Probability function for consistency and inconsistency with respect to T_wall for species H₂ at each radial position.

figure figures/Probability3DConsistencyFunc_E2_dp_CO_x021cm_Data.png

(a) x = 21 cm.

figure figures/Probability3DConsistencyFunc_E2_dp_CO_x036cm_Data.png

(b) x = 36 cm.

figure figures/Probability3DConsistencyFunc_E2_dp_CO_x051cm_Data.png

figure figures/Probability3DConsistencyFunc_E2_dp_CO_x067cm_Data.png

(d) x = 67 cm.

figure figures/Probability3DConsistencyFunc_E2_dp_CO_x081cm_Data.png

(e) x = 81 cm.

figure figures/Probability3DConsistencyFunc_E2_dp_CO_x112cm_Data.png

(f) x = 112 cm.

Figure 6.29 Two-dimensional probability function for consistency (black) and inconsistency (red) for species CO at all spatial locations in the E₂ × d_p parameter subspace.

figure figures/Probability3DConsistencyFunc_E2_Tw_CO2_x021cm_Data.png

(a) x = 21 cm.

figure figures/Probability3DConsistencyFunc_E2_Tw_CO2_x036cm_Data.png

(b) x = 36 cm.

figure figures/Probability3DConsistencyFunc_E2_Tw_CO2_x051cm_Data.png

figure figures/Probability3DConsistencyFunc_E2_Tw_CO2_x067cm_Data.png

(d) x = 67 cm.

figure figures/Probability3DConsistencyFunc_E2_Tw_CO2_x081cm_Data.png

(e) x = 81 cm.

figure figures/Probability3DConsistencyFunc_E2_Tw_CO2_x112cm_Data.png

(f) x = 112 cm.

Figure 6.31 Two-dimensional probability function for consistency (black) and inconsistency (red) for species CO₂ at all spatial locations in the E₂ × T_wall parameter subspace.

figure figures/Probability3DConsistencyFunc_E2_mc_H2_x021cm_Data.png

(a) x = 21 cm.

figure figures/Probability3DConsistencyFunc_E2_mc_H2_x036cm_Data.png

(b) x = 36 cm.

figure figures/Probability3DConsistencyFunc_E2_mc_H2_x051cm_Data.png

figure figures/Probability3DConsistencyFunc_E2_mc_H2_x067cm_Data.png

(d) x = 67 cm.

figure figures/Probability3DConsistencyFunc_E2_mc_H2_x081cm_Data.png

(e) x = 81 cm.

figure figures/Probability3DConsistencyFunc_E2_mc_H2_x112cm_Data.png

(f) x = 112 cm.

Figure 6.33 Two-dimensional probability function for consistency (black) and inconsistency (red) for species H₂ at all spatial locations in the E₂ × ṁ_coal parameter subspace.

This analysis results in a large amount of data, with 90 four-dimensional probability functions, and this data lends itself to being viewed from many different perspectives. Depending on which perspective is taken, the code may appear validated, or invalidated, or the validation verdict may be inconclusive. If the intention is to find the region of parameter space resulting in the best probability of certainty for a particular response or set of responses, this is possible; but if the intention is to find the region of parameter space resulting in the best possible certainty for the entire set of responses, this proves an unclear goal with an unclear answer, becoming less clear as the number of responses increases. Looking at the probability of consistency for the entire set of responses, the probability of consistency is around 50% (Figure 6.23↑). Were this number 10%, or 90%, making a qualitative judgement about how well the code performs overall would be easy. However, no such statement can be made from Figure 6.23↑.

6.9 Prediction

The figures included here are intended to show that there are many patterns, peaks, troughs, and regions of consistency or inconsistency among the four dimensions and 90 data points. The end goal is to use all of this information to construct a prediction interval with some estimate of the error in that prediction: that is, a prediction with an associated prediction uncertainty. The Data Collaboration method makes predictions by evaluating the surrogate model using parameters from the feasible set, and determining the bounds on the simulation responses from the feasible set. Results from this are shown for CO₂ in Figures 6.15↑ and 6.16↑, and for CO and H₂ in Figures 6.17↑ and 6.18↑. The Data Collaboration approach presumes that because the parameters are in the feasible set, that predictions made using parameter values within the feasible set are “valid” predictions, in the sense that they can be trusted. However, making predictions with the Data Collaboration method does not provide a level of belief or confidence in the predictions. This is in line with the general philosophy of the Data Collaboration method, which is to use a set-based approach, rather than a probabilistic approach.

A probabilistic approach, in contrast, such as the approach exhibited with the Monte Carlo method of evaluating consistency between the surrogate model and the experimental data, provides a system within which a prediction interval may be constructed. This provides a way forward that resolves some of the difficulties mentioned in the close of Section 6.8↑: the information about the consistency probability can be used in conjunction with the surrogate models at each spatial location to construct prediction intervals for given values of parameters.

This also provides a way to reconcile both the Data Collaboration approach and the Monte Carlo approach: the Data Collaboration method is valuable for extracting a region of consistent parameter hyperspace that lead to consistency (this can also be extracted from the Monte Carlo results). However, in the case of inconsistent Data Collaboration results, the consistent parameter hyperspaces do not overlap; in this case, drawing conclusions from the Data Collaboration results is difficult, and making predictions is not possible (as covered above). In this case, the Monte Carlo validation analysis can be used as illustrated above to find a region of parameter space in which the simulation tool will be used to make a prediction, by exploring only responses of interest. This set of parameter combinations is then used to compute the probability of consistency for each response; these probabilities are then used to construct a prediction interval.

In the following sections, the use of this approach is illustrated. First, concepts underlying construction of prediction intervals are covered. These concepts are then applied to the results from the Monte Carlo surrogate model evaluation to compute parameter combinations of interest, and use the corresponding probability of consistency to construct a prediction interval, with associated prediction uncertainty bounds.

6.9.1 Prediction Interval Construction

Following (loosely) the nomenclature of Young and Smith [192], let a denote a set of observations with observed values of the random variable A, A₁, …, A_n, for which the goal is to predict the value of a random variable Z that will follow next in the set of observations. The density of the random variables in a is given by f(a∣θ), where θ is a parameter or parameter vector that specifies the particular form of f. In this case, the distribution of Z is given by g(z∣θ^⋆), where θ^⋆ is an unknown parameter; it cannot be assumed that θ^⋆ = θ. A distribution or statistic that is a function of A₁, …, A_n and Z that is independent of θ, called a pivotal, would provide a way of defining a prediction set for Z.

Let T = T(A₁, …, A_n, Z) be pivotal, independent of θ. Then if Pr{T ∈ R_α} = 1 − α, where R_α is a set, then the set S = {Z : T(A₁, …, A_n, Z) ∈ R_α} defines a prediction set for Z, with an associated probability of 1 − α that Z will fall in S, that is independent of θ and θ^⋆. Now let a confidence interval for T be constructed as,

(7.26) Pr{T ≤ t} ≥ G(t∣θ) Pr{T ≥ − t} ≥ G(t∣θ)

and let an approximation to G(t∣θ), which is a pivotal estimate, be denoted \widetildeG(t) ≈ G(t∣θ). Also let a specific t be denoted, t_α, such that \widetildeG(t_α) = 1 − α. Then equations (6.9.1↑) and (7.26↑) become:

(7.27) Pr{ − t_α ≤ T ≤ t_α} ≥ 1 − α.

The choice of statistic T is highly dependent on the distribution type, and is generally difficult to find [130]. For a normal distribution, a pivotal quantity can be found using Cochran’s theorem [28], and is:

(7.28) T = √((n)/(n + 1))(Z − A)

where

(7.29) A = (1)/(n)ⁿ⎲⎳_i = 1A_i.

This results in:

(7.30) Pr{ − t_α ≤ √((n + 1)/(n))(Z − A) ≤ t_α} = 1 − α Pr{A − √((n + 1)/(n))t_α ≤ Z ≤ A + √((n + 1)/(n))t_α} = 1 − α.

The value of t_α can be determined from the normal distribution; and is:

(7.31) t_α = z_ασ

where σ is the variance, and z_α is the ⎛⎝1 − (α)/(2)⎞⎠^th quantile of the standard normal distribution. Because the variance is rarely known exactly, particularly for small sample sizes, it is approximated withs, the sample standard deviation,

(7.32) s = √((1)/(n − 1)ⁿ⎲⎳_i = 1(A_i − A)²)

which makes equation (7.30↑):

(7.33) Pr{A − √((n + 1)/(n))z_αs ≤ Z ≤ A + √((n + 1)/(n))z_αs} < 1 − α

or,

(7.34) Pr{A − √((n + 1)/(n))z_αs ≤ Z ≤ A + √((n + 1)/(n))z_αs} ≈ 1 − α.

6.9.2 Prediction Intervals for Model Validation

When making predictions using a validation tool, the true quantity of interest is the validation outcome: valid, or invalid? In other words, consistent, or inconsistent? For this reason, the prediction interval, which is a bounds on a future prediction Z based on previous observations A_i, should be a prediction about consistency. In this section, the quantity Z is the probability of a valid model prediction, Pr{γ(x) > 0}. This can be constructed based on prior information about whether model predictions are valid.

In order to construct a prediction interval for γ for the Arches gasification simulation tool, it is necessary to obtain a variance s. Because of the fact that simulations are “deterministic,” in the sense that repeating simulations with the same input variables (and on the same computational system) results in identical simulation responses, there is no variance in simulation predictions, so there is no deviation of the probability of consistency Pr{γ(x) > 0} from the mean probability of consistency Pr{γ(x) > 0} due to the simulation. For this reason, the variance must come from experimental observations. But more is needed than simply more experimental observations: multiple experimental observations only contribute to a single lower and upper bounds l_e and u_e corresponding to a single experiment d_e. What is needed is multiple experiments, each yielding a new d_e and new l_e and u_e.

Using this, a consistency measure γ for a given model prediction can be defined. It is a function of the experimental observation d_e, as well as the lower and upper bounds l_e and u_e, and the simulation input parameters:

(7.35) γ = γ(d_e, l_e, u_e, x)

so that the quantity for which a prediction interval is being constructed can be expressed as:

(7.36) Pr{γ(d_e, l_e, u_e, x) > 0}

or, more tersely,

(7.37) Pr{γ(d_e, x) > 0}.

The probability plots presented in Section 6.8↑ are plots of Pr{γ(d_e, x∣x_i = ^x_i) > 0}, that is, the probability of consistency conditioned on particular values ^x_i of the parameter(s) x_i.

The quantity A in equation (7.34↑) can be expressed as:

(7.38) A = Pr(γ > 0) = (1)/(n)ⁿ⎲⎳_e = 1[Pr(γ(d_e, x) > 0)]

and the quantity s can be expressed as:

(7.39) s = (1)/(n − 1)ⁿ⎲⎳_e = 1[Pr(γ(d_e, x) > 0) − Pr(γ > 0)]².

Using this notation, a prediction interval, with a confidence level of 1 − α, can be constructed for the probability of a valid prediction:

(7.40) {Pr(γ(d_e, x) > 0) − √((n + 1)/(n))z_αs ≤ Pr(γ(d_e, x) > 0) ≤ Pr(γ(d_e, x) > 0) + √((n + 1)/(n))z_αs}.

The prediction interval for Pr(γ(d_e, x) > 0) can be constructed for γ conditional on all parameters x_i taking on particular values ^x_i, γ(d_e, x∣x₁ = ^x₁, x₂ = ^x₂, …, x_N = ^x_N); it can be constructed for γ conditional on particular values of a single parameter, γ(d_e, x∣x₁ = ^x₁); it can be constructed for γ conditional on ranges of values of parameters, γ(d_e, x∣^x⁻_i ≤ x_i ≤ ^x⁺_i); etc. This is done by computing the probabilities of these γ values being greater than 0, using the Monte Carlo analysis described in Section 6.8↑ above, for each experiment e; these are then used to construct the probability prediction interval 7.40↑.

The experiments run in order to obtain a predictive interval should all cover the physical regime of interest for the computational model (e.g., gasification, if developing a gasification simulation tool; or combustion and gasification, if developing a combustion and gasification simulation tool). The scenario parameters for the experiment being explored as active parameters in the validation (in this case, T_wall, d_p, and ṁ_coal) should have an uncertainty range that either overlaps with, or ideally, is identical to, those of the prior experiments being used for validation. If only a subset of these parameters are held constant, it is still possible to construct a prediction interval, but only for γ conditioned on particular values of those scenario parameters in the same range. Aside from these scenario parameters, other experimental parameters do not need to remain fixed (with the caveat that changing the experimental scenario too much may change the dominant physical mechanisms, such that variables with significant effects for one scenario may not have significant effects for another). Once these alternative experiments are run, a set of experimental data d_e is gathered, with which a lower and upper bound, l_e and u_e, are constructed. Validation simulations are then run for this new experimental regime, and a response surface constructed, as described in Chapter 5. Exploring this response surface using a Monte Carlo analysis provides probabilities for γ(d_e, x) > 0. Once these probabilities have been obtained, the prediction interval given above by equation (7.40↑) can be constructed. This provides a level of confidence, given the model’s ability to make valid predictions in two experiments, that it will make valid predictions in new experiments in a particular region of parameter space.

6.9.3 Coal Gasification Prediction Interval

The procedure for constructing a prediction interval is demonstrated using gasification simulations of a similar gasification experiment. An additional gasification experiment was run by Soelberg, with data and operating conditions reported by Soelberg [161] and Rasband [139]. This experiment was run at conditions similar to those of Brown; Soelberg reported a solids flowrate of ṁ_coal = 0.0066 ( kg)/(s), comparable to the 0.0062 ( kg)/(s) of the Brown experiment. T_wall was also in an identical range. The mass mean particle size reported by Soelberg was 42 μm, slightly outside of the region of parameter space explored with the Brown validation simulations (see Table 4.1↑). This results in an overlap in the parameter interval [0.3, 1.0]. Validation simulations of the Soelberg system were run as part of prior validation studies exploring the parameters E₂ and ṁ_coal. These simulation results were used in combination with the Brown gasification simulation results to construct a prediction interval.

Soelberg reported radial profile data for r = 0 cm, r = 2 cm, r = 4 cm, r = 6 cm, and r = 8 cm, and radial profiles given at x = 20 cm, x = 34 cm, x = 51 cm, x = 81 cm, and x = 112 cm, although a few data points are missing (e.g., no data are given at x = 20 cm and r = 0 or 2 cm). The radial profiles at x = 20 cm and x = 34 cm from the Soelberg gasifier simulations were used in combination with radial profiles obtained from the Brown gasifier simulations at x = 21 cm and x = 36 cm. The Soelberg gasification validation simulations were used to construct response surfaces for each response; all response surfaces were first order, due to the fact that only four Soelberg simulations were run (a 2² full factorial experimental design). The conditions for each of the Soelberg validation simulations are reported in Table 6.4↓. (While the simulation code base changed between the time of the Soelberg and Brown simulations, the process of constructing a prediction interval can still be demonstrated. Changes in the code base may be accounted as bias in the response surface model.)

Using these response surfaces, a Monte Carlo analysis was performed, analogous to the analysis shown above. This was used to find probabilities for γ(d_e, x) > 0 for the Soelberg experiment and the corresponding simulations. A prediction interval could only be constructed for the E₂ and ṁ_coal parameter space, due to the fact that multiple error bars, and therefore multiple consistency measures γ, were only obtained as a function of those parameters.

Table 6.4 Experimental observations for Brown II and Brown III experiments.

Case	E₂ ⎡⎣(J)/(kmol)⎤⎦	ṁ_coal ⎡⎣(kg)/(h)⎤⎦
A	1.0 × 10⁸	21.4
B	3.0 × 10⁸	21.4
C	1.0 × 10⁸	26.1
D	3.0 × 10⁸	26.1

Figures 6.35↓ through 6.47↓ show the consistency probability function with respect to the high temperature devolatilization reaction activation energy model parameter E₂ for both the Soelberg and Brown response surfaces. These were computed using the Monte Carlo analysis procedure. These were then used to construct a prediction interval probability, which is also shown in the figures. The prediction intervals are 95% prediction intervals, which means that based on past observations of the probability of consistency (that is, the probability of consistency observed in the Soelberg and Brown gasification simulations), 95% of the predicted probabilities of consistency will fall between the lower and upper prediction interval bounds (both bounds are plotted). The behavior of the prediction interval varies, but strongly depends on the behavior of the probability of consistency of both simulations. One trend that clearly emerges is that a lower E₂ leads consistently to a prediction of higher consistency.

figure figures/PredictionInterval_E2_CO_x112_r0.png

Figure 6.35 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations r = 0 cm and x = 112 cm (the only x location on the centerline for which CO measurements were available).

figure figures/PredictionInterval_E2_CO_x021_r4.png

(a)

figure figures/PredictionInterval_E2_CO_x036_r4.png

(b)

Figure 6.36 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations r = 4 cm and x = 21 and 36 cm.

figure figures/PredictionInterval_E2_CO_x051_r4.png

(a)

figure figures/PredictionInterval_E2_CO_x112_r4.png

(b)

Figure 6.37 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations r = 4 cm and x = 51 and 112 cm.

\clearpage

figure figures/PredictionInterval_E2_CO_x021_r8.png

(a)

figure figures/PredictionInterval_E2_CO_x036_r8.png

(b)

Figure 6.38 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations r = 8 cm and x = 21 and 36 cm.

figure figures/PredictionInterval_E2_CO_x051_r8.png

(a)

figure figures/PredictionInterval_E2_CO_x112_r8.png

(b)

Figure 6.39 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations r = 8 cm and x = 51 and 112 cm.

\clearpage

figure figures/PredictionInterval_E2_CO2_x051_r0.png

(a)

figure figures/PredictionInterval_E2_CO2_x112_r0.png

(b)

Figure 6.40 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO₂ and spatial locations r = 0 cm and x = 51 and 112 cm for subplots (a) and (b). CO₂ data were not reported at x = 20 and 34 cm.

figure figures/PredictionInterval_E2_CO2_x051_r4.png

(a)

figure figures/PredictionInterval_E2_CO2_x112_r4.png

(b)

Figure 6.41 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO₂ and spatial locations r = 4 cm and x = 51 and 112 cm.

figure figures/PredictionInterval_E2_CO2_x021_r8.png

(a)

figure figures/PredictionInterval_E2_CO2_x036_r8.png

(b)

Figure 6.42 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO₂ and spatial locations r = 8 cm and x = 21 and 36 cm.

figure figures/PredictionInterval_E2_CO2_x051_r8.png

(a)

figure figures/PredictionInterval_E2_CO2_x112_r8.png

(b)

Figure 6.43 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO₂ and spatial locations r = 8 cm and x = 51 and 112 cm.

\clearpage

figure figures/PredictionInterval_E2_H2_x051_r0.png

(a)

figure figures/PredictionInterval_E2_H2_x112_r0.png

(b)

Figure 6.44 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species H₂ and spatial locations r = 0 cm and x = 51 and 112 cm.

figure figures/PredictionInterval_E2_H2_x021_r4.png

(a)

figure figures/PredictionInterval_E2_H2_x036_r4.png

(b)

Figure 6.45 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species H₂ and spatial locations r = 4 cm and x = 21 and 36 cm.

figure figures/PredictionInterval_E2_H2_x051_r4.png

(a)

figure figures/PredictionInterval_E2_H2_x112_r4.png

(b)

Figure 6.46 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species H₂ and spatial locations r = 4 cm and x = 51 and 112 cm.

figure figures/PredictionInterval_E2_H2_x021_r8.png

(a)

figure figures/PredictionInterval_E2_H2_x036_r8.png

(b)

Figure 6.47 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species H₂ and spatial locations r = 8 cm and x = 21 and 36 cm.

figure figures/PredictionInterval_E2_H2_x051_r8.png

(a)

figure figures/PredictionInterval_E2_H2_x112_r8.png

(b)

Figure 6.48 Plots of the consistency probability function as a function of parameter E₂ for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species H₂ and spatial locations r = 8 cm and x = 51 and 112 cm.

Similarly, Figures 6.49↓ through 6.62↓ show the consistency probability function with respect to the coal mass flowrate ṁ_coal. The conclusions for ṁ_coal also indicate that a higher mass flowrate of coal leads to a higher probability of consistency, although the prediction intervals are large and therefore more uncertain, and depend on the species response being considered.

It was mentioned that code development had occurred between the Soelberg and Brown simulations being run. It is very encouraging to see a much greater probability of consistency in the Brown simulations, run with the improved code, than in the Soelberg code. Additional simulations of the Soelberg gasifier using the improved Arches coal gasification tool would likely yield much better prediction intervals, in addition to improvements in the Soelberg response surfaces that would result from additional simulations. However, the construction of a prediction interval presents a significant step forward for determining valid regions of parameter space and determining, not just a valid region of parameter space, but a level of confidence in predictions made from regions of parameter space.

An additional concept of importance is the application of the construction of the prediction interval to multiscale simulations. While the construction of the prediction intervals presented above used simulations of very similar systems, it is also possible, with some consideration, to construct this prediction interval in regions of parameter space using simulations of systems at different scales. For example, a single particle drop tube experiment could be investigated using the Arches simulation tool, and response surfaces generated that are functions of model parameters shared by simulations at different scales (for example, the devolatilization activation energy E₂). This would provide valuable insight into the validity of various submodels, such as devolatilization, across scales. Considering previous studies have found devolatilization models to be of chief importance in simulating gasification systems, this would be a valuable next step.

figure figures/PredictionInterval_mc_CO_x112_r0.png

Figure 6.49 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations r = 0 cm and x = 112 cm (the only x location on the centerline for which CO measurements were available).

figure figures/PredictionInterval_mc_CO_x021_r4.png

(a)

figure figures/PredictionInterval_mc_CO_x036_r4.png

(b)

Figure 6.50 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations r = 4 cm and x = 21 and 36 cm.

figure figures/PredictionInterval_mc_CO_x051_r4.png

(a)

figure figures/PredictionInterval_mc_CO_x112_r4.png

(b)

Figure 6.51 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations r = 4 cm and x = 51 and 112 cm.

\clearpage

figure figures/PredictionInterval_mc_CO_x021_r8.png

(a)

figure figures/PredictionInterval_mc_CO_x036_r8.png

(b)

Figure 6.52 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations r = 8 cm and x = 21 and 36 cm.

figure figures/PredictionInterval_mc_CO_x051_r8.png

(a)

figure figures/PredictionInterval_mc_CO_x112_r8.png

(b)

Figure 6.53 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO and spatial locations r = 8 cm and x = 51 and 112 cm.

\clearpage

figure figures/PredictionInterval_mc_CO2_x051_r0.png

(a)

figure figures/PredictionInterval_mc_CO2_x112_r0.png

(b)

Figure 6.54 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO₂ and spatial locations r = 0 cm and x = 51 and 112 cm for subplots (a) and (b). CO₂ data were not reported at x = 20 and 34 cm.

figure figures/PredictionInterval_mc_CO2_x051_r4.png

(a)

figure figures/PredictionInterval_mc_CO2_x112_r4.png

(b)

Figure 6.55 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO₂ and spatial locations r = 4 cm and x = 51 and 112 cm.

figure figures/PredictionInterval_mc_CO2_x021_r8.png

(a)

figure figures/PredictionInterval_mc_CO2_x036_r8.png

(b)

Figure 6.56 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO₂ and spatial locations r = 8 cm and x = 21 and 36 cm.

figure figures/PredictionInterval_mc_CO2_x051_r8.png

(a)

figure figures/PredictionInterval_mc_CO2_x112_r8.png

(b)

Figure 6.57 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species CO₂ and spatial locations r = 8 cm and x = 51 and 112 cm.

\clearpage

figure figures/PredictionInterval_mc_H2_x051_r0.png

(a)

figure figures/PredictionInterval_mc_H2_x112_r0.png

(b)

Figure 6.58 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species H₂ and spatial locations r = 0 cm and x = 51 and 112 cm.

figure figures/PredictionInterval_mc_H2_x021_r4.png

(a)

figure figures/PredictionInterval_mc_H2_x036_r4.png

(b)

Figure 6.59 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species H₂ and spatial locations r = 4 cm and x = 21 and 36 cm.

figure figures/PredictionInterval_mc_H2_x051_r4.png

(a)

figure figures/PredictionInterval_mc_H2_x112_r4.png

(b)

Figure 6.60 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species H₂ and spatial locations r = 4 cm and x = 51 and 112 cm.

figure figures/PredictionInterval_mc_H2_x021_r8.png

(a)

figure figures/PredictionInterval_mc_H2_x036_r8.png

(b)

Figure 6.61 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species H₂ and spatial locations r = 8 cm and x = 21 and 36 cm.

figure figures/PredictionInterval_mc_H2_x051_r8.png

(a)

figure figures/PredictionInterval_mc_H2_x112_r8.png

(b)

Figure 6.62 Plots of the consistency probability function as a function of parameter ṁ_coal for the Brown surrogate model (left) and Soelberg surrogate model (center), and the corresponding constructed 95% prediction interval (right) for species H₂ and spatial locations r = 8 cm and x = 51 and 112 cm.

6.10 Conclusions

The chapter began with a presentation of concepts used in validation of computational models taken from the literature, specifically the Data Collaboration method of Frenklach et al. [53, 52, 147, 146, 174]. It was shown how these concepts align precisely with the instrumentalist philosophy of validation adopted throughout this work. These concepts were then applied to validating the Arches coal gasification tool.

First, the Data Collaboration toolbox was applied. The approach used by this toolbox was described in Section 6.4↑ as quadratically constrained rational quadratic programming (optimization), which uses piecemeal quadratic functions to represent underlying functions and determine global minima and maxima of these functions. The toolbox yielded the feasible set of parameter values, and a left and right bounds on the model predictions using parameters from this feasible set. It was found that there was no feasible set that satisfied the experimental uncertainty bounds for all data points (30 spatial locations and three species). For this reason, the data were fragmented and grouped by species. The Data Collaboration toolbox found a feasible set for all 30 measurements of CO₂, but no feasible set was found for all CO or H₂ measurements. The data were further fragmented into radial profiles, and a feasible set was found for all but one CO radial profile, while a feasible set was only found for a single H₂ radial profile. It was determined that E₂ had by far the largest effect on the model’s consistency with the data, nearly to the point of exclusion of other variables. The effect of E₂ in the gasifier was very clearly dominant, both from the sensitivity and factorial studies performed in Chapter 5↑, as well as the validation consistency analysis, meaning it is both significant to the mathematical model, and significant to the accuracy of the mathematical model’s predictions. Unfortunately, it was awkward to explore the consistencies computed by the toolbox, and a clear path forward was lacking due to the lack of global (or even, in the case of H₂, local) consistency. Perhaps the safest conclusion to draw from the Data Collaboration results is that the devolatilization model is of principal importance in the gasifier simulation, and that future modeling efforts should focus on it.

To explore the response surfaces further, the response surfaces were sampled using a Monte Carlo sampling method. While a dense Monte Carlo sampling would be impossible for many of the problems to which the Data Collaboration toolbox has been applied, such as the GRI Mech model, which has hundreds of parameters, it is computationally easy to do with a small number of parameters. The Brown coal gasification response surfaces meet these criteria, as they had only four parameters. This analysis yielded very insightful results, and the visual representation of the valid and invalid parameter spaces was very insightful. The Monte Carlo results were then used to construct consistency probability functions, which were defined as:

(7.41) Pr{γ(x) > 0},

where γ is defined in Section 6.2↑ as the amount by which the experimental uncertainty bounds may be shrunk while still bounding the model prediction. The inequality γ > 0 provides the desired binary validation measure, discussed in Chapter 4↑. Thus this probability function is a measure of the probability of a model making a valid prediction.

This probability was then used in concert with simulations run on a second, similar gasifier, and these probability functions were combined to construct a prediction interval. This was defined as a region in which 1 − α% of predictions of the probability of consistency would actually fall. These prediction intervals are presented in Figures 6.35↑ through 6.62↑. These figures provide an extremely valuable way, not just to make predictions using a set of parameter values, but to establish a level of confidence in said predictions. As mentioned above, it would be particularly interesting to re-run the Soelberg gasification simulations in order to obtain better predictions and construct an improved response surface, and search for regions of the E₂ × ṁ_coal parameter space which have a prediction interval that predicts 100% validity.

7 CONCLUSIONS AND RECOMMENDATIONS

Little by little we subtract
Faith and fallacy from fact,
The illusory from the true
And starve upon the residue.
― Samuel Hoffenstein

7.1 Digest of Concepts and Conclusions

The following sections provide a digest of the most important concepts and conclusions from each of the preceding chapters.

7.1.1 Verification

One of the most important contributions made in the verification chapter was the elucidation of the concept of numerical error and numerical uncertainty, and the role that both play in the validation procedure. Often, verification and validation are treated as activities connected in name only: verification is performed, results of a grid convergence are reported, and no further mind is paid to it. This detracts significantly from the big picture effort; verification is the first step in validation. In addition to verifying that the theoretical order of convergence is achieved, it is also a validation litmus test: if the numerical uncertainty is so large that it overshadows, or is even approximately equal to, the experimental uncertainty, there is no point in proceeding with validation. The importance of verification as a validation litmus test was embodied in the concept of level of verification.

This is not an issue addressed in the literature: the results of the verification procedure very much dominate the choice of experimental data used to validate the computational model. Validation with an experimental data set implies a certain level of validation; but this level of validation must be larger than the level of verification, otherwise the validation results are meaningless, due to the fact that the model’s ability or inability to match experimental data may be due entirely to numerical error.

7.1.2 Validation

Of all of the conclusions of Chapter 4↑, the chiefest is the concept of simulation as an extension of theory. While it is tempting to treat simulation results as surrogate experimental data and analyze them as such, it is dangerous to do so. Simulations are purely extensions of theory, capable of exploring in great detail the implications of the hypotheses and assumptions bundled into mathematical models; as such, they are extremely valuable tools. But they can never touch reality, and must always be treated using the approach of Box and Draper: “essentially, all models are wrong, but some are useful” [20]. The purpose of validation is, essentially, to determine when models are useful.

Many validation approaches exist in the literature. Some of these approaches were summarized in Section 4.3↑. Several conclusions emerge from this review of validation approaches, with the most evident being experimental considerations. There is a significant gap between the goals of experimentalists and the goals of modelers, as discussed in the section covering validation experiments and traditional experiments (Section 4.4.2↑). If the field of simulation science is to advance forward in any significant way, these differences must be reconciled. There are several ways to do this, some of which tie into other conclusions from the validation literature review. First, the use of the internet to share and discuss detailed experimental results in order to bypass length and content limitations of scientific journals has great potential, and has been discussed for well over a decade [108], but there is a strange deficiency of such databases, and motivation seems to be missing. Existence of such databases could transform many traditional experiments into validation experiments, without a significant change in the goals of the experimentalist gathering the data. Furthermore, discussion of the experiments could lead to much more accurate modeling of boundary conditions and other scenario parameters, which are often neglected in scientific journal articles for the sake of brevity.

The impetus to change some of these characteristics of the scientific community is unlikely to come from the community itself. Paul Davis, in a quote already presented in Section 4.4.2↑ but well worth repeating, said that validation experiments are “very important and [have] long been inadequately funded by any measure. By explicitly budgeting for ’serious’ VV&A, the Department of Defense would create incentives that do not now exist for model developers. Without such incentives, VV&A may improve only marginally, despite the suggestions and exhortations from this and other studies.” Funding agencies exercise an undue leverage over the directions that science takes. Assuming this leverage does not change, motivation to perform more validation experiments, or to provide databases of experimental results, or to make the process of experimental data analysis (or design of experimental campaigns) more of a collaborative effort achieving goals of both experimentalists and modelers, must come from these agencies.

Likewise, scientific journals, which also exercise undue leverage over the direction of scientific development, must also provide impetus for change through their policies. Such moves have been made in the past, such as the 1986 editorial statement in the Journal of Fluids Engineering on control and quantification of numerical error and numerical uncertainty. Similar consideration should be given to policies mandating quantification of experimental uncertainty, as well as disclosure of experimental results in database, rather than solely plot, format. This would provide a significant step forward for simulation validation, whilst simultaneously making the process of experimental measurements more open, and the quantification of experimental uncertainty more honest.

Another consideration is the so-called “open science” movement [34]; this more democratic approach to science supplements or bypasses traditional forums in favor of a more transparent and open approach. While there is still some debate about the strengths and weaknesses of this approach, e.g. the peer review process of open-access journals, the process need not be perfected to be useful. Open scientific collaboration through the many media available today should be more widespread, and technology used as a tool, not an obstacle. (It is in some ways an embarrassment that new media have been adopted and used to greater ends by political organizations or journalists than by the scientific community, from which the new media originate.)

The adopted validation framework provides an excellent and robust way of consolidating the many approaches to validation and connecting them together; sometimes validation procedures in the literature provide opposing methods to accomplish the same task, but more often, they provide complimentary techniques for dealing with different steps in the large and involved process of validation. As covered in the section addressing the need for a framework (Section 4.3.4↑), adoption of a validation framework is a very important part of creating a cohesive validation philosophy; being flexible enough to handle validation of both very cheap and very expensive computational models (and everything in-between) is an essential framework characteristic.

7.1.3 Surrogate Models

The primary conclusion with regard to surrogate models is the desperate need for a good surrogate model: surrogate models are the lynchpin of the validation process for expensive, complex computational models. If the surrogate model is bad, the validation is bad. In fact, the level of error introduced by the surrogate model can be thought of as a sort of additional level of verification; if the surrogate model is too bad to reproduce the behavior of the complex model, the validation is useless.

To construct a good surrogate model, and quantitatively judge the goodness of fit of said model, a thorough statistical analysis of the surrogate model was performed. A statistical analysis was performed at each step in the sequential design, and the analysis revealed underlying linear behavior in the responses of the highly nonlinear coal gasification model, a somewhat surprising result that provides a boost of hope for any modeler facing a daunting task. In order to verify that this was, in fact, a correct conclusion, several statistical tests, including a curvature check, a residuals test, an F-test for statistical significance, and an ANOVA analysis, were performed.

These statistical analyses also revealed an underlying complication: the amount of data generated by each statistical metric was enormous, with 30 spatial locations and 3 species, for a total of 90 data points, and with 4-6 input variables (dimensions of parameter space). It was cumbersome to digest all of the results of each statistical test. However, this is a critical step: the surrogate model distills the dozens of terabytes and thousands or millions of CPU hours worth of Arches computations into its most essential characteristics; as such, the modeler must make absolutely sure that the surrogate model provides a faithful representation of the Arches computations (that the “level of verification of the surrogate model” is lower than the level of validation), else the computations are all for naught.

Another conclusion from the surrogate models chapter was in regards to the exploration of parameter space. There is a push and pull when selecting the ranges of each parameter to explore: the desire to push the bounds out further, explore larger ranges, based on both experience (large ranges for prior distributions of input parameters, particularly the case with model parameters); and the desire to pull the bounds narrower, explore smaller ranges, due to the fact that surrogate models are frankly awful as the range grows larger. The assumptions going into surrogate models assume certain things about the response (that it is smooth, that it does not vary sharply, etc.), but these assumptions are likely to erode as the parameter ranges grow larger.

There is a way to address this problem, proposed in Section 5.3.1↑, which originates from the fact that wider parameter ranges are not inherently a bad idea, they just increase the number of samples that must be gathered. Cheaper, reduced-dimensional physical models (e.g. RANS, one-dimensional turbulence, ideal reactor network models, etc.) should be used to explore wide ranges of parameter space with space-filling designs to reveal the interesting regions of parameter space, while also shedding light on the functional form of the response. This would help to provide better parameter input ranges to the much more expensive physical model (Arches) and provide justification for selecting a particular functional form for the surrogate model, rather than assuming that polynomials will work.

7.1.4 Validation Results Analysis

One of the important conclusions of Chapter 6↑ was that the Data Collaboration approach to validation provided metrics that fit the instrumentalist philosophy of validation very well. However, the “black box” Data Collaboration toolbox hindered interpretation of some of these validation metrics. In order to achieve consistency among data, fragmentation had to occur, and even when the data was fragmented, only some species or some spatial locations had a feasible set that made them consistent. The interpretation of these fragmented and disparate feasible sets was muddled by lack of experience with the toolbox’s algorithms and the resulting lack of transparency. The importance of this step in the validation process led to the need for a more open and easily understood process of validation results analysis. A Monte Carlo analysis of the simple and low-dimensional response surfaces was much easier to visualize and understand, and led to more concrete conclusions about the impact of variables or combinations of variables and their impacts on whether a code made consistent predictions. It also led to the conclusion that the feasible set should not be treated as a crisp set, but as a fuzzy set, and it provided a probabilistic way of looking at the Data Collaboration metrics. It is recommended that validation results analyses utilize the Monte Carlo analysis approach, when computationally feasible, to supplement the Data Collaboration results analysis.

A prediction interval was constructed for prediction of the probability of consistency. This led to a more realistic approach to predictions than the Data Collaboration method, which presumes that if a prediction is made using a parameter combination from the feasible set, it is valid. The prediction interval quantifies the level of confidence in the prediction. One of the chief recommendations from Chapter 6↑ that could be implemented in a short amount of time is to run the Soelberg gasification cases using an updated version of the Arches code, perform a full statistical design and analysis of surrogate model, and repeat the prediction interval construction. Another fruitful area of research would be to utilize much of the existing work on Bayesian inference to recast the construction of the prediction interval, which only utilized elements of frequentist approaches, in terms of these approaches to improve the prediction interval.

7.2 The List

As the tedious old chatterbox Polonius says,

“...to expostulate

What majesty should be, what duty is,
What day is day, night night, and time is time,
Were nothing but to waste night, day, and time;
Therefore, since brevity is the soul of wit,
And tediousness the limbs and outward flourishes,

I will be brief.” (Hamlet, Act 2, Scene ii, 89-92)

The following are recommended for future work:

Careful use of terminology, such as “uncertainty” and “error,” such that it follows common technical use of the term (as opposed to a dictionary definition), and if a common technical definition is insinuated or not given in the literature, one be given;
Treatment of simulation, not as a branch of science independent of theory, but as a tool that greatly extends the capability of theory;
For scientific journal editors, boards, and peer reviewers, adoption of more clear attitudes toward reporting of experimental results and provision of results in database, not just plot, format, to ease the use of traditional experimental results in model validation;
Increased collaboration among scientists, outside of the “X pages or less” forum of scientific journals;
A detailed and thorough statistical analysis of surrogate models, rather than a “TV dinner” approach (Section 5.3.2↑), due to the magnitude of their importance;
Use of low-dimensional physical models to explore parameter space with space-filling designs, determine optimal functional forms for surrogate models, and provide narrower input parameter ranges to explore with expensive physical models;
Supplementation of a validation analysis using Data Collaboration toolbox with a validation analysis using a Monte Carlo approach;
Use of the prediction interval or similar method for establishing a level of belief in model predictions; and
Application of additional probabilistic ideas and concepts (e.g., Bayesian inference or fuzzy sets) to the validation analysis process and construction of prediction belief level.

References

[1] Annual Energy Review 2008. 2009.

[2] CFD Validation Philosophy. 1988.

[3] Carbon Dioxide Emissions from the Generation of Electric Power in the United States. 2000.

[4] . Coal Science and Technology (Series). Elsevier Science Publishers, 1986.

[5] International Energy Outlook 2009. 2009.

[6] . Parallel Processing for Scientific Computing. SIAM, 2005.

[7] Systems, approximation, singular integral operators, and related topics: International Workshop on Operator Theory and Applications, IWOTA 2000. 2000.

[8] D B Anthony, J B Howard, H C Hottel, H P Meissner. Rapid Devolatilization of Pulverized Coal. Fifth Symposium (International) on Combustion, 5:1303-1317, 1975.

[9] Armen Der Kiureghian, Ove Ditlevsen. Aleatory or epistemic? Does it matter?. Structural Safety, 31:105-112, 2009.

[10] Arnold Neumaier. Interval methods for systems of equations. Cambridge University Press, 1990.

[11] Alfred J. Ayer. Language, Truth, and Logic. Penguin, 1936, 1946.

[12] B. Boehm. Software Engineering Economics. Prentice-Hall, 1981.

[13] Stanley Badzioch, Peter G. W. Hawksley. Kinetics of Thermal Decomposition of Pulverized Coal Particles. Ind. Eng. Chem. Process Des. Develop., 9(4), 1970.

[14] S. Balachandar, John W. Eaton. Turbulent dispersed multiphase flow. Annual Review of Fluid Mechanics, 42:111-133, 2010.

[15] Hans-Walter Bandemer. Mathematics of Uncertainty. Springer, 2006.

[16] Jacob Bernoulli. Ars conjectandi: opus posthumum; accedit Tractatus de seriebus infinitis; et Epistola gallice scripta de ludo pilae reticularis. Impensis Thurnisiorum, 1713.

[17] George Box, J. Stuart Hunter, William G. Hunter. Statistics for Experimenters: Design, Innovation, and Discovery. John Wiley and Sons, 2005.

[18] George Box, K. B. Wilson. On the Experimental Attainment of Optimum Conditions. Journal of the Royal Statistical Society, Series B, 13(1):1-45, 1951.

[19] George Box, Norman Draper. A Basis for the Selection of a Response Surface Design. Journal of the Royal Statistical Society, Series B, 54(287):622-654, 1959.

[20] George Box, Norman Draper. Empirical Model-Building and Response Surfaces. John Wiley and Sons, 1987.

[21] B. Scott Brewster, Larry L. Baxter, L. Douglas Smoot. Treatment of Coal Devolatilization in Comprehensive Combustion Modeling. Energy and Fuels, 2:362-370, 1988.

[22] Blaine W. Brown. Effect of Coal Type of Entrained Gasification. 1985.

[23] Blaine Brown, L. Douglas Smoot, Paul O. Hedman. Effect of coal type on entrained gasification. Fuel, 65:673-678, 1986.

[24] C. de Boor. A Practical Guide to Splines. Springer-Verlag, 1978.

[25] C.-C. Rossow, N. Kroll. Numerical Simulation - Complementing Theory and Experiment as the Third Pillar in Aerodynamics. Hermann Schlichting—100 years: scientific colloquium celebrating the anniversary of his birthday, Braunschweig, Germany 2007:39-58, 2009.

[26] Carla Currin, Toby Mitchell, Max Morris, Don Ylvisaker. Bayesian Prediction of Deterministic Functions, with Applications to the Design and Analysis of Computer Experiments. Journal of the American Statistical Association, 86(416):953-963, 1991.

[27] Charles Hermann. Validation problems in games and simulations with special reference to models of international politics. Behavioral Science, 12(3):216-231, 1967.

[28] W. G. Cochran. The distribution of quadratic forms in a normal system, with applications to the analysis of covariance. Mathematical Proceedings of the Cambridge Philosophical Society, 30(2):178-191, 1934.

[29] H. W. Coleman, F. Stern. Uncertainties and CFD code validation. Journal of Fluids Engineering, 119(4):795-803, 1997.

[30] H. W. Coleman, F. Stern. Uncertainties and CFD code validation: Discussion. Journal of Fluids Engineering, 120:635-636, 1997.

[31] Hugh W. Coleman, W. Glenn Steele. Experimentation, Validation, and Uncertainty Analysis for Engineers. John Wiley and Sons, 2009.

[32] Constantino M. Lagoa, B. R. Barmish. Distributionally Robust Monte Carlo Simulation: A Tutorial Survey. 2002.

[33] D. P. Aeschliman, William Oberkampf, F. G. Blottner. A proposed methodology for computational fluid dynamics code verification, calibration, and validation. 1995.

[34] David Dobbs. Free Science, One Paper at a Time. 2011.

[35] David Moens, Dirk Vandepitte. A survey of non-probabilistic uncertainty treatment in finite element analysis. Comput. Methods Appl. Mech. Engrg., 194:1527-1555, 2005.

[36] Dennis Lindley. The probability approach to the treatment of uncertainty in artificial intelligence and expert systems. Statistical Science, 2(1):17-24, 1987.

[37] Sir Arthur Conan Doyle. The Hound of the Baskervilles. Aladdin, 1902, 2000.

[38] L. Eca, M. Hoekstra. Evaluation of numerical error estimation based on grid refinement studies with the method of manufactured solutions. Computers and Fluids, 38:1580-1591, 2009.

[39] Laurent El Ghaoui, Giuseppe Calafiore. Worst-case simulation of uncertain systems. In Robustness in identification and control (Garulli, A. and Tesi, A., ed.). Springer Berlin, 1999.

[40] Eric Winsberg. Simulated Experiments: Methodology for a Virtual World. Philosophy of Science, 70(1):105-125, 2003.

[41] Eric Winsberg. Simulation and the Philosophy of Science: Computationally Intensive Studies of Complex Physical Systems. 1999.

[42] R. H. Essenhigh, R Froberg, J B Howard. Combustion behavior of small particles. Industrial and Engineering Chemistry, 57(9):32-43, 1965.

[43] William Fain, Janice Fain. Validation of combat models against historical data. 1970.

[44] Liang Shih Fan, Chao Zhu. Principles of Gas-Solid Flows. Cambridge University Press, 1998.

[45] Kai-Tai Fang, Runze Li, Agus Sudjianto. Design and Modeling for Computer Experiments. Chapman and Hall-CRC, 2006.

[46] Alexandre Favre. Turbulence: space-time statistical properties and behavior in supersonic flows. Physics of Fluids, 26(10):2851-2863, 1983.

[47] Scott Ferson, Cliff A. Joslyn, Jon C. Helton, William L. Oberkampf, Kari Sentz. Summary from the epistemic uncertainty workshop: consensus amid diversity. Reliability Engineering and System Safety, 85(355-369), 2004.

[48] Joel Ferziger, Milovan Peric. Further discussion of numerical error in CFD. International Journal for Numerical Methods in Fluids, 23:1263-1274, 1996.

[49] Rodney O. Fox. Computational Models for Turbulent Reacting Flows. Cambridge University Press, 2003.

[50] Francois Hemez, Scott Doebling. Model validation and uncertainty quantification. 2001.

[51] Allan Franklin. The Neglect of Experiment. Cambridge University Press, 1986.

[52] Michael Frenklach, Andrew Packard, Pete Seiler, Ryan Feeley. Collaborative data processing in developing predictive models of complex reaction systems. International Journal of Chemical Kinetics, 36(1):57-66, 2004.

[53] Michael Frenklach, Andrew Packard, Pete Seiler. Prediction uncertainty from models and data. 2002.

[54] Milton Friedman. Essays in Positive Economics. University of Chicago Press, 1953.

[55] G. E. P. Box, D. W. Behnken. Some new three-level designs for the study of quantitative variables. Technometrics, 2(4):455-475, 1960.

[56] Gary Balas, Richard Chiang, Andy Packard, Michael Safnov. Robust Control Toolbox: User's Guide. 2006.

[57] Andrew Gelman, John B. Carlin, Hal S. Stern, Donald B. Rubin. Bayesian Data Analysis. Chapman and Hall-CRC, 2004.

[58] George Em Karniadakis. Quantifying uncertainty in CFD. Journal of Fluids Engineering, 124:2-3, 2002.

[59] George Em Karniadakis. Toward a Numerical Error Bar in CFD. Journal of Fluids Engineering, 117(7):7-9, 1995.

[60] George S. Kimeldorf, Grace Wahba. A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines. The Annals of Mathematical Statistics, 41(2):495-502, 1970.

[61] Roy G. Godon. Error Bounds in Equilibrium Statistical Mechanics. Journal of Mathematical Physics, 9(5):655-663, 1968.

[62] David E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Professional, 1989.

[63] Kurt Gödel. On Formally Undecidable Propositions of Principia Mathematica and Related Systems. Dover Publications, 1931, 1992.

[64] David M. Grant, Ronald J. Pugmire, Thomas H. Fletcher, Alan R. Kerstein. Chemical Model of Coal Devolatilization Using Percolation Lattice Statistics. Energy and Fuels, 3:175-186, 1989.

[65] J. C. Helton, F. J. Davis, J. D. Johnson. A comparison of uncertainty and sensitivity analysis results obtained with random and Latin hypercube sampling. Reliability Engineering and System Safety, 89:305-330, 2005.

[66] J. C. Helton, F. J. Davis. Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliability Engineering and System Safety, 81:23-69, 2003.

[67] J. C. Helton, W. L. Oberkampf. Alternative representations of epistemic uncertainty. Reliability Engineering and System Safety, 85:1-10, 2004.

[68] Michael Heroux, James Willenbring. Barely sufficeint software engineering: 10 practices to improve your CSE software. 2009.

[69] Scott Hill, L. Douglas Smoot. A Comprehensive Three-Dimensional Model for Simulation of Combustion Systems: PCGC-3. Energy and Fuels, 7:874-883, 1993.

[70] David Hume. An Enquiry Concerning Human Understanding. P. F. Collier and Son, 1748, 1910.

[71] Immanuel Kant, Paul Guyer (trans.), Allen Wood (trans.). Critique of Pure Reason. Cambridge University Press, 1781, 1999.

[72] J. A. Nelder, R. W. M. Wedderburn. Generalized linear models. Journal of the Royal Statistical Society, Series A, 135(3):370-384, 1972.

[73] J. C. Helton, J. D. Johnson, C. J. Sallaberry, C. B. Storlie. Survey of sampling-based methods for uncertainty and sensitivity analysis. Reliability Engineering and System Safety, 91:1175—1209, 2006.

[74] J. C. Helton, J. D. Johnson, W. L. Oberkampf. An exploration of alternative approaches to the representation of uncertainty in model predictions. Reliability Engineering and System Safety, 85:39-71, 2004.

[75] J. L. Doob. Stochastic Processes. John Wiley and Sons, 1953.

[76] Jack Kleijnen, Robert Sargent. A methodology for fitting and validating metamodels in simulation. European Journal of Operations Research, 120:14-29, 2000.

[77] Jack Kleijnen, Susan Sanchez, Thomas Lucas, Thomas Cioppa. A User's Guide to the Brave New World of Designing Simulation Experiments. INFORMS Journal on Computing, 17(3):263-289, 2005.

[78] Jack Kleijnen. Design and Analysis of Simulation Experiments. Springer, 2008.

[79] Jack Kleijnen. Statistical validation of simulation models. European Journal of Operations Research, 87:21-34, 1995.

[80] Jack Kleijnen. Validation of trace-driven simulation models: regression analysis revisited. 1996.

[81] Jack Kleijnen. Verification and validation of simulation models. European Journal of Operations Research, 82:145-162, 1995.

[82] James Hodges, James Dewar. Is it you or your model talking? A framework for model validation. 1992.

[83] James S. Hodges. Six (or so) things you can do with a bad model. Operations Research, 39(3):355-365, 1991.

[84] E. T. Jaynes. Probability Theory: The Logic of Science. Cambridge University Press, 2003.

[85] Jerome Sacks, Susannah B. Schiller, William Welch. Design for Computer Experiments. Technometrics, 31(1):41-47, 1989.

[86] Jerome Sacks, William Welch, Toby J. Mitchell, Henry Wynn. Design and Analysis of Computer Experiments. Statistical Science, 4(4):409-423, 1989.

[87] Karl Popper. Conjectures and Refutations: The Growth of Scientific Knowledge. Routledge, 1963, 2002.

[88] George E. Karniadakis, Robert M. Kirby. Parallel Scientific Computing in C++ and MPI. Cambridge University Press, 2003.

[89] Kenneth Nichols. Nitrogen pollutant formation in a high pressure entrained coal gasifier. 1987.

[90] Kevin Whitty. Investigation of fuel chemistry and bed performance in a fluidized bed black liquor steam reformer. 2007.

[91] G. J. Klir. On the alleged superiority of probabilistic representation of uncertainty. 1994.

[92] Patrick Knupp, Kambiz Salari. Verification of Computer Codes in Computational Science and Engineering. Chapman and Hall-CRC, 2003.

[93] H. Kobayashi, J.B. Howard, Adel F. Sarofim. Coal Devolatization at High Temperatures. Sixteenth Symposium (International) on Combustion:411-425, 1976.

[94] A. N. Kolmogorov. The local structure of turbulence in incompressible viscous fluid for very large Reynolds numbers. Proceedings: Mathematical and Physical Sciences, 434(1890):9-13, 1990.

[95] F Kreith. Principles of heat transfer. Intext Educational Publishers, 1973.

[96] Thomas Kuhn. The Structure of Scientific Revolutions. University of Chicago Press, 1962, 1970, 1996.

[97] D. Kunii, O. Levenspiel. Bubbling Bed Model: Model for Flow of Gas through a Fluidized Bed. Fluidization Engineering, Wiley, New York, 1968.

[98] Larry Schumaker. Spline Functions: Basic Theory. Cambridge University Press, 2007.

[99] D K Leavitt. Effects of Coal Dust and Secondary Swirl on Gas and Particle Mixing Rates in Confined Coaxial Jets. 1980.

[100] Jarrett Leplin. A Novel Defense of Scientific Realism. Oxford University Press, 1997.

[101] R. W. Logan, C. K. Nitta. Comparing 10 Methods for Solution Verification, and Linking to Model Validation. 2005.

[102] D. Lucor, D. Xiu, C. H. Su, George E. Karniadakis. Predictability and uncertainty in CFD. International Journal for Numerical Methods in Fluids, 43:483-505, 2003.

[103] Ludwig Fahremir, Gerhard Tutz. Multivariate Statistical Modelling Based on Generalized Linear Models. Springer, 1994.

[104] M. J. Bayarri, James O. Berger, David Higdon, Marc C. Kennedy, A. Kottas, Rui Paulo, Jerome Sacks, James A. Cafeo, James C. Cavendish, C. H. Lin, J. Tui. A framework for validation of computer models. 2002.

[105] Ernst Mach. Popular Scientific Lectures. Cornell University Library, 2009.

[106] Ernst Mach. The Science of Mechanics: A Critical and Historical Account of its Development. The Open Court Publishing Co., 1893.

[107] Daniele L. Marchisio, R. O. Fox. Solution of population balance equations using the direct quadrature method of moments. Aerosol Science, 36:43-73, 2005.

[108] J. G. Marvin. Perspective on computational fluid dynamics validation. AIAA Journal, 33(10):1778-1787, 1995.

[109] Mary McWherter Walker, William L. Oberkampf. Joint computational/experimental aerodynamics research on a hypersonic vehicle, part 2: computational results. AIAA Journal, 30(8):2010-2016, 1992.

[110] Max D. Morris, Toby J. Mitchell, Don Ylvsiaker. Bayesian Design and Analysis of Computer Experiments: Use of Derivatives in Surface Prediction. Technometrics, 35(3):243-255, 1993.

[111] D. Merrick. Mathematical Model of the Thermal Decomposition of Coal. 2. Specific Heats and Heats of Reaction. Fuel, 62:540-6, 1983.

[112] Michael Eldred, Anthony A Giunta, Bart G. van Bloemen Waanders, Jr. Steven F. Wojkiewicz, William E. Hart, Mario P. Alleva. DAKOTA, a multilevel parallel object-oriented framework for design optimization, parameter estimation, uncertainty quantification, and sensitivity analysis: Version 3.0 developers manual. 2002.

[113] Raymon E. Moore, R. Baker Kearfott, Michael J. Cloud. Introduction to Inverval Analysis. SIAM, 2009.

[114] Morris Kline. Mathematics: The Loss of Certainty. Oxford University Press, 1982.

[115] Greg F. Naterer. Heat Transfer in Single and Multiphase Systems. CRC Press, 2003.

[116] M A Nettleton. Burning rates of devolatilized coal particles. Industrial and Engineering Chemistry Fundamentals, 6(1):20-25, 1967.

[117] Kenneth M. Nichols, Paul O. Hedman, L. Douglas Smoot, Angus U. Blackham. Fate of coal-sulphur in a laboratory-scale coal gasifier. Fuel, 68:242-247, 1989.

[118] Kenneth M. Nichols, Paul O. Hedman, L. Douglas Smoot. Release and reaction of fuel-nitrogen in a high-pressure entrained-coal gasifier. Fuel, 66, 1987.

[119] Norman Draper, Harry Smith. Applied Regression Analysis. John Wiley and Sons, 1998.

[120] William Oberkampf, F. G. Blottner, D. P. Aeschliman. Methodology for computational fluid dynamics code verification/validation. 1995.

[121] William Oberkampf. A proposed framework for computational fluid dynamics code calibration/validation. 1994.

[122] William Oberkampf. What are validation experiments?. Experimental Techniques:35-40, 2001.

[123] Ognyan Kounchev. Multivariate Polysplines: Applications to Numerical and Wavelet Analysis. Academic Press, 2001.

[124] Naomi Oreskes, Kristin Shrader-Frechette, Kenneth Belitz. Verification, validation, and confirmation of numerical models in earth sciences. Science, 263(5147):641-646, 1994.

[125] P. J. Roache. Quantification of uncertainty in computational fluid dynamics. Annual Review of Fluid Mechanics, 29:123-160, 1997.

[126] Dale K. Pace. Naval Modeling and Simulation Verification, Validation, and Accreditation. 1993.

[127] Paul Davis. Generalizing concepts and methods of verification, validation, and accreditation (VVA) for military simulations. 1992.

[128] Paul Humphreys. Computational Models. Philosophy of Science, 69(3):1-11, 2002.

[129] Wolfgang Pauli. Probability and Physics. Dialectica, 8:112-124, 1954.

[130] Pavel Shevchenko. Bayesian Operational Risk Using Bayesian Inference. Springer, 2011.

[131] Peter Congdon. Bayesian Models for Categorical Data. John Wiley and Sons, 2005.

[132] Philp R. Bevington, D. Keith Robinson. Data Reduction and Error Analysis. McGraw-Hill, 2003.

[133] R. L. Plackett, J. P. Burman. The Design of Optimum Multifactorial Experiments. Biometrika, 33(4):305-325, 1946.

[134] Stephen Pope. Turbulent Flows. Cambridge University Press, 2000.

[135] Karl Popper. Quantum Theory and the Schism in Physics. Rowman and Littlefield, 1956, 1982.

[136] William H. Press, Saul A. Teukolsky. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, 1992.

[137] R. A. Bates, R. J. Buck, E. Riccomango, H. P. Wynn. Experimental Design and Observation for Large Systems. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58(1):77-94, 1996.

[138] Randall J. McDermott. Toward One-Dimensional Turbulence Subgrid Closure for Large-Eddy Simulation. 2005.

[139] Mary W. Rasband. PCGC-2 and "The Data Book": A concurrent analysis of data reliability and code performance. 1988.

[140] Patrick J. Roache. Perspective: validation - what does it mean?. Journal of Fluids Engineering, 131:0345031-0345034, 2009.

[141] Patrick J. Roache. Verification and Validation in Computational Science and Engineering. Hermosa Publishers, 1998.

[142] Robert L. Mason, Richard F. Gunst, James L. Hess. Statistical Design and Analysis of Experiments. Wiley-Interscience, 2003.

[143] Robert Wilson, Fred Stern, Hugh Coleman, Eric Paterson. Comprehensive approach to verification and validation of CFD simulations - part 2: application for RANS simulation of a cargo/container ship. Journal of Fluids Engineering, 123:803-810, 2001.

[144] Christopher J. Roy. Review of code and solution verification procedures for computational simulation. Journal of Computational Physcis, 205:131-156, 2005.

[145] Bertrand Russell. Recent Work on the Principles of Mathematics. International Monthly, 4, 1901.

[146] Ryan Feeley, Michael Frenklach, Matt Onsum, Trent Russi, Adam Arkin, Andrew Packard. Model discrimination using data collaboration. J. Phys. Chem. A, 110:6803-6813, 2006.

[147] Ryan Feeley, Pete Seiler, Andrew Packard, Michael Frenklach. Consistency of a Reaction Dataset. J. Phys. Chem. A, 108:9573-9583, 2004.

[148] Ryan Patrick Feeley. Fighting the Curse of Dimensionality: A method for model validation and uncertainty propagation for complex simulation models. 2008.

[149] C. D. Gelatt S. Kirkpatrick, M. P. Vecchi. Optimization by Simulated Annealing. Science, 220(4598):671-680, 1983.

[150] Pierre Sagaut. Large Eddy Simulation for Incompressible Flows. Springer, 2006.

[151] Kambiz Salari, Patrick Knupp. Code verification by the method of manufactured solutions. 2000.

[152] Thomas Santner, Brian Williams, William Notz. The Design and Analysis of Computer Experiments. Springer-Verlag, 2003.

[153] Saul I. Gass. Decision-aiding models: validation, assessment, and related issues for policy analysis. Operations Research, 31(4):603-631, 1983.

[154] T. Simpson. A Letter to the Right Honourable George Earl of Macclesfield, President of the Royal Society, on the Advantage of Taking the Mean of a Number of Observations, in Practical Astronomy: By T. Simpson, F. R. S. Philosophical Transactions of the Royal Society (1683-1775), 49:82-83, 1755.

[155] I W Smith. The combustion rates of coal chars: a review. Nineteenth Symposium (International) on Combustion, 19, 1982.

[156] Joseph D. Smith. A detailed evaluation of comprehensive simulation software describing pulverized-coal combustion and gasification using advanced sensitivity analyses and techniques. 1990.

[157] Philip J. Smith, Thomas H. Fletcher, L. Douglas Smoot. Prediction and measurement of entrained flow coal gasification processes. Volume II. User's Manual for a computer program for 2-dimensional coal gasification or combustion (PCGC-2). Final report, 8 September 1983-28. 1985.

[158] L. Douglas Smoot, Blaine W. Brown. Controlling mechanisms in gasification of pulverized coal. Fuel, 66:1249-1256, 1987.

[159] L. Douglas Smoot, Philip J. Smith. Coal Combustion and Gasification. Plenum Press, 1985.

[160] L. Douglas Smoot, David T. Pratt. Pulverized-Coal Combution and Gasification. Plenum Press, 1979.

[161] Nicholas R. Soelberg, L. Douglas Smoot, Paul O. Hedman. Entrained flow gasification of coal: 1. Evaluation of mixing and reaction processes from local measurements. Fuel, 64:776-781, 1985.

[162] William Sowa. The Effect of Injector Design on the Performance of the Brigham Young University Gasifier. 1987.

[163] Jennifer Spinti, J. N. Thornock, E. G. Eddings, P. J. Smith, A. F. Sarofim. Transport Phenomena in Fires. WIT Press, 2008.

[164] Stephen Boyd. Convex Optimization. Cambridge University Press, 2004.

[165] Fred Stern, Robert Wilson, Hugh Coleman, Eric Paterson. Comprehensive approach to verification and validation of CFD simulations - part 1: methodology and procedures. Journal of Fluids Engineering, 123:793-802, 2001.

[166] Fred Stern, Robert Wilson, Hugh Coleman, Eric Paterson. Comprehensive approach to verification and validation of CFD simulations - part 1: methodology and procedures: Discussion. Journal of Fluids Engineering, 124(809-811), 2002.

[167] Steven Chapra, Raymond Canale. Numerical Methods for Engineers. McGraw-Hill, 2005.

[168] Stephen M. Stigler. The History of Statistics: The Measurement of Uncertainty Before 1900. The Belknap Press of Harvard University Press, 1986.

[169] John Strikwerda. Finite difference schemes and partial differential equations. SIAM, 2004.

[170] T.W. Simpson, J.D. Poplinski, P. N. Koch, J.K. Allen. Metamodels for Computer-based Engineering Design: Survey and recommendations. Engineering with Computers, 17(2):129-150, 2001.

[171] Tennekes, Lumley. A First Course in Turbulence. MIT Press, 1972.

[172] Thomas Naylor, J. M. Finger, James L. McKenney, William E. Schrank, Charles C. Holt. Verification of computer simulation models. Management Science, 14(2), 1967.

[173] Timothy G. Trucano, Martin Pilch, William L. Oberkampf. General concepts for experimental validation of ASCI code applications. 2002.

[174] Trent Russi, Andrew Packard, Ryan Feeley, Michael Frenklach. Sensitivity analysis of uncertainty in model prediction. J. Phys. Chem. A, 112:2579-2588, 2008.

[175] Trent Russi, Michael Frenklach, Andrew Packard. Uncertainty quantification: making predictions of complex reaction systems reliable. Chemical Physics Letters, 499:1-8, 2010.

[176] Trent Russi. Uncertainty quantification with experimental data and complex system models. 2010.

[177] S.K. Ubhayakar, D.B. Stickler, C.W. Von Rosenberg, R.E. Gannon. Rapid Devolatilization of Pulverized Coal in Hot Combustion Gases. Sixteenth Symposium (International) on Combustion, 16:427-436, 1976.

[178] W. V. Quine. Ontological Relativity. Columbia University Press, 1977.

[179] W F Wells, S K Kramer, L D Smoot, A U Blackham. Reactivity and Combustion of Coal Chars. Twentieth Symposium (International) on Combustion, 20, 1984.

[180] Wendy S. Parker. Computer simulation through an error-statistical lens. Synthese, 163:371-384, 2008.

[181] A. N. Whitehead, Bertrand Russel. Principia Mathematica. University Press, 1912.

[182] William J. Welch, Robert J. Buck, Jerome Sacks, Henry Wynn. Screening, Predicting, and Computer Experiments. Tecnometrics, 34(1):15-25, 1992.

[183] William L. Oberkampf, Matthew F. Barone. Measures of agremeent between computation and experiment: validation metrics. 2005.

[184] William L. Oberkampf, Timothy G. Trucano, Charles Hirsch. Verification, validation, and predictive capability in computational engineering and physics. Appl. Mech. Rev., 57(5):345-384, 2004.

[185] William L. Oberkampf, Timothy G. Trucano. Verification and validation benchmarks. Nuclear Engineering and Design, 238:716-743, 2008.

[186] William L. Oberkampf, Timothy G. Trucano. Verification and validation in computational fluid dynamics. Progress in Aerospace Sciences, 38:209-272, 2002.

[187] William L. Oberkampf. Bibliography for Verification and Validation in Computational Simulation. 1998.

[188] Douglas L. Wright, Robert McGraw, Daniel E. Rosner. Bivariate Extension of the Quadrature Method of Moments for Modeling Simultaneous Coagulation and Sintering of Particle Populations. Journal of Colloid and Interface Sciences, 236:242-251, 2001.

[189] J. Xu, S. B. Pope. Assessment of numerical accuracy of PDF/Monte Carlo methods for turbulent reacting flows. Journal of Computational Physcis, 152:192-230, 1999.

[190] Choongseok Yoon, Robert McGraw. Representation of generally mixed multivariate aerosols by the quadrature method of moments: I. Statistical foundation. Journal of Aerosol Science, 35:561-576, 2004.

[191] Jerrold M. Yos. Transport Properties of Nitrogen, Hydrogen, Oxygen, and Air to 30,000 K. 1963.

[192] G. A. Young, R. L. Smith. Essentials of Statistical Inference. Cambridge University Press, 2005.

[193] L. A. Zadeh. Fuzzy sets. Information and control, 8(3):338-353, 1965.

\clearpage

\makeappendices

8 GOVERNING EQUATIONS

The primary governing equations for turbulent reacitng flow are the species continuity equations, the momentum equation, and the energy equation. These equations are written for a general single-phase formulation, then extended to apply to dilute particle systems.

8.1 Reynolds Transport Theorem

The Reynolds Transport Theorem is the general starting point for deriving a partial differential equation to describe changes in an intensive quantity ψ(x, t). The balance equation over a differential control volume with volume δV can be written by expanding the substantial derivative operator:

(D)/(Dt)(⌠⌡_δV(t)ρψ(x, t)dV) = ⌠⌡_V(t)⎛⎝(Dρψ(x, t))/(Dt) + ρψ(x, t)(∇⋅u)⎞⎠dV = ⌠⌡_V(t)⎛⎝(∂ρψ)/(∂t) + u∇(ρψ) + (ρψ)∇⋅u⎞⎠dV (9.1) = ⌠⌡_V(t)⎛⎝(∂ρψ)/(∂t) + ∇⋅(ρuψ)⎞⎠dV = ⌠⌡_V(t)(∂ρψ)/(∂t)dV + ⌠⌡_V(t)∇⋅(uψ)dV = ⌠⌡_V(t)(∂ρψ)/(∂t)dV + \ointop_Sn⋅uρψdS (9.2) = ⌠⌡_V(t)ρS_ψdV

where S_ψ is a source term representing the net generation of ψ (and is intensive).

Combining equations (9.1↑) and (9.1↑) leads to an instantaneous partial differential equation:

(9.5) (∂ρψ)/(∂t) + ∇⋅(uρψ) = ρS_ψ.

8.2 Continuity Equation

When the quantity ψ = 1, the Reynolds Transport Theorem yields continuity equations. A subset of these are the species continuity equations. These are obtained from the continuity equation by letting ψ = ω_i, the mass fraction of species i. For n species, n − 1 species continuity equations are independent, since ∑_kω_k = 1.

8.2.1 Single Phase

The species continuity equations are:

(9.6) (∂ρ_i)/(∂t) + ∇⋅(u_iρ_i) = ρ_iS_i

where the subscript i denotes the i^th species, quantity ρ_i is the mass density of species i, with units (mass of i)/(volume), u_i is the velocity of species i, and S_i is a source or sink term for the mass density of i due to chemical reactions. If the species continuity equations for all species are added, the overall continuity equation is obtained, which is equal to:

(9.7) (∂ρ)/(∂t) + ∇⋅(uρ) = 0

where u is the mass-mean velocity vector [354]. The net (across all species) mass source term for the gas phase ρS = ∑^N_species_i = 1ρ_iS_i = 0, due to conservation of mass.

8.2.2 Multiple Phases

For multiphase systems, a set of species continuity equations must be written for each phase. In this case, the mass source term only sums to 0 across all species and across all phases, ρS = ∑^N_phases_p = 1∑^N_species_i = 1ρ_piS_pi = 0. Denoting the volume fraction of phase p by φ_p, he species continuity equation for a phase p and a species i is:

(9.8) (∂φ_pρ_pi)/(∂t) + ∇⋅(φ_pu_piρ_pi) = φ_pρ_piS_pi

making the overall continuity equation for phase p:

(9.9) (∂φ_pρ_p)/(∂t) + ∇⋅(φ_pu_pρ_p) = φ_pρ_pS_p.

For the case of dilute particle systems such as pulverized coal, the gas volume fraction φ_gas ≈ 1. In this case, the continuity equation for the gas phase can be written:

(9.10) (∂ρ)/(∂t) + ∇⋅(uρ) = ρS

where ρS is a net mass source term representing the mass entering the gas phase and released by the solid phase (e.g. devolatilization or evaporation processes).

8.3 Probability Density Function

8.3.1 Definition

The probability density function (PDF) denotes the density of the probability of a random variable having a particular value at a particular point in its corresponding sample space. Statistical descriptions of turbulence make use of the PDF to describe the probability of a random field (or fields) in a given domain. The PDF of a random variable (say, φ) is given by:

(9.11) p_φ(ψ)dψ = P{ψ < φ < ψ + dψ}

where ψ is the sample space variable corresponding to the random variable φ. If the random variable’s value is a function of space and time, the PDF is denoted by

(9.12) p_φ(ψ;x, t).

It is often desirable to describe the probability of several random variables, rather than a single variable. In this case, the PDF describes the probability density of a vector of random values φ, and is called a joint PDF. The joint PDF of a two-variable system is given by:

(9.13) p_{φ₁, φ₂}(ψ₁, ψ₂)dψ₁dψ₂ = P{ψ₁ < φ₁ < ψ₁ + dψ₁∩ψ₂ < φ₂ < ψ₂ + dψ₂}.

Likewise, for the case of an arbitrary number of scalars, the joint PDF is given by:

(9.14) p_φ(ψ)dψ = P{ψ₁ < φ₁ < ψ₁ + dψ₁∩…∩ψ_n < φ_n < ψ_n + dψ_n}.

And furthermore a joint PDF of scalars and velocity can be written for a random variable u given a velocity sample space v,

(9.15) p_uφ(v, ψ)dvdψ = P{v₁ < u₁ < v₁ + dv₁∩…∩ψ_n < φ_n < ψ_n + dψ_n}

The PDF can be used to obtain various moments, including the mean, of a function; for example, the expected value of an arbitrary function of a number of random variables Q(φ), given the joint PDF of φ, p_φ, is:

(9.16) ⟨Q(φ)⟩ = \dotsintop^+ ∞_− ∞Q(ψ) p_φ(ψ) dψ.

Generally, the k^th moment of a function Q(φ) is given by:

(9.17) m_k = \dotsintop^+ ∞_− ∞Q(ψ)^k p_φ(ψ) dψ.

8.3.2 PDF Transport Equation

The transport equation for a PDF of an arbitrary number of random variables (u, φ) can be derived in the manner of Pope[355, 356], by equating the time derivative of an arbitrary function of all random variables (u, φ). In this section, the spatial and temporal dependence of the probability distribution function will be excluded but implied.

The expectation of the time derivative of an arbitrary function Q(u, φ), using 9.16↑, is written (assuming the gas density ρ is independent of the random variables, and where the operator ⟨⟩ denotes the expectation operator) as:

(9.18) ⟨(DQ(u, φ))/(Dt)⟩ = \dotsint(D[Q(v, ψ)p_uφ(v, ψ)])/(Dt)dvdψ = \dotsint⎛⎝(∂[Q(v, ψ)p_uφ(v, ψ)])/(∂t) + (∂)/(∂x_i)[Q(v, ψ)v_ip_uφ]⎞⎠dvdψ = \dotsintopQ(v, ψ)⎛⎝(∂p_uφ)/(∂t) + (∂)/(∂x_i)[v_ip_uφ]⎞⎠dvdψ

which is the expectation of Q(u, φ) given the joint PDF of (u, φ).

A second expression for the same quantity can be written using the chain rule:

(9.19) ⟨(DQ(u, φ))/(Dt)⟩ = ⟨(∂Q)/(∂u_j)(Du_j)/(Dt)⟩ + ⟨(∂Q)/(∂φ_k)(Dφ_k)/(Dt)⟩

The transport equations for each random variable can be written as

(9.20) (Du_j)/(Dt) = A_j, (Dφ_k)/(Dt) = G_k,

and 9.19↑ becomes

(9.21) ⟨(DQ)/(Dt)⟩ = ⟨(∂Q)/(∂u_j)A_j⟩ + ⟨(∂Q)/(∂φ_k)G_k⟩.

These u-space and φ-space convection terms can be written conditioned on the value of (v, ψ), and by integrating over the sample space (following equation (9.16↑)), they become:

(9.22) ⟨(∂Q)/(∂u_j)A_j⟩ = \dotsint⟨(∂Q)/(∂u_j)A_j∣v, ψ⟩p_uφdvdψ = \dotsint(∂Q)/(∂v_j)⟨A_j∣v, ψ⟩p_uφdvdψ

where the time derivative can be taken out of the conditional, since the value of Q(u = v, φ = ψ) is a known function and does not need to be written as conditional on the value of v. Using the chain rule, the quantity inside the integral can be re-expressed as:

(9.23) ⟨(∂Q)/(∂u_j)A_j⟩ = \dotsint⎡⎣(∂)/(∂v_j)(Q⟨A_j∣v, ψ⟩p_uφ) − Q(∂)/(∂v_j)(⟨A_j∣v, ψ⟩p_uφ)⎤⎦dvdψ

It is shown in [357] that the first term is zero for functions that are monotonic at ∞ and for which ⟨A_jQ(u, φ)⟩ exists. Assuming the arbitrary function Q satisfies these conditions, this expression finally becomes:

(9.24) ⟨(∂Q)/(∂u_j)A_j⟩ = − \dotsint⎡⎣Q(∂)/(∂u_j)(⟨A_j∣v, ψ⟩p_uφ)⎤⎦dvdψ.

Doing the same for the scalar term:

(9.25) ⟨(∂Q)/(∂φ_k)G_k⟩ = \dotsint⟨(∂Q)/(∂ψ_k)G_k∣ψ⟩p_uφdvdψ = \dotsint(∂Q)/(∂ψ_k)⟨G_k∣ψ⟩p_uφdvdψ

and using the chain rule again,

(9.26) ⟨(∂Q)/(∂φ_k)G_k⟩ = \dotsint⎡⎣(∂)/(∂ψ_k)(Q⟨G_k∣v, ψ⟩p_uφ) − Q(∂)/(∂ψ_k)(⟨G_k∣v, ψ⟩p_uφ)⎤⎦dvdψ.

Using the same assumption about Q, the first term goes to zero, yielding:

(9.27) ⟨(∂Q)/(∂φ_k)G_k⟩ = − \dotsint⎡⎣Q(∂)/(∂ψ_k)(⟨G_k∣v, ψ⟩p_uφ)⎤⎦dvdψ.

These expressions can now be used in equation 9.21↑ to write ⟨(DQ)/(Dt)⟩ as:

(9.28) ⟨(DQ)/(Dt)⟩ = − \dotsintQ⎡⎣(∂)/(∂v_j)(⟨A_j∣v, ψ⟩p_uφ) + (∂)/(∂ψ_k)(⟨G_k∣v, ψ⟩p_uφ)⎤⎦dvdψ.

Finally, 9.18↑ and 9.28↑ can be combined:

(9.29) \dotsintQ(v, ψ)⎡⎣(∂p_uφ(v, ψ))/(∂t) + (∂)/(∂x_j)(v_jp_uφ(v, ψ)) + (∂)/(∂u_j)(⟨A_j∣v, ψ⟩p_uφ(v, ψ))⎤⎦dvdψ = − \dotsintQ(v, ψ)⎡⎣(∂)/(∂ψ_k)(⟨G_k∣v, ψ⟩p_uφ(v, ψ))⎤⎦dvdψ

which leads to the PDF transport equation:

(9.30) (∂p_uφ(v, ψ))/(∂t) + (∂)/(∂x_j)(v_jp_uφ(v, ψ)) = − (∂)/(∂u_j)(⟨A_j∣v, ψ⟩p_uφ(v, ψ)) − (∂)/(∂ψ_k)(⟨G_k∣v, ψ⟩p_uφ(v, ψ)).

8.3.3 Filtered PDF Transport Equation

The large eddy simulation turbulence model isformulated by filtering governing equations using a low-pass filter, so that the smallest scales of the flow are not resolved. This operation gives rise to unclosed “subgrid” terms representing the effects of the filtered scales, which must be modeled. Filtering the number density function similarly leads to a loss of information about the NDF.

Applying the filtering operation to the PDF transport equation, and commuting the filter inside derivatives, yields:

(9.31) (∂\widetildep_uφ(v, ψ))/(∂t) + (∂)/(∂x_j)(\widetildev_jp_uφ(v, ψ)) = − (∂)/(∂u_j)(\widetilde⟨A_j∣v, ψ⟩p_uφ(v, ψ)) − (∂)/(∂ψ_k)(\widetilde⟨G_k∣v, ψ⟩p_uφ(v, ψ)).

Next, splitting and rearranging terms yields:

(9.32) (∂\widetildep_uφ)/(∂t) + (∂)/(∂x_j)(\widetildev_j\widetildep_uφ) + (∂)/(∂x_j)(\widetildev_jp_uφ − \widetildev_j\widetildep_uφ) = − (∂)/(∂u_j)(\widetilde⟨A_j∣v, ψ⟩\widetildep_uφ) − (∂)/(∂u_j)(\widetilde⟨A_j∣v, ψ⟩\widetildep_uφ − \widetilde⟨A_j∣v, ψ⟩p_uφ) − (∂)/(∂ψ_k)(\widetilde⟨G_k∣v, ψ⟩\widetildep_uφ) − (∂)/(∂ψ_k)(\widetilde⟨G_k∣v, ψ⟩\widetildep_uφ − \widetilde⟨G_k∣v, ψ⟩p_uφ).

(9.33) (∂\widetildep_uφ)/(∂t) + (∂)/(∂x_j)(\widetildev_j\widetildep_uφ) + = − (∂)/(∂u_j)(\widetilde⟨A_j∣v, ψ⟩\widetildep_uφ) − (∂)/(∂ψ_k)(\widetilde⟨G_k∣v, ψ⟩\widetildep_uφ) + (∂τ_sgs, j)/(∂x_j) + (∂τ_{sgs, u_j})/(∂u_j) + (∂τ_{sgs, ψ_k})/(∂ψ_k)

where the subgrid scalar fluxes, denoted with τ, are defined as:

(9.34) τ_sgs, j = \widetildev_j\widetildep_uφ − \widetildev_jp_uφ (9.35) τ_{sgs, u_j} = \widetilde⟨A_j∣v, ψ⟩p_uφ − \widetilde⟨A_j∣v, ψ⟩\widetildep_uφ (9.36) τ_{sgs, ψ_k} = \widetilde⟨G_k∣v, ψ⟩p_uφ − \widetilde⟨G_k∣v, ψ⟩\widetildep_uφ.

and can be modeled using a gradient diffusion model (superscript m denotes modeled):

(9.37) τ_{sgs, u_j} = Γ^sgs_{u_j}(∂p_uφ)/(∂u_j) τ_{sgs, ψ_k} = Γ^sgs_{ψ_k}(∂p_uφ)/(∂ψ_k)

with the subgrid diffusivities usually modeled using a Prandtl number or Schmidt number approach, that is,

(9.38) Pr^sgs_{ψ_k} = (μ^sgs)/(Γ^sgs_{ψ_k})

where μ^sgs is the subgrid scale viscosity.

9 MOMENTS

9.1 Definition

Ultimately, the NDF must be tracked in a computational fluid dynamics (CFD) code. Nearly every CFD code is designed to run in a scalar framework; all higher-order vectors, tensors, etc. must ultimately be expressed as a set of scalars in order to be tracked in existing CFD codes. Thus, the NDF must be decomposed into a set of scalars that characterize it. One such set, the moments of the NDF, describe statistical characteristics of the distribution; the distribution can ultimately be re-constructed from its moments. The ↓k^th integer moment ↓m_k of a univariate NDF f(ξ;x, t) is defined in terms of the probability density function P_ξ^⋆(ξ), then the number density function f(ξ;x, t), as:

(10.1) m_k = ⌠⌡^+ ∞_− ∞ξ^kP_ξ^⋆(ξ)dξ = (^+ ∞⌠⌡_− ∞ξ^kf(ξ;x, t)dξ)/(⌠⌡^+ ∞_− ∞f(ξ;x, t)dξ)

Physically, this can be interpreted as the expectation of ξ^k. Thus, the first moment is simply interpreted as the mean value of ξ; the second moment interpreted as the standard deviation of ξ, the third moment the kurtosis, the fourth moment the skewness, etc. If the internal coordinate is the particle diameter L, the 1^st moment is physically interpreted as the mean value of the particle diameter L; the 2^nd moment is proportional to the surface area, L²; and so on.

The moments ↓m_k of each internal coordinate of a multivariate NDF f(ξ;x, t) are defined over all the internal coordinates as:

(10.2) m_k = (\dotsintop^+ ∞_∞ ξ^k₁₁…ξ^k_{N_ξ}_{N_ξ}f(ξ₁, …, ξ_{N_ξ};x, t) dξ₁…dξ_{N_ξ})/(\dotsintop^+ ∞_− ∞ f(ξ₁, …, ξ_{N_ξ};x, t)dξ₁…dξ_{N_ξ}) = (\dotsintop^+ ∞_− ∞[(^N_ξ∏_m = 1ξ^k_m_m) f(ξ;x, t) dξ])/(\dotsintop^+ ∞_− ∞ f(ξ;x, t) dξ)

where the integer vector k is the moment index vector for the k^th (multivariate) moment, defined by ↓↓k = [k₁, k_2,⋯, k_{N_s}], k_i is the i^th index of the k^th moment (corresponding to the i^th internal coordinate), and N_ξis the number of internal coordinates.

9.2 Method of Moments for NDF Transport

The method of moments is a method of tracking the NDF of a system of particles. Because the NDF is a full, continuous distribution, it is difficult to track without assuming a functional form for it. Rather than assume a functional form, the moments of the NDF, which are simply scalars, are tracked instead. This method requires tracking various scalars, which is computationally feasible in a scalar framework and which greatly simplifies the process of tracking the NDF. However, the approach has a closure problem that prevents it from being used in practice for any but the most simple systems.

The transport equation for each moment must be written in terms of higher order moments, and the transport equations for these higher order moments must be written in terms of successively higher order moments, etc. Simplifications (models) must be used to express higher order moments only in terms of lower order moments being tracked as a part of the method of moments. Once this is accomplished, the set of moment transport equations becomes a closed set of equations.

9.3 Moment Transport Equation Derivation

The moment transport equation can be derived by applying the moment definition to the NDF transport equation (3.22↑). This is done by multiplying the entire NDF transport equation by ∏^N_s_j = 1ξ^k_j_j and integrating over the domain Ω of all internal coordinate values. First, multiplying by the product of the internal coordinates yields:

(10.5) (^N_ξ∏_m = 1ξ^k_m_m)(∂f)/(∂t) + (^N_ξ∏_m = 1ξ^k_m_m)³⎲⎳_i = 1(∂)/(∂x_i)(u_i, f(ξ;x, t) f) + (^N_ξ∏_m = 1ξ^k_m_m)^N_ξ⎲⎳_j = 1(∂)/(∂ξ_j)(G_i, f(ξ;x, t) f) = (^N_ξ∏_m = 1ξ^k_m_m)h(ξ;x, t)

Next, that product can be taken into the derivatives of the first two terms, because all of the coordinates involved (ξ, x, t) are orthogonal and independent. However, bringing the product of internal coordinates into the derivative in front of the third term involves derivatives of internal coordinates with respect to themselves, meaning the product does not commute into the derivative in the same way.

(10.6) (∂)/(∂t)[(^N_ξ∏_m = 1ξ^k_m_m)f] + ⎲⎳³_i = 1(∂)/(∂x_i)[(^N_ξ∏_m = 1ξ^k_m_m)u_i, f(ξ;x, t) f] + (^N_ξ∏_m = 1ξ^k_m_m)⎲⎳^N_ξ_j = 1(∂)/(∂ξ_j)[G_j, f(ξ;x, t) f] = (^N_ξ∏_m = 1ξ^k_m_m)h(ξ;x, t)

Integrating over Ω (and dropping the dependencies on ξ, x, and t for simplicity):

(10.7) (∂)/(∂t)(⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m) f dξ]) + ⎲⎳³_i = 1(∂)/(∂x_i)(⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)u_i, f f]dξ) + ⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)⎲⎳^N_ξ_j = 1(∂)/(∂ξ_j)(G_j, f f)]dξ = ⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)h dξ]

Now the third term must be broken up as:

(10.8) ⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)⎲⎳^N_ξ_j = 1(∂)/(∂ξ_j)(G_j, f f)]dξ = ⎲⎳^N_ξ_j = 1(∂)/(∂ξ_j)(⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)G_j, f f]dξ) − ⎲⎳^N_ξ_j = 1(⌠⌡_Ω[G_j, f f (∂ξ^k_j_j)/(∂ξ_j)(∏^N_ξ_{m = 1, m ≠ j}ξ^k_m_m)]dξ)

where k_j is the index corresponding to internal coordinate ξ_j for the multivariate moment m_k. For a given moment, if there is no ξ_j (that is, if the index of the moment corresponding to the j^th internal coordinate k_j = 0) then the second term in (10.8↑) will be zero. Substituting this:

(10.9) (∂)/(∂t)(⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m) f dξ]) + ⎲⎳³_i = 1(∂)/(∂x_i)(⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)u_i(ξ;x, t) f]dξ) = − ⎲⎳^N_ξ_j = 1(∂)/(∂ξ_j)(⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)G_j, f f]dξ) + ⎲⎳_j⎛⎝⌠⌡_Ω⎡⎣G_j, f f (∂ξ^k_j_j)/(∂ξ_j)(∏^N_ξ_{m = 1, m ≠ j}ξ^k_m_m)⎤⎦dξ⎞⎠ + ⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)h dξ]

Now, using the definition of the multivariate moment (10.2↑) yields:

(10.10) (∂)/(∂t)(m_k) + ⎲⎳³_i = 1(∂)/(∂x_i)(⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)u_i, f f dξ]) = − ⎲⎳^N_ξ_j = 1(∂)/(∂ξ_j)(⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)G_j, f f]dξ) + ^N_ξ⎲⎳_j = 1⎛⎝⌠⌡_Ω⎡⎣G_j, f f (∂ξ^k_j_j)/(∂ξ_j)(∏^N_ξ_{m = 1, m ≠ j}ξ^k_m_m)⎤⎦dξ⎞⎠ + ⌠⌡_Ω[(^N_ξ∏_m = 1ξ^k_m_m)h]dξ

The terms on the left-hand side are related to the spatial and temporal changes of each moment, while the right-hand side is related to the changes in phase-space. This is the general form of the multivariate moment transport equation.

10 QUADRATURE-APPROXIMATED NUMBER DENSITY FUNCTION TRANSPORT EQUATION

This appendix starts with the quadrature approximation and the univariate and multivariate NDF transport equations. It then proceeds to derive the quadratuer-approximated NDF transport equations. These equations are an important piece of the DQMOM formulation. In Appendix D, the moment transform of these qudarature-approximated NDF transport equations is taken, which yields a set of independent linear equations. In Appendix E, these independent linear equations are formulated in matrix form, and this matrix is a key component of the DQMOM algorithm.

10.1 Univariate Quadrature-Approximated NDF Transport Equation

This section presents a rigorous derivation of the univariate and multivariate weight and weighted abscissa transport equations and univariate quadrature-approximated NDF transport equation. The derivation starts with the univariate NDF transport equation, given as:

(11.1) (∂f(ξ;x, t))/(∂t) + (∂)/(∂x_i)(v_if(ξ;x, t)) = − (∂)/(∂ξ)(⟨G∣ξ⟩f(ξ;x, t)).

Next, the univariate quadrature approximation (3.30↑) is substituted into this equation, as are the environment-averaged physical space and phase space velocities ⟨v_i⟩_α (equation 3.36↑) and ⟨G⟩_α (equation 3.38↑). Grouping spatial and temporal derivatives on the left hand side, this yields:

(11.2) ⎲⎳^N_α = 1(∂)/(∂t)(w_αδ(ξ − ⟨ξ⟩_α)) + ⎲⎳^N_α = 1(∂)/(∂x_i)(⟨v_i⟩_αw_αδ(ξ − ⟨ξ⟩_α)) − ⎲⎳^N_α = 1³⎲⎳_i = 1(∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_αδ(ξ − ⟨ξ⟩_α))) = − ⎲⎳^N_α = 1(∂)/(∂ξ)(⟨G⟩_αw_αδ(ξ − ⟨ξ⟩_α)) + ⎲⎳^N_α = 1(∂)/(∂ξ)(Γ_ξ, α(∂)/(∂ξ)(w_αδ(ξ − ⟨ξ⟩_α)))

Because the right side of the equation does not contain derivatives or integrals with respect to x or t, so those terms can be replaced with a source term representing phase-space convection and phase-space diffusion, also defined in equations (3.57↑) and (3.58↑):

(11.3) S_ξ = S_ξ = − ^N⎲⎳_α = 1(∂)/(∂ξ)(⟨G⟩_αw_αδ(ξ − ⟨ξ⟩_α)) (11.4) D_ξ = ⎲⎳^N_α = 1(∂)/(∂ξ)⎛⎝Γ_ξ, α(∂)/(∂ξ)(w_αδ(ξ − ⟨ξ⟩_α))⎞⎠.

This yields a simplified form of (11.2↑):

(11.5) ⎲⎳_α(∂)/(∂t)(w_αδ(ξ − ⟨ξ⟩_α)) + ⎲⎳_α(∂)/(∂x_i)(⟨u_i⟩_αw_αδ(ξ − ⟨ξ⟩_α)) − ⎲⎳_α(∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_αδ(ξ − ⟨ξ⟩_α))) = S_ξ + D_ξ

Next, each of these terms can be split up individually, starting with the first term (the summation over α is implied from this point on; note that the equations that follow are single equations, and are only valid when summed over all α’s, and are not valid for individual α’s):

(∂)/(∂t)(w_αδ(ξ − ⟨ξ⟩_α)) = w_α(∂)/(∂t)(δ(ξ − ⟨ξ⟩_α)) + δ(ξ − ⟨ξ⟩_α)(∂w_α)/(∂t)

and using implicit differentiation to evaluate the derivative of the delta function,

(∂)/(∂t)(δ(ξ − ⟨ξ⟩_α)) = − δ^′(ξ − ⟨ξ⟩_α)(∂)/(∂t)(⟨ξ⟩_α),

the time derivative in (11.5↑) simplifies to:

(11.6) (∂)/(∂t)(w_αδ(ξ − ⟨ξ⟩_α)) = δ(ξ − ⟨ξ⟩_α)[(∂)/(∂t)(w_α)] − δ^′(ξ − ⟨ξ⟩_α)[w_α(∂)/(∂t)(⟨ξ⟩_α)].

Now the spatial derivative term can be split up using the definition of the delta function derivative and the chain rule:

(11.7) (∂)/(∂x_i)(⟨v_i⟩_αw_αδ(ξ − ⟨ξ⟩_α)) = w_α[⟨v_i⟩_α(∂)/(∂x_i)(δ(ξ − ⟨ξ⟩_α)) + δ(ξ − ⟨ξ⟩_α)(∂)/(∂x_i)(⟨v_i⟩_α)] + δ(ξ − ⟨ξ⟩_α)[⟨v_i⟩_α(∂)/(∂x_i)(w_α)] = w_α[ − ⟨v_i⟩_αδ^′(ξ − ⟨ξ⟩_α)(∂)/(∂x_i)(⟨ξ⟩_α) + δ(ξ − ⟨ξ⟩_α)(∂)/(∂x_i)(⟨v_i⟩_α)] + δ(ξ − ⟨ξ⟩_α)[⟨v_i⟩_α(∂)/(∂x_i)(w_α)] = δ(ξ − ⟨ξ⟩_α)[w_α(∂)/(∂x_i)(⟨v_i⟩_α) + ⟨v_i⟩_α(∂)/(∂x_i)(w_α)] − δ^′(ξ − ⟨ξ⟩_α)[⟨v_i⟩_αw_α(∂)/(∂x_i)(⟨ξ⟩_α)]

Finally, the diffusion term can be split up in a similar fashion:

(11.8) (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_αδ(ξ − ⟨ξ⟩_α))) = (∂)/(∂x_i)\Biggl(w_αΓ_{x_i, α}(∂)/(∂x_i)(δ(ξ − ⟨ξ⟩_α)) + δ(ξ − ⟨ξ⟩_α)Γ_{x_i, α}(∂)/(∂x_i)(w_α)\Biggr) = (∂)/(∂x_i)(w_αΓ_{x_i, α}(∂)/(∂x_i)(δ(ξ − ⟨ξ⟩_α))) + (∂)/(∂x_i)(δ(ξ − ⟨ξ⟩_α)Γ_{x_i, α}(∂)/(∂x_i)(w_α)) = 2Γ_{x_i, α}(∂)/(∂x_i)(w_α)(∂)/(∂x_i)(δ(ξ − ⟨ξ⟩_α)) + w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(δ(ξ − ⟨ξ⟩_α))) + δ(ξ − ⟨ξ⟩_α)(∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_α)).

Next, using properties of the delta function, the diffusion term becomes:

(11.9) (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_αδ(ξ − ⟨ξ⟩_α))) = δ^′(ξ − ⟨ξ⟩_α)[2Γ_{x_i, α}(∂)/(∂x_i)(w_α)(∂)/(∂x_i)(⟨ξ⟩_α)] + δ(ξ − ⟨ξ⟩_α)[(∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i))] + w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)δ^′(ξ − ⟨ξ⟩_α))

The last term can be split up as

(11.10) w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)δ^′(ξ − ⟨ξ⟩_α)) = w_αδ^′(ξ − ⟨ξ⟩_α)(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)) + w_αΓ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)(∂)/(∂x_i)(δ^′(ξ − ⟨ξ⟩_α)) = w_αδ^′(ξ − ⟨ξ⟩_α)(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)) + w_αδ^′′(ξ − ⟨ξ⟩_α)Γ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)(∂⟨ξ⟩_α)/(∂x_i),

so that the diffusion term finally becomes:

(11.11) (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_αδ(ξ − ⟨ξ⟩_α))) = δ(ξ − ⟨ξ⟩_α)[(∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i))] + δ^′(ξ − ⟨ξ⟩_α)[2Γ_{x_i, α}(∂)/(∂x_i)(w_α)(∂)/(∂x_i)(⟨ξ⟩_α)] + δ^′(ξ − ⟨ξ⟩_α)[w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i))] + δ^′′(ξ − ⟨ξ⟩_α)[w_αΓ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)(∂⟨ξ⟩_α)/(∂x_i)]

Now, plugging (11.6↑), (11.7↑), and (11.11↑) into (11.5↑), the quadrature-approximated NDF transport equation becomes:

(11.12) δ(ξ − ⟨ξ⟩_α)[(∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i))] − δ^′(ξ − ⟨ξ⟩_α)[w_α(∂)/(∂t)(⟨ξ⟩_α) + ⟨v_i⟩_αw_α(∂)/(∂x_i)(⟨ξ⟩_α) − w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)) − 2Γ_{x_i, α}(∂)/(∂x_i)(w_α)(∂)/(∂x_i)(⟨ξ⟩_α)] δ^′′(ξ − ⟨ξ⟩_α)[ − w_αΓ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)(∂⟨ξ⟩_α)/(∂x_i)] = + δ^′′(ξ − ⟨ξ⟩_α)[w_αΓ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)(∂⟨ξ⟩_α)/(∂x_i)] + S_ξ + D_ξ

This equation can be rewritten in terms of the weights w_α and the weighted abscissas ς_α using two identities. The first identity is for the accumulation term for ⟨ξ⟩_α:

(∂)/(∂t)(w_α⟨ξ⟩_α) = (∂ς_α)/(∂t) = w_α(∂⟨ξ⟩_α)/(∂t) + ⟨ξ⟩_α(∂w_α)/(∂t)

or, rearranging for terms that appear in (11.12↑):

(11.13) w_α(∂⟨ξ⟩_α)/(∂t) = (∂ς_α)/(∂t) − ⟨ξ⟩_α(∂w_α)/(∂t)

The second identity is for the convection term for ⟨ξ⟩_α:

(∂)/(∂x_i)(⟨v_i⟩_ας_α) = (∂)/(∂x_i)(⟨v_i⟩_αw_α⟨ξ⟩_α) = ⟨v_i⟩_αw_α(∂)/(∂x_i)(⟨ξ⟩_α) + ⟨ξ⟩_α(∂)/(∂x_i)(⟨v_i⟩_αw_α)

which can be rearranged to give:

(11.14) ⟨v_i⟩_αw_α(∂)/(∂x_i)(⟨ξ⟩_α) = (∂)/(∂x_i)(⟨v_i⟩_ας_α) − ⟨ξ⟩_α(∂)/(∂x_i)(⟨v_i⟩_αw_α)

Finally, the last identity for the diffusion term of ⟨ξ⟩_α is:

(∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(ς_α)) = 2Γ_{x_i, α}(∂w_α)/(∂x_i)(∂⟨ξ⟩_α)/(∂x_i) + ⟨ξ⟩_α(∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i)) + w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(⟨ξ⟩_α))

which can be rearranged to yield:

(11.15) 2Γ_{x_i, α}(∂w_α)/(∂x_i)(∂⟨ξ⟩_α)/(∂x_i) + w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)) = (∂)/(∂x_i)(Γ_{x_i, α}(∂ς_α)/(∂x_i)) + ⟨ξ⟩_α(∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i))

Plugging (11.13↑), (11.14↑), and (11.15↑) into (11.12↑) changes (11.5↑) into:

(11.16) δ(ξ − ⟨ξ⟩_α)[(∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_α))] − δ^′(ξ − ⟨ξ⟩_α)[(∂)/(∂t)(ς_α) + (∂)/(∂x_i)(⟨v_i⟩_ας_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(ς_α))] + δ^′(ξ − ⟨ξ⟩_α)⟨ξ⟩_α[(∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_α))] = + δ^′′(ξ − ⟨ξ⟩_α)w_αC_α + S_ξ + D_ξ

where C_α is a dissipation term, defined as C_α = Γ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)(∂⟨ξ⟩_α)/(∂x_i).

The source terms for the transport equations for the weights and weighted abscissas appear in this equation. Upon substituting a_α and b_α from:

(11.28) (∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i)) = a_α (11.29) (∂)/(∂t)(ς_α) + (∂)/(∂x_i)(⟨v_i⟩_ας_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂ς_α)/(∂x_i)) = b_α

into (11.16↑), a new form of the NDF transport equation is obtained:

(11.30) ⎲⎳^N_α = 1[δ(ξ − ⟨ξ⟩_α) + δ^′(ξ − ⟨ξ⟩_α)⟨ξ⟩_α]a_α − ⎲⎳^N_α = 1[δ^′(ξ − ⟨ξ⟩_α)]b_α = ⎲⎳^N_α = 1δ^′′(ξ − ⟨ξ⟩_α)w_αC_α + S_ξ + D_ξ

10.2 Multivariate Quadrature-Approximated NDF Transport Equation

The transport equations for the NDF weights and weighted abscissas can be derived starting with the multivariate NDF transport equation,

(11.31) (∂f(ξ;x, t))/(∂t) + (∂)/(∂x_i)(v_if(ξ;x, t)) = − (∂)/(∂ξ_j)(⟨G_j∣ξ⟩f(ξ;x, t)).

The internal coordinate v has been incorporated into the internal coordinate ξ for simplicity of notation. The quadrature approximation for the multivariate NDF (equation 3.35↑) is substituted into equation(11.31↑), the environment average velocities ⟨v_i⟩_α (equation3.36↑) and ⟨G_i⟩_α (equation 3.38↑) substituted, and spatial and temporal derivatives grouped on the right side of the equation, yielding:

(11.32) ⎲⎳^N_α = 1(∂)/(∂t)(w_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) + ^N⎲⎳_α = 1(∂)/(∂x_i)(⟨v_i⟩_αw_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) + ⎲⎳^N_α = 1(∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))) = − ^N⎲⎳_α = 1(∂)/(∂ξ_j)(⟨G_j⟩_αw_α∏^N_ξ_m = 1δ(ξ_m − ⟨ξ_m⟩_α)) + ⎲⎳^N_α = 1(∂)/(∂ξ_j)⎛⎝Γ_{ξ_j, α}(∂)/(∂ξ_j)(w_α∏^N_ξ_m = 1δ(ξ_m − ⟨ξ_m⟩_α))⎞⎠.

The terms on the right-hand side can be replaced with the terms:

(11.33) S_ξ = − ^N⎲⎳_α = 1(∂)/(∂ξ_j)(⟨G_j⟩_αw_α∏^N_ξ_m = 1δ(ξ_m − ⟨ξ_m⟩_α)) (11.34) D_ξ = ⎲⎳^N_α = 1(∂)/(∂ξ_j)⎛⎝Γ_{ξ_j, α}(∂)/(∂ξ_j)(w_α∏^N_ξ_m = 1δ(ξ_m − ⟨ξ_m⟩_α))⎞⎠.

Plugging this into (11.32↑) gives:

(11.35) ⎲⎳^N_α = 1(∂)/(∂t)(w_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) + ^N⎲⎳_α = 1(∂)/(∂x_i)(⟨v_i⟩_αw_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) + ⎲⎳^N_α = 1(∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))) = S_ξ + D_ξ.

The summation over α will be dropped and implied for all the following equations, with the same caveat that each equation is only true for the sum over all α’s and is not true for individual α’s. Each term can be split up, starting with the temporal derivative:

(∂)/(∂t)(w_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) = w_α(∂)/(∂t)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) + ∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)(∂)/(∂t)(w_α)

Implicit differentiation can be used to evaluate the derivative of the product of delta functions. It is expressed as:

(11.36) (∂)/(∂t)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) = − ^N_s⎲⎳_m = 1[(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) (∂)/(∂t)(∏^N_ξ_m = 1δ(ξ_m − ⟨ξ_m⟩_α))] = − ^N_ξ⎲⎳_m = 1[(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × ⎛⎝δ^′(ξ_m − ⟨ξ_m⟩_α)(∂)/(∂t)(⟨ξ_m⟩_α)⎞⎠⎤⎦

Using this, the temporal derivative can be expressed as:

(11.37) (∂)/(∂t)(w_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) = − ^N_ξ⎲⎳_m = 1\Biggl[w_α(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × (δ^′(ξ_m − ⟨ξ_m⟩_α)⟨ξ_m⟩_α)\Biggr] + (∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))(∂)/(∂t)(w_α)

The spatial derivative can also be split up:

(∂)/(∂x_i)(⟨v_i⟩_αw_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) = w_α(∂)/(∂x_i)(⟨v_i⟩_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) + (⟨v_i⟩_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)(∂)/(∂x_i)(w_α))

Next, each term of the spatial derivative will be rearranged. Each term can be treated individually. Starting with the first term:

w_α(∂)/(∂x_i)(⟨v_i⟩_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) = w_α[(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)(∂)/(∂x_i)(⟨v_i⟩_α)) + ⟨v_i⟩_α(∂)/(∂x_i)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))]

(11.36↑) can be used to get:

(11.38) w_α(∂)/(∂x_i)(⟨v_i⟩_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) = w_α(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))(∂)/(∂x_i)(⟨v_i⟩_α) − ⟨v_i⟩_αw_α^N_ξ⎲⎳_m = 1\Biggl[(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × ⎛⎝δ^′(ξ_m − ⟨ξ_m⟩_α)(∂)/(∂x_i)(⟨ξ_m⟩_α)⎞⎠\Biggr]

The second term is simpler, and only requires rearrangement:

(11.39) (⟨v_i⟩_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))(∂)/(∂x_i)(w_α) = (∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) × (⟨v_i⟩_α(∂)/(∂x_i)(w_α))

Finally, combining (11.38↑) and (11.39↑), the spatial derivative term in (11.35↑) can be written as:

(11.40) (∂)/(∂x_i)(⟨v_i⟩_αw_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) = (∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))(⟨v_i⟩_α(∂)/(∂x_i)(w_α)) + w_α(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))(∂)/(∂x_i)(⟨v_i⟩_α) − ⟨v_i⟩_αw_α^N_ξ⎲⎳_m = 1\Biggl[(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × ⎛⎝δ^′(ξ_m − ⟨ξ_m⟩_α)(∂)/(∂x_i)(⟨ξ_m⟩_α)⎞⎠\Biggr]

The diffusion term can also be broken up:

(11.41) (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))) = (∂)/(∂x_i)(Γ_{x_i, α}w_α(∂)/(∂x_i)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))) + (∂)/(∂x_i)(Γ_{x_i, α}∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)(∂w_α)/(∂x_i))

which becomes:

(∂)/(∂x_i)⎛⎝Γ_{x_i, α}(∂)/(∂x_i)(w_α × ^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))) = 2Γ_{x_i, α}(∂w_α)/(∂x_i)(∂)/(∂x_i)\Bigl( ^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α) + ∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α) × (∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i)) + w_α(∂)/(∂x_i)\Bigl(Γ_{x_iα}(∂)/(∂x_i)\Bigl( ^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α)))

which can be further reduced to:

(∂)/(∂x_i)⎛⎝Γ_{x_i, α}(∂)/(∂x_i)(w_α × ^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))) = ⎲⎳^N_ξ_m = 1[2Γ_{x_i, α}(∂w_α)/(∂x_i)(∂⟨ξ_m⟩_α)/(∂x_i) × (^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))\Biggr] + ∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)(∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i)) + ^N_ξ⎲⎳_m = 1⎡⎣w_α(∂)/(∂x_i)⎧⎩Γ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i) × (^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × (δ^′(ξ_m − ⟨ξ_m⟩_α))}]

The very last term in this expression will become important later; it can be simplified as:

(11.42) ⎲⎳^N_ξ_m = 1[w_α(∂)/(∂x_i){Γ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(^N_ξ∏_{j = 1, j ≠ m}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α)\Biggr})] = ⎲⎳^N_ξ_m = 1[w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i))(^N_ξ∏_{j = 1, j ≠ m}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))] + ⎲⎳^N_ξ_m = 1[w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂)/(∂x_i)\Biggl{(^N_ξ∏_{j = 1, j ≠ m}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α)\Biggr})]

and the last term in this expression can also be further simplified:

(11.43) ⎲⎳^N_ξ_m = 1⎡⎣w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂)/(∂x_i){(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))}⎤⎦ = ⎲⎳^N_ξ_m = 1^N_ξ⎲⎳_n = 1\Biggl{w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_n⟩_α)/(∂x_i)(^N_ξ∏_{j ≠ m, n, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × (δ^′(ξ_m − ⟨ξ_m⟩_α))(δ^′(ξ_n − ⟨ξ_n⟩_α))\Biggr} + ⎲⎳^N_ξ_m = 1\Biggl{w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_m⟩_α)/(∂x_i)(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × (δ^′′(ξ_m − ⟨ξ_m⟩_α))\Biggr}

Finally, the diffusion term becomes:

(11.44) (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_α∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))) = ^N_ξ⎲⎳_m = 1[2Γ_{x_i, α}(∂w_α)/(∂x_i)(∂⟨ξ_m⟩_α)/(∂x_i) (^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))] + ∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)(∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i)) + ^N_ξ⎲⎳_m = 1⎡⎣w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)) × (^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))] + ^N_ξ⎲⎳_m = 1^N_ξ⎲⎳_n = 1{w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_n⟩_α)/(∂x_i)(^N_ξ∏_{j ≠ m, n, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) (δ^′(ξ_m − ⟨ξ_m⟩_α))(δ^′(ξ_n − ⟨ξ_n⟩_α))} + ^N_ξ⎲⎳_m = 1{w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_m⟩_α)/(∂x_i) (^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′′(ξ_m − ⟨ξ_m⟩_α))}

Now, plugging (11.37↑), (11.40↑), and (11.44↑) into (11.35↑), the quadrature-approximated multivariate NDF transport equation becomes:

(11.45) (∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))[(∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i))] − (⎲⎳^N_ξ_m = 1[(^N_ξ∏_{j = 1, j ≠ m}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))])[w_α(∂)/(∂t)(⟨ξ_m⟩_α) + ⟨v_i⟩_αw_α(∂)/(∂x_i)(⟨ξ_m⟩_α) − w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ⟩_α)/(∂x_i)) + 2Γ_{x_i, α}(∂w_α)/(∂x_i)(∂⟨ξ_m⟩_α)/(∂x_i)] = ⎲⎳^N_ξ_m = 1^N_ξ⎲⎳_n = 1{w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_n⟩_α)/(∂x_i)(^N_ξ∏_{j ≠ m, n, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × (δ^′(ξ_m − ⟨ξ_m⟩_α))(δ^′(ξ_n − ⟨ξ_n⟩_α))\Biggr} + ⎲⎳^N_ξ_m = 1{w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_m⟩_α)/(∂x_i)(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′′(ξ_m − ⟨ξ_m⟩_α))} + S_ξ + D_ξ.

As in Section 10.1↑, this equation can be rewritten in terms of the weights w_α and the weighted abscissas ς_mα = w_α⟨ξ_m⟩_α by using three identities. First, an identity for the temporal derivative of the weighted abscissa,

(11.46) (∂)/(∂t)(ς_mα) = w_α(∂)/(∂t)(⟨ξ_m⟩_α) + ⟨ξ_m⟩_α(∂)/(∂t)(w_α),

can be rearranged to isolate terms appearing in (11.45↑):

(11.47) w_α(∂⟨ξ_m⟩_α)/(∂t) = (∂ς_mα)/(∂t) − ⟨ξ_m⟩_α(∂w_α)/(∂t).

The second identity is for the weighted abscissa convection term:

(∂)/(∂x_i)(⟨v_i⟩_ας_mα) = (∂)/(∂x_i)(⟨v_i⟩_αw_α⟨ξ_m⟩_α) = ⟨v_i⟩_αw_α(∂)/(∂x_i)(⟨ξ_m⟩_α) + ⟨ξ_m⟩_α(∂)/(∂x_i)(⟨v_i⟩_αw_α)

which, isolating terms appearing in (11.45↑), yields:

(11.48) ⟨v_i⟩_αw_α(∂)/(∂x_i)(⟨ξ_m⟩_α) = (∂)/(∂x_i)(⟨v_i⟩_ας_mα) − ⟨ξ_m⟩_α(∂)/(∂x_i)(⟨v_i⟩_αw_α).

The third and final identity is for the weighted abscissa diffusion term:

(∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(ς_m, α)) = (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_α⟨ξ_m⟩_α)) = (∂)/(∂x_i)(w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i) + ⟨ξ_m⟩_αΓ_{x_i, α}(∂w_α)/(∂x_i)) = w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)) + 2Γ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂w_α)/(∂x_i) + ⟨ξ_m⟩_α(∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i)),

and isolating terms appearing in (11.45↑):

(11.49) 2Γ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂w_α)/(∂x_i) + w_α(∂)/(∂x_i)(Γ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)) = (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(ς_m, α)) + ⟨ξ_m⟩_α(∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i)).

Next, (11.47↑), (11.48↑), and (11.49↑) can be plugged into the quadrature-approximated NDF transport equation (11.45↑) to yield:

(11.50) ∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)[(∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i))] − ⎲⎳^N_ξ_m = 1[(^N_ξ∏_{j = 1, j ≠ m}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))]{(∂)/(∂t)(ς_mα) + (∂)/(∂x_i)(⟨v_i⟩_ας_mα) − (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(ς_m, α)) − ⟨ξ_m⟩_α((∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i)))} = ⎲⎳^N_ξ_m = 1^N_ξ⎲⎳_n = 1{w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_n⟩_α)/(∂x_i)(^N_ξ∏_{j ≠ m, n, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × (δ^′(ξ_m − ⟨ξ_m⟩_α))(δ^′(ξ_n − ⟨ξ_n⟩_α))} + ⎲⎳^N_ξ_m = 1{w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_m⟩_α)/(∂x_i)(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′′(ξ_m − ⟨ξ_m⟩_α))} + S_ξ + D_ξ

or, representing the dissipation terms using C_mnα = D_x, α(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_n⟩_α)/(∂x_i), this equation becomes:

(11.51) ∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)[(∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i))] − ⎲⎳^N_ξ_m = 1[(^N_ξ∏_{j = 1, j ≠ m}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))]{(∂)/(∂t)(ς_mα) + (∂)/(∂x_i)(⟨v_i⟩_ας_mα) − (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(ς_m, α)) − ⟨ξ_m⟩_α((∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂w_α)/(∂x_i)))} = ⎲⎳^N_ξ_m = 1^N_ξ⎲⎳_n = 1{w_αC_mnα(^N_ξ∏_{j ≠ m, n, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))(δ^′(ξ_n − ⟨ξ_n⟩_α))} + ⎲⎳^N_ξ_m = 1{w_αC_mmα(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′′(ξ_m − ⟨ξ_m⟩_α))} + S_ξ + D_ξ.

In this equation, several terms can be isolated as transport equations for the weights w_α and weighted abscissa ς_α:

(11.52) (∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨v_i⟩_αw_α) − (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(w_α)) = a_α (11.53) (∂)/(∂t)(ς_nα) + (∂)/(∂x_i)(⟨v_i⟩_ας_nα) − (∂)/(∂x_i)(Γ_{x_i, α}(∂)/(∂x_i)(ς_n, α)) = b_nα

Where a_α and b_mα are source terms. Upon substituting the terms on the right-hand side for the terms on the left-hand side in (11.16↑), and re-expressing the delta function derivatives, the final NDF transport equation (the form of interest) is obtained:

(11.54) ⎲⎳^N_α = 1[∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α) + ⎲⎳^N_ξ_m = 1[(^N_ξ∏_{j = 1, j ≠ m}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))]⟨ξ_m⟩_α]a_α − ^N⎲⎳_α = 1⎲⎳^N_ξ_m = 1[(^N_ξ∏_{j = 1, j ≠ m}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′(ξ_m − ⟨ξ_m⟩_α))]b_mα = ⎲⎳^N_ξ_m = 1^N_ξ⎲⎳_n = 1\Biggl{w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_n⟩_α)/(∂x_i)(^N_ξ∏_{j ≠ m, n, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) (δ^′(ξ_m − ⟨ξ_m⟩_α))(δ^′(ξ_n − ⟨ξ_n⟩_α))\Biggr} + ^N_ξ⎲⎳_m = 1\Biggl{w_αΓ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_m⟩_α)/(∂x_i)(^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) + (δ^′′(ξ_m − ⟨ξ_m⟩_α))\Biggr} + S_ξ + D_ξ

This can be expressed more concisely as:

(11.55) ⎲⎳^N_α = 1⎡⎣∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α) + ^N⎲⎳_m = 1(∂)/(∂⟨ξ_m⟩_α)(^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⟨ξ_m⟩_α⎤⎦a_α − ^N⎲⎳_α = 1^N⎲⎳_n = 1⎡⎣(∂)/(∂⟨ξ_n⟩_α)(^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎤⎦b_nα = ⎲⎳^N_α = 1^N_ξ⎲⎳_m = 1^N_ξ⎲⎳_n = 1⎡⎣(∂²)/(∂⟨ξ_m⟩_α∂⟨ξ_n⟩_α)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎤⎦w_αC_mnα + S_ξ + D_ξ

where ↓C_mnα is a “cross-coordinate” dissipation term, defined as C_mnα = Γ_{x_i, α}(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_n⟩_α)/(∂x_i).

10.3 Summary

In summary, for the univariate and multivariate cases, there is a set of equations that provide the starting point for the solution procedure, described in detail in Section 2.3.3↑. For the univariate case, this set of equations consists of the weight and weighted abscissa transport equations:

(11.56) (∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨u_i⟩_αw_α) − (∂)/(∂x_i)(D_x, α(∂w_α)/(∂x_i)) = a_α (11.57) (∂)/(∂t)(ς_α) + (∂)/(∂x_i)(⟨u_i⟩_ας_α) − (∂)/(∂x_i)(D_x, α(∂ς_α)/(∂x_i)) = b_α

and the corresponding univariate quadrature approximated NDF transport equation is:

(11.58) ⎲⎳^N_α = 1[δ(ξ − ⟨ξ⟩_α) + δ^′(ξ − ⟨ξ⟩_α)⟨ξ⟩_α]a_α − ⎲⎳^N_α = 1[δ^′(ξ − ⟨ξ⟩_α)]b_α = ⎲⎳^N_α = 1δ^′′(ξ − ⟨ξ⟩_α)w_αC_α + S_ξ + D_ξ.

For the multivariate case, this set of equations consists of the multivariate weight and weighted abscissa transport equations:

(11.59) (∂)/(∂t)(w_α) + (∂)/(∂x_i)(⟨u_i⟩_αw_α) − (∂)/(∂x_i)(D_x, α(∂)/(∂x_i)(w_α)) = a_α (11.60) (∂)/(∂t)(ς_nα) + (∂)/(∂x_i)(⟨u_i⟩_ας_nα) − (∂)/(∂x_i)(D_x, α(∂)/(∂x_i)(ς_n, α)) = b_nα

and the multivariate quadrature approximated NDF transport equation:

(11.61) ⎲⎳^N_α = 1⎡⎣∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α) + ^N⎲⎳_m = 1(∂)/(∂⟨ξ_m⟩_α)(^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⟨ξ_m⟩_α⎤⎦a_α − ^N⎲⎳_α = 1^N⎲⎳_n = 1⎡⎣(∂)/(∂⟨ξ_n⟩_α)(^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎤⎦b_nα = ⎲⎳^N_α = 1^N_ξ⎲⎳_m = 1^N_ξ⎲⎳_n = 1⎡⎣(∂²)/(∂⟨ξ_m⟩_α∂⟨ξ_n⟩_α)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎤⎦w_αC_mnα + S_ξ + D_ξ.

11 MOMENT-TRANSFORMED NUMBER DENSITY FUNCTION TRANSPORT EQUATION

11.1 Moment-Transformed Univariate NDF

Becuase the quadrature-approximated univariate NDF transport equation (11.58↑) is only a single equation, but the number of moments, weights, and abscissas that need to be tracked to maintain a high accuracy representation of the NDF is larger than one, a set of independent linear equations must be derived from equation (11.58↑). This can be done by selecting a set of linearly indepenent moments. The number of moments that must be selected is 2N, since there are N unknown weights and N unknown abscissas.

The quadrature-approximated univariate NDF transport equation is written as:

(12.1) ⎲⎳^N_α = 1[δ(ξ − ⟨ξ⟩_α) + δ^′(ξ − ⟨ξ⟩_α)⟨ξ⟩_α]a_α − ⎲⎳^N_α = 1[δ^′(ξ − ⟨ξ⟩_α)]b_α = ⎲⎳^N_α = 1δ^′′(ξ − ⟨ξ⟩_α)w_αC_α + S_ξ + D_ξ.

Next, the moment transform, defined by (3.25↑) can be taken. Using the properties of the delta function [134, 107],

^∞⌠⌡_− ∞ξ^kδ(ξ − ⟨ξ⟩_α)dξ = ⟨ξ^k⟩_α ⌠⌡^∞_− ∞ξ^kδ^′(ξ − ⟨ξ⟩_α)dξ = − k⟨ξ^k − 1⟩_α ⌠⌡^+ ∞_− ∞ξ^kδ^′′(ξ − ⟨ξ⟩_α)dξ = k(k − 1)⟨ξ^k − 2⟩_α,

and multiplying by the denominator of (3.25↑), the moment-transformed NDF transport equation becomes:

(12.2) ⎲⎳^N_α = 1[⟨ξ^k⟩_α − k⟨ξ^k⟩_α]a_α + ⎲⎳^N_α = 1[k⟨ξ^k − 1⟩_α]b_α = ⎲⎳^N_α = 1k(k − 1)⟨ξ^k − 2⟩_αw_αC_α + S_k + D_k

where S_k is the moment transform (for the k^th moment) of the phase-space convection source term S_ξ (the quantity S_ξ is defined in equation (3.57↑)), defined as:

(12.3) S_k = ⌠⌡^+ ∞_− ∞ξ^kS_ξdξ = − ^+ ∞⌠⌡_− ∞[^N⎲⎳_α = 1ξ^k(∂)/(∂ξ)(⟨G⟩_αw_αδ(ξ − ⟨ξ⟩_α))]dξ

which, using integration by parts, becomes:

(12.4) S_k = ⎲⎳^N_α = 1⎡⎣ − ξ^k⟨G⟩_α(w_αδ(ξ − ⟨ξ⟩_α))|^{ξ = + ∞}_{ξ = − ∞} + ^+ ∞⌠⌡_− ∞(dξ^k)/(dξ)⟨G⟩_αw_αδ(ξ − ⟨ξ⟩_α)dξ⎤⎦ = 0 + k^+ ∞⌠⌡_− ∞[^N⎲⎳_α = 1ξ^k − 1⟨G⟩_αw_αδ(ξ − ⟨ξ⟩_α)]dξ = − ^N⎲⎳_α = 1k⟨ξ^k − 1⟩_α(w_α⟨G⟩_α).

D_k is the moment transform of the phase space diffusive term D_ξ (the quantity D_ξ is defined in equation (3.58↑)), defined as:

(12.5) D_k = ⌠⌡^+ ∞_− ∞ξ^kD_ξdξ = ⌠⌡^+ ∞_− ∞⎡⎣^N⎲⎳_α = 1ξ^k(∂)/(∂ξ)⎛⎝Γ_ξ, α(∂)/(∂ξ)(w_αδ(ξ − ⟨ξ⟩_α))⎞⎠⎤⎦dξ = ⎲⎳^N_α = 1⎡⎣ξ^kΓ_ξ, α(∂)/(∂ξ)(w_αδ(ξ − ⟨ξ⟩_α))|^{ξ = + ∞}_{ξ = − ∞} − ^+ ∞⌠⌡_− ∞(dξ^k)/(dξ)Γ_ξ, α(∂)/(∂ξ)(w_αδ(ξ − ⟨ξ⟩_α))dξ⎤⎦ = ⎲⎳^N_α = 1[0 − k^+ ∞⌠⌡_− ∞ξ^k − 1Γ_ξ, α(∂)/(∂ξ)(w_αδ(ξ − ⟨ξ⟩_α))dξ] = ⎲⎳^N_α = 1⎡⎣0 − ξ^k − 1Γ_ξ, αw_αδ(ξ − ⟨ξ⟩_α)|^{ξ = + ∞}_{ξ = − ∞} + k^+ ∞⌠⌡_− ∞(dξ^k − 1)/(dξ)Γ_ξ, α(∂)/(∂ξ)(w_αδ(ξ − ⟨ξ⟩_α))dξ⎤⎦ = ⎲⎳^N_α = 1[0 − 0 + k(k − 1)^+ ∞⌠⌡_− ∞ξ^k − 2Γ_ξ, αw_αδ(ξ − ⟨ξ⟩_α)] (12.6) = ⎲⎳^N_α = 1k(k − 1)⟨ξ^k − 2⟩_α(w_αΓ_ξ, α).

Equation (12.2↑) contains unknowns for each of the N weights and N weighted abscissas, for a total of 2N equations, and therefore requires 2N moment indices k. Another way to express this is to say that the quadrature approximation has a degree of freedom for each of the N weights and each of the N abscissa locations, leading to 2N degrees of freedom. This set of equations can alternatively be expressed as a linear system,

Ax = B,

which is covered extensively in Appendix E.

11.2 Moment-Transformed Multivariate NDF

The same procedure can be done for the multivariate case, starting with the mutivariate quadrature-approximated NDF transport equation:

(12.7) ⎲⎳^N_α = 1⎡⎣∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α) + ^N_ξ⎲⎳_m = 1(∂)/(∂⟨ξ_m⟩_α)(^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⟨ξ_m⟩_α⎤⎦a_α − ^N⎲⎳_α = 1^N_ξ⎲⎳_n = 1⎡⎣(∂)/(∂⟨ξ_n⟩_α)(^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎤⎦b_n, α = ⎲⎳^N_α = 1^N_ξ⎲⎳_m = 1^N_ξ⎲⎳_n = 1⎡⎣(∂²)/(∂⟨ξ_m⟩_α∂⟨ξ_n⟩_α)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎤⎦C_mnα + S_ξ + D_ξ

where S_ξand D_ξ are the multivariate phase space convection and diffusion terms, defined by (3.62↑) and (3.63↑), respectively.

Next, using the corresponding properties of the multivariate delta function (summations over α are implied):

\dotsintop^+ ∞_− ∞(^N_ξ∏_i = 1ξ^k_i_i)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))dξ = ∏^N_ξ_m = 1⟨ξ^k_m_m⟩_α

^N_ξ⎲⎳_m = 1\dotsintop^∞_− ∞(^N_ξ∏_i = 1ξ^k_i_i) × ((∂)/(∂⟨ξ_m⟩_α)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α)))dξ = ⎲⎳^N_ξ_m = 1[(^N_ξ∏_i = 1ξ^k_i_i) × (∏^N_ξ_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × (δ^′(ξ_m − ⟨ξ_m⟩_α))] = ⎲⎳^N_ξ_m = 1(∏^N_ξ_{j ≠ m, j = 1}⟨ξ^k_j_j⟩_α)( − k_m⟨ξ^{k_m − 1}_m⟩_α)

\dotsintop^∞_− ∞(^N_ξ∏_i = 1ξ^k_i_i) × ⎛⎝(∂²)/(∂⟨ξ_m⟩_α∂⟨ξ_n⟩_α)(∏^N_ξ_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎞⎠dξ = ^N_ξ⎲⎳_m = 1^N_ξ⎲⎳_n = 1[(^N_ξ∏_i = 1ξ^k_i_i) × (^N_ξ∏_{j ≠ m, n j = 1}δ(ξ_j − ⟨ξ_j⟩_α)) × (δ^′(ξ_m − ⟨ξ_m⟩_α))(δ^′(ξ_n − ⟨ξ_n⟩_α))\Biggr] + ⎲⎳^N_ξ_m = 1[(^N_ξ∏_i = 1ξ^k_i_i) (^N_ξ∏_{j ≠ m, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))(δ^′′(ξ_m − ⟨ξ_m⟩_α))] = ⎲⎳^N_ξ_m = 1^N_ξ⎲⎳_n = 1\Biggl[k_mk_n⟨ξ^{k_m − 1}_m⟩_α⟨ξ^{k_n − 1}_n⟩_α × (^N_ξ∏_{j ≠ m, n j = 1}⟨ξ^k_j_j⟩)] + ⎲⎳^N_ξ_m = 1\Biggl[k_m(k_m − 1)⟨ξ^{k_m − 1}_m⟩_α (^N_ξ∏_{j ≠ m, j = 1}⟨ξ^k_J_j⟩_α)\Biggr]

In this case, the moment transform of the multivariate NDF transport equation becomes:

(12.8) ^N⎲⎳_α = 1[(^N_ξ∏_j = 1⟨ξ^k_j_j⟩_α) − ^N_ξ⎲⎳_m = 1k_m(⟨ξ^k_m_m⟩_α)(∏^N_ξ_{j ≠ m, j = 1}⟨ξ^k_j_j⟩_α)]a_α + ⎲⎳^N_α = 1⎲⎳^N_ξ_n = 1[k_n⟨ξ^{k_n − 1}_n⟩_α (^N_ξ∏_{j ≠ n, j = 1}⟨ξ^k_j_j⟩_α)]b_n, α = ^N⎲⎳_α = 1^N_ξ⎲⎳_m = 1[k_m(k_m − 1)⟨ξ^{k_m − 2}_m⟩_α × (^N_ξ∏_{j ≠ m, j = 1}⟨ξ^k_j_j⟩_α)]w_αC_mmα + ^N⎲⎳_α = 1^N_ξ⎲⎳_m = 1^N_ξ⎲⎳_n = 1[k_mk_n⟨ξ^{k_m − 1}_m⟩_α⟨ξ^{k_n − 1}_n⟩_α × (∏^N_ξ_{j ≠ m, j = 1}⟨ξ^k_j_j⟩_α)]w_αC_mnα + S_k + D_k

where C_mmα = D_x, α(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_m⟩_α)/(∂x_i) and C_mnα = D_x, α(∂⟨ξ_m⟩_α)/(∂x_i)(∂⟨ξ_n⟩_α)/(∂x_i). Because the phase-space convection and diffusion terms contain the NDF, the quadarature approximation can be used to simplify the term S_k:

(12.9) S_k = ⎲⎳^N_ξ_n = 1\dotsintop^+ ∞_− ∞((^N_ξ∏_m = 1ξ^k_m_m)S_{ξ_n})dξ = − ⎲⎳^N_ξ_n = 1^N⎲⎳_α = 1\dotsintop^+ ∞_− ∞⎛⎝(^N_ξ∏_m = 1ξ^k_m_m)(∂)/(∂ξ_n)(⟨G_n⟩_α w_α^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎞⎠dξ

and again using integration by parts, this becomes:

(12.10) S_k = − ^N_ξ⎲⎳_n = 1^N⎲⎳_α = 1(^N_ξ∏_m = 1ξ^k_m_m)(G_nw_α^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))|^{ξ = + ∞}_{ξ = − ∞} + ^N_ξ⎲⎳_n = 1^N⎲⎳_α = 1\dotsintop^+ ∞_− ∞(∂)/(∂ξ_n)(^N_ξ∏_m = 1ξ^k_m_m)G_nw_α^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α)dξ = 0 + ⎲⎳^N_ξ_n = 1^N⎲⎳_α = 1\dotsintop^+ ∞_− ∞ \Biggl{(^N_ξ∏_{m ≠ n, m = 1}ξ^k_m_m)(k_n⟨ξ_n⟩^{k_n − 1}_α) × (⟨G_n⟩_α w_α^N_ξ∏_{j ≠ n, j = 1}δ(ξ_j − ⟨ξ_j⟩_α))dξ\Biggr} = ⎲⎳^N_ξ_n = 1⎲⎳^N_α = 1[(∏^N_ξ_{m ≠ n, m = 1}⟨ξ^k_m_m⟩_α)(k_n⟨ξ^{k_n − 1}_n⟩_α)(w_α⟨G_n⟩_α)]

The term D_k can also be simplified by using the quadrature approximation:

(12.11) D_k = ⎲⎳^N_ξ_n = 1\dotsintop^+ ∞_− ∞((^N_ξ∏_m = 1ξ^k_m_m)D_{ξ_n})dξ = ⎲⎳^N_ξ_n = 1^N⎲⎳_α = 1\dotsintop^+ ∞_− ∞⎡⎣(^N_ξ∏_m = 1ξ^k_m_m)(∂)/(∂ξ_n)⎛⎝Γ_{ξ_n, α}(∂)/(∂ξ_n)(w_α^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))⎞⎠⎤⎦dξ

Next, using integration by parts twice, this can be simplified to:

(12.12) D_k = ⎲⎳^N_ξ_n = 1^N⎲⎳_α = 1(^N_ξ∏_m = 1ξ^k_m_m)Γ_{ξ_n, α}(∂)/(∂ξ_n)(w_α^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))||^{ξ = + ∞}_{ξ = − ∞} − ^N_ξ⎲⎳_n = 1^N⎲⎳_α = 1\dotsintop^+ ∞_− ∞(∂)/(∂ξ_n)(^N_ξ∏_m = 1ξ^k_m_m)Γ_{ξ_n, α}(∂)/(∂ξ_n)(w_α^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α)) = 0 − ^N_ξ⎲⎳_n = 1^N⎲⎳_α = 1(^N_ξ∏_{m ≠ n, m = 1}ξ^k_m_m)(k_nξ^{k_n − 1}_n)(Γ_{ξ_n, α})(w_α^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))|^{ξ = + ∞}_{ξ = − ∞} + ^N_ξ⎲⎳_n = 1^N⎲⎳_α = 1\dotsintop^+ ∞_− ∞⎡⎣(w_α^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))(∂)/(∂ξ_n)((^N_ξ∏_{m ≠ n, m = 1}ξ^k_m_m)(k_nξ^{k_n − 1}_n))⎤⎦ 0 − 0 + ^N_ξ⎲⎳_n = 1^N⎲⎳_α = 1(w_αΓ_{ξ_n, α})(^N_ξ∏_j = 1δ(ξ_j − ⟨ξ_j⟩_α))(^N_ξ∏_{m ≠ n, m = 1}ξ^k_m_m)(k_n(k_n − 1)ξ^{k_n − 2}_n) = ^N_ξ⎲⎳_n = 1^N⎲⎳_α = 1(^N_ξ∏_{m ≠ n, m = 1}⟨ξ^k_m_m⟩_α)(k_n(k_n − 1)⟨ξ^{k_n − 2}_n⟩_α)(w_αΓ_{ξ_n, α}).

As with the univariate moment-transformed quadrature-approximated NDF transport equation, with all simplifications, these equations can be combined to form a linear system,

(12.13) Ax = B,

whose construction is covered in great detail in Appendix E.

12 CONSTRUCTION OF LINEAR SYSTEM FOR DQMOM

12.1 Univariate Linear System

Appendix D covered the derivation of the univariate moment-transformed quadrature-approximated NDF transport equation 12.2↑, which can be used to solve for N weights and N abscissas, for a total of 2N equations, using a set of 2N independent moments. The transport equations for the moments, originating from the moment transform of the quadrature-approximated NDF transport equation, are independent equations that can be cast in matrix form,

Ax = B

where x is the vector of unknowns a_α and b_α,

x = [ a b ]^T = [ a₁ … a_N b₁ … b_N ]^T.

The matrix A contains the coefficients of the unknowns: one column for each element of x at each quadrature node, and one row for each moment:

(13.1) A = [ A₁ A₂ ] (13.2) A₁ = ⟨ξ^k⟩_α − k⟨ξ^k⟩_α (13.3) A₂ = k⟨ξ^k − 1⟩_α.

The matrices A₁ and A₂ correspond, respecitvely, to a and b; both are 2N × N matrices, with one row for each moment and one column for each environment. This makes A a 2N × 2N matrix.

The right-hand side vector B contains vectors for each of the NDF source terms, including diffusion in x space (C_diff), convection in phase space (S), and diffusion in phase space (D_diff). B can be expressed as the sum of each of these. The diffusion vector C can be rewritten as:

(13.4) C_diff = A_cwC,

where A_c is a 2N × N matrix with entries:

(13.5) A_c = k(k − 1)⟨ξ^k − 2⟩_α;

each row of A_c corresponds to a moment k, and each column of A_c corresponds to a quadrature node α. w is an N × N matrix, w = diag(w₁, …, w_N), and C is an N × 1 vector, C = [C_α = 1, … , C_α = N], where C_i is defined by (3.56↑).

Next, the vector S can be rewritten as:

(13.6) S = A₂wG,

where G is an N × 1 vector containing the phase-space convection terms, G = [G_α = 1, …, G_α = N], and w and A₂ are the same as above.

Finally, the vector D_diff can be written as:

(13.7) D_diff = A_cwΓ

where Γ is an N × 1 vector of the diffusion coefficients for each environment, Γ = [ Γ_{ξ, α = 1}, … , Γ_{ξ, α = N} ]. Now, the linear system being solved can also be rewritten:

(13.8) Ax = B [A₁ A₂][a b]^T = C_diff + S + D_diff A₁a + A₂b = A_cwC + A₂wG + A_cwΓ A₁a + A₂(b − wG) = A_cw(C + Γ) A₁a + A₂b^⋆ = A_cw(C + Γ)

and the last equation can be written as

(13.9) Ax^⋆ = B^⋆.

12.2 Multivariate Linear System

As with the univariate case, the multivariate moment-transformed quadrature-approximated NDF provides a set of independent equations with which to track the N weights and the N_ξ × N abscissas. The multivariate moment-transformed quadrature-approximated NDF is given by equation (12.8↑), and is a set of N(N_ξ + 1) independent equations, requiring N(N_ξ + 1) independent moments. These independent equations are linear, due to the quadrature approximation, and can be expressed in the form:

(13.10) Ax = B.

The x matrix is a combination of several smaller matrices. It is defined as:

(13.11) x = [ a b₁ b₂ … b_{N_ξ}]^T

and the submatrices are defined as:

(13.12) a = a_α = [ a₁ a₂ ⋯ a_N ]^T

(13.13) b_i = b_i, α = [ b_i, 1 b_i, 2 ⋯ b_i, N ]^T

where i = 1, 2, ⋯, N_ξ. All of the terms in the matrix x are unknown quantities. The matrix A is, like the univariate case, composed of several submatrices:

(13.14) A = [ A₀ … A₁ … A_{N_ξ} ].

The matrix A₀ is composed of the elements:

(13.15) A₀ = (^N_ξ∏_j = 1⟨ξ^k_j_j⟩_α) − ^N_ξ⎲⎳_m = 1[⟨ξ^k_m_m⟩_α(^N_ξ∏_{j ≠ m, m = 1}⟨ξ^k_j_j⟩_α)],

with one column for each quadrature node and one row for each moment, making A₀ a matrix of size N(N_ξ + 1) × N. This matrix contains coefficeints of a.

The remaining A submatrices are calculated in a similar fashion; the matrix A_j (where j = 1, ⋯, N_ξ) contains terms of the form:

(13.16) A_j = (∂)/(∂⟨ξ_j⟩_α)(∏^N_ξ_n = 1⟨ξ^k_n_n⟩_α) (13.17) = (k_j)(⟨ξ^{k_j − 1}_j⟩_α)(∏^N_ξ_{n ≠ j, n = 1}⟨ξ^k_n_n⟩_α).

These matrices each have size (N_ξ + 1)N × N, with one row for each moment and one column for each environment. Using equations (13.15↑) and (13.16↑), the entire A matrix can be determined.

The B matrix consists of the right-hand side source and sink terms for the number density function. These include vectors for diffusion in x space (C_diff), convection in phase space (S), and diffusion in phase space (D_diff). B can be expressed as a linear combination of these vectors. Each of these matrices can be simplified, starting with C_diff:

(13.18) C_diff = A_cWC.

A_c is a coefficient matrix of size (N_ξ + 1)N × N²_ξ, defined as:

(13.19) A_c = A^k_{c (m, n)} = ⎡⎢⎢⎢⎣ A⁽¹⁾_c(1, 1) … A⁽¹⁾_{c(1, N_ξ)} … A⁽¹⁾_{c(N_ξ, 1)} … A⁽¹⁾_{c(N_ξ, N_ξ)} ⋮ ⋮ A^{(N_ξN + N)}_c(1, 1) … … … A^{(N_ξN + N)}_{c(N_ξ, N_ξ)} ⎤⎥⎥⎥⎦

where each submatrix A^k_{c (m, n)} is a 1 × N matrix with elements

(13.20) A^k_{c (m, n)} = ⎡⎢⎢⎢⎢⎢⎢⎢⎣ (1 − δ_mn){k_mk_n⟨ξ^k_m_m⟩_α = 1⟨ξ^k_n_n⟩_α = 1(∏^N_ξ_{j ≠ m, n j = 1}⟨ξ^k_j_j⟩_α = 1)} + δ_mn{k_m(k_m − 1)⟨ξ^{k_i − 2}_m⟩_α = 1(∏^N_ξ_{j ≠ m, j = 1}⟨ξ^k_j_j⟩_α = 1)} ⋮ (1 − δ_mn){k_mk_n⟨ξ^k_m_m⟩_α = N⟨ξ^k_n_n⟩_α = N(∏^N_ξ_{j ≠ m, n j = 1}⟨ξ^k_j_j⟩_α = N)} + δ_mn{k_m(k_m − 1)⟨ξ^{k_i − 2}_m⟩_α = N(∏^N_ξ_{j ≠ m, j = 1}⟨ξ^k_m_j⟩_α = N)} ⎤⎥⎥⎥⎥⎥⎥⎥⎦^T

with

δ_mn = ⎧⎨⎩ 1 if m = n 0 if m ≠ n .

The matrix W is a diagonal matrix of size N²_ξ × N²_ξ, W = diag(w), where w is an N × N diagonal matrix, w = diag(w_α = 1, … , w_α = N). Finally, the matrix C is of size N²_ξ × 1 and contains the diffusion terms,

(13.21) C = C_(m, n) = ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ C_(1, 1) … C_{(1, N_ξ)} … C_{(N_ξ, 1)} … C_{(N_ξ, N_ξ)} ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦,

where the submatrices C_(m, n) are N × 1 matrices with elements

(13.22) C_(m, n) = ⎡⎢⎢⎢⎣ D_{x, α = 1}(∂⟨ξ_m⟩_α = 1)/(∂x_i)(∂⟨ξ_n⟩_α = 1)/(∂x_i) ⋮ D_{x, α = N}(∂⟨ξ_m⟩_α = N)/(∂x_i)(∂⟨ξ_n⟩_α = N)/(∂x_i) ⎤⎥⎥⎥⎦.

This makes C_diff a (N_ξ + 1)N × 1 vector. It can be expressed more concisely as:

C_diff = ⎲⎳^N_ξ_m = 1^N_ξ⎲⎳_n = 1^N⎲⎳_α = 1(δ_mn{k_m(k_m − 1)⟨ξ^{k_m − 2}_m⟩_α(^N_ξ∏_{j ≠ m, j = 1}⟨ξ^k_j_j⟩_α)w_αC_mmα} + (1 − δ_mn){k_mk_n⟨ξ^{k_m − 1}_m⟩_α⟨ξ^{k_n − 1}_n⟩_α(∏^N_ξ_{j ≠ m, n, j = 1}⟨ξ^k_j_j⟩_α)w_αC_mnα})

with one row for each moment k.

The phase-space diffusion vector D_diff can be written in a similar way:

(13.23) D_diff = A_cwΓ

where Γ = [ Γ_{ξ = 1, α = 1}, … , Γ_{ξ = 1, α = N} Γ_{ξ = 2, α = 1} … Γ_{ξ = N_ξ, α = N}].

Finally, the phase-space convection vector S can be re-written as:

(13.24) S = A_sW^′G,

where A_s = [ A₁ … A_{N_ξ} ] contains all but one of the submatrices comprising the coefficient matrix A, defined in equation defined above, equation 13.1↑); W^′ is an N_ξ × N_ξ diagonal matrix, W^′ = diag(w) (where w is an N × N matrix defined above); and G is an N_ξ × 1 vector containing the phase-space convection terms; it is defined as:

(13.25) G = ⎡⎢⎢⎢⎣ G₁ … G_{N_ξ} ⎤⎥⎥⎥⎦

and the submatrices comprising G are N × 1 matrices defined as

(13.26) G_j = ⎡⎢⎢⎢⎣ ⟨G_j⟩_α = 1 … ⟨G_j⟩_α = N ⎤⎥⎥⎥⎦.

This makes S an (N_ξ + 1)N × 1 vector containing source terms due to phase-space convection. S can be expressed in compact notation using equation (12.10↑); this results in:

(13.27) S = ⎲⎳^N_ξ_n = 1^N⎲⎳_α = 1[(∏^N_ξ_{m ≠ n, m = 1}⟨ξ^k_m_m⟩_α)(k_n⟨ξ^{k_n − 1}_n⟩_α)(w_α⟨G_n⟩_α)]

with one row for each k. Likewise, D_diff can be expressed in compact notation using equation (12.12↑), which yields:

(13.28) D_diff = ^N_ξ⎲⎳_n = 1^N⎲⎳_α = 1(^N_ξ∏_{m ≠ n, m = 1}⟨ξ^k_m_m⟩_α)(k_n)(k_n − 1)(⟨ξ^{k_n − 2}_n⟩_α)(w_αΓ_{ξ_n, α}).

The linear system for the multivariate system can also be rewritten:

(13.29) Ax = B [A₀ A₁ … A_{N_ξ}]⎡⎢⎢⎢⎢⎢⎣ a b₁ ⋮ b_{N_ξ} ⎤⎥⎥⎥⎥⎥⎦ = C_diff + S + D_diff A₀a + A₁b₁ + … + A_{N_ξ}b_{N_ξ} = A_cWC + A_cWΓ + A_sW^′G A₀a + A₁b₁ + … + A_{N_ξ}b_{N_ξ} = A_cWC + A_cWΓ + A₁wG₁ + … + A_{N_ξ}wG_{N_ξ} A₀a + A₁(b₁ − wG₁) + … + A_{N_ξ}(b_{N_ξ} − wG_{N_ξ}) = A_cW(C + Γ) A₀a + A₁b^⋆₁ + … + A_{N_ξ}b^⋆_{N_ξ} = A_cW(C + Γ) (13.30) Ax^⋆ = B^⋆.

There are also several limiting cases in which the linear system can be further simplified; these are covered in Appendix F.

13 SPECIAL CASES FOR DQMOM LINEAR SYSTEM

There are several special cases that simplify the form of the DQMOM linear system.

13.1 No Birth/Death

For inhomogeneous cases with no birth or death terms, h = 0, and an additional constraint can be implemented: the source term for environment weights can be set equal to zero. That is,

(13.31) a = 0.

This eliminates the variables in a as unknowns and makes the weight transport equations

(13.32) (∂w_α)/(∂t) + (∂)/(∂x_i)(⟨u_i⟩_αw_α) − (∂)/(∂x_i)(D_x, α(∂)/(∂x_i)(w_α)) = 0.

13.1.1 No Birth/Death Only

In the case of no birth or death of particles (and no additional simplifications), the number of unknowns in the matrix system Ax^⋆ = B^⋆ is reduced from (N_ξ + 1)N to N_ξN (N weight source terms are eliminated as unknowns). This also reduces the number of moments that msut be specified to N_ξN, and reduces the size of the matrices given in Appendix 12↑ accordingly. The matrix system becomes:

(13.33) A^′x^⋆′ = B^⋆′

where each matrix has changed slightly; A becomes (for the multivariate case):

(13.34) A^′ = [A₁ … A_{N_ξ}]

where the matrix A₀, containing the prefixes for the variables in a, is gone. Likewise, x^′ becomes:

(13.35) x^′ = [ b^⋆₁ … b^⋆_{N_ξ} ]^T

and each matrix composing the parts of B^⋆ is changed because of the reduced number of moments. B^⋆′ becomes an N_ξN × 1 vector, rather than an (N_ξ + 1)N × 1 vector.

13.1.2 No Birth/Death, No Dispersion/Diffusion

The lack of birth or death of particles causes h = 0 in B^⋆. Coupled with a lack of dispersion, which causes A_cWC = 0 and A_cWΓ = 0, this will make the entire right-hand side equal to zero, so the system being solved is:

(13.36) Ax^⋆ = 0

(Note that, as above, a = 0, and the linear system being solved is a reduced linear system). This case only applies in the absence of gradients for all environments’ internal coordinate values.

Two types of solutions exist for this linear system; the first is the trivial solution, x^⋆ = 0, and the second is the non-trivial solution. Following [49], the trivial solution can be found by setting x^⋆ = 0, which makes b^⋆_i, α = 0. In this case,

(13.37) b^⋆_i = 0

or,

(13.38) a_α = 0 b_i, α = w_αG_i, α

and thus the expressions for the weighted abscissa transport equation source terms are not coupled, and it is unnecessary to solve a linear system.

The second, non-trivial solution, as discussed in [361], arises when there are additional unknowns in the equation. The example covered in [362] is the case of evaporating droplets, when the number density flux for droplets with zero volume is non-zero. This necessitates an additional variable whose value is non-zero. For this reason, the trvial solution is not satisfactory.

13.1.3 No Birth/Death, Unmixed Moments Only

To begin, the linear system being solved,

(13.39) Ax^⋆ = B^⋆,

can be re-expressed as

(13.40) ⎡⎢⎢⎢⎢⎢⎣ A⁽⁰⁾₀ A⁽⁰⁾₁ … A⁽⁰⁾_{N_ξ} A⁽¹⁾₀ A⁽¹⁾₁ … A⁽¹⁾_{N_ξ} ⋮ ⋮ ⋱ ⋮ A^(N_ξ)₀ A^(N_ξ)₁ … A^(N_ξ)_{N_ξ} ⎤⎥⎥⎥⎥⎥⎦⎡⎢⎢⎢⎢⎢⎣ a^⋆ b^⋆₁ ⋮ b^⋆_{N_ξ} ⎤⎥⎥⎥⎥⎥⎦ = B^⋆

where A₀, A₁, etc. are all defined the same as in equation (13.16↑), but the rows (each row of A_j corresponding to one moment) are now split up into N_ξ + 1 groups (indicated by the superscripts), each group containing N moments. Each A⁽ⁿ⁾_m is size N × N, which gives the matrix A size (N_ξ + 1)N × (N_ξ + 1)N. Similarly, the matrix x^⋆ is size (N_ξ + 1)N × 1, and B^⋆ is size (N_ξ + 1)N × 1.

As above, because there are no birth or death processes, the matrix A₀ = 0, and the matrix system being solved becomes

(13.41) A^′x^⋆′ = B^⋆′ ⎡⎢⎢⎢⎢⎢⎣ A⁽¹⁾₁ A⁽¹⁾₂ … A⁽¹⁾_{N_ξ} A⁽²⁾₁ A⁽²⁾₂ … A⁽²⁾_{N_ξ} ⋮ ⋮ ⋱ ⋮ A^(N_ξ)₁ A^(N_ξ)₂ … A^(N_ξ)_{N_ξ} ⎤⎥⎥⎥⎥⎥⎦⎡⎢⎢⎢⎢⎢⎣ b^⋆₁ b^⋆₂ ⋮ b^⋆_{N_ξ} ⎤⎥⎥⎥⎥⎥⎦ = B^⋆′

where one column, corresponding to the coefficients of a, and one row, corresponding to an extra group of moments, have been eliminated from the matrix A; the element of x^⋆ containing a has been eliminated; and B^⋆′ is (as above) a transformed B^⋆, in which a row has been removed, corresponding to the decreased number of unknown variables (and corresponding decreased number of moments). Thus A^′ is size N_ξN × N_ξN, x^⋆′ is size N_ξN × 1, and B^⋆′ is size N_ξN × 1.

Next, each moment in the set of moments used is unmixed, meaning only one moment index is nonzero for any particular moment. In order to obtain the same amount of information about each internal coordinate, the number of nonzero moment indices for each internal coordinate is the same for each internal coordinate, namely, N moments. For an unmixed moment, all but one of the matrices A⁽ⁿ⁾_m in A^′ are zero. This makes the matrix system:

(13.42) ⎡⎢⎢⎢⎢⎢⎣ A⁽¹⁾₁ 0 0 0 0 A⁽²⁾₂ 0 0 0 0 ⋱ ⋮ 0 0 … A^(N_ξ)_{N_ξ} ⎤⎥⎥⎥⎥⎥⎦⎡⎢⎢⎢⎢⎢⎣ b^⋆₁ b^⋆₂ ⋮ b^⋆_{N_ξ} ⎤⎥⎥⎥⎥⎥⎦ = B^⋆′

which can be reduced to a set of N_ξ matrix equations of the form

(13.43) A_jb^⋆_j = B^⋆_j

where the matrix A_j is a Vandermode matrix of size N × N, b^⋆_j is a vector of N unknowns, and B^⋆_j is a vector of N source terms.

13.2 Small N, Small N_ξ

It should, of course, be mentioned that in the case of simplified physics, an analytical solution may be obtained that circumvents the need to invert the linear system. Alternatively, for small numbers of quadrature nodes N or internal coordinates N_ξ, the linear systems are small, and can be inverted by hand for analytical solutions for a and b.

13.3 Mixed Moment Choices

The chief difficulty that arises as the number of quadrature nodes N and internal coordinates N_ξ is increased is with finding sets of mixed moments that are linearly independent. Marchisio [107] gives a example for two variables, showing how the covariance is linearly dependent on the variances, so that only two of these three moments may be selected. The origin of the problem lies in the quadrature algorithm; if two or more moments are linearly dependent, the quadrature algorithm, which is attemping to find orthogonal polynomials whose zeros are the abscissas, has too many constraints and not enough information. While an a priori determination of whether two or more moments are linearly dependent can be made for small N_ξ, the problem grows exponentially in difficulty with the number of internal coordinates. This problem is equivalent to determining if a multivariate polynomial is a factor of another multiviariate polynomial. Each moment may be expressed as a polynomial with respect to the abscissas. Determining if one multivariate polynomial is a factor of another multivariate polynomial is not, in general, an easy problem to solve. For this reason, one must experiment with different moments to find a set that works. This, too, however, becomes difficult and cumbersome for large numbers of internal coordinates and quadrature nodes. For this reason (and others), it is recommended that the optimal linear system construction procedure, detailed in Appendix ↓, be used in the construction of the DQMOM linear system.