Help for package RobustCalibration

Type:

Package

Title:

Robust Calibration of Imperfect Mathematical Models

Version:

0.5.6

Date:

2025-07-15

Maintainer:

Mengyang Gu <mengyang@pstat.ucsb.edu>

Author:

Mengyang Gu [aut, cre]

Description:

Implements full Bayesian analysis for calibrating mathematical models with new methodology for modeling the discrepancy function. It allows for emulation, calibration and prediction using complex mathematical model outputs and experimental data. See the reference: Mengyang Gu and Long Wang, 2018, Journal of Uncertainty Quantification; Mengyang Gu, Fangzheng Xie and Long Wang, 2022, Journal of Uncertainty Quantification; Mengyang Gu, Kyle Anderson and Erika McPhillips, 2023, Technometrics.

License:

GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]

Depends:

methods

Imports:

Rcpp (≥ 0.12.3), RobustGaSP (≥ 0.6.4), nloptr (≥ 1.0.4)

LinkingTo:

Rcpp, RcppEigen

NeedsCompilation:

yes

Repository:

CRAN

Packaged:

2025-07-15 21:22:01 UTC; mengyanggu

RoxygenNote:

5.0.1

Date/Publication:

2025-07-15 22:30:06 UTC

Robust Calibration of Imperfect Mathematical Models

Description

Details

The DESCRIPTION file:

Package:	RobustCalibration
Type:	Package
Title:	Robust Calibration of Imperfect Mathematical Models
Version:	0.5.6
Date:	2025-07-15
Authors@R:	c(person(given="Mengyang",family="Gu",role=c("aut","cre"),email="mengyang@pstat.ucsb.edu"))
Maintainer:	Mengyang Gu <mengyang@pstat.ucsb.edu>
Author:	Mengyang Gu [aut, cre]
Description:	Implements full Bayesian analysis for calibrating mathematical models with new methodology for modeling the discrepancy function. It allows for emulation, calibration and prediction using complex mathematical model outputs and experimental data. See the reference: Mengyang Gu and Long Wang, 2018, Journal of Uncertainty Quantification; Mengyang Gu, Fangzheng Xie and Long Wang, 2022, Journal of Uncertainty Quantification; Mengyang Gu, Kyle Anderson and Erika McPhillips, 2023, Technometrics.
License:	GPL (>= 2)
Depends:	methods
Imports:	Rcpp (>= 0.12.3), RobustGaSP (>= 0.6.4), nloptr (>= 1.0.4)
LinkingTo:	Rcpp, RcppEigen
NeedsCompilation:	yes
Repository:	CRAN
Packaged:	2024-05-30 06:42:08 UTC; mengyanggu
RoxygenNote:	5.0.1
Date/Publication:	2018-05-14 04:26:37 UTC

Index of help topics:

RobustCalibration-package
                        Robust Calibration of Imperfect Mathematical
                        Models
predict                 Prediction for the robust calibration model
predict_MS              Prediction for the robust calibration model for
                        multiple sources
predictobj.rcalibration-class
                        Predictive results for the Robust Calibration
                        class
predictobj.rcalibration_MS-class
                        Predictive results for the Robust Calibration
                        class
rcalibration            Setting up the robust Calibration model
rcalibration-class      Robust Calibration class
rcalibration_MS         Setting up the robust Calibration model for
                        multiple sources data
rcalibration_MS-class   Robust Calibration for multiple sources class
show                    Show an Robust Calibration object.

Robust calibration of imperfect mathematical models and prediction using experimental data

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Bayarri, Maria J and Berger, James O and Paulo, Rui and Sacks, Jerry and Cafeo, John A and Cavendish, James and Lin, Chin-Hsu and Tu, Jian (2007) A framework for validation of computer models. Technometrics. 49, 138–154.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Examples


##---------------------------------------------------
##A simple example where the math model is not biased
##---------------------------------------------------
## the reality 
test_funct_eg1<-function(x){
  sin(pi/2*x)
}



## obtain 25 data from the reality plus a noise
set.seed(1)
## 10 data points are very small, one may want to add more data
n=15
input=seq(0,4,4/(n-1))
input=as.matrix(input)

output=test_funct_eg1(input)+rnorm(length(input),mean=0,sd=0.2)

## plot input and output 
#plot(input,output)
#num_obs=n=length(output)



## the math model 
math_model_eg1<-function(x,theta){
  sin(theta*x)  
}

##fit the S-GaSP model for the discrepancy
##one can choose the discrepancy_type to GaSP, S-GaSP or no discrepancy
##p_theta is the number of parameters to calibrate and user needs to specifiy 
##one may also want to change the number of posterior samples by change S and S_0
##one may change sd_proposal for the standard derivation of the proposal distribution
## one may also add a mean by setting X=... and have_trend=TRUE
model_sgasp=rcalibration(design=input, observations=output, p_theta=1,simul_type=1,
                         math_model=math_model_eg1,theta_range=matrix(c(0,3),1,2)
                         ,S=10000,S_0=2000,discrepancy_type='S-GaSP')


##posterior samples of calibration parameter and value
## the value is 
plot(model_sgasp@post_sample[,1],type='l',xlab='num',ylab=expression(theta))   
plot(model_sgasp@post_value,type='l',xlab='num',ylab='posterior value')   


show(model_sgasp)



#------------------------------------------------------------------------------
# Example: an example used in Susie Bayarri et. al. 2007 Technometrics paper
#------------------------------------------------------------------------------

##reality
test_funct_eg1<-function(x){
  3.5*exp(-1.7*x)+1.5
}


##math model
math_model_eg1<-function(x,theta){
  5*exp(-x*theta) 
}

## noise observations (sampled from reality + independent Gaussian noises)
## each has 3 replicates
input=c(rep(.110,3),rep(.432,3),rep(.754,3),rep(1.077,3),rep(1.399,3),rep(1.721,3),
        rep(2.043,3),rep(2.366,3),rep(2.688,3),rep(3.010,3))
output=c(4.730,4.720,4.234,3.177,2.966,3.653,1.970,2.267,2.084,2.079,2.409,2.371,1.908,1.665,1.685,
         1.773,1.603,1.922,1.370,1.661,1.757,1.868,1.505,1.638,1.390,1.275,1.679,1.461,1.157,1.530)


n_stack=length(output)/3
output_stack=rep(0,n_stack)
input_stack=rep(0,n_stack)
for(j in 1:n_stack){
  output_stack[j]=mean(output[ ((j-1)*3+1):(3*j)])
  input_stack[j]=mean(input[ ((j-1)*3+1):(3*j)])
  
}
output_stack=as.matrix(output_stack)
input_stack=as.matrix(input_stack)
## plot the output and stack
#plot(input,output,pch=16,col='red')
#lines(input_stack,output_stack,pch=16,col='blue',type='p')



## fit the model with S-GaSP for the discrepancy
model_sgasp=rcalibration(design=input_stack, observations=output_stack, p_theta=1,simul_type=1,
                         math_model=math_model_eg1,theta_range=matrix(c(0,10),1,2),S=10000,
                         S_0=2000,discrepancy_type='S-GaSP')

#posterior
plot(model_sgasp@post_sample[,1],type='l',xlab='num',ylab=expression(theta))   
show(model_sgasp)

Determine whether we accept the proposed poserior sample at one step of MCMC.

Description

Determine whether the proposed poserior sample is accepted at one step of MCMC.

Usage

Accept_proposal(r)

Arguments

r

r is a real number calculated by the ratio of posterior distribution with a new proposed sample of parameters and the posterior distribution with the current parameters.

Value

A bool value. If it is true, the proporsed sample gets accepted; if not, it gets rejected.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

Cholesky decomposition of a symmetric matrix.

Description

This function provides a faster Cholesky decomposition of a symmetric matrix using the Eigen package.

Usage

Chol_Eigen(R)

Arguments

R

R is a symmetric matrix.

Value

lower triangular matrix of Cholesky decomposition

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

Cholesky decomposition of the correlation matrix in GaSP.

Description

This function computes the Cholesky decomposition of the correlation matrix in GaSP.

Usage

Get_R_new(beta_delta,  eta_delta,  R0, kernel_type,  alpha, inv_output_weights)

Arguments

beta_delta

Inverse range parameters.

eta_delta

nugget parameters.

R0

A List of matrices where the j-th matrix is an absolute difference matrix of the j-th input vector.

kernel_type

Type of kernel. matern_3_2 and matern_5_2 are Matern kernel with roughness parameter 3/2 and 5/2 respectively. pow_exp is power exponential kernel with roughness parameter alpha. If pow_exp is to be used, one needs to specify its roughness parameter alpha.

alpha

Roughness parameters in the kernel functions. It is only useful if the power exponential correlation function is used.

inv_output_weights

The inverse of output weights.

Value

lower triangular matrix of Cholesky decomposition of the correlation matrix in GaSP.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

Cholesky decomposition of the covariance matrix in S-GaSP.

Description

This function computes the Cholesky decomposition of the covariance matrix in GaSP.

Usage

Get_R_z_new(beta_delta,  eta_delta, lambda_z,  R0, kernel_type,  alpha, 
            inv_output_weights)

Arguments

beta_delta

a vector of inverse range parameters.

eta_delta

a scalar of nugget parameters.

lambda_z

a scalar parameter controling how close the math model to the reality in squared distance.

R0

a list of matrices where the j-th matrix is an absolute difference matrix of the j-th input vector.

kernel_type

type of kernel. matern_3_2 and matern_5_2 are Matern kernel with roughness parameter 3/2 and 5/2 respectively. pow_exp is power exponential kernel with roughness parameter alpha. If pow_exp is to be used, one needs to specify its roughness parameter alpha.

alpha

roughness parameters in the kernel functions. It is only useful if the power exponential correlation function is used.

inv_output_weights

the inverse of output weights.

Value

lower triangular matrix of Cholesky decomposition of the covariance matrix in S-GaSP.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

Produce the inversion of the covariances of the model discrepancy and the measurement bias.

Description

This function computes inversion of the covariances of the model discrepancy and the measurement bias. This is applicable to model calibration with multiple sources of data and measurement bias.

Usage

Get_inv_all(param,  lambda_z, is_SGaSP, R0, kernel_type,  alpha_list,
p_x, num_sources)

Arguments

param

a list of the current parameters values in MCMC.

lambda_z

the value of lambda_z

is_SGaSP

a vector of integer values to indicate whether it is S-GaSP model or not. 0 means the model is GaSP and 1 means the model is S-GaSP.

R0

A List of matrices where the j-th matrix is an absolute difference matrix of the j-th input vector.

kernel_type

alpha_list

A list of roughness parameters in the kernel functions. It is only useful if the power exponential correlation function is used.

p_x

a list of dimensions of the observable inputs.

num_sources

a integer value of the number of sources.

Value

a list of inverse covariances of discrepancy and measurement bias.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

Natural Logorithm of the posterior.

Description

This function compute the natural Logorithm of the posterior assuming the GaSP or S-GaSP models for the discrepancy function.

Usage

Log_marginal_post(param, L_cur, output,  p_theta,  p_x, X, have_mean, CL, a, b,  cm_obs
,S_2_f,num_obs_all)

Arguments

param

Current parameters in the MCMC.

L_cur

Cholesky decomposition of the covariance matrix.

output

Experimental observations.

p_theta

Number of calibration parameters.

p_x

Number of range parameters.

X

Number of mean discrepancy parameters.

have_mean

Whether the mean discrepancy is zero or not.

CL

Prior parameter in the jointly robust prior.

a

Prior parameter in the jointly robust prior.

b

Prior parameter in the jointly robust prior.

cm_obs

Outputs from the mathematical model.

S_2_f

Variance of the data. This term is useful when there are repeated experiments.

num_obs_all

Total number of observations. If there is no repeated experiment, this is equal to the number of observable inputs.

Value

Natural logorithm of the posterior assuming the GaSP or S-GaSP models for the discrepancy function.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Natural Logorithm of the posterior of the discrepancy in model calibration with multiple sources with measurement bias.

Description

This function compute the natural Logorithm of the posterior assuming the GaSP or S-GaSP models for the discrepancy function.

Usage

Log_marginal_post_delta(param, L, delta,  p_x, CL, a, b)

Arguments

param

current parameters in the MCMC.

L

Cholesky decomposition of the covariance matrix.

delta

a vector of the discrepancy.

p_x

dimension of observable inputs.

CL

Prior parameter in the jointly robust prior.

a

Prior parameter in the jointly robust prior.

b

Prior parameter in the jointly robust prior.

Value

Natural logorithm of the posterior of the discrepancy function.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Natural Logorithm of the posterior with no discrepancy function.

Description

This function compute the natural Logorithm of the posterior assuming no discrepancy function.

Usage

Log_marginal_post_no_discrepancy(param,  output,  p_theta,   X, have_mean, 
inv_output_weights,  cm_obs,S_2_f,num_obs_all)

Arguments

param

Current parameters in the MCMC.

output

Experimental observations.

p_theta

Number of calibration parameters.

X

Number of mean discrepancy parameters.

have_mean

Whether the mean discrepancy is zero or not.

inv_output_weights

Inverse of the weights of the outputs

cm_obs

Outputs from the mathematical model.

S_2_f

Variance of the data. This term is useful when there are repeated experiments.

num_obs_all

Total number of observations. If there is no repeated experiment, this is equal to the number of observable inputs.

Value

Natural logorithm of the posterior assuming no discrepancy function.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

A geophysical model for the ground deformation in Kilauea.

Description

This function produces outputs of ground deformation at given coordinates and parameters.

Usage

Mogihammer(obsCoords, m, simul_type)

Arguments

obsCoords

spatial coordinate system.

m

A five dimensional input parameters. The first two are the location of the magma chamber; the third one is the depth of the chamber; the fourth one is magma storage rate; the last one is the Possion ratio, which is related to the rock properties.

simul_type

If the simul_type is 2, it means it is for the ascending mode InSAR data; if the simul_type is 3, it means it is for the descending mode InSAR data.

Value

A vector of outputs for ground deformation.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

K. R. Anderson and M. P. Poland (2016), Bayesian estimation of magma supply, storage, and eroption rates using a multiphysical volcano model: Kilauea volcano, 2000-2012.. Eath and Planetary Science Letters, 447, 161-171.

K. R. Anderson and M. P. Poland (2017), Abundant carbon in the mantle beneath Hawaii. Nature Geoscience, 10, 704-708.

Sample the model discrepancy.

Description

This function samples a vector of the model discrepancy for the scenario with multiple sources and measurement bias.

Usage

Sample_delta(cov_inv_all,  tilde_output_cur,   param,  p_x,   
num_sources, num_obs, rand_norm)

Arguments

cov_inv_all

a list of inverse covariances of discrepancy and measurement bias.

tilde_output_cur

a list of transformed observations.

param

a list of the current parameters values in MCMC.

p_x

a list of dimensions of the observable inputs.

num_sources

the number of sources.

num_obs

the number of observations.

rand_norm

the vector of i.i.d. standard normal samples.

Value

A vector of samples of model discrepancy.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Sample the variance and mean parameters.

Description

This function Samples the variance and mean parameters assuming the GaSP or S-GaSP models for the discrepancy function.

Usage

Sample_sigma_2_theta_m(param,  L_cur,  output,   p_theta,  p_x,  X,  have_mean, cm_obs
,S_2_f,num_obs_all)

Arguments

param

Current parameters in the MCMC.

L_cur

Cholesky decomposition of the covariance matrix.

output

Experimental observations.

p_theta

Number of calibration parameters.

p_x

Number of range parameters.

X

Number of mean discrepancy parameters.

have_mean

Whether the mean discrepancy is zero or not.

cm_obs

outputs from the mathematical model.

S_2_f

Variance of the data. This term is useful when there are repeated experiments.

num_obs_all

Total number of observations. If there is no repeated experiment, this is equal to the number of observable inputs.

Value

A vector of samples of the variance and mean discrepancy parameters.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Sample the variance and mean parameters with no discrepancy function.

Description

This function samples the variance and mean parameters assuming no discrepancy function.

Usage


Sample_sigma_2_theta_m_no_discrepancy(param,  output,   p_theta,   
                                      X,  have_mean, inv_output_weights, cm_obs,
                                      S_2_f,num_obs_all)

Arguments

param

Current parameters in the MCMC.

output

Experimental observations.

p_theta

Number of calibration parameters.

X

Number of mean discrepancy parameters.

have_mean

Whether the mean discrepancy is zero or not.

inv_output_weights

The inverse of the weights of outputs..

cm_obs

outputs from the mathematical model.

S_2_f

Variance of the data. This term is useful when there are repeated experiments.

num_obs_all

Total number of observations. If there is no repeated experiment, this is equal to the number of observable inputs.

Value

A vector of samples of the variance and trend parameters.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Update the inverse of covariance multiplied by the outputs in the S-GaSP model.

Description

This function update the inverse of R_z multiple the outputs in the S-GaSP model for prediction.

Usage

Update_R_inv_y(R_inv_y,  R0,  beta_delta,   kernel_type,  alpha,  lambda_z,  num_obs)

Arguments

R_inv_y

A vector of inverse of covariance multiplied by the outputs.

R0

A List of matrix where the j-th matrix is an absolute difference matrix of the j-th input vector.

beta_delta

Inverse range parameters.

kernel_type

alpha

Roughness parameters in the kernel functions. It is only useful if the power exponential correlation function is used.

lambda_z

A parameter controling how close the math model to the reality in squared distance.

num_obs

Number of observations.

Value

A vector of the inverse of covariance multiplied by the outputs in the S-GaSP model.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

Evaluation of the mathmatical model at given observed inputs and calibration parameters.

Description

This function evaluates the mathematical model at given observed inputs and calibration parameters.

Usage

mathematical_model_eval(input,theta,simul_type, emulator,
emulator_type,loc_index_emulator,math_model)

Arguments

input

a matrix of the observed inputs.

theta

a vector of calibration parameters.

simul_type

tpye of math model. If the simul_type is 0, it means we use the RobustGaSP R package to emulate the math model. If the simul_type is 1, it means the function of the math model is given by the user. When simul_type is 2 or 3, the mathematical model is the geophyiscal model for Kilauea Volcano. If the simul_type is 2, it means it is for the ascending mode InSAR data; if the simul_type is 3, it means it is for the descending mode InSAR data.

emulator

an S4 class of rgasp model from the RobustGaSP R Package.

emulator_type

a character to specify the type of emulator. 'rgasp' is for computer models with scalar-valued output and 'ppgasp' for computer models with vectorized output.

loc_index_emulator

a vector of the location index from the ppgasp emulator to output. Only useful for vectorized output computer model emulated by the ppgasp emulator.

math_model

a function for the math model to be calibrated.

Value

a vector of outputs from the math model or its emulator at given observed inputs and calibration parameters.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

K. R. Anderson and M. P. Poland (2017), Abundant carbon in the mantle beneath Hawaii. Nature Geoscience, 10, 704-708.

Posterior sampling.

Description

This function performs the posterior sampling for calibration parameters and other parameters in the model.

Usage

post_sample(input, output, R0_list, kernel_type, p_theta, output_weights, par_cur, 
            lambda_z,prior_par,theta_range,S,thinning, X, have_trend,alpha,
            sd_proposal,discrepancy_type, simul_type,emulator,
            emulator_type, loc_index_emulator,math_model,
            S_2_f,num_obs_all)

Arguments

input

a matrix of observed inputs/design points of the experimental data.

output

a vector of experimental data.

R0_list

a List of matrices where the j-th matrix is an absolute difference matrix of the j-th input vector.

kernel_type

p_theta

number of calibration parameters.

output_weights

a vector of the weights of the output.

par_cur

current value of the posterior samples.

lambda_z

a scalar parameter controling how close the math model to the reality in squared distance.

prior_par

a vector of prior parameters in the prior.

theta_range

a matrix for the range of the calibration parameters. The first column is the lower bound and the second column is the upper bound of the calibration parameters.

S

number of MCMC to run.

thinning

the ratio between the number of posterior samples and the number of recorded samples.

X

a matrix for the basis of the mean discrepancy.

have_trend

a bool value. It means the mean discrepancy is zero or not.

alpha

a vector of roughness parameters in the kernel functions. It is only useful if the power exponential correlation function is used.

sd_proposal

a vector for the standard deviation of the proposal distribution.

discrepancy_type

A string for type of discrepancy funcation. It can be chosen from 'no-discrepancy', 'GaSP' or 'S-GaSP'.

simul_type

emulator

an S4 class of rgasp model from the RobustGaSP R Package.

emulator_type

a character to specify the type of emulator. 'rgasp' is for computer models with scalar-valued output and 'ppgasp' for computer models with vectorized output.

loc_index_emulator

a vector of the location index from the ppgasp emulator to output. Only useful for vectorized output computer model emulated by the ppgasp emulator.

math_model

a function for the math model to be calibrated.

S_2_f

Variance of the data. This term is useful when there are repeated experiments.

num_obs_all

Total number of observations. If there is no repeated experiment, this is equal to the number of observable inputs.

Value

A list. The first element is a matrix of the posterior samples after burn-in. The second element is a vector of posterior values after burn-in. The third element is the number of times of the proposed samples are accepted. The first value of the vector is the number of times that the proposed calibration parameters are accepted and the second value is the number of times of the proposed log inverse range parameter and the log nugget parameters are accepted, if there is a specification of the discrepancy function. The fourth element is the number of times the proposed samples of the calibration parameters are outside the range of the calibration parameters.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Posterior sampling.

Description

This function performs the posterior sampling for calibration parameters and other parameters in the model.

Usage

post_sample_MS(model,par_cur_theta, par_cur_individual, emulator,math_model_MS)

Arguments

model

a S4 object of rcalibration_MS.

par_cur_theta

a list of current value of the posterior sample of calibration parameters.

par_cur_individual

a list of the current values of the posterior sample of the individual parameter of multiple sources.

emulator

a list of emulators if specified of multiple sources.

math_model_MS

a list of mathematical models of multiple sources.

Value

A list. The record_post is a vector of posterior values after burn-in samples. The record_theta is a matrix of of the posterior samples of theta after burn-in samples. The individual_par is a list where each element is a matrix of posterior samples of the range and nugget parameters for each source. The accept_S_theta is a vector where each element is the number of accepted posterior samples of calibration parameters. The accept_S_beta is a vector where each element is the number of accepted posterior samples of range and nugget parameters. The count_dec_record is vector where each element is the number of times the proposed samples of the calibration parameters are outside the range of the calibration parameters for each source.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Posterior sampling for the model with no discrepancy function.

Description

This function performs the posterior sampling for calibration parameters and other parameters in the model assuming no discrepancy function.

Usage


post_sample_no_discrepancy(input, output, R0_list,  p_theta, output_weights,
                                par_cur, theta_range,S,thinning, X, have_trend, 
                                alpha,sd_proposal, discrepancy_type, simul_type,emulator,
                                emulator_type, loc_index_emulator,
                                math_model,S_2_f,num_obs_all)

Arguments

input

a matrix of observed inputs/design points of the experimental data.

output

a vector of experimental data.

R0_list

a List of matrices where the j-th matrix is an absolute difference matrix of the j-th input vector.

p_theta

number of calibration parameters.

output_weights

a vector of the weights of the output.

par_cur

current value of the posterior samples.

theta_range

a matrix for the range of the calibration parameters. The first column is the lower bound and the second column is the upper bound of the calibration parameters.

S

number of MCMC to run.

thinning

the ratio between the number of posterior samples and the number of recorded samples.

X

a matrix for the basis of the mean discrepancy.

have_trend

a bool value. It means the mean discrepancy is zero or not.

alpha

a vector of roughness parameters in the kernel functions. It is only useful if the power exponential correlation function is used.

sd_proposal

a vector for the standard deviation of the proposal distribution.

discrepancy_type

A string for type of discrepancy funcation. It can be chosen from 'no-discrepancy', 'GaSP' or 'S-GaSP'.

simul_type

emulator

an S4 class of rgasp model from the RobustGaSP R Package.

emulator_type

a character to specify the type of emulator. 'rgasp' is for computer models with scalar-valued output and 'ppgasp' for computer models with vectorized output.

loc_index_emulator

a vector of the location index from the ppgasp emulator to output. Only useful for vectorized output computer model emulated by the ppgasp emulator.

math_model

a function for the math model to be calibrated.

S_2_f

Variance of the data. This term is useful when there are repeated experiments.

num_obs_all

Total number of observations. If there is no repeated experiment, this is equal to the number of observable inputs.

Value

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Posterior sampling for the model with a discrepancy function

Description

This function performs the posterior sampling for calibration parameters and other parameters in the model, assuming the GaSP or S-GaSP model for the discrepancy function.

Usage

post_sample_with_discrepancy(input, output, R0_list, kernel_type, p_theta, 
                             output_weights, par_cur, lambda_z,prior_par,theta_range,
                             S, thinning,X, have_trend, alpha,sd_proposal, 
                             discrepancy_type, simul_type,emulator,
                             emulator_type, loc_index_emulator,math_model,
                             S_2_f,num_obs_all)

Arguments

input

a matrix of observed inputs/design points of the experimental data.

output

a vector of experimental data.

R0_list

a List of matrices where the j-th matrix is an absolute difference matrix of the j-th input vector.

kernel_type

p_theta

number of calibration parameters.

output_weights

a vector of the weights of the output.

par_cur

current value of the posterior samples.

lambda_z

a scalar parameter controling how close the math model to the reality in squared distance.

prior_par

a vector of prior parameters in the prior.

theta_range

a matrix for the range of the calibration parameters. The first column is the lower bound and the second column is the upper bound of the calibration parameters.

S

number of MCMC to run.

thinning

the ratio between the number of posterior samples and the number of recorded samples.

X

a matrix for the basis of the mean discrepancy.

have_trend

a bool value. It means the mean discrepancy is zero or not.

alpha

a vector of roughness parameters in the kernel functions. It is only useful if the power exponential correlation function is used.

sd_proposal

a vector for the standard deviation of the proposal distribution.

discrepancy_type

A string for type of discrepancy funcation. It can be chosen from 'no-discrepancy', 'GaSP' or 'S-GaSP'.

simul_type

emulator

an S4 class of rgasp object from the RobustGaSP R Package.

emulator_type

a character to specify the type of emulator. 'rgasp' is for computer models with scalar-valued output and 'ppgasp' for computer models with vectorized output.

loc_index_emulator

a vector of the location index from the ppgasp emulator to output. Only useful for vectorized output computer model emulated by the ppgasp emulator.

math_model

a function for the math model to be calibrated.

S_2_f

Variance of the data. This term is useful when there are repeated experiments.

num_obs_all

Total number of observations. If there is no repeated experiment, this is equal to the number of observable inputs.

Value

a list. The first element is a matrix of the posterior samples after burn-in. The second element is a vector of posterior values after burn-in. The third element is the number of times the proposed samples are accepted. The first value of the vector is the number of times that the proposed calibration parameters are accepted and the second value is the number of times of the proposed log inverse range parameter and the log nugget parameters are accepted. The fourth element is the number of times the proposed samples of the calibration parameters are outside the range of the calibration parameters.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Prediction for the robust calibration model

Description

Function to make prediction on Robust Calibration models after the rcalibration class has been constructed.

Usage

## S4 method for signature 'rcalibration'
predict(object, testing_input,X_testing=NULL,
                                 n_thinning=10,
                                 testing_output_weights=NULL,
                                 interval_est=NULL,interval_data=F,
                                 math_model=NULL,test_loc_index_emulator=NULL,...)

Arguments

object

an object of class rcalibration.

testing_input

a matrix containing the inputs where the predict is to perform prediction. To predict one observable input with multiple dimension, user should supply a row vector.

X_testing

a matrix of mean/trend for prediction.

n_thinning

number of points further thinning the MCMC posterior samples.

testing_output_weights

the weight of testing outputs.

interval_est

a vector for the the posterior credible interval. If interval_est is NULL, we do not compute the posterior credible interval. It can be specified as a vector of values ranging from zero to one. E.g. if interval_est=c(0.025, 0.975), the 95 posterior credible interval will be computed.

interval_data

a bool value to decide whether the experimental noise is included for computing the posterior credible interval.

math_model

a function for the math model to be calibrated.

test_loc_index_emulator

a vector of the location index from the ppgasp emulator to output. Only useful for vectorized output computer model emulated by the ppgasp emulator.

...

extra arguments to be passed to the function (not implemented yet).

Value

The returned value is a S4 CLass predictobj.rcalibration.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Examples

#------------------------------------------------------------------------------
# Example: an example used in Susie Bayarri et. al. 2007 Technometrics paper
#------------------------------------------------------------------------------
    

##reality
test_funct_eg1<-function(x){
  3.5*exp(-1.7*x)+1.5
}


##math model
math_model_eg1<-function(x,theta){
  5*exp(-x*theta) 
}


## noise observations (sampled from reality + independent Gaussian noises)
## each has 3 replicates
input=c(rep(.110,3),rep(.432,3),rep(.754,3),rep(1.077,3),rep(1.399,3),rep(1.721,3),
        rep(2.043,3),rep(2.366,3),rep(2.688,3),rep(3.010,3))
output=c(4.730,4.720,4.234,3.177,2.966,3.653,1.970,2.267,2.084,2.079,2.409,2.371,1.908,1.665,1.685,
         1.773,1.603,1.922,1.370,1.661,1.757,1.868,1.505,1.638,1.390,1.275,1.679,1.461,1.157,1.530)


## calculating the average or the stack data
n_stack=length(output)/3
output_stack=rep(0,n_stack)
input_stack=rep(0,n_stack)
for(j in 1:n_stack){
  output_stack[j]=mean(output[ ((j-1)*3+1):(3*j)])
  input_stack[j]=mean(input[ ((j-1)*3+1):(3*j)])
}
output_stack=as.matrix(output_stack)
input_stack=as.matrix(input_stack)
## plot the output and stack output
#plot(input,output,pch=16,col='red')
#lines(input_stack,output_stack,pch=16,col='blue',type='p')



## fit model using S-GaSP for the discrepancy
## one can change S and S_0 for the number of posterior and burn-in samples
## Normallly you may need a larger number of posterior sample
## you can set S=50000 and S_0=5000
## one may also change the sd of the proposal distribution using sd_proposal
model_sgasp=rcalibration(design=input_stack, observations=output_stack, p_theta=1,simul_type=1,
                         math_model=math_model_eg1,theta_range=matrix(c(0,10),1,2),
                         S=10000,S_0=2000,discrepancy_type='S-GaSP')

# one can  fit the GaSP model for discrepancy function by discrepancy_type='GaSP'
# one can  fit a model without the discrepancy function by discrepancy_type='no-discrepancy'

## posterior of the calibration parameter
#plot(model_sgasp@post_sample[,1],type='l',xlab='num',ylab=expression(theta))   
show(model_sgasp)

##

## test data set
testing_input=as.matrix(seq(0,6,0.02))

##perform prediction
prediction_sgasp=predict(model_sgasp,testing_input,math_model=math_model_eg1,
                         interval_est=c(0.025,0.975),interval_data=TRUE,
                         n_thinning =20 )

##real test output
testing_output=test_funct_eg1(testing_input)

##the prediction by S-GaSP
min_val=min(prediction_sgasp@mean,prediction_sgasp@interval,output,testing_output)
max_val=max(prediction_sgasp@mean,prediction_sgasp@interval,output,testing_output)

plot(testing_input,prediction_sgasp@mean,type='l',col='blue',xlab='x',ylab='y',
     ylim=c(min_val,max_val) )
lines(testing_input,prediction_sgasp@interval[,1],col='blue',lty=2)
lines(testing_input,prediction_sgasp@interval[,2],col='blue',lty=2)

lines(input,output,type='p')
lines(testing_input,prediction_sgasp@math_model_mean,col='blue',lty=3)

lines(testing_input,testing_output,type='l')

legend("topright", legend=c("reality", "predictive mean","95 percent posterior credible interval",
                            "predictive mean of the math model"),
                            col=c("black", "blue","blue","blue"), lty=c(1,1,2,3),cex=.6)


## MSE if the math model and discrepancy are used for prediction
mean((testing_output-prediction_sgasp@mean)^2)

## MSE if the math model is used for prediction 
mean((testing_output-prediction_sgasp@math_model_mean)^2)



##################################
#the example with a mean structure
##################################

##now let's fit  model with mean
model_sgasp_with_mean=rcalibration(design=input_stack, observations=output_stack,
                                   p_theta=1,X=matrix(1,dim(input_stack)[1],1),
                                   have_trend=TRUE,simul_type=1,
                                   math_model=math_model_eg1,
                                   theta_range=matrix(c(0,10),1,2),
                                   S=10000,S_0=2000,
                                   discrepancy_type='S-GaSP')

#posterior
#plot(model_sgasp_with_mean@post_sample[,1],type='l',xlab='num',ylab=expression(theta))   
show(model_sgasp_with_mean)

## test data set
testing_input=as.matrix(seq(0,6,0.02))


prediction_sgasp_with_mean=predict(model_sgasp_with_mean,testing_input, X_testing=matrix(1,dim
(testing_input)[1],1),
math_model=math_model_eg1,n_thinning = 50,
interval_est=c(0.025,0.975),interval_data=TRUE)


##plot for the S-GaSP 
##for this example, with a mean structure, it fits much better
min_val=min(prediction_sgasp_with_mean@mean,output,testing_output,
prediction_sgasp_with_mean@interval[,1])
max_val=max(prediction_sgasp_with_mean@mean,output,testing_output,
prediction_sgasp_with_mean@interval[,2])

plot(testing_input,prediction_sgasp_with_mean@mean,type='l',col='blue',xlab='x',
     ylab='y',ylim=c(min_val,max_val) )
#lines(testing_input,prediction_sgasp_with_mean@interval[,1],col='blue',lty=2)
#lines(testing_input,prediction_sgasp_with_mean@interval[,2],col='blue',lty=2)

lines(input,output,type='p')
lines(testing_input,prediction_sgasp_with_mean@math_model_mean,col='blue',lty=3)
lines(testing_input,prediction_sgasp_with_mean@interval[,1],col='blue',lty=2)
lines(testing_input,prediction_sgasp_with_mean@interval[,2],col='blue',lty=2)

lines(testing_input,testing_output,type='l')

legend("topright", legend=c("reality", "predictive mean", "predictive mean of the math model"),
       col=c("black", "blue","blue"), lty=c(1,1,3),cex=.6)


## MSE if the math model and discrepancy are used for prediction
mean((testing_output-prediction_sgasp_with_mean@mean)^2)

## MSE if the math model is used for prediction 
mean((testing_output-prediction_sgasp_with_mean@math_model_mean)^2)


## Not run: 
  #-------------------------------------------------------------
  #the example with the emulator
  #-------------------------------------------------------------
  
  n_design=80
  
  design_simul=matrix(runif(n_design*2),n_design,2)
  #library(lhs)
  #design_simul=maximinLHS(n=n_design,k=2)
  
  design_simul[,1]=6*design_simul[,1]   ##the first one is the observed input x
  design_simul[,2]=10*design_simul[,2]   ##the second one is the calibration parameter \theta
  
  output_simul=math_model_eg1(design_simul[,1],design_simul[,2])
  
  
  ##this is a little slow compared with the previous model
  
  model_sgasp_with_mean_emulator=rcalibration(design=input_stack, observations=output_stack,
                                              p_theta=1,simul_type=0, 
                                              have_trend=T,X=matrix(1,dim(input_stack)[1],1),
                                              input_simul=design_simul, output_simul=output_simul,
                                              theta_range=matrix(c(0,10),1,2),
                                              S=10000,S_0=2000,discrepancy_type='S-GaSP')
  
  ##now the output is a list
  show(model_sgasp_with_mean_emulator)
  
  ##here is the plot
  plot(model_sgasp_with_mean_emulator@post_sample[,4],type='l',xlab='num',ylab=expression(theta))   
  plot(model_sgasp_with_mean_emulator@post_value,type='l',xlab='num',ylab='posterior value')   
  
  
  prediction_sgasp_with_mean_emulator=predict(model_sgasp_with_mean_emulator,testing_input,
                                              X_testing=matrix(1,dim(testing_input)[1],1),
                                              interval_est=c(0.025,0.975),
                                              interval_data=TRUE)
  
  ##for this example, with a mean structure, it fits much better
  min_val=min(prediction_sgasp_with_mean_emulator@mean,output,testing_output,
              prediction_sgasp_with_mean_emulator@math_model_mean)
  max_val=max(prediction_sgasp_with_mean_emulator@mean,output,testing_output,
              prediction_sgasp_with_mean_emulator@math_model_mean)
  
  plot(testing_input,prediction_sgasp_with_mean_emulator@mean,type='l',col='blue',xlab='x',
       ylab='y',ylim=c(min_val,max_val) )
  #lines(testing_input,prediction_sgasp_with_mean@interval[,1],col='blue',lty=2)
  #lines(testing_input,prediction_sgasp_with_mean@interval[,2],col='blue',lty=2)
  
  lines(input,output,type='p')
  lines(testing_input,prediction_sgasp_with_mean_emulator@math_model_mean,col='blue',lty=3)
  
  lines(testing_input,testing_output,type='l')
  
  legend("topright", legend=c("reality", "predictive mean", "predictive mean of the math model"),
         col=c("black", "blue","blue"), lty=c(1,1,3),cex=.6)
  
  
  ## MSE if the math model and discrepancy are used for prediction
  mean((testing_output-prediction_sgasp_with_mean_emulator@mean)^2)
  
  ## MSE if the math model is used for prediction 
  mean((testing_output-prediction_sgasp_with_mean_emulator@math_model_mean)^2)


## End(Not run)

Prediction for the robust calibration model for multiple sources

Description

Function to make prediction on Robust Calibration models after the rcalibration class has been constructed for multiple sources.

Usage

## S4 method for signature 'rcalibration_MS'
predict_MS(object, testing_input,
                                       X_testing=as.list(rep(0,object@num_sources)),
                                       testing_output_weights=NULL, 
                                       n_thinning=10,
                                       interval_est=NULL,
                                       interval_data=rep(F,length(testing_input)),
                                       math_model=NULL,...)

Arguments

object

an object of class rcalibration_MS.

testing_input

a list of matrices containing the inputs where the predict_MS is to perform prediction. Each element of the list is a matrix of testing inputs for the corresponding source of data. The number of rows of the matrix is equal to the number of predictive outputs for the corresponding source.

X_testing

a list of matrices of mean/trend for prediction if specified. The number of rows of the matrix is equal to the number of predictive outputs for the corresponding source.

testing_output_weights

a list of vecots for the weight of testing outputs for multiple sources.

n_thinning

number of points further thinning the MCMC posterior samples.

interval_est

a list of vectors for the posterior predctive credible interval for multiple sources. If interval_est is NULL, we do not compute the posterior credible interval. It can be specified as a vector of values ranging from zero to one. E.g.

interval_data

a vector of bool values to decide whether the experimental noise is included for computing the posterior credible interval.

math_model

a list of functions for the math model to be calibrated for multiple sources.

...

extra arguments to be passed to the function (not implemented yet).

Value

The returned value is a S4 CLass predictobj.rcalibration.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

K. R. Anderson and M. P. Poland (2017), Abundant carbon in the mantle beneath Hawaii. Nature Geoscience, 10, 704-708.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Examples

#---------------------------------------------------------------------------------------------
# An example for calibrating and predicting mathematical models for data from multiple sources
#---------------------------------------------------------------------------------------------
    


library(RobustCalibration)


##reality
test_funct<-function(x){
  sin(pi*x/2)+2*cos(pi*x/2)
}


##math model from two sources
math_model_source_1<-function(x,theta){
  sin(theta*x) 
}

math_model_source_2<-function(x,theta){
  cos(theta*x) 
}

input1=seq(0,2,2/(10-1))
input2=seq(0,3,3/(10-1))
##
output1=test_funct(input1)+rnorm(length(input1), sd=0.01)
output2=test_funct(input2)+rnorm(length(input2), sd=0.02)

plot(input1, output1)
plot(input2, output2)



design=list()
design[[1]]=as.matrix(input1)
design[[2]]=as.matrix(input2)

observations=list()
observations[[1]]=output1
observations[[2]]=output2


p_theta=1


theta_range=matrix(0,p_theta,2)
theta_range[1,]=c(0, 8)  
simul_type=c(1,1)

math_model=list()

math_model[[1]]=math_model_source_1
math_model[[2]]=math_model_source_2


## calibrating two mathematical models for these two sources
model_sgasp=rcalibration_MS(design=design, observations=observations, p_theta=1,
                            simul_type=simul_type,math_model=math_model,
                            theta_range=theta_range, 
                            S=10000,S_0=2000,
                            discrepancy_type=rep('S-GaSP',length(design)))

#plot(model_sgasp@post_theta[,1],type='l')
mean(model_sgasp@post_theta[,1])

testing_input1=seq(0,2,2/(25-1))

testing_input2=seq(0,3,3/(25-1))

testing_input=list()
testing_input[[1]]=as.matrix(testing_input1)
testing_input[[2]]=as.matrix(testing_input2)


predict_sgasp=predict_MS(model_sgasp, testing_input, math_model=math_model)
  

testing_output1=test_funct(testing_input1)
testing_output2=test_funct(testing_input2)


plot(predict_sgasp@mean[[1]])
lines(testing_output1)

plot(predict_sgasp@mean[[2]])
lines(testing_output2)

Fast prediction when the test points lie on a 2D lattice.

Description

This function computes fast computation when the test points lie on a 2D lattice.

Usage

predict_separable_2dim(object, testing_input_separable,
X_testing=matrix(0,length(testing_input_separable[[1]])*
length(testing_input_separable[[2]]),1), n_thinning=10,  
interval_est = NULL,math_model=NULL,test_loc_index_emulator=NULL,...)

Arguments

object

an object of class rcalibration.

testing_input_separable

a list. The first element is a vector of the coordinate of the latitue and the second element is a vector of the coordinate of the longitude.

X_testing

a matrix of mean/trend for prediction.

n_thinning

number of points thinning the MCMC posterior samples.

math_model

a function for the math model to be calibrated.

test_loc_index_emulator

a vector of the location index from the ppgasp emulator to output. Only useful for vectorized output computer model emulated by the ppgasp emulator.

Value

The returned value is a S4 CLass predictobj.rcalibration.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Fast prediction when the test points lie on a 2D lattice for multiple sources of observations.

Description

This function computes fast computation when the test points lie on a 2D lattice for multiple sources of observations.

Usage

predict_separable_2dim_MS(object, testing_input_separable,
X_testing=NULL, math_model=NULL,...)

Arguments

object

an object of class rcalibration.

testing_input_separable

a list of two. In the first (outer) list, each list is a source of test input. Then in the second (interior) list, The first element is a vector of the coordinate of the latitue and the second element is a vector of the coordinate of the longitude.

X_testing

mean/trend for prediction where the defaul value is NULL.

math_model

a list of functions of math models to be calibrated.

Value

The returned value is a S4 CLass predictobj.rcalibration_MS.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

Mengyang Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu and L. Wang (2018) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. SIAM/ASA Journal on Uncertainty Quantification, 6, 1555-1583.

M. Gu (2019) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . Bayesian Analysis, 14, 857-885.

Predictive results for the Robust Calibration class

Description

S4 class for prediction after Robust rcalibration with or without the specification of the discrepancy model.

Objects from the Class

Objects of this class are created and initialized with the function predict that computes the prediction and the uncertainty quantification.

Slots

mean:: object of class vector. The predictive mean at testing inputs combing the mathematical model and discrepancy function.
math_model_mean:: object of class vector. The predictive mean at testing inputs using only the mathematical model (and the trend if specified).
math_model_mean_no_trend:: object of class vector. The predictive mean at testing inputs using only the mathematical model without the trend.
delta:: object of class vector. The predictive discrepancy function.
interval:: object of class matrix. The upper and lower predictive credible interval. If interval_data is TRUE in the predict.rcalibration, the experimental noise is included for computing the predictive credible interval.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

Predictive results for the Robust Calibration class

Description

S4 class for prediction after Robust rcalibration for multiple sources.

Objects from the Class

Objects of this class are created and initialized with the function predict_MS that computes the prediction and the uncertainty quantification.

Slots

mean:: object of class list. Each element is a vector of the predictive mean at testing inputs combing the mathematical model and discrepancy function for each source.
math_model_mean:: object of class list. Each element is a vector of the predictive mean at testing inputs using only the mathematical model (and the trend if specified).
math_model_mean_no_trend:: object of class list. Each element is a vector of the predictive mean at testing inputs using only the mathematical model without the trend for each source.
interval:: object of class list. Each element is a matrix of the upper and lower predictive credible interval. If interval_data is TRUE in the predict_MS, the experimental noise is included for computing the predictive credible interval.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

Setting up the robust Calibration model

Description

Setting up the Calibration model for estimating the parameters via MCMC with or without a discrepancy function.

Usage

  rcalibration(design, observations, p_theta=NULL, 
  X=matrix(0,dim(as.matrix(design))[1],1), 
  have_trend=FALSE, simul_type=1, input_simul=NULL,output_simul=NULL,simul_nug=FALSE,
  loc_index_emulator=NULL,math_model=NULL, theta_range=NULL,
  sd_proposal=NULL, 
  S=10000,S_0=2000,thinning=1, discrepancy_type='S-GaSP',
  kernel_type='matern_5_2', lambda_z=NA, a=1/2-dim(as.matrix(design))[2], b=1,
  alpha=rep(1.9,dim(as.matrix(design))[2]), 
  output_weights=rep(1,dim(as.matrix(design))[1]),method='post_sample',
  initial_values=NULL,num_initial_values=3,...)

Arguments

design

a matrix of observed inputs where each row is a row vector of observable inputs corresponding to one observation, and the number of field or experimental data is the total number of rows.

observations

a vector of field or experimental data.

p_theta

an integer about the number of parameters, which should be specified by the user.

X

a matrix of the mean/trend discrepancy between the reality and math model. The number of rows of X is equal to the number of observations. The default values are a vector of zeros.

have_trend

a bool value meaning whether we assume a mean/trend discrepancy function.

simul_type

an integer about the math model/simulator. If the simul_type is 0, it means we use RobustGaSP R package to build an emulator for emulation. If the simul_type is 1, it means the function of the math model is given by the user. When simul_type is 2 or 3, the mathematical model is the geophyiscal model for Kilauea Volcano. If the simul_type is 2, it means it is for the ascending mode InSAR data; if the simul_type is 3, it means it is for the descending mode InSAR data.

input_simul

an D x (p_x+p_theta) matrix of design for emulating the math model. It is only useful if simul_type is 0, meaning that we emulate the output of the math model.

output_simul

a D dimensional vector of the math model runs on the design (input_simul). It is only useful if simul_type is 0, meaning that we emulate the output of the math model.

simul_nug

a bool value meaning whether we have a nugget for emulating the math model/simulator. If the math model is stochastic, we often need a nugget. If simul_Nug is TRUE, it means we have a nugget for the emulator. If simul_Nug is FALSE, it means we do not have a nugget for the emulator.

loc_index_emulator

a vector of the location index from the ppgasp emulator to output. Only useful for vectorized output computer model emulated by the ppgasp emulator.

math_model

a function of the math model provided by the user. It is only useful if simul_type is 1, meaning that we know the math model and it can be computed fast. If the math model is computationally slow, one should set simul_type to be 0 to emulate the math model. One can input a function to define a math_model where the first input of the function is a vector of observable inputs and the second input is a vector of calibration parameters. The output of each function is a scalar.

theta_range

a p_theta x 2 matrix of the range of the calibration parameters. The first column is the lower bound and the second column is the upper bound. It should be specified by the user if the simul_type is 0.

sd_proposal

a vector of the standard deviation of the proposal distribution in MCMC. The default value of sd of the calibration parameter is 0.05 times theta_range. The rest is set to be 0.05.

S

number of posterior samples to run.

S_0

number of burn-in samples.

thinning

number of posterior samples to record.

discrepancy_type

characters about the type of the discrepancy. If it is 'no-discrepancy', it means no discrepancy function. If it is 'GaSP', it means the GaSP model for the discrepancy function. If it is 'S-GaSP', it means the S-GaSP model for the discrepancy function.

kernel_type

characters about the type of the discrepancy type of kernel. matern_3_2 and matern_5_2 are Matern kernel with roughness parameter 3/2 and 5/2 respectively. pow_exp is power exponential kernel with roughness parameter alpha. If pow_exp is to be used, one needs to specify its roughness parameter alpha.

lambda_z

a vector value about how close the math model to the reality in squared distance when the S-GaSP model is used for modeling the discrepancy.

a

a scalar of the prior parameter.

b

a scalar of the prior parameter.

alpha

a numeric parameter for the roughness in the kernel.

output_weights

a vector of the weights of the outputs.

method

characters for method of parameter estimation. If it is 'post_sample', the posterior sampling will be used. If it is 'mle', the maximum likelihood estimator will be used.

initial_values

either a vector or a matrix of initial values of parameters. If posterior sampling method is used, it needs to be vector of the initial values of the calibration parameters. If an optimization method is used, it can be a matrix of the calbiration parameters and kernel parameters (log inverse range parameters and the log nugget parameter) to be optimized numerically, where each row of the matrix contains a set of initial values.

num_initial_values

the number of initial values of the kernel parameters in optimization.

...

Extra arguments to be passed to the function (not implemented yet)

Value

rcalibration returns an S4 object of class rcalibration (see rcalibration-class).

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

K. R. Anderson and M. P. Poland (2017), Abundant carbon in the mantle beneath Hawaii. Nature Geoscience, 10, 704-708.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Examples

library(RobustCalibration)

#-------------------------------------------------------------
# an example with multiple local maximum of minimum in L2 loss
#-------------------------------------------------------------

## the reality 
test_funct_eg1<-function(x){
  x*cos(3/2*x)+x
}



## obtain 25 data from the reality plus a noise
set.seed(1)
## 10 data points are very small, one may want to add more data
n=15
input=seq(0,5,5/(n-1))
input=as.matrix(input)

output=test_funct_eg1(input)+rnorm(length(input),mean=0,sd=0.1)

num_obs=n=length(output)



## the math model 
math_model_eg1<-function(x,theta){
  sin(theta*x)+x  
}

##fit the S-GaSP model for the discrepancy
##one can choose the discrepancy_type to GaSP, S-GaSP or no discrepancy
##p_theta is the number of parameters to calibrate and user needs to specifiy 
##one may also want to change the number of posterior samples by change S and S_0
p_theta=1
model_sgasp=rcalibration(design=input, observations=output, p_theta=p_theta,simul_type=1,
                         math_model=math_model_eg1,theta_range=matrix(c(0,3),1,2)
                         ,S=10000,S_0=2000,discrepancy_type='S-GaSP')

##if the acceptance rate is too low or two high, one can adjust sd_proposal, e.g.
#model_sgasp=rcalibration(design=input, observations=output, p_theta=1,simul_type=1,
#                         sd_proposal=c(rep(0.02,p_theta),rep(0.2,dim(input)[2]),0.2)
#                         math_model=math_model_eg1,theta_range=matrix(c(0,3),1,2)
#                         ,S=10000,S_0=2000,discrepancy_type='S-GaSP')

##posterior samples of calibration parameter and value
plot(model_sgasp@post_sample[,1],type='l',xlab='num',ylab=expression(theta))   
plot(model_sgasp@post_value,type='l',xlab='num',ylab='posterior value')   


show(model_sgasp)




##one may want to fit a a model with an estimated baseline mean discrepancy by setting 
##X=matrix(1,dim(input_stack)[1],1),have_trend=TRUE

model_sgasp_with_mean=rcalibration(design=input, observations=output, p_theta=1,simul_type=1,
                                   X=matrix(1,dim(input)[1],1),have_trend=TRUE,
                                   math_model=math_model_eg1,theta_range=matrix(c(0,3),1,2),
                                   S=10000,S_0=2000,discrepancy_type='S-GaSP')

show(model_sgasp_with_mean)

##posterior samples of calibration parameter and value
plot(model_sgasp_with_mean@post_sample[,1],type='l',xlab='num',ylab=expression(theta))   
plot(model_sgasp_with_mean@post_value,type='l',xlab='num',ylab='posterior value')   


## Not run: 
  #-------------------------------------------------------------
  # an example with multiple local maximum of minimum in L2 loss
  # for combing the emulator
  #-------------------------------------------------------------
  
  ## the reality 
  test_funct_eg1<-function(x){
    x*cos(3/2*x)+x
  }
  
  ## obtain 20 data from the reality plus a noise
  set.seed(1)
  n=20
  input=seq(0,5,5/(n-1))
  input=as.matrix(input)
  
  output=test_funct_eg1(input)+rnorm(length(input),mean=0,sd=0.05)
  
  num_obs=n=length(output)
  
  ## the math model 
  math_model_eg1<-function(x,theta){
    sin(theta*x)+x  
  }
  
  ##let's build an emulator for the case if the math model is too slow
  
  # let's say we can only run the math model n_design times
  n_design=80
  
  design_simul=matrix(runif(n_design*2),n_design,2)
  design_simul[,1]=5*design_simul[,1]   ##the first one is the observed input x
  design_simul[,2]=3*design_simul[,2]   ##the second one is the calibration parameter 
  
  output_simul=math_model_eg1(design_simul[,1],design_simul[,2])
  
  
  
  ##this is a little slow compared with the previous model
  model_sgasp_emulator=rcalibration(design=input, observations=output, p_theta=1,simul_type=0, 
                                    input_simul=design_simul, output_simul=output_simul,
                                    theta_range=matrix(c(0,3),1,2),
                                    S=10000,S_0=2000,discrepancy_type='S-GaSP')
  
  ##now the output is a list
  show(model_sgasp_emulator)

  ##here is the plot
  plot(model_sgasp_emulator@post_sample[,1],type='l',xlab='num',ylab=expression(theta))   
  plot(model_sgasp_emulator@post_value,type='l',xlab='num',ylab='posterior value')   

## End(Not run)

Robust Calibration class

Description

S4 class for Robust rcalibration with or without the specification of the discrepancy model.

Objects from the Class

Objects of this class are created and initialized with the function rcalibration that computes the calculations needed for setting up the calibration and prediction.

Slots

p_x:: Object of class integer. The dimension of the observed inputs.
p_theta:: Object of class integer. The calibration parameters.
num_obs:: Object of class integer. The number of experimental observations.
input:: Object of class matrix with dimension n x p_x. The design of experiments.
output:: Object of class vector with dimension n x 1. The vector of the experimental observations.
X:: Object of class matrix of with dimension n x q. The mean/trend discrepancy basis function.
have_trend:: Object of class bool to specify whether the mean/trend discrepancy is zero. "TRUE" means it has zero mean discrepancy and "FALSE"" means the mean discrepancy is not zero.
q:: Object of class integer. The number of basis functions of the mean/trend discrepancy.
R0:: Object of class list of matrices where the j-th matrix is an absolute difference matrix of the j-th input vector.
kernel_type:: A character to specify the type of kernel to use.
alpha:: Object of class vector. Each element is the parameter for the roughness for each input coordinate in the kernel.
theta_range:: A matrix for the range of the calibration parameters.
lambda_z:: Object of class vector about how close the math model to the reality in squared distance when the S-GaSP model is used for modeling the discrepancy.
S:: Object of class integer about how many posterior samples to run.
S_0:: Object of class integer about the number of burn-in samples.
prior_par:: Object of class vector about prior parameters.
output_weights:: Object of class vector about the weights of the experimental data.
sd_proposal:: Object of class vector about the standard deviation of the proposal distribution.
discrepancy_type:: Object of class character about the discrepancy. If it is 'no-discrepancy', it means no discrepancy function. If it is 'GaSP', it means the GaSP model for the discrepancy function. If it is 'S-GaSP', it means the S-GaSP model for the discrepancy function.
simul_type:: Object of class integer about the math model/simulator. If the simul_type is 0, it means we use RobustGaSP R package to build an emulator for emulation. If the simul_type is 1, it means the function of the math model is given by the user. When simul_type is 2 or 3, the mathematical model is the geophyiscal model for Kilauea Volcano. If the simul_type is 2, it means it is for the ascending mode InSAR data; if the simul_type is 3, it means it is for the descending mode InSAR data.
emulator_rgasp:: An S4 class of rgasp from the RobustGaSP package.
emulator_ppgasp:: An S4 class of ppgasp from the RobustGaSP package.
post_sample:: Object of class matrix for the posterior samples after burn-in.
post_value:: Object of class vector for the posterior values after burn-in.
accept_S:: Object of class vector for the number of proposed samples of the calibation parameters are accepted in MCMC. The first value is the number of proposed calibration parameters are accepted in MCMC. The second value is the number of proposed range and nugget parameters are accepted.
count_boundary:: Object of class vector for the number of proposed samples of the calibation parameters are outside the range and they are rejected directly.
have_replicates:: Object of class bool for having repeated experiments (replicates) or not.
num_replicates:: Object of class vector for the number of replicates at each observable input.
thinning:: Object of class integer for the ratio between the number of posterior samples and the number of samples to be recorded.
S_2_f:: Object of class numeric for the variance of the field observations.
num_obs_all:: Object of class integer for the total number of field observations.
method:: Object of class character for posterior sampling or maximum likelihood estimation.
initial_values:: Object of class matrix for initial starts of kernel parameters in maximum likelihood estimation.
param_est:: Object of class vector for estimated range and nugget parameter in parameter estimation.
opt_value:: Object of class numeric for optimized likelihood or loss function.
emulator_type:: Object of class character for the type of emulator. 'rgasp' means scalar-valued emulator and 'ppgasp' means vectorized emulator.
loc_index_emulator:: Object of class vector for location index to output in the ppgasp emulator for computer models with vectorized output.

Methods

show: Prints the main slots of the object.
predict: See predict.
predict_separable_2dim: See predict_separable_2dim.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Setting up the robust Calibration model for multiple sources data

Description

Setting up the Calibration model for estimating the parameters via MCMC for multiple sources.

Usage

  rcalibration_MS(design, observations, p_theta=NULL, index_theta=NULL,
                  X=as.list(rep(0,length(design))), 
                  have_trend=rep(FALSE,length(design)),
                  simul_type=rep(1, length(design)),
                  input_simul=NULL, output_simul=NULL,
                  simul_nug=rep(FALSE,length(design)),loc_index_emulator=NULL,
                  math_model=NULL, 
                  theta_range=NULL, 
                  sd_proposal_theta=NULL, 
                  sd_proposal_cov_par=NULL,
                  S=10000,S_0=2000, thinning=1,measurement_bias=FALSE, 
                  shared_design=NULL,have_measurement_bias_recorded=F,
                            shared_X=0,have_shared_trend=FALSE,
                  discrepancy_type=rep('S-GaSP',length(design)+measurement_bias),
                  kernel_type=rep('matern_5_2',length(design)+measurement_bias),
                  lambda_z=as.list(rep(NA,length(design)+measurement_bias)),
                  a=NULL,b=NULL,alpha=NULL,
                  output_weights=NULL,...)

Arguments

design

a list of observed inputs from multiple sources. Each element of the list is a matrix of observable inputs, where each row is a row vector of observable inputs corresponding to one observation and the number of field or experimental data is the total number of rows.

observations

a list of experimental data from multiple sources. Each element is a vector of observations.

index_theta

a list of vectors for the index of calibration parameter contained in each source.

p_theta

an integer about the number of parameters, which should be specified by the user.

X

a list of matrices of the mean/trend discrepancy between the reality and math model for multiple sources.

have_trend

a vector of bool value meaning whether we assume a mean/trend discrepancy function.

simul_type

a vector of integer about the math model/simulator for multiple sources. If the simul_type is 0, it means we use RobustGaSP R package to build an emulator for emulation. If the simul_type is 1, it means the function of the math model is given by the user. When simul_type is 2 or 3, the mathematical model is the geophyiscal model for Kilauea Volcano. If the simul_type is 2, it means it is for the ascending mode InSAR data; if the simul_type is 3, it means it is for the descending mode InSAR data.

input_simul

a list of matices, each having dimension D x (p_x+p_theta) being the design for emulating the math model. It is only useful if the ith value of simul_type is 0 for the ith source, meaning that we emulate the output of the math model.

output_simul

a list of vectors, each having dimension D x 1 being the math model outputs on the design (input_simul). It is only useful if the ith value of simul_type is 0 for the ith source, meaning that we emulate the output of the math model.

simul_nug

a vectors of bool values meaning whether we have a nugget for emulating the math model/simulator for this source. If the math model is stochastic, we often need a nugget. If simul_Nug is TRUE, it means we have a nugget for the emulator. If simul_Nug is FALSE, it means we do not have a nugget for the emulator.

loc_index_emulator

a list for location index to output in the ppgasp emulator for computer models with vectorized output.

math_model

a list of functions of the math models provided by the user for multiple sources. It is only useful if simul_type is 1, meaning that we know the math model and it can be computed fast. If the math model is computationally slow, one should set simul_type to be 0 to emulate the math model. If defined, each element of the list is a function of math models, where the first input of the function is a vector of observable inputs and the second input is a vector of calibration parameters. The output of each function is a scalar. Each function corresponds to one source of data.

theta_range

sd_proposal_theta

a vector of the standard deviation of the proposal distribution for the calibration parameters in MCMC. The default value of sd of the calibration parameter is 0.05 times theta_range.

sd_proposal_cov_par

a list of vectors of the standard deviation of the proposal distribution for range and nugget parameters in MCMC for each source.

S

an integer about about how many posterior samples to run.

S_0

an integer about about the number of burn-in samples.

thinning

the ratio between the number of posterior samples and the number of recorded samples.

measurement_bias

containing measurement bias or not.

shared_design

A matrix for shared design across different sources of data used when measurement bias exists.

have_measurement_bias_recorded

A bool value whether we record measurement bias or not.

shared_X

A matrix of shared trend when measurement bias exists.

have_shared_trend

A bool value whether we have shared trend when measurement bias exist.

discrepancy_type

a vector of characters about the type of the discrepancy for each source. If it is 'no-discrepancy', it means no discrepancy function. If it is 'GaSP', it means the GaSP model for the discrepancy function. If it is 'S-GaSP', it means the S-GaSP model for the discrepancy function.

kernel_type

a vector of characters about the type of kernel for each data source. matern_3_2 and matern_5_2 are Matern kernel with roughness parameter 3/2 and 5/2 respectively. pow_exp is power exponential kernel with roughness parameter alpha. If pow_exp is to be used, one needs to specify its roughness parameter alpha.

lambda_z

a vector numeric values about how close the math model to the reality in squared distance when the S-GaSP model is used for modeling the discrepancy for each source.

a

a vector of the prior parameter for multiple sources.

b

a vector of the prior parameter for multiple sources.

alpha

a list of vectors of roughness parameters in the kernel for multiple sources.

output_weights

a list of vectors of the weights of the outputs for multiple sources.

...

Extra arguments to be passed to the function (not implemented yet).

Value

rcalibration_MS returns an S4 object of class rcalibration_MS (see rcalibration_MS-class).

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

K. R. Anderson and M. P. Poland (2017), Abundant carbon in the mantle beneath Hawaii. Nature Geoscience, 10, 704-708.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Examples

#------------------------------------------------------------------------------
# An example for calibrating mathematical models for data from multiple sources
#------------------------------------------------------------------------------

library(RobustCalibration)


##reality
test_funct<-function(x){
  sin(pi*x/2)+2*cos(pi*x/2)
}


##math model from two sources
math_model_source_1<-function(x,theta){
  sin(theta*x) 
}

math_model_source_2<-function(x,theta){
  cos(theta*x) 
}

input1=seq(0,2,2/(10-1))
input2=seq(0,3,3/(15-1))
##
output1=test_funct(input1)+rnorm(length(input1), sd=0.01)
output2=test_funct(input2)+rnorm(length(input2), sd=0.02)

plot(input1, output1)
plot(input2, output2)



design=list()
design[[1]]=as.matrix(input1)
design[[2]]=as.matrix(input2)

observations=list()
observations[[1]]=output1
observations[[2]]=output2


p_theta=1


theta_range=matrix(0,p_theta,2)
theta_range[1,]=c(0, 8)  
simul_type=c(1,1)

math_model=list()

math_model[[1]]=math_model_source_1
math_model[[2]]=math_model_source_2


## calibrating two mathematical models for these two sources
model_sgasp=rcalibration_MS(design=design, observations=observations, p_theta=1,
                            simul_type=simul_type,math_model=math_model,
                            theta_range=theta_range, 
                            S=10000,S_0=2000,
                            discrepancy_type=rep('S-GaSP',length(design)))

plot(model_sgasp@post_theta[,1],type='l')
mean(model_sgasp@post_theta[,1])

Robust Calibration for multiple sources class

Description

S4 class for multiple sources Robust rcalibration with or without the specification of the discrepancy model.

Objects from the Class

Objects of this class are created and initialized with the function rcalibration_MS that computes the prediction after calibrating the mathematical models from multiple sources.

Slots

num_sources:: Object of class integer. The number of sources.
p_x:: Object of class vector. Each element is the dimension of the observed inputs in each source.
p_theta:: Object of class integer. The number of calibration parameters.
num_obs:: Object of class vector.Each element is the number of experimental observations of each source.
index_theta:: Object of class list. The each element is a vector of the index of calibration parameters (theta) contained in each source.
input:: Object of class list. Each element is a matrix of the design of experiments in each source with dimension n_i x p_{x,i}, for i=1,...,num_sources.
output:: Object of class list. Each element is a vector of the experimental observations in each source with dimension n_i x 1, for i=1,...,num_sources.
X:: Object of class list. Each element is a matrix of the mean/trend discrepancy basis function in each source with dimension n_i x q_i, for i=1,...,num_sources.
have_trend:: Object of class vector. Each element is a bool to specify whether the mean/trend discrepancy is zero in each source. "TRUE" means it has zero mean discrepancy and "FALSE"" means the mean discrepancy is not zero.
q:: Object of class vector. Each element is integer of the number of basis functions of the mean/trend discrepancy in each source.
R0:: Object of class list. Each element is a list of matrices where the j-th matrix is an absolute difference matrix of the j-th input vector in each source.
kernel_type:: Object of class vector. Each element is a character to specify the type of kernel to use in each source.
alpha:: Object of class list. Each element is a vector of parameters for the roughness parameters in the kernel in each source.
theta_range:: A matrix for the range of the calibration parameters.
lambda_z:: Object of class vector. Each element is a numeric value about how close the math model to the reality in squared distance when the S-GaSP model is used for modeling the discrepancy in each source.
S:: Object of class integer about how many posterior samples to run.
S_0:: Object of class integer about the number of burn-in samples.
prior_par:: Object of class list. Each element is a vector about prior parameters.
output_weights:: Object of class list. Each element is a vector about the weights of the experimental data.
sd_proposal_theta:: Object of class vector about the standard deviation of the proposal distribution for the calibration parameters.
sd_proposal_cov_par:: Object of class list. Each element is a vector about the standard deviation of the proposal distribution for the calibration parameters in each source.
discrepancy_type:: Object of class vector. Each element is a character about the type of the discrepancy in each source. If it is 'no-discrepancy', it means no discrepancy function. If it is 'GaSP', it means the GaSP model for the discrepancy function. If it is 'S-GaSP', it means the S-GaSP model for the discrepancy function.
simul_type:: Object of class vector. Each element is an integer about the math model/simulator. If the simul_type is 0, it means we use RobustGaSP R package to build an emulator for emulation. If the simul_type is 1, it means the function of the math model is given by the user. When simul_type is 2 or 3, the mathematical model is the geophyiscal model for Kilauea Volcano. If the simul_type is 2, it means it is for the ascending mode InSAR data; if the simul_type is 3, it means it is for the descending mode InSAR data.
emulator_rgasp:: Object of class list. Each element is an S4 class of rgasp from the RobustGaSP package in each source.
emulator_ppgasp:: Object of class list. Each element is an S4 class of ppgasp from the RobustGaSP package in each source.
post_theta:: Object of class matrix for the posterior samples of the calibration parameters after burn-in.
post_individual_par:: Object of class list. Each element is a matrix for the posterior samples after burn-in in each source.
post_value:: Object of class vector for the posterior values after burn-in.
accept_S_theta:: Object of class numerical for the number of proposed samples of the calibration parameters are accepted in MCMC.
accept_S_beta:: Object of class vector for the number of proposed samples of the range and nugget parameters in each source are accepted in MCMC.
count_boundary:: Object of class vector for the number of proposed samples of the calibation parameters are outside the range and they are rejected directly.
have_measurement_bias_recorded:: Object of class bool for whether measurement bias will be recorded or not.
measurement_bias:: Object of class bool for whether measurement bias exists or not.
post_delta:: Object of class matrix of samples of model discrepancy.
post_measurement_bias:: Object of class list of samples of measurement_bias if measurement bias is chosen to be recorded.
thinning:: Object of class integer for the ratio between the number of posterior samples and the number of samples to be recorded.
emulator_type:: Object of class vector for the type of emulator for each source of data. 'rgasp' means scalar-valued emulator and 'ppgasp' means vectorized emulator.
loc_index_emulator:: Object of class list for location index to output in ghe ppgasp emulator for computer models with vectorized output.

Methods

predict_MS: See predict_MS.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Product correlation matrix with the product form

Description

Function to construct the product correlation matrix with the product form. This is imported from the RobustGaSP package.

Usage

separable_kernel(R0, beta, kernel_type, alpha)

Arguments

R0

A List of matrix where the j-th matrix is an absolute difference matrix of the j-th input vector.

beta

The range parameters.

kernel_type

A vector specifying the type of kernels of each coordinate of the input. matern_3_2 and matern_5_2 are Matern correlation with roughness parameter 3/2 and 5/2 respectively. pow_exp is power exponential correlation with roughness parameter alpha. If pow_exp is to be used, one needs to specify its roughness parameter alpha. The default choice is matern_5_2. The periodic_gauss means the Gaussian kernel with periodic folding method with be used. The periodic_exp means the exponential kernel with periodic folding method will be used.

alpha

Roughness parameters in the kernel functions.

Value

The product correlation matrix with the product form.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

Show an Robust Calibration object.

Description

Function to print the Robust Calibration model after the rcalibration class has been constructed.

Usage

## S4 method for signature 'rcalibration'
show(object)

Arguments

object

an object of class rcalibration.

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Examples


##-------------------------------------------------
#A simple example where the math model is not biased
##-------------------------------------------------
## the reality 
test_funct_eg1<-function(x){
  sin(pi/2*x)
}



## obtain 15 data from the reality plus a noise
set.seed(1)
## 10 data points are very small, one may want to add more data
n=15
input=seq(0,4,4/(n-1))
input=as.matrix(input)

output=test_funct_eg1(input)+rnorm(length(input),mean=0,sd=0.2)

## plot input and output 
#plot(input,output)
#num_obs=n=length(output)



## the math model 
math_model_eg1<-function(x,theta){
  sin(theta*x)  
}

##fit the S-GaSP model for the discrepancy
##one can choose the discrepancy_type to GaSP, S-GaSP or no discrepancy
##p_theta is the number of parameters to calibrate and user needs to specifiy 
##one may also want to change the number of posterior samples by change S and S_0
##one may change sd_proposal for the standard derivation of the proposal distribution
## one may also add a mean by setting X=... and have_trend=TRUE
model_sgasp=rcalibration(design=input, observations=output, p_theta=1,simul_type=1,
                         math_model=math_model_eg1,theta_range=matrix(c(0,3),1,2)
                         ,S=10000,S_0=2000,discrepancy_type='S-GaSP')


##posterior samples of calibration parameter and value
## the value is 
plot(model_sgasp@post_sample[,1],type='l',xlab='num',ylab=expression(theta))   
plot(model_sgasp@post_value,type='l',xlab='num',ylab='posterior value')   


show(model_sgasp)


#-------------------------------------------------------------
# an example with multiple local maximum of minimum in L2 loss
#-------------------------------------------------------------

## the reality 
test_funct_eg1<-function(x){
  x*cos(3/2*x)+x
}



## obtain 15 data from the reality plus a noise
set.seed(1)
n=15
input=seq(0,5,5/(n-1))
input=as.matrix(input)

output=test_funct_eg1(input)+rnorm(length(input),mean=0,sd=0.05)

num_obs=n=length(output)


## the math model 
math_model_eg1<-function(x,theta){
  sin(theta*x)+x  
}

## fit the S-GaSP model for the discrepancy

model_sgasp=rcalibration(design=input, observations=output, p_theta=1,simul_type=1,
                         math_model=math_model_eg1,theta_range=matrix(c(0,3),1,2),
                         discrepancy_type='S-GaSP')


## posterior samples 
plot(model_sgasp@post_sample[,1],type='l',xlab='num',ylab=expression(theta))   
show(model_sgasp)

Robust Calibration of Imperfect Mathematical Models

Description

Details

Author(s)

References

See Also

Examples

Determine whether we accept the proposed poserior sample at one step of MCMC.

Description

Usage

Arguments

Value

Author(s)

References

Cholesky decomposition of a symmetric matrix.

Description

Usage

Arguments

Value

Author(s)

Cholesky decomposition of the correlation matrix in GaSP.

Description

Usage

Arguments

Value

Author(s)

Cholesky decomposition of the covariance matrix in S-GaSP.

Description

Usage

Arguments

Value

Author(s)

Produce the inversion of the covariances of the model discrepancy and the measurement bias.

Description

Usage

Arguments

Value

Author(s)

Natural Logorithm of the posterior.

Description

Usage

Arguments

Value

Author(s)

References

Natural Logorithm of the posterior of the discrepancy in model calibration with multiple sources with measurement bias.

Description

Usage

Arguments

Value

Author(s)

References

Natural Logorithm of the posterior with no discrepancy function.

Description

Usage

Arguments

Value

Author(s)

References

A geophysical model for the ground deformation in Kilauea.

Description

Usage

Arguments

Value

Author(s)

References

Sample the model discrepancy.

Description

Usage

Arguments

Value

Author(s)

References

Sample the variance and mean parameters.

Description

Usage

Arguments

Value

Author(s)

References