rapid_models.gp_diagnostics.cv

Module Contents

Functions

multifold(→ Union[Tuple[None, None, None], ...)

Compute multifold CV residuals for GP regression with noiseless

multifold_cholesky(...)

Compute multifold CV residuals from the Cholesky factor L of the

loo(→ Union[Tuple[None, None, None], ...)

Compute Leave-One-Out (LOO) residuals for GP regression with noiseless

loo_cholesky(...)

Compute Leave-One-Out (LOO) residuals from the Cholesky factor L of the

check_folds_indices(folds, n_max)

Check that the list of index subsets (list of lists) is valid

check_lower_triangular(arr[, argname])

Check that the argument is a 2d numpy array which is lower triangular

check_numeric_array(arr, dim[, argname])

Check that the argument is a numpy array of correct dimension

_multifold_inv(K, Y_train, folds)

Compute multifold cv residuals using matrix inverse (for testing)

rapid_models.gp_diagnostics.cv.multifold(K: nptyping.NDArray[nptyping.Shape[N, N], nptyping.Float], Y_train: nptyping.NDArray[nptyping.Shape[N], nptyping.Float], folds: List[List[int]], noise_variance: float = 0.0, check_args: bool = True) Union[Tuple[None, None, None], Tuple[nptyping.NDArray[nptyping.Shape[N], nptyping.Float], nptyping.NDArray[nptyping.Shape[N, N], nptyping.Float], nptyping.NDArray[nptyping.Shape[N], nptyping.Float]]]

Compute multifold CV residuals for GP regression with noiseless (noise_variance = 0) or fixed variance iid Gaussian noise. (residual = observed - predicted)

Parameters:
  • K (2d array) – GP prior covariance matrix

  • Y_train (array) – training observations

  • folds (list of lists) – The index subsets

  • noise_variance – variance of the observational noise. Set noise_variance = 0 for noiseless observations

  • check_args (bool) – Check (assert) that arguments are well-specified before computation

Returns:

Mean of CV residuals cov: Covariance of CV residuals residuals_transformed: The residuals transformed to the standard normal space

Return type:

mean

This function just calls ‘multifold_cholesky()’ with the appropriate Cholesky factor. It is based on the formulation derived in:

[D. Ginsbourger and C. Schaerer (2021). Fast calculation of Gaussian Process multiple-fold crossvalidation residuals and their covariances. arXiv:2101.03108]

rapid_models.gp_diagnostics.cv.multifold_cholesky(L: nptyping.NDArray[nptyping.Shape[N, N], nptyping.Float], Y_train: nptyping.NDArray[nptyping.Shape[N], nptyping.Float], folds: List[List[int]], check_args: bool = True) Tuple[nptyping.NDArray[nptyping.Shape[N], nptyping.Float], nptyping.NDArray[nptyping.Shape[N, N], nptyping.Float], nptyping.NDArray[nptyping.Shape[N], nptyping.Float]]

Compute multifold CV residuals from the Cholesky factor L of the observation precision matrix and the training data Y_train (residual = observed - predicted)

Parameters:
  • L (2d array) – lower triangular Cholesky factor of covariance matrix (L L.T = covariance matrix)

  • Y_train (array) – training observations

  • folds (list of lists) – The index subsets

  • check_args (bool) – Check (assert) that arguments are well-specified before computation

Returns:

Mean of CV residuals cov: Covariance of CV residuals residuals_transformed: The residuals transformed to the standard normal space

Return type:

mean

Note: * The matrix K = L L.T is the covariance matrix of the predicted observations Y_train * For observations including Gaussian noise with fixed variance (v), the matrix K is K = (K + v*I) where K[i, j] is the prior covariance of the latent GP between the i-th an j-th training location

This implementation uses the Cholesky factor instead of the inverse precision matrix, but is otherwise equivalent to the formulas derived in

[D. Ginsbourger and C. Schaerer (2021). Fast calculation of Gaussian Process multiple-fold crossvalidation residuals and their covariances. arXiv:2101.03108]

rapid_models.gp_diagnostics.cv.loo(K: nptyping.NDArray[nptyping.Shape[N, N], nptyping.Float], Y_train: nptyping.NDArray[nptyping.Shape[N], nptyping.Float], noise_variance: float = 0.0, check_args: bool = True) Union[Tuple[None, None, None], Tuple[nptyping.NDArray[nptyping.Shape[N], nptyping.Float], nptyping.NDArray[nptyping.Shape[N, N], nptyping.Float], nptyping.NDArray[nptyping.Shape[N], nptyping.Float]]]

Compute Leave-One-Out (LOO) residuals for GP regression with noiseless (noise_variance = 0) or fixed variance iid Gaussian noise. (residual = observed - predicted) This function just calls ‘loo_cholesky()’ with the appropriate Cholesky factor.

Parameters:
  • K (2d array) – GP prior covariance matrix

  • Y_train (array) – training observations

  • noise_variance (float) – variance of the observational noise. Set noise_variance = 0. for noiseless observations

  • check_args (bool) – Check (assert) that arguments are well-specified before computation

Returns:

Mean of LOO residuals cov: Covariance of LOO residuals residuals_transformed: The residuals transformed to the standard normal space

Return type:

mean

rapid_models.gp_diagnostics.cv.loo_cholesky(L: nptyping.NDArray[nptyping.Shape[N, N], nptyping.Float], Y_train: nptyping.NDArray[nptyping.Shape[N], nptyping.Float], check_args: bool = True) Tuple[nptyping.NDArray[nptyping.Shape[N], nptyping.Float], nptyping.NDArray[nptyping.Shape[N, N], nptyping.Float], nptyping.NDArray[nptyping.Shape[N], nptyping.Float]]

Compute Leave-One-Out (LOO) residuals from the Cholesky factor L of the observation precision matrix and the training data Y_train (residual = observed - predicted)

Parameters:
  • L (2d array) – lower triangular Cholesky factor of covariance matrix (L L.T = covariance matrix)

  • Y_train (array) – training observations

  • check_args (bool) – Check (assert) that arguments are well-specified before computation

Returns:

Mean of LOO residuals cov: Covariance of LOO residuals residuals_transformed: The residuals transformed to the standard normal space

Return type:

mean

Note: * The matrix K = L L.T is the covariance matrix of the predicted observations Y_train * For observations including Gaussian noise with fixed variance (v), the matrix K is K = (K + v*I) where K[i, j] is the prior covariance of the latent GP between the i-th an j-th training location

This implementation uses the Cholesky factor instead of the inverse precision matrix, but is otherwise equivalent to the formulas derived in

[O. Dubrule. Cross validation of kriging in a unique neighborhood. Journal of the International Association for Mathematical Geology, 15 (6):687-699, 1983.]

rapid_models.gp_diagnostics.cv.check_folds_indices(folds: List[List[int]], n_max: int)

Check that the list of index subsets (list of lists) is valid

Parameters:
  • folds (list of lists) – The index subsets.

  • n_max (int) – Total number of indices.

Raises:

AssertionError – if not ‘folds’ represents the range [0:n_max-1] of n_max indices split into non overlapping subsets

rapid_models.gp_diagnostics.cv.check_lower_triangular(arr: Union[nptyping.NDArray[nptyping.Shape[N, N], nptyping.Float], Any], argname: str = 'arr')

Check that the argument is a 2d numpy array which is lower triangular

Parameters:

() (arr) – object

Raises:

AssertionError – if not ‘arr’ represents a lower triangular matrix

rapid_models.gp_diagnostics.cv.check_numeric_array(arr: Union[nptyping.NDArray[Any, nptyping.Float], Any], dim: int, argname: str = 'arr')

Check that the argument is a numpy array of correct dimension

Parameters:

() (arr) – object

Raises:

AssertionError – if not ‘arr’ represents a ‘dim’-dimensional numpy array

rapid_models.gp_diagnostics.cv._multifold_inv(K, Y_train, folds)

Compute multifold cv residuals using matrix inverse (for testing) (residual = observed - predicted)

Parameters:
  • K (2d array) – covariance matrix

  • Y_train (array) – training observations

  • folds (list of lists) – The index subsets.

Returns:

Mean of CV residuals cov: Covariance of CV residuals residuals_transformed: The residuals transformed to the standard normal space

Return type:

mean

[D. Ginsbourger and C. Schaerer (2021). Fast calculation of Gaussian Process multiple-fold crossvalidation residuals and their covariances. arXiv:2101.03108]