ed.ppc
Aliases:
ed.criticisms.ppc
ed.ppc
ppc(
T,
data,
latent_vars=None,
n_samples=100
)
Defined in edward/criticisms/ppc.py
.
Posterior predictive check (Gelman, Meng, & Stern, 1996; Meng, 1994; Rubin, 1984).
PPC’s form an empirical distribution for the predictive discrepancy,
$(p(T\mid x) = \int p(T(x^{\text{rep}})\mid z) p(z\mid x) dz)$
by drawing replicated data sets $(x^{\text{rep}})$ and calculating $(T(x^{\text{rep}}))$ for each data set. Then it compares it to $(T(x))$.
If data
is inputted with the prior predictive distribution, then it is a prior predictive check (Box, 1980).
Args:
T
: function. Discrepancy function, which takes a dictionary of data and dictionary of latent variables as input and outputs atf.Tensor
.data
: dict. Data to compare to. It binds observed variables (of typeRandomVariable
ortf.Tensor
) to their realizations (of typetf.Tensor
). It can also bind placeholders (of typetf.Tensor
) used in the model to their realizations.latent_vars
: dict. Collection of random variables (of typeRandomVariable
ortf.Tensor
) binded to their inferred posterior. This argument is used when the discrepancy is a function of latent variables.n_samples
: int. Number of replicated data sets.
Returns:
list of np.ndarray. List containing the reference distribution, which is a NumPy array with n_samples
elements,
$((T(x^{{\text{rep}},1}, z^{1}), ..., T(x^{\text{rep,nsamples}}, z^{\text{nsamples}})))$
and the realized discrepancy, which is a NumPy array with n_samples
elements,
$((T(x, z^{1}), ..., T(x, z^{\text{nsamples}})).)$
Examples
# build posterior predictive after inference:
# it is parameterized by a posterior sample
x_post = ed.copy(x, {z: qz, beta: qbeta})
# posterior predictive check
# T is a user-defined function of data, T(data)
T = lambda xs, zs: tf.reduce_mean(xs[x_post])
ed.ppc(T, data={x_post: x_train})
# in general T is a discrepancy function of the data (both response and
# covariates) and latent variables, T(data, latent_vars)
T = lambda xs, zs: tf.reduce_mean(zs[z])
ed.ppc(T, data={y_post: y_train, x_ph: x_train},
latent_vars={z: qz, beta: qbeta})
# prior predictive check
# run ppc on original x
ed.ppc(T, data={x: x_train})
Box, G. E. (1980). Sampling and Bayes’ inference in scientific modelling and robustness. Journal of the Royal Statistical Society. Series A (General), 383–430.
Gelman, A., Meng, X.-L., & Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 733–760.
Meng, X.-L. (1994). Posterior predictive $(p)$-values. The Annals of Statistics, 1142–1160.
Rubin, D. B. (1984). Bayesianly justifiable and relevant frequency calculations for the applied statistician. The Annals of Statistics, 12(4), 1151–1172.