Expansions for the Conditional Density and Distribution of a Standard Estimate

C.S. Withers

doi:10.20944/preprints202509.2439.v1

Submitted:

29 September 2025

Posted:

30 September 2025

You are already at the latest version

Abstract

Conditioning is a very useful way of using correlated information to reduce the variability of an estimate. Inference based on a conditioned estimate, can be much more precise than on an unconditioned estimate. Here we give expansions in powers of $n^{-1/2}$} for the conditional density and distribution of a multivariate standard estimate based on a sample of size $n$. Standard estimates include most estimates of interest, including smooth functions of sample means and other empirical estimates. So they have potential application to a range of practical problems. We also show that a conditional estimate is not a standard estimate, so that Edgeworth-Cornish-Fisher expansions cannot be applied directly.

Keywords:

conditional distributions

;

multivariate Edgeworth expansions

;

Edgeworth coefficients

;

standard estimates

Subject:

Computer Science and Mathematics - Probability and Statistics

MSC: Classification 62E20

1. Introduction and Summary

Suppose that

\hat{w}

is a standard estimate of an unknown parameter

w \in R^{q}

of a statistical model, based on a sample of size n. That is,

\hat{w}

is a consistent estimate, and for

r \geq 1

, its rth order cumulants have magnitude

n^{- (r - 1) / 2}

and can be expanded in powers of

n^{- 1}

. This is a very large class of estimates, with potential application to a range of practical problems. For example,

\hat{w}

may be a smooth function of one or more sample means, or a smooth functional of one or more empirical distributions. A smooth function of a standard estimate is also a standard estimate: see [29]. [32] gave the multivariate Edgeworth expansions for the distribution and density of

X_{n} = n^{1 / 2} (\hat{w} - w),

in powers of

n^{- 1 / 2}

about the multivariate normal in terms of the Edgeworth coefficients of (2.3). (For typos, see p25 of [29]. Also replace

\hat{θ}

by

\hat{θ} / θ

on 4th to last line p1121 and in (23). To line 3 p1138, add

P_{12} = B_{23} / 2

.) [30] gave the Edgeworth coefficients explicitly for the Edgeworth expansions to

O (n^{- 2})

. [15].

We now turn to conditioning. This is a very useful way of using correlated information to reduce the variability of estimates, and to make inference on unknown parameters more precise. This is the motivation for this paper. In Section 3 we take

q \geq 2

, and write

w, \hat{w}

and

X_{n}

as

(\binom{w_{1}}{w_{2}})

,

(\binom{{\hat{w}}_{1}}{{\hat{w}}_{2}})

and

(\binom{X_{n 1}}{X_{n 2}})

of dimensions

(\binom{q_{1}}{q_{2}})

. Just as the distribution of

X_{n}

allows inference on w, the conditional distribution of

X_{n 1}

given

X_{n 2}

, allows inference on

w_{1}

for a given

w_{2}

. The covariance of

{\hat{w}}_{1} | {\hat{w}}_{2}

can be substantially less than that of

{\hat{w}}_{1}

. Only when

{\hat{w}}_{1}

and

{\hat{w}}_{2}

are uncorrelated, is there no advantage in conditioning.

Theorems 3.1 and 3.2 give our main results: explicit expansions to

O (n^{- 2})

for the conditional density and distribution of

X_{n 1}

given

X_{n 2}

, that is, for the conditional density and distribution of

{\hat{w}}_{1} - w_{1}

given

{\hat{w}}_{2} - w_{2}

. In other words it gives the likely position of

w_{1}

for any given

w_{2}

. The main difficulty is integrating the density. Theorem 3.2 does this in terms of

{\bar{I}}^{1 - k}

of (3.28), the integral of the multivariate Hermite polynomial, with respect to the conditional normal density. Note 3.1 gives

{\bar{I}}^{1 - k}

in terms of derivatives of the multivariate normal distribution. Theorem 3.3 gives

{\bar{I}}^{1 - k}

in terms of the partial moments of the conditional distribution. If

q_{1} = 1

, then Theorem 3.4 gives

{\bar{I}}^{1 - k}

in terms of the unit normal distribution and density.

Section 4 specialises to the case

q_{1} = q_{2} = 1

. Examples are the condtional distribution and density of a bivariate sample mean, of entangled gamma random variables, and of a sample mean given the sample variance. Section 5 and Section 6 give conclusions, discussion, and suggestions for future research. Appendix A gives expansions for the conditional moments of

X_{n 1} | (X_{n 2} = x_{2}) .

It shows that

{\hat{w}}_{1}

given

X_{n 2}

, is neither a standard estimate, nor a Type B estimate, so that Edgeworth-Cornish-Fisher expansions do not apply to it.

Conditional expansions for the sample mean were given in Chapter 12 of [4], and used in Section 2.3 and 2.5 of [13] to show bootstrap consistency.

2. Multivariate Edgeworth Expansions

Suppose that

\hat{w}

is a standard estimate of

w \in R^{q}

with respect to n. (n is typically the sample size.) That is,

E \hat{w} \to w

as

n \to \infty

, where we use

E

for expected value, and for

r \geq 1

and

1 \leq i_{1}, \dots, i_{r} \leq q,

the rth order cumulants of

\hat{w} = {({\hat{w}}_{1}, \dots, {\hat{w}}_{q})}^{'}

can be expanded as

\begin{matrix} {\bar{k}}^{1 - r} = k^{i_{1} \dots i_{r}} = κ ({\hat{w}}_{i_{1}}, \dots, {\hat{w}}_{i_{r}}) \approx \sum_{d = r - 1}^{\infty} n^{- d} {\bar{k}}_{d}^{1 - r}, where {\bar{k}}_{d}^{1 - r} = k_{d}^{i_{1} \dots i_{r}}, \end{matrix}

(2.1)

where ≈ indicates an asymptotic expansion, and the cumulant coefficients

{\bar{k}}_{d}^{1 - r}

may depend on n but are bounded as

n \to \infty

. So the bar replaces each

i_{k}

by k. For example

{\bar{k}}_{0}^{1} = w^{i_{1}}

and

{\bar{k}}_{1}^{12} = k_{1}^{i_{1} i_{2}} .

We reserve

i_{k}

for this bar notation to avoid double subscripts.

\begin{matrix} As n \to \infty, X_{n} = n^{1 / 2} (\hat{w} - w) \overset{L}{\to} X = N_{q} (0, V), for V = ({\bar{k}}_{1}^{12}), q \times q, \end{matrix}

(2.2)

the multivariate normal on

R^{q}

, with density and distribution

ϕ_{V} (x) = {(2 π)}^{- q / 2} {(d e t V)}^{- 1 / 2} exp (- x^{'} V^{- 1} x / 2), Φ_{V} (x) = \int_{- \infty}^{x} ϕ_{V} (x) d x .

V may depend on n, but we assume that

d e t V

is bounded away from 0.

\begin{matrix} Let {\bar{P}}_{r}^{1 - k} = P_{r}^{i_{1} \dots i_{k}} b e the rth Edgeworth coefficient of \hat{w}, \end{matrix}

(2.3)

for

q \geq 1 \leq r \leq 3

. These are Bell polynomials in the cumulant coefficients of (2.1), as defined and given in [30]. Their importance lies in their central role in the Edgeworth expansions of

X_{n}

of (2.2).

(When

q = 1

and

\hat{w}

is a sample mean, the Edgeworth coefficients were given for all r in [31]. For typos, see p24–25 of [29].) Set

P (A) =

Probability A is true. By [32], or [29], for

\hat{w}

non-lattice, the distribution and density of

X_{n}

can be expanded as

\begin{matrix} P (X_{n} \leq x) \approx \sum_{r = 0}^{\infty} n^{- r / 2} P_{r} (x), p_{X_{n}} (x) \approx \sum_{r = 0}^{\infty} n^{- r / 2} p_{r} (x), x \in R^{q}, \end{matrix}

(2.4)

\begin{matrix} where P_{0} (x) = Φ_{V} (x), p_{0} (x) = ϕ_{V} (x), \\ and for r \geq 1, P_{r} (x) = \sum_{k = 1}^{3 r} [P_{r k} (x) : k - r even], \end{matrix}

\begin{matrix} p_{r} (x) / ϕ_{V} (x) = \sum_{k = 1}^{3 r} [{\tilde{p}}_{r k} : k - r even] = {\tilde{p}}_{r} (x) say, \end{matrix}

(2.5)

\begin{matrix} P_{r k} (x) = {\bar{P}}_{r}^{1 - k} {\bar{H}}_{*}^{1 - k}, {\tilde{p}}_{r k} = {\bar{P}}_{r}^{1 - k} {\bar{H}}^{1 - k}, \end{matrix}

(2.6)

\begin{matrix} {\bar{H}}_{*}^{1 - k} = {\bar{H}}_{*}^{1 - k} (x, V) = {\bar{O}}^{1 - k} Φ_{V} (x) = \int_{- \infty}^{x} {\bar{H}}^{1 - k} ϕ_{V} (x) d x, \\ {\bar{O}}^{1 - k} = (- {\bar{\partial}}_{1}) \dots (- {\bar{\partial}}_{k}), {\bar{\partial}}_{k} = \partial_{i_{k}}, \partial_{i} = \partial / \partial x_{i}, \end{matrix}

(2.7)

\begin{matrix} {\bar{H}}^{1 - k} = H^{i_{1} \dots i_{k}} = ϕ_{V} {(x)}^{- 1} {\bar{O}}^{1 - k} ϕ_{V} (x) = E ({\bar{y}}_{1} + I {\bar{Y}}_{1}) \dots ({\bar{y}}_{k} + I {\bar{Y}}_{k}) \end{matrix}

(2.8)

\begin{matrix} and I = \sqrt{- 1}, y = V^{- 1} x, Y = V^{- 1} X \sim N_{q} (0, V^{- 1}) . \end{matrix}

(2.9)

{\bar{H}}^{1 - k} (x, V) = {\bar{H}}^{1 - k}

is the multivariate Hermite polynomial. We use the tensor summation convention, repetition of

i_{1}, \dots, i_{k}

in (2.6) implies their implicit summation over their range,

1, \dots, q

. [30] gave

{\bar{H}}^{1 - k}

explicitly for

k \leq 6

and for

k \leq 9

when

q = 2

.

\begin{matrix} S e t {\bar{μ}}^{1 - 2 k} = E {\bar{Y}}_{1} \dots {\bar{Y}}_{2 k} = \sum^{1.3 \dots (2 k - 1)} {\bar{V}}^{12} \dots {\bar{V}}^{2 k - 1, 2 k}, \end{matrix}

(2.10)

where

\sum^{N} {\bar{f}}^{1 - 2 k}

sums

{\bar{f}}^{1 - 2 k}

over all N permutations of

i_{1}, \dots, i_{2 k}

giving distinct values. For example,

(So the repeated

i_{k + 1}, \dots, i_{2 k}

in (2.11) implies their repeated summatioin over

1, \dots, q

.)

P_{2} (x), P_{3} (x)

are given explicitly in [30]. So (2.4) with the

{\bar{P}}_{r}^{1 - k}

in [30] give the Edgeworth expansions for the distribution and density of

X_{n}

of (2.2) to

O (n^{- 2})

.

{\tilde{p}}_{r k}

and

P_{r k}

each have

q^{k}

terms, but many are duplicates as

{\bar{P}}_{r}^{1 - k}

is symmetric in

i_{1}, \dots, i_{k}

. This is exploited by the notation of Section 4 of [30] to greatly reduce the number of terms in (2.6).

By (2.5), the density of

X_{n}

relative to its asymptotic value is

p_{X_{n}} (x) / ϕ_{V} (x) \approx 1 + \sum_{r = 1}^{\infty} n^{- r / 2} {\tilde{p}}_{r} (x) = 1 + n^{- 1 / 2} {\tilde{p}}_{1} (x) + O (n^{- 1}), for x \in R^{q},

and for measurable

C \subset R^{q}

,

\begin{matrix} P (X_{n} \in C) \approx Φ_{V} (C) + \sum_{r = 1}^{\infty} n^{- r / 2} p_{r C}, where for, \\ p_{r C} = E p_{r} (X) I (X \in C) = \int_{C} p_{r} (x) ϕ_{V} (x) d x = \sum_{k = 1}^{3 r} [{\tilde{p}}_{r k} (C) : k - r even], \\ {\tilde{p}}_{r k} (C) = E {\tilde{p}}_{r k} (X) I (X \in C) = \int_{C} {\tilde{p}}_{r k} (x) ϕ_{V} (x) d x = {\bar{P}}_{r}^{1 - k} {\bar{H}}^{1 - k} (C), \\ and {\bar{H}}^{1 - k} (C) = E {\bar{H}}^{1 - k} (X, V) I (X \in C) = \int_{C} {\bar{H}}^{1 - k} ϕ_{V} (x) d x . \end{matrix}

If

- C = C

, then for r odd,

{\bar{Q}}^{1 - r} = {\tilde{p}}_{r k} (C) = p_{r C} = 0,

so that

\begin{matrix} P (X_{n} \in C) \approx Φ_{V} (C) + \sum_{r = 1}^{\infty} n^{- r} p_{2 r C} = Φ_{V} (C) + n^{- 1} p_{2 C} + O (n^{- 2}) . \end{matrix}

(2.14)

Examples 3 and 4 of [30] gave

p_{2 C}

for

C = {x : x^{'} V^{- 1} x \leq u}

, and

C = {x : | {(V^{- 1 / 2} x)}_{j} | \leq u_{j}, j = 1, \dots, q}

.

3. The Conditional Density and Distribution

For

q = q_{1} + q_{2}, q_{1} \geq 1,

and

q_{2} \geq 1,

partition

w, \hat{w}, X \sim N_{q} (0, V), X_{n} = n^{1 / 2} (\hat{w} - w), x

and

y = V^{- 1} x

as

(\binom{w_{1}}{w_{2}}), (\binom{{\hat{w}}_{1}}{{\hat{w}}_{2}}), (\binom{X_{1}}{X_{2}}), (\binom{X_{n 1}}{X_{n 2}}), (\binom{x_{1}}{x_{2}})

and

(\binom{y_{1}}{y_{2}}),

where

w_{i}, {\hat{w}}_{i}, X_{i}, X_{n i}, x_{i}, y_{i}

are vectors of length

q_{i}

. Partition

V, V^{- 1}

as

(V_{i j}), (V^{i j}), 2 \times 2,

where

V_{i j}, V^{i j}

are

q_{i} \times q_{j}

.

Now we come to the main purpose of this paper. Theorem 3.1 expands the conditional density of

X_{n 1 \cdot 2}

about the conditional density of

X_{1 \cdot 2}

. Its derivation is straightforward, the only novel feature being the use of Lemma 3.2 to find the reciprocal of a series, using Bell polynomials. Theorem 3.2 integrates the conditional density to obtain the expansion for the conditional distribution of

X_{n 1 \cdot 2}

about the conditional distribution of

X_{1 \cdot 2}

in terms of

{\bar{I}}^{1 - k}

of (3.28) below, the integral of the Hermite polynomial

{\bar{H}}^{1 - k}

of (2.8), with respect to the conditional normal density. Note 3.1 gives

{\bar{I}}^{1 - k}

in terms of derivatives of the multivariate normal distribution. Theorem 3.3 gives

{\bar{I}}^{1 - k}

in terms of the partial moments of the conditional normal distribution. For

X_{1 \cdot 2}

of (3.1), set Preprints 178664 i003

Lemma 3.1.

The elements of

(V^{i j}) = V^{- 1}

are

PROOF

V V^{- 1} = V^{- 1} V = I_{q}

gives 8 equations relating

{V^{i j}}

and

{V_{i j}}

. Now solve for

{V^{i j}}

.

So

A_{1} = 0_{q_{1} \times q_{2}}, A_{2} = V^{22} B V_{22}^{- 1}

for

B = V_{22} - V_{21} V_{11}^{- 1} V_{12} = {(V^{22})}^{- 1} . □

Since

Q = V_{11} - V_{1 \cdot 2} \geq 0_{q_{1} \times q_{1}}

in the sense that

x^{'} Q x \geq 0

for

x \in R^{q_{1}}

,

X_{1 \cdot 2}

is less variable than

X_{1}

, and

X_{n 1 \cdot 2}

is less variable than

X_{n 1}

, unless

X_{1}

and

X_{2}

are uncorrelated, that is,

V_{12}

is a matrix of zeros.

The conditional density of

X_{n 1 \cdot 2}

is

\begin{matrix} p_{n 1 \cdot 2} (x_{1}) = p_{X_{n}} (x) / p_{X_{n 2}} (x_{2}) = ϕ_{1 \cdot 2} (x_{1}) (1 + S) / (1 + S_{2}), \end{matrix}

(3.7)

\begin{matrix} where S = p_{X_{n}} (x) / ϕ_{V} (x) - 1 \approx \sum_{r = 1}^{\infty} n^{- r / 2} {\tilde{p}}_{r} (x) of (2.6), \\ S_{2} = p_{X_{n 2}} (x_{2}) / ϕ_{V_{22}} (x_{2}) - 1 \approx \sum_{r = 1}^{\infty} n^{- r / 2} f_{r}, for f_{r} = p_{r}^{*} (x_{2}), \end{matrix}

(3.8)

where

p_{r}^{*} (x_{2})

is

{\tilde{p}}_{r} (x)

of (2.6) for

X_{n 2}

, and

ϕ_{1 \cdot 2} (x_{1})

is the density of

X_{1 \cdot 2}

of (3.1). By (4)–(6), Section 2.5 of [1], for

V_{0}

of (3.4),

\begin{matrix} ϕ_{1 \cdot 2} (x_{1}) = ϕ_{V} (x) / ϕ_{V_{22}} (x_{2}) = ϕ_{V_{0}} (u), where u = x_{1} - μ_{1 \cdot 2} \in R^{q_{1}} . \end{matrix}

(3.9)

So the distribution of

X_{1} | (X_{2} = x_{2})

is

\begin{matrix} Φ_{1 \cdot 2} (x_{1}) = Φ_{V_{0}} (u), f o r V_{0} o f (3.4) . \end{matrix}

(3.10)

For

μ_{1 \cdot 2}

of (3.3),

V_{1 \cdot 2}

of (3.4), and

v \in R^{q_{1}}

, set

\begin{matrix} x_{1} (x_{2}, v) = μ_{1 \cdot 2} + V_{1 \cdot 2}^{1 / 2} v = V_{12} V_{22}^{- 1} x_{2} + V_{1 \cdot 2}^{1 / 2} v . \end{matrix}

(3.11)

Corollary 3.1.

Suppose that

q_{1} = 1

. Then for

v = V_{0}^{- 1 / 2} u

of (3.9),

If

q_{1} = 1

, this gives an asymptotic conditional confidence limit for

{\hat{w}}_{1} - w_{1}

given

{\hat{w}}_{2} - w_{2}

. So if

q_{1} = 1

, by (2.14), for

x_{1} (x_{2}, v)

of (3.11), 2-sided limits are

\begin{matrix} P (x_{1} (x_{2}, - v) < X_{n 1 \cdot 2} < x_{1} (x_{2}, v)) = 2 Φ (v) - 1 + O (n^{- 1}), if v > 0 . \\ Set {\bar{H}}_{q}^{1 - k} = {\bar{H}}^{1 - k} = {\bar{H}}^{1 - k} (x, V), and {\bar{H}}_{q_{2}}^{1 - k} = {\bar{H}}^{1 - k} (x_{2}, V_{22}) . \end{matrix}

So

{\bar{H}}_{q_{2}}^{1 - k}

is given by replacing

y = V^{- 1} x

and

(V^{i j}) = V^{- 1}

in

{\bar{H}}_{q}^{1 - k}

by

\begin{matrix} z = V_{22}^{- 1} x_{2} and (U^{i j}) = V_{22}^{- 1} . For example {\bar{H}}_{q_{2}}^{12} = {\bar{z}}_{1} {\bar{z}}_{2} - {\bar{U}}^{12} . \end{matrix}

(3.13)

By (2.5) and (2.6), for

r \geq 1

,

p_{r}^{*} (x_{2})

of (3.8) is given by

\begin{matrix} p_{r}^{*} (x_{2}) = \sum_{k = 1}^{3 r} [p_{r k}^{*} : k - r even], where p_{r k}^{*} = {\bar{P}}_{r}^{1 - k} {\bar{H}}_{q_{2}}^{1 - k}, \end{matrix}

(3.14)

and implicit summation in (3.14) for

i_{1}, \dots, i_{k}

is now over

q_{1} + 1, \dots, q

. So,

\begin{matrix} p_{1}^{*} (x_{2}) = \sum_{k = 1, 3} p_{1 k}^{*}, p_{11}^{*} = \sum_{i_{1} = q_{1} + 1}^{q} {\bar{k}}_{1}^{1} {\bar{H}}_{q_{2}}^{1}, p_{13}^{*} = \sum_{i_{1}, i_{2}, i_{3} = q_{1} + 1}^{q} {\bar{k}}_{2}^{1 - 3} {\bar{H}}_{q_{2}}^{1 - 3} / 6, \\ where {\bar{H}}_{q_{2}}^{1} = {\bar{z}}_{1}, {\bar{H}}_{q_{2}}^{1 - 3} = {\bar{z}}_{1} {\bar{z}}_{2} {\bar{z}}_{3} - \sum^{3} {\bar{U}}^{12} {\bar{z}}_{3}, \\ p_{2}^{*} (x_{2}) = \sum_{k = 2, 4, 6} p_{2 k}^{*}, p_{3}^{*} (x_{2}) = \sum_{k = 1, 3, 5, 7, 9} p_{3 k}^{*}, for p_{r k}^{*} of (3.14) . \end{matrix}

Ordinary Bell polynomials. For a sequence

e = (e_{1}, e_{2}, \dots)

from R, the partial ordinary Bell polynomial

{\tilde{B}}_{r s} = {\tilde{B}}_{r s} (e)

, is defined by the identity

\begin{matrix} for s = 0, 1, 2, \dots and z \in R, S^{s} = \sum_{r = s}^{\infty} z^{r} {\tilde{B}}_{r s} (e), where S = \sum_{r = 1}^{\infty} z^{r} e_{r} . \\ So, {\tilde{B}}_{r 0} = δ_{r 0}, {\tilde{B}}_{r 1} = e_{r}, {\tilde{B}}_{r r} = e_{1}^{r}, {\tilde{B}}_{32} = 2 e_{1} e_{2}, \end{matrix}

(3.15)

where

δ_{00} = 1, δ_{r 0} = 0

for

r \neq 0 .

They are tabled on p309 of [7]. To obtain (3.7), we use

Lemma 3.2.

Take

{\tilde{B}}_{r s} (e)

of (3.15). Set

S_{2} = \sum_{r = 1}^{\infty} z^{r} f_{r}

for

f_{r} \in R

. Then

\begin{matrix} {(1 + S_{2})}^{- 1} = \sum_{r = 0}^{\infty} z^{r} C_{r}, where C_{r} = B_{r}^{*} (- f), B_{r}^{*} (e) = \sum_{s = 0}^{r} {\tilde{B}}_{r s} (e) . \\ S o, B_{0}^{*} (e) = 1, B_{1}^{*} (e) = e_{1}, B_{2}^{*} (e) = e_{2} + e_{1}^{2}, B_{3}^{*} (e) = e_{3} + 2 e_{1} e_{2} + e_{1}^{3}, \end{matrix}

(3.16)

PROOF

\begin{matrix} {(1 + S_{2})}^{- 1} = \sum_{s = 0}^{\infty} {(- S_{2})}^{s}, and {(- S_{2})}^{s} = \sum_{r = s}^{\infty} z^{r} {\tilde{B}}_{r s} (- f) . \end{matrix}

Now swap summations. □

Theorem 3.1.

Take

{\tilde{p}}_{r} (x)

of (2.6) and

C_{r} = B_{r}^{*} (- f)

of (3.16) with

\begin{matrix} f_{r} = p_{r}^{*} (x_{2}) o f (3.14) . \end{matrix}

(3.18)

The conditional density

p_{n 1 \cdot 2} (x_{1})

of (3.7), relative to

ϕ_{1 \cdot 2} (x_{1})

of (3.9), is

\begin{matrix} p_{n 1 \cdot 2} (x_{1}) / ϕ_{1 \cdot 2} (x_{1}) \approx \sum_{r = 0}^{\infty} n^{- r / 2} D_{r}, where D_{r} = C_{r} \otimes {\tilde{p}}_{r} (x), \end{matrix}

(3.19)

and for sequences

(a_{0}, a_{1}, \dots)

and

(b_{0}, b_{1}, \dots), a_{r} \otimes b_{r} = \sum_{i = 0}^{r} a_{i} b_{r - i} .

So,

\begin{matrix} D_{0} = {\tilde{p}}_{0} (x) = 1, D_{1} = C_{1} + {\tilde{p}}_{1} (x), D_{2} = C_{2} + C_{1} {\tilde{p}}_{1} (x) + {\tilde{p}}_{2} (x), \end{matrix}

(3.20)

PROOF This follows from (3.7) and Lemma 3.2. □

So

D_{0}, \dots, D_{3}

of (3.20) and (3.21) give the conditional density to

O (n^{- 2})

. We call (3.19) the relative conditional density. We now give our main result, an expansion for the conditional distribution of

X_{n 1} | (X_{n 2} = x_{2})

. As noted, Theorem 3.2 gives this in terms of

{\bar{I}}^{1 - k}

of (3.28) below, an integral of the Hermite polynomial

{\bar{H}}^{1 - k}

of (2.8), and Note 3.1 gives

{\bar{I}}^{1 - k}

in terms of derivatives of the multivariate normal distribution. Theorem 3.3 gives

{\bar{I}}^{1 - k}

in terms of the partial moments of the conditional distribution

Φ_{1 \cdot 2} (x_{1})

of (3.10). When

q_{1} = 1

, Theorem 3.4 gives

{\bar{I}}^{1 - k}

in terms of

Φ (v)

and

ϕ (v)

for

\begin{matrix} v = V_{1 \cdot 2}^{- 1 / 2} u = V_{1 \cdot 2}^{- 1 / 2} (x_{1} - μ_{1 \cdot 2}) \in R^{q_{1}} . \end{matrix}

(3.22)

Theorem 3.2.

Take

C_{r}, D_{r}

of Theorem 3.1. Set

{\tilde{p}}_{0} (x) = 1 .

The conditional distribution of

X_{n 1}

given

X_{n 2}

, about

Φ_{1 \cdot 2} (x_{1})

of (3.10), has the expansion Preprints 178664 i008

PROOF (3.26) holds by (2.6). (3.27) holds by (2.6). Now use (3.9). □ Preprints 178664 i009

for

C_{r}

of (3.17).

g_{1} = g_{11} + g_{13}

is given by

{\bar{I}}^{1}, {\bar{I}}^{1 - 3}

,

g_{2} = g_{22} + g_{24} + g_{26}

is given by

{\bar{I}}^{12}, {\bar{I}}^{1 - 4}, {\bar{I}}^{1 - 6},

and

g_{3} = \sum (g_{3 k} : k = 1, 3, 5, 7, 9)

is given by

{\bar{I}}^{1}, {\bar{I}}^{1 - 3}, \dots, {\bar{I}}^{1 - 9} .

Note 3.1. Set

\partial^{†} = Π_{i = q_{1} + 1}^{q} \partial_{i} .

By (3.9),

Comparing

{\bar{L}}^{1 - k}

with the Hermite function

{\bar{H}}_{*}^{1 - k}

of (2.7), we can call

{\bar{L}}^{1 - k}

the partial Hermite function. When

q = 2

, see (4.1).

By (3.25),

G_{r}

in (3.23) is given by

C_{r}

of (3.18) and

g_{r}

of (3.26). Viewing

{\bar{H}}_{q}^{1 - k}

as a polynomial in

x_{1} = μ_{1 \cdot 2} + u

for u of (3.9),

{\bar{I}}^{1 - k}

is linear in

\int_{- \infty}^{x_{1}} x_{i_{1}} \dots x_{i_{s}} ϕ_{1 \cdot 2} (x_{1}) d x_{1} = \int_{- \infty}^{u} {(μ_{1 \cdot 2} + u)}_{i_{1}} \dots {(μ_{1 \cdot 2} + u)}_{i_{s}} ϕ_{V_{0}} (u) d u

for

0 \leq s \leq k, 1 \leq i_{1}, \dots, i_{s} \leq q_{1}

. So

{\bar{I}}^{1 - k}

can be expanded in terms of the partial moments of Preprints 178664 i011

This has only

q_{1}

integrals, while (2.12) has q integrals.

Lemma 3.3.

For

u = x_{1} - μ_{1 \cdot 2}

,

y = V^{- 1} x = α + Λ u

, where

\begin{matrix} Λ = (\binom{V^{11}}{V^{21}}) \in R^{q \times q_{1}}, α = (\binom{α_{1}}{α_{2}}), α_{1} = 0_{q_{1}}, α_{2} = V_{22}^{- 1} x_{2} . \end{matrix}

(3.33)

PROOF

y = (\binom{y_{1}}{y_{2}})

, where

y_{i} = V^{i 1} x_{1} + V^{i 2} x_{2} = α_{i} + V^{i 1} u

, and

α_{i} = A_{i} x_{2}

for

A_{i}

of (3.6). □

Our main result, Theorem 3.2, gave the conditional distribution expansion in terms of

{\bar{I}}^{1 - k}

of (3.28). Note 4.1 gave these in terms of the derivatives of

Φ_{V} (x)

. We now give

{\bar{I}}^{1 - k}

in terms of

{\bar{J}}^{1 - k}

, the partial moments of the conditional distribution

Φ_{1 \cdot 2} (x_{1})

of (3.10). As in (2.10), for any

π = (m, \dots, n)

, set

\sum^{N} c_{π} = \sum c_{π}

summed over all, N say, permutations of

π

giving distinct

c_{π}

. For example,

\sum^{2} c_{23} = c_{23} + c_{32}

.

Theorem 3.3.

Take

{\bar{J}}^{1 - k} (x, V)

of (2.11), u of (3.10), M of (3.31),

{\bar{M}}^{a - b}

of (3.32),

Λ, α

of (3.33), and

1 \leq i_{1}, \dots, i_{k} \leq q

. Set

\begin{matrix} {\bar{K}}^{1 - k} = K^{i_{1} \dots i_{k}} = \int_{- \infty}^{u} {(Λ u)}_{i_{1}} \dots {(Λ u)}_{i_{k}} ϕ_{V_{0}} (u) d u = Λ_{i_{1} j_{1}} \dots Λ_{i_{k} j_{k}} M^{j_{1}, \dots j_{k}}, \end{matrix}

where

j_{1}, \dots j_{k}

sum over their range

1, \dots, q_{1} .

So ,

\begin{matrix} {\bar{K}}^{1 - k} = {\bar{Λ}}_{1, k + 1} \dots {\bar{Λ}}_{k, 2 k} {\bar{M}}^{k + 1 - 2 k} . \\ F e x a m p l e, {\bar{K}}^{1} = {\bar{Λ}}_{12} {\bar{M}}^{2}, {\bar{K}}^{12} = {\bar{Λ}}_{13} {\bar{Λ}}_{24} {\bar{M}}^{34}, {\bar{K}}^{123} = {\bar{Λ}}_{14} {\bar{Λ}}_{25} {\bar{Λ}}_{36} {\bar{M}}^{456} . \end{matrix}

(3.34)

PROOF Since

x_{1} = μ + u, y = Λ x_{1} + (\binom{V^{12}}{V^{22}}) x_{2} = α + Λ u \in R^{q} .

Substitute

y = α + Λ u

into the expressions for

{\bar{H}}^{1 - k} .

Now multiply by

ϕ_{V_{0}} (u)

and integrate from

- \infty

to

u . □

This gives the

{\bar{I}}^{1 - k}

needed for

g_{1}, g_{2}, G_{1}, G_{2}

. The

{\bar{I}}^{1 - k}, k = 7, 9

needed for

g_{3}, G_{3}

can be written down similarly in terms of the partial moments using

{\bar{H}}_{q}^{1 - k}

for

k = 7, 9 .

We now show that if

q_{1} = 1

, we only need the partial moments of

Φ (v)

at v of (3.22), and that these are easily written in terms of

Φ (v)

and

ϕ (v) \times

a polynomial in v of (3.22).

The case

q_{1} = 1 .

So

w_{1} = w_{1}, {\hat{w}}_{1} = {\hat{w}}_{1}, X_{1} = X_{1}, X_{n 1} = X_{n 1}, V_{11} = V_{11} .

Theorem 3.4.

For

q_{1} = 1

,

1 \leq k \leq 6, {\bar{I}}^{1 - k}

is given by Theorem 3.3 with Preprints 178664 i013

where dot denotes multiplication. Also,

G_{0} = Φ (v)

.

PROOF For v of (3.22), by (3.9),

ϕ_{1 \cdot 2} (x_{1}) = σ^{- 1} ϕ (v) .

(3.37) follows from integration by parts. By (3.34),

K^{1 - k} = {\bar{Λ}}_{1} \dots {\bar{Λ}}_{k} M^{1^{k}}

where

M^{1^{k}} = \int_{- \infty}^{u} u^{k} d Φ (u / σ) = σ^{k} γ_{k} .

That

G_{0} = Φ (v),

follows from (3.25). □

By (3.23), for

C_{r}

of (3.18) and v of (3.22), the conditional distribution of

X_{n 1 \cdot 2}

is

\begin{matrix} P (X_{n 1 \cdot 2} / σ \leq v) \approx Φ (v) + \sum_{r = 1}^{\infty} n^{- r / 2} G_{r}, where G_{r} = C_{r} \otimes g_{r}, \end{matrix}

(3.38)

as in (3.29), and

g_{r}

is given by (3.26) in terms of the integrated Hermite polynomial,

{\bar{I}}^{1 - k}

of (3.28) given by Theorems 3.3, 3.4.

4. The Case $q_{1} = q_{2} = 1 .$

Theorem 3.2 gave the conditional Edgeworth expansion in terms of

{\bar{I}}^{1 - k}

of (3.28). Theorem 3.3 gave

{\bar{I}}^{1 - k}

needed for

g_{r k}

of (3.27) and

G_{1}, G_{2}

of (3.23), in terms of the partial moments

{\bar{M}}^{a - b}

of (3.32). When

q_{1} = 1

, Theorem 3.4 gave

{\bar{I}}^{1 - k}

in terms of

Φ (v)

and its partial moments for v of (3.22). But now

q = 2

so that

i_{1}, \dots, i_{k} = 1

or 2. So for

(I, y, Y)

of (2.9), we switch notation to

\begin{matrix} H_{a b} = {(- \partial_{1})}^{a} {(- \partial_{2})}^{b} ϕ_{V} (x) = E {(y_{1} + I Y_{1})}^{a} {(y_{2} + I Y_{2})}^{b}, \\ H_{a b}^{*} = {(- \partial_{1})}^{a} {(- \partial_{2})}^{b} Φ_{V} (x) = \int_{- \infty}^{x} H_{a b} ϕ_{V} (x) d x . \\ So H_{a b}^{*} = H_{a - 1, b - 1} ϕ_{V} (x) if a \geq 2, b \geq 1, \\ H_{10}^{*} = \int_{- \infty}^{x_{2}} ϕ_{V} (x) d x_{2} = \partial_{1} Φ_{V} (x), H_{a 0}^{*} = {(- \partial_{1})}^{a - 1} H_{10}^{*} if a \geq 2, \\ H_{01}^{*} = \int_{- \infty}^{x_{1}} ϕ_{V} (x) d x_{1} = \partial_{2} Φ_{V} (x), H_{0 b}^{*} = {(- \partial_{2})}^{b - 1} H_{01}^{*} i f b \geq 1, \end{matrix}

for

{\bar{L}}^{1 - k}

of (3.30). Similarly, write (2.1) as

\begin{matrix} κ_{a b} ({\hat{w}}_{1}, {\hat{w}}_{2}) \approx \sum_{d = a + b - 1}^{\infty} n^{- d} k_{a b d}, for a + b \geq 1, where k_{a b d} = k_{d}^{1^{a} 2^{b}}, and \end{matrix}

Also, we switch from

{\bar{P}}_{r}^{1 - k}

to

\begin{matrix} P_{r} (a b) = (\binom{a + b}{a}) P_{r}^{1^{a} 2^{b}} . \end{matrix}

given for

r \leq 3

in Section 4 of [30]. So,

\begin{matrix} {\tilde{p}}_{r k} = \sum_{b = 0}^{k} P_{r} (k - b, b) H_{k - b, b}, P_{r k} (x) = \sum_{b = 0}^{k} P_{r} (k - b, b) H_{k - b, b}^{*} . \\ S o, {\tilde{p}}_{r 1} = P_{r} (10) H_{10} + P_{r} (01) H_{01}, {\tilde{p}}_{11} = k_{101} y_{1} + k_{011} y_{2}, \\ {\tilde{p}}_{13} = P_{1} (30) H_{30} + P_{1} (21) H_{21} + P_{1} (12) H_{12} + P_{1} (03) H_{03} . \end{matrix}

(4.3)

P_{r} (b a)

is just

P_{r} (a b)

with 1 and 2 reversed. For the other

{\tilde{p}}_{r k}

and

P_{r k} (x)

needed for

r \leq 3

, see Section 4 of [30]. Our main result for this section, Theorem 4.3, gives simple formulas for

I_{a b}

and for

g_{r}

of (3.26), the main ingredient needed in Theorem 3.2 for the expansion of the conditional distribution.

Theorem 4.1.

The conditional density of

X_{n 1 \cdot 2}

of (3.1), is given by Theorem 3.1 where

f_{r} = p_{r}^{*} (x_{2})

is given by (3.14) in terms of

\begin{matrix} p_{r k}^{*} = P_{r} (0 k) H_{k}^{*}, where H_{k}^{*} = H_{k} (z, V_{22}) and z = V_{22}^{- 1} x_{2} . \end{matrix}

(4.4)

\begin{matrix} F e x a m p l e, H_{1}^{*} = z, H_{2}^{*} = z^{2} - V_{22}^{- 1}, H_{3}^{*} = z^{3} - 3 z V_{22}^{- 1}, \end{matrix}

(4.5)

\begin{matrix} H_{4}^{*} = z^{4} - 15 z^{2} V_{22}^{- 1} + 3 V_{22}^{- 2}, H_{5}^{*} = z^{5} - 10 z^{3} V_{22}^{- 1} + 15 z V_{22}^{- 2}, \end{matrix}

(4.6)

\begin{matrix} H_{6}^{*} = z^{6} - 15 z^{4} V_{22}^{- 1} + 45 z^{2} V_{22}^{- 2} - 15 V_{22}^{- 3} . \end{matrix}

(4.7)

PROOF This follows from Theorem 3.1. □

Theorem 4.2 gives a laborious expression for the conditional distribution.

However Theorem 4.3 gives a huge simplification.

Theorem 4.2.

The conditional distribution of

X_{n 1 \cdot 2}

of (3.1), is given by Theorem 3.2 with

Λ, σ, γ_{s}

of Theorem 3.4 as follows. For

k - r

even,

g_{r k}

of (3.27) is given by

\begin{matrix} g_{r k} = \sum_{b = 0}^{k} P_{r} (k - b, b) I_{k - b, b}, \end{matrix}

(4.8)

where

I_{a b}

of (4.2) is given for

a + b = k

, as follows in terms of

Λ_{i} = V^{i 1}

.

\begin{matrix} K_{a b} = Λ_{1}^{a} Λ_{2}^{b} σ^{a + b} γ_{a + b}, and J_{k 0} = \sum_{s = 0}^{k} α_{1}^{k - s} K_{s 0}, f o r K_{s 0} = {(Λ_{1} σ)}^{s} γ_{s} . \\ F k = 1 : I_{10} = J_{10} = α_{1} γ_{0} + Λ_{1} σ γ_{1} . \\ F k = 2 : I_{20} = J_{20} - γ_{0} V^{11}, I_{11} = J_{11} - γ_{0} V^{12}, \\ J_{11} = α_{1} α_{2} γ_{0} + σ γ_{1} \sum^{2} α_{1} Λ_{2} + Λ_{1} Λ_{2} σ^{2} γ_{2} . \\ F k = 3 : I_{30} = J_{30} - 3 J_{10} V^{11}, I_{21} = J_{21} - (2 J_{10} V^{12} + J_{01} V^{11}), \\ J_{21} = α_{1}^{2} α_{2} γ_{0} + X_{21} + X_{12} + K_{21}, where \\ X_{21} = α_{1}^{2} K_{01} + 2 α_{1} α_{2} K_{10}, X_{12} = 2 α_{1} K_{11} + α_{2} K_{20} . \\ F k = 4 : I_{40} = J_{40} - 6 J_{20} V^{11} + 3 γ_{0} {(V^{11})}^{2}, \\ I_{31} = J_{31} - S_{6} + γ_{0} S_{3}, where S_{6} = 3 J_{20} V^{22} + 3 J_{11} V^{12}, S_{3} = 3 V^{11} V^{12}, \\ J_{31} = α_{1}^{3} α_{2} γ_{0} + X_{31} + X_{22} + X_{13} + K_{31}, where X_{31} = α_{1}^{3} K_{01} + 3 α_{1}^{2} α_{2} K_{10}, \\ X_{22} = 4 α_{1}^{2} K_{11} + 2 α_{1} α_{2} K_{20}, X_{13} = 3 α_{1} K_{21} + 6 α_{2} K_{30}, \\ I_{22} = J_{22} - S_{6} + γ_{0} S_{3}, where S_{6} = J_{20} V^{22} + 4 J_{11} V^{12} + J_{02} V^{11}, \end{matrix}

(4.9)

\begin{matrix} S_{3} = μ^{1122} = V^{11} V^{22} + 2 {(V^{12})}^{2}, \\ J_{22} = α_{1}^{2} α_{2}^{2} γ_{0} + X_{31} + X_{22} + X_{13} + K_{22}, where X_{31} = 2 α_{1}^{2} α_{2} K_{01} + 2 α_{1} α_{2}^{2} K_{02}, \\ X_{22} = α_{1}^{2} K_{02} + 4 α_{1} α_{2} K_{11} + α_{2}^{2} K_{20}, X_{13} = 2 α_{1} K_{12} + 2 α_{2} K_{21} . \\ F k = 5 : I_{50} = J_{50} - 10 J_{30} V^{11} + 15 J_{10} {(V^{11})}^{2}, \\ I_{41} = J_{41} - S_{10} + S_{15}, where S_{10} = 6 J_{21} V^{11} + 4 J_{30} V^{12}, \\ S_{15} = 12 J_{10} V^{11} V^{12} + 3 J_{01} {(V^{11})}^{2}, \\ J_{41} = α_{1}^{4} α_{2} γ_{0} + X_{41} + X_{32} + X_{23} + X_{14} + K_{41}, where, \\ X_{41} = 4 α_{1}^{3} α_{2} K_{10} + α_{1}^{4} K_{01}, X_{32} = 5 α_{1}^{2} α_{2} K_{20} + 5 α_{1}^{3} K_{11}, \\ X_{23} = 6 α_{1}^{2} K_{21} + 4 α_{1} α_{2} K_{30}, X_{14} = 4 α_{1} K_{31} + α_{2} K_{40}, \\ I_{32} = J_{32} - S_{10} + S_{15}, where S_{10} = 3 J_{12} V^{11} + 6 J_{21} V^{12} + J_{30} V^{22}, \\ S_{15} = 3 J_{10} μ^{1122} + 6 J_{01} V^{11} V^{12}, \\ J_{32} = α_{1}^{3} α_{2}^{2} γ_{0} + X_{41} + X_{32} + X_{23} + X_{14} + K_{32}, where, \\ X_{41} = 3 α_{1}^{2} α_{2}^{2} K_{10} + 2 α_{1}^{3} α_{2} K_{01}, X_{32} = 3 α_{1} α_{2}^{2} K_{20} + 6 α_{1}^{2} α_{2} K_{11} + α_{1}^{3} K_{02}, \\ X_{23} = 3 α_{1}^{2} K_{12} + 6 α_{1} α_{2} K_{21} + α_{2}^{2} K_{30}, X_{14} = 3 α_{1} K_{22} + 2 α_{2} K_{31} . \\ F k = 6 : I_{60} = J_{60} - 15 J_{40} V^{11} + 45 J_{20} {(V^{11})}^{2} - 15 γ_{0} {(V^{11})}^{3}, \\ I_{51} = J_{51} - S_{15} + S_{45} - γ_{0} S_{15}^{'}, where S_{15} = 5 V^{12} J_{40} + 10 V^{11} J_{31}, \\ S_{45} = 30 V^{11} V^{12} J_{20} + 15 {(V^{11})}^{2} J_{11}, S_{15}^{'} = 15 γ_{0} {(V^{11})}^{2} V^{12}, \\ J_{51} = α_{1}^{5} α 2 γ_{0} + X_{51} + X_{42} + X_{33} + X_{24} + X_{15} + K_{51}, where \\ X_{51} = α_{1}^{5} K_{01} + 5 α_{1}^{4} α_{2} K_{10}, X_{42} = 5 α_{1}^{4} K_{11} + 10 α_{1}^{3} α_{2} K_{20}, \\ X_{33} = 10 α_{1}^{3} K_{21} + 10 α_{1}^{2} α_{2} K_{30}, X_{24} = 10 α_{1}^{2} K_{31} + 5 α_{1} α_{2} K_{40}, \\ X_{15} = α_{2} K_{60} + 5 α_{1} K_{51}, \\ I_{42} = J_{42} - S_{15} + S_{45} - γ_{0} S_{15}^{'}, where S_{15} = V^{22} J_{40} + 6 V^{11} J_{22} + 8 V^{12} J_{31}, \\ S_{45} = 3 {(V^{11})}^{2} J_{02} + 6 μ^{1122} J_{20} + 24 V^{11} V^{12} J_{11}, \\ S_{15}^{'} = 3 {(V^{11})}^{2} V^{22} + 24 V^{11} V^{12})^{2} + 6 V^{11} μ^{1122}, \\ J_{42} = α_{1}^{4} α_{2}^{2} γ_{0} + X_{51} + X_{42} + X_{33} + X_{24} + X_{15} + K_{42}, where \\ X_{51} = 2 α_{1}^{5} K_{01} + 4 α_{1}^{4} α_{2} K_{10}, X_{42} = α_{1}^{4} K_{02} + 8 α_{1}^{3} α_{2} K_{11} + 6 α_{1}^{2} α_{2}^{2} K_{20}, \\ X_{33} = 10 α_{1}^{2} α_{2} K_{12} + 10 α_{1} α_{2}^{2} K_{21}, \\ X_{24} = α_{2}^{2} K_{40} + 8 α_{1} α_{2} K_{31} + 6 α_{1}^{2} K_{22}, X_{15} = 2 α_{2} K_{50} + 4 α_{1} K_{41}, \end{matrix}

\begin{matrix} I_{33} = J_{33} - S_{15} + S_{45} - γ_{0} S_{15}^{'}, where S_{15} = 6 V^{11} J_{04} + 9 V^{12} J_{22} + 6 V^{22} J_{40}, \\ S_{45} = 9 V^{12} V^{22} J_{20} + 9 μ^{1122} J_{11} + 9 V^{12} V^{11} J_{02}, S_{15}^{'} = 6 V^{11} V^{12} V^{22} + 3 V^{12} μ^{1122}, \\ J_{33} = α_{1}^{3} α_{2}^{3} γ_{0} + X_{51} + X_{42} + X_{33} + X_{24} + X_{15} + K_{33}, where \\ X_{51} = 2 α_{1}^{2} α_{2}^{3} K_{10} + 3 α_{1}^{3} α_{2}^{2} K_{01}, X_{42} = 6 α_{1} α_{2}^{3} K_{20} + 3 α_{1}^{3} α_{2}^{3} K_{11} + 6 α_{1}^{3} α_{2} K_{02}, \\ X_{33} = α_{1}^{3} K_{03} + 9 α_{1}^{2} α_{2} K_{12} + 9 α_{1} α_{2}^{2} K_{21} + α_{2}^{3} K_{30}, \\ X_{24} = 6 α_{1}^{2} K_{13} + 3 α_{1} α_{2} K_{22} + 8 α_{2}^{2} K_{31}, X_{15} = 3 α_{1} K_{23} + 3 α_{2} K_{32} . \end{matrix}

Also

J_{b a}, I_{b a}

are

J_{a b}, I_{a b}

with

α_{1}, Λ_{1} = V^{11}

and

α_{2}, Λ_{2} = V^{21}

of (3.33) reversed, before setting

α_{1} = 0

and

α_{2} = z = V_{22}^{- 1} x_{2}

by (3.13). For example, by (4.9), for

Λ, σ, γ_{s}

of Theorem 3.4,

\begin{matrix} I_{10} = α_{1} γ_{0} + Λ_{1} σ γ_{1} = V^{11} σ γ_{1}, I_{01} = α_{2} γ_{0} + Λ_{2} σ γ_{1} = z γ_{0} + V^{11} σ γ_{1}, \end{matrix}

(4.10)

PROOF This follows from Theorems 3.3 and 3.4. □

This gives the

{\bar{I}}_{a b}

needed for

g_{1}, g_{2}, G_{1}, G_{2}

for the conditional distribution of (3.23)–(3.25) to

O (n^{- 3 / 2})

. The

{\bar{I}}_{a b}

needed for

g_{3}, G_{3}

can be written down similarly. We now give a much simpler method for obtaining

g_{r k}

of (3.27), and so

g_{r}

by (3.26), and

G_{r}

needed for (3.23) by (3.24). Theorem 4.3 gives

g_{r k}

and

g_{r}

in terms of

I_{0 k}

of (4.2). Theorem 4.4 gives

I_{0 k}

in terms of

J_{0 k}

of (4.11), a function of

(Λ, σ, γ_{s})

of Theorem 3.4.

Theorem 4.3.

For v of (3.22),

I_{a b}

of (4.2) is given by

\begin{matrix} I_{a b} = - H_{a - 1, b} σ^{- 1} ϕ (v), f o r a \geq 1 . \end{matrix}

(4.12)

For

k \geq r \geq 1

and

k - r

even,

g_{r k}

of (3.27) is given by

\begin{matrix} g_{r k} = P_{r} (0 k) I_{0 k} - b_{r k} σ^{- 1} ϕ (v), f o r b_{r k} = \sum_{a = 1}^{k} P_{r} (a, k - a) H_{a - 1, k - a} . \end{matrix}

(4.13)

So by (3.26), for

r \geq 1, g_{r}

of (3.25) is given by

\begin{matrix} g_{r} = \sum_{k = 1}^{3 r} [P_{r} (0 k) I_{0 k} - b_{r k} σ^{- 1} ϕ (v) : k - r even] . \end{matrix}

(4.14)

PROOF By (4.8),

g_{r k} = P_{r} (0 k) I_{0 k} + \sum_{a = 1}^{k} P_{r} (a, k - a) I_{a, k - a} .

By (3.9),

ϕ_{1 \cdot 2} (x_{1}) / ϕ_{V} (x) = θ^{- 1}

where

θ = ϕ_{V_{22}} (x_{2})

and

ϕ_{1 \cdot 2} (x_{1}) = σ^{- 1} ϕ (v) .

So,

\begin{matrix} for a \geq 1, H_{a b} ϕ_{V} (x) = {(- \partial_{1})}^{a} {(- \partial_{2})}^{b} = - \partial_{1} [H_{a - 1, b} ϕ_{V} (x)], \\ so I_{a b} = θ^{- 1} \int_{- \infty}^{x_{1}} H_{a b} ϕ_{V} (x) d x_{1} = - θ^{- 1} H_{a - 1, b} ϕ_{V} (x) = - H_{a - 1, b} σ^{- 1} ϕ (v) . \end{matrix}

This proves (4.12). So,

\begin{matrix} g_{r k} = P_{r} (0 k) I_{0 k} - θ^{- 1} ϕ_{V} (x) \sum [P_{r} (a b) H_{a - 1, b} : a + b = k, a \geq 1] . \end{matrix}

(4.13) follows. (4.14) now follows from (3.14). □

Note 4.1.

b_{r k}

is just

{\tilde{p}}_{r k}

of (4.3) with

(H_{0 b}, H_{a b})

replaced by

(0, H_{a - 1, b})

for

a \geq 1

.

So for

r = 1, 2, 3, b_{r k}

is given in terms of

P_{r} (.)

of Section 3, by Preprints 178664 i017

This gives

g_{r k}

and

g_{r}

of (3.26) for

r \leq 3

, and so the conditional distribution

P_{1 \cdot 2} (x_{1})

of (3.23), to

O (n^{- 2})

, in terms of

I_{0 k}

of (4.2) and the coefficients

P_{r} (a b)

.

Theorem 4.4.

The

I_{0 k}

needed for

g_{1}, g_{2}, g_{3}

of (4.14) and (3.24) are given in terms of

γ_{0} = Φ (v), v

of (3.22), and

J_{0 k}

of (4.11), by

\begin{matrix} I_{01} = J_{01}, I_{02} = J_{02} - γ_{0} V^{22}, I_{03} = J_{03} - 3 J_{01} V^{22}, \\ I_{04} = J_{04} - 6 J_{02} V^{22} + 3 γ_{0} {(V^{22})}^{2}, \\ I_{05} = J_{05} - 10 J_{03} V^{22} + 15 J_{01} {(V^{22})}^{2}, \\ I_{06} = J_{06} - 15 J_{04} V^{22} + 45 J_{02} {(V^{22})}^{2} - 15 γ_{0} {(V^{22})}^{3}, \\ I_{07} = J_{07} - 21 J_{05} V^{22} + 105 J_{03} {(V^{22})}^{2} - 105 J_{01} {(V^{22})}^{3}, \\ I_{08} = J_{08} - 28 J_{06} V^{22} + 210 J_{04} {(V^{22})}^{2} - 420 J_{02} {(V^{22})}^{3} + 105 γ_{0} {(V^{22})}^{4}, \\ I_{09} = J_{09} - 36 J_{07} V^{22} + 378 J_{05} {(V^{22})}^{2} - 1260 J_{03} {(V^{22})}^{3} + 945 J_{01} {(V^{22})}^{4} . \end{matrix}

PROOF For

k \leq 6, I_{0 k}

follow from Theorem 3.2. By the proof of Theorem 3.3,

I_{0 k}

can be read off [30] and the univariate Hermite polynomials

H_{k} (u)

given in terms of

I = \sqrt{- 1}

by expanding

\begin{matrix} H_{k} = H_{k} (u) = ϕ {(u)}^{- 1} {(- d / d u)}^{k} ϕ (u) = E {(u + I N)}^{k}, f o r k \geq 0 . □ \end{matrix}

To summarise, the conditional density of

X_{n 1 \cdot 2}

of (3.1), is given by Theorem 4.1, and the conditional distribution is given by (3.23), (3.27) in terms of

g_{r}

of (4.14) and

I_{0 k}

of Theorem 4.4.

Example 4.1.

Conditioning when

\hat{w} \in R^{2}

is the mean of a sample with cumulants

κ_{a b}

. The non-zero

P_{r} (a b)

were given in Example 6 of [30]. So

b_{r k} = 0

for

(r k) = (11), (22), (31), (33),

and for

r = 1, 2, 3,

other

b_{r k}

are given by (4.15)–(4.18) starting

\begin{matrix} 6 b_{13} = κ_{30} H_{20} + 3 κ_{21} H_{11} + κ_{12} H_{02}, \end{matrix}

(4.19)

The relative conditional density is given to

O (n^{- 2})

by (3.19) in terms of

{\tilde{p}}_{r}

of (2.6),

{\tilde{p}}_{r k}

of (4.3),

f_{r} = p_{r}^{*} (x_{2})

of (3.14) for

r \leq 3

, and

H_{k}^{*}

of (4.4) for

k \leq 9

.

\begin{matrix} S o, f_{1} = p_{13}^{*} = P_{1} (03) H_{3}^{*}, P_{1} (03) = κ_{03} / 6, \\ f_{2} = p_{24}^{*} + p_{26}^{*}, p_{24}^{*} = P_{2} (04) H_{4}^{*}, P_{2} (04) = κ_{04} / 24, \\ p_{26}^{*} = P_{2} (06) H_{6}^{*}, P_{2} (06) = κ_{03}^{2} / 72, \end{matrix}

\begin{matrix} f_{3} = \sum_{k = 5, 7, 9} p_{3 k}^{*}, p_{3 k}^{*} = P_{3} (0 k) H_{k}^{*}, P_{3} (05) = κ_{05} / 120, \\ P_{3} (07) = κ_{04} κ_{03} / 144, P_{3} (09) = {(κ_{03} / 6)}^{3} . \end{matrix}

The conditional distribution is given by (3.38) with

g_{r}

of (4.14), starting

\begin{matrix} G_{0} = g_{0} = Φ (v), g_{1} = κ_{03} I_{03} / 6 - b_{13} σ^{- 1} ϕ (v), \\ f o r v o f (3.22), w i t h σ^{2} = V_{0} = κ_{20} - κ_{11}^{2} / κ_{02}, μ_{1 \cdot 2} = κ_{11} κ_{02}^{- 1} x_{2}, \end{matrix}

(4.22)

I_{03}

of Theorem 4.4, and

b_{13}

of (4.19). As noted this is a far simpler result than using Theorem 4.2.

\begin{matrix} S i m i l a r l y, g_{2} = κ_{04} I_{04} / 24 + κ_{03}^{2} I_{06} / 72 - (b_{24} + b_{26}) σ^{- 1} ϕ (v), \\ g_{3} = \sum_{k = 5, 7, 9} [P_{3} (0 k) I_{0 k} - b_{3 k} σ^{- 1} ϕ (v)], \end{matrix}

for

b_{24}, b_{26}

of (4.20), (4.21) and

b_{3 k}

above.

Example 4.2.

We now build on the entangled gamma model of Example 7 of [30], which gave the

P_{r} (a b)

needed. Let

G_{0}, G_{1}, G_{2}

be independent gamma random variables with means

γ = γ_{0}, γ_{1}, γ_{2}

. For

i = 1, 2

, set

X_{i} = G_{0} + G_{i}, w_{i} = E X_{i} = γ + γ_{i}

, and let

\hat{w}

be the mean of a random sample of size n distributed as

(X_{1}, X_{2})

. So,

E \hat{w} = w,

and

n \hat{w} \overset{L}{=} {(G_{n 0} + G_{n 1}, G_{n 0} + G_{n 2})}^{'}

where

G_{n 0}, G_{n 1}, G_{n 2}

are independent gamma random variables with means

n γ, n γ_{1}, n γ_{2}

. The rth order cumulants of

X = {(X_{1}, X_{2})}^{'}

are

κ^{i^{r}} = (r - 1)! w_{i},

and otherwise

(r - 1)! γ

. Now suppose that

γ_{i} \equiv 1

, the entangled exponential model. So

q = 2

,

X_{n 1}

and

X_{n 2}

have correlation

1 / 2

,

\begin{matrix} V = (\begin{matrix} 2 & 1 \\ 1 & 2 \end{matrix}), V_{12} V_{22}^{- 1} = 1 / 2, V_{1 \cdot 2} = 3 / 2, V^{- 1} = (\begin{matrix} 2 & - 1 \\ - 1 & 2 \end{matrix}) / 3, \\ P (X_{n 1} | (X_{n 2} = x_{2}) < x_{1} (x_{2}, u)) = Φ (u) + O (n^{- 1 / 2}), \end{matrix}

for

x_{1} (x_{2}, v)

of (3.11), that is,

x_{1} (x_{2}, u) = x_{2} / 2 + {(3 / 2)}^{1 / 2} u

. Figure 4.1 plots the conditional asymptotic quantiles of

X_{n 1 \cdot 2}

, that is,

x_{1} (x_{2}, u)

, for

Φ (u) = . 01, . 025, . 1, 0, . 9, . 975, . 99

. To

O (n^{- 1 / 2})

, given n and

\hat{w}

, this figure is equivalent to a figure of

w_{1}

versus

w_{2}

. That is, Figure 4.1 shows to

O (n^{- 1 / 2})

, the likely value of

w_{1} = {\hat{w}}_{1} - n^{- 1 / 2} x_{1}

for a given value of

w_{2} = {\hat{w}}_{2} - n^{- 1 / 2} x_{2} .

In fact by (3.12),

- X_{n 1 \cdot 2} = n^{1 / 2} (w_{1} - {\hat{w}}_{1 \cdot 2})

lies between the outer limits with probability .98+

O (n^{- 1})

. So although labelled as

x_{1}

versus

x_{2}

, the figure can be viewed as showing the likely value of

w_{1} = {\hat{w}}_{1} - n^{- 1 / 2} x_{1}

for a given value of

w_{2} = {\hat{w}}_{2} - n^{- 1 / 2} x_{2} .

We now give

C_{r}

of (3.17),

D_{r}

of (3.19),

H_{k}^{*}

and

p_{r k}^{*}

of (4.4), and

g_{r}

for

G_{r}

, the coefficients of the expansion for the conditional distribution of (3.23).

\begin{matrix} S o, P_{1} (03) = 2 / 3, P_{1} (21) = 1, P_{2} (04) = 1 / 2, P_{2} (31) = 1, P_{2} (22) = 3 / 2, \\ P_{2} (06) = 2 / 9, P_{2} (51) = 2 / 3, P_{2} (42) = 7 / 6, P_{2} (33) = 26 / 3 \\ P_{3} (05) = 2 / 5, P_{3} (41) = 1, P_{3} (32) = 2, P_{3} (07) = 1 / 3, P_{3} (61) = P_{3} (52) = 5 / 2, \\ P_{3} (43) = 3, P_{3} (09) = 8 / 27, P_{3} (81) = 4 / 27, P_{3} (72) = 10 / 27, \\ P_{3} (63) = 47 / 756, P_{3} (54) = 59 / 945 . \\ B y T h e o r e m 4.4, t o 3 d e c i m a l p l a c e s, I_{03} = J_{03} - 2 J_{01} = - . 586, \\ I_{04} = J_{04} - 4 J_{02} + 4 γ_{0} / 3 = . 871, I_{05} = J_{05} - 20 J_{03} / 3 + 20 J_{01} / 9 = . 709, \\ I_{06} = J_{06} - 10 J_{04} + 20 J_{02} - 40 γ_{0} / 9 = - 3.187, \\ I_{07} = J_{07} - 14 J_{05} + 140 J_{03} / 3 - 280 J_{01} / 9 = - 12.857, \\ I_{08} = J_{08} - 56 J_{06} / 3 + 280 J_{04} / 3 - 1120 J_{02} / 3 + 560 γ_{0} / 27 = 12.077, \\ I_{09} = J_{09} - 24 J_{07} + 168 J_{05} - 1120 J_{03} + 560 J_{01} / 3 = 28.278 . \end{matrix}

By Note 4.1,

{\tilde{p}}_{r k}

of Example 7 of [30], symmetry, and (4.14),

\begin{matrix} b_{13} = 5 H_{20} / 3 + H_{11}, b_{24} = 3 H_{40} / 2 + 2 H_{31}, b_{26} = [7 H_{50} + 12 H_{41} + 19 H_{32}] / 9, \\ b_{35} = [3 H_{40} + 2 H_{31} + H_{22}] / 5, b_{37} = [9 H_{60} + 19 H_{51} + 30 H_{42} + 18 H_{33}] / 6, \\ b_{39} = [44 H_{80} + 83 H_{71} + 206 H_{62} + 159 H_{53}] / 27, \\ g_{1} = 2 I_{03} / 3 + b_{13} σ^{- 1} γ_{1}, g_{2} = I_{04} / 2 + 2 I_{06} / 9 - b_{24} - b_{26}, \\ g_{3} = 2 I_{05} / 5 + I_{07} / 3 + 8 I_{09} / 27 - b_{35} - b_{37} - b_{39} . \end{matrix}

Let us work through 2 numerical examples to get the conditional distribution to

O (n^{- 2})

. We build on Example 7 of [30]. By Theorem 4.1, if

x_{2} = 1,

then

z = 1 / 2

,

\begin{matrix} H_{3}^{*} = - 5 / 8, H_{4}^{*} = - 17 / 16, H_{6}^{*} = - 89 / 64, \\ H_{5}^{*} = 41 / 32, H_{7}^{*} = - 461 / 2^{7}, H_{9}^{*} = 6481 / 2^{9}, \\ - C_{1} = f_{1} = - 5 / 12, p_{24}^{*} = - 17 / 32, p_{26}^{*} = - 89 / 288, f_{2} = - 121 / 144, \\ p_{35}^{*} = 41 / 80, p_{37}^{*} = - 461 / 384, p_{39}^{*} = 6481 / 1728, f_{3} = 52921 / 17280, \\ C_{2} = 83 / 72, C_{3} = - 39571 / 17280 . \end{matrix}

We worked to 8 significant figures, but display less. If

x = {(1, 1)}^{'}

, then

\begin{matrix} D_{1} = - 113 / 324 = - . 349, D_{2} = 120199 / 2^{3} 3^{8} = 2.290, \\ D_{3} = 8896102087 / 2^{7} 3^{12} 5 = 26.156 . \end{matrix}

So to

O (n^{- 2})

the relative conditional density of (3.19) for

n = 4, 16, 64,

is

\begin{matrix} {(1, 1, 1)}^{'} - {(2^{- 1}, 4^{- 1}, 8^{- 1})}^{'} . 349 + {(4^{- 1}, 16^{- 1}, 64^{- 1})}^{'} 2.290 \\ + {(8^{- 1}, 64^{- 1}, 2^{- 9})}^{'} 26.156 = (\begin{matrix} 1 & - . 174 & + . 573 & + 3.269 \\ 1 & - . 087 & + . 143 & + . 409 \\ 1 & - . 044 & + . 036 & + . 051 \end{matrix}), \end{matrix}

so that for

n = 4

and 16 we can only include two terms, and for

n = 64

, only three terms. We now give the 1st 3

g_{r}, G_{r}

, needed by (3.23) for the conditional distribution to

O (n^{- 2})

. By (3.36),

σ^{2} = 3 / 2, σ^{2} = 1.225

. By (3.3),

μ_{1 \cdot 2} = x_{2} / 2

.

\begin{matrix} F x = {(1, 1)}^{'}, μ_{1 \cdot 2} = 1 / 2, and b y (), v = 6^{- 1 / 2} = . 408 \\ G_{0} = g_{0} = γ_{0} = Φ (v) = . 658, γ_{1} = - ϕ (v) = - . 367, γ_{2} = . 509, γ_{3} = - . 795, \\ γ_{4} = 1.501, γ_{5} = - 3.191, γ_{6} = 7.500, γ_{7} = - 19.150, γ_{8} = 52.500, γ_{9} = - 153.200 . \\ K_{0 s} = {(- 6^{- 1 / 2})}^{s} γ_{s} \Rightarrow K_{01} = . 150, K_{02} = . 0848, K_{03} = . 0541, K_{04} = . 0417, \\ K_{05} = . 0362, K_{06} = . 03472, K_{07} = . 0362, K_{08} = . 0405, K_{09} = . 0483 . \\ S o b y (), J_{01} = . 479, J_{02} = . 0203,, J_{03} = . 372, J_{04} = . 0738, \\ J_{05} = . 0731, J_{06} = . 0713, J_{07} = . 0718, J_{08} = . 441, J_{09} = . 269 . \\ S o f o r x = {(1, 1)}^{'}, b_{13} = - 13 / 27 = - . 481, g_{1} = - . 246, \\ b_{24} = - 47 / 54, b_{26} = 2726 / 2107, g_{2} = - . 696, b_{35} = 10 / 27, b_{37} = - 9371 / 4374, \\ b_{39} = 163806 / 59049 = 2.774, g_{3} = 3.375 . \\ B y (3.24), G_{1} = g_{1} + C_{1} g_{0} = . 0281, G_{2} = - . 040, G_{3} = . 762 . \end{matrix}

For example for

n = 4, 16, 64,

to

O (n^{- 2}), P (X_{n 1} < 1 | X_{n 2} = 1) =

\begin{matrix} . 658 + . 0141 - . 01000 + . 0952, n = 4, \\ . 658 + . 00703 - . 00250 + . 0119, n = 16, \\ . 658 + . 00351 - . 000625 + . 00149, n = 64, \end{matrix}

so that divergence begins with the 4th term.

\begin{matrix} I f x_{2} = 2 t h e n z = 1, H_{3}^{*} = - 1 / 2, H_{4}^{*} = - 23 / 4, H_{6}^{*} = 23 / 8, H_{5}^{*} = - 1 / 4, \\ H_{7}^{*} = 29 / 8, H_{9}^{*} = - 175 / 16, - C_{1} = f_{1} = - 1 / 3, p_{24}^{*} = - 23 / 4, p_{26}^{*} = 23 / 36, \\ f_{2} = 161 / 72, p_{35}^{*} = - 1 / 10, p_{37}^{*} = 29 / 24, p_{39}^{*} = - 175 / 54, \\ f_{3} = - 2303 / 1080 = - 2.132, C_{2} = - 17 / 8 = - 2.125, C_{3} = 733 / 1080 = . 679 . \\ I f x = {(2, 2)}^{'}, t h e n D_{1} = - 37 / 81 = - . 457, D_{2} = . 387, D_{3} = 13.313 . \end{matrix}

So to

O (n^{- 2})

the relative conditional density of (3.19) for

n = 4, 16, 64,

is

\begin{matrix} {(1, 1, 1)}^{'} - {(2^{- 1}, 4^{- 1}, 8^{- 1})}^{'} . 457 + {(4^{- 1}, 16^{- 1}, 64^{- 1})}^{'} . 387 \\ + {(8^{- 1}, 64^{- 1}, 2^{- 9})}^{'} 13.313 = (\begin{matrix} 1 & - . 228 & + . 0969 & + 1.664 \\ 1 & - . 114 & + . 0242 & + . 208 \\ 1 & - . 0571 & + . 00605 & + . 0260, \end{matrix}), \end{matrix}

so that we can only include three terms. Finally, we now give the 1st three

g_{r}, G_{r}

, needed by (3.23) for the conditional distribution to

O (n^{- 2})

.

\begin{matrix} F x = {(2, 2)}^{'}, μ_{1 \cdot 2} = 1, v = {(2 / 3)}^{1 / 2} = . 816, G_{0} = γ_{0} = Φ (v) = . 793, \\ γ_{1} = - ϕ (v) = - . 286, γ_{2} = . 559, γ_{3} = - . 762, γ_{4} = 1.522, \\ γ_{5} = - 3.176, γ_{6} = 7.511, γ_{7} = - 19.142, γ_{8} = 52.505, γ_{9} = - 153.190 . \\ K_{0 s} = {(- 6^{- 1 / 2})}^{s} γ_{s} \Rightarrow K_{01} = . 117, K_{02} = . 0932, K_{03} = . 0519, K_{04} = . 0423, \\ K_{05} = . 0360, K_{06} = . 0348, K_{07} = . 0362, K_{08} = . 0405, K_{09} = . 0483 . \\ S o b y (4.11), J_{01} = . 910, J_{02} = 1.0028, J_{03} = 1.055, J_{04} = 1.097, \\ J_{05} = 1.133, J_{06} = 1.168, J_{07} = 1.204, J_{08} = 1.249, J_{09} = 1.293 . \\ I_{03} = - . 764, I_{04} = - 1.877, I_{05} = - 3.877, I_{06} = 6.731, I_{07} = 6.263, \\ I_{08} = - 276.110, I_{09} = - 848.735, \\ b_{13} = 11 / 27, g_{1} = - . 605, b_{24} = - 26 / 27, b_{26} = 1660 / 2187, g_{2} = . 771, \\ b_{35} = - 138 / 405, b_{37} = 20128 / 4374, b_{39} = 1795048 / 3^{10}, g_{3} = - 224.802 . \\ B y (), G_{1} = g_{1} + C_{1} g_{0} = . 0180, G_{2} = - 2.463, G_{3} = 4.204 . \end{matrix}

For example for

n = 4, 16, 64,

to

O (n^{- 2}), P (X_{n 1} < 2 | X_{n 2} = 2) =

\begin{matrix} . 793 + . 00902 - . 616 + . 525, n = 4, \\ . 793 + . 00451 - . 154 + . 131, n = 16, \\ . 793 + . 00226 - . 0385 + . 0164, n = 64, \end{matrix}

so that divergence begins with the 3rd term.

Example 4.3.

Conditioning when the distribution of

\hat{w}

is symmetric about w. Then for r odd,

C_{r} = D_{r} = g_{r k} = g_{r} = 0

. By (3.19), the conditional density is

p_{n 1 \cdot 2} (x_{1}) = σ^{- 1} ϕ (v) [1 + n^{- 1} D_{2} + O (n^{- 2})], w h e r e D_{2} = {\tilde{p}}_{2} (x) - p_{2}^{*} (x_{2}),

for

{\tilde{p}}_{2} (x)

of Example 1 of [30],

H_{k}^{*}

of (4.4), and

\begin{matrix} p_{2}^{*} (x_{2}) = k_{022} H_{2}^{*} / 2 + k_{043} H_{4}^{*} / 24 . \end{matrix}

By (3.38), the conditional distribution of

X_{n 1 \cdot 2}

is

\begin{matrix} Φ (v) + n^{- 1} G_{2} + O (n^{- 2}), where G_{2} = g_{2} - p_{2}^{*} (x_{2}) Φ (v), \\ g_{2} = \sum_{k = 2, 4} [P_{2} (0 k) I_{2} (0 k) - b_{2 k} σ^{- 1} ϕ (v)], \end{matrix}

for

b_{2 k}

of (4.16) and (4.17).

Example 4.4.

Discussions of pivotal statistics advocate using the distribution of a sample mean, given the sample variance. So

q = 2 .

Let

{\hat{w}}_{2}, {\hat{w}}_{2}

be the usual unbiased estimates of the mean and variance from a univariate random sample of size n from a distribution with rth cumulant

κ_{r}

. So

w_{1} = κ_{1}, w_{2} = κ_{2} .

By the last 2 equations of Section 12.15 and (12.35)–(12.38) of [26], the cumulant coefficients needed for

{\bar{P}}_{r}^{1 - k}

of (2.3) for

r \leq 3

, – the coefficients needed for the conditional density to

O (n^{- 2})

, in terms of

(i_{1}^{j_{1}} i_{2}^{j_{2}} \dots) = κ_{i_{1}}^{j_{1}} κ_{i_{2}}^{j_{2}} \dots

, are

\begin{matrix} k_{201} = κ_{2}, k_{111} = κ_{3}, k_{021} = κ_{4} + 2 κ_{2}^{2}, \Rightarrow V = (\begin{matrix} κ_{2} & κ_{3} \\ κ_{3} & κ_{4} + 2 κ_{2}^{2} \end{matrix}), \\ k_{101} = k_{011} = 0, k_{302} = κ_{3}, k_{212} = k_{122} = 0, k_{032} = (6) + 12 (24) + 4 (3^{2}) + 8 (2^{3}), \\ k_{202} = k_{112} = 0, k_{022} = 2 (2^{2}), k_{403} = (4), k_{313} = (5), \\ k_{223} = k_{133} = 0, k_{043} = (8) + 24 (26) + 32 (35) + 32 (4^{2}) + 144 (2^{2} 4) + 96 (23^{2}) \\ + 48 (2^{4}), k_{102} = k_{012} = k_{303} = k_{213} = k_{123} = 0, k_{033} = 12 (24) + 16 (2^{3}), \\ k_{504} = k_{324} = k_{234} = k_{144} = 0, k_{414} = (6), k_{054} = (10) + 40 (28) + 80 (37) \\ + 200 (46) + 96 (5^{2}) + 480 (2^{2} 6) + 1280 (235) + 1280 (24^{2}) + 960 (3^{2} 4) + 1920 (2^{3} 4) \\ + 1920 (2^{2} 3^{2}) + 384 (2^{5}) . \end{matrix}

(3.19) gives

D_{r}

in terms of

{\tilde{p}}_{r}

and

p_{r}^{*}

, that is, in terms of

{\tilde{p}}_{r k}

and

p_{r k}^{*}

of (3.14) in terms of

P_{r} (a b)

. In this example, many of these are 0. The non-zero

P_{r} (a b)

are in order needed,

\begin{matrix} P_{1} (30) = κ_{3} / 6, P_{1} (03) = k_{032} / 6, P_{2} (02) = κ_{2}^{2}, P_{2} (40) = κ_{4} / 24, \\ P_{2} (04) = k_{043} / 24, P_{2} (32) = κ_{5} / 6, P_{2} (60) = κ_{32} / 72, \\ P_{2} (06) = k_{032}^{2} / 72, P_{2} (33) = κ_{3} k_{032} / 36 . P_{3} (03) = k^{033} / 6 . \\ P_{3} (05) = k_{054} / 120 + k_{022} k_{032} / 12 . P_{3} (70) = κ_{3} κ_{4} / 144, P_{3} (07) = k^{032} k_{042} / 144, \\ P_{3} (62) = κ_{3} k_{313} / 36, P_{3} (43) = k_{032} κ_{4} / 144, P_{3} (34) = (κ_{3} k_{043} + k_{022} k_{313}) / 144, \end{matrix}

\begin{matrix} P_{3} (90) = κ_{3} / 6^{3}, P_{3} (09) = {(k_{032} / 6)}^{3}, P_{3} (63) = 3 κ_{32} k_{032} / 6^{3}, \\ P_{3} (36) = 3 κ_{3} k_{032}^{2} / 6^{3} . \\ S o, {\tilde{p}}_{11} = 0, {\tilde{p}}_{13} = P_{1} (30) H_{30} + P_{1} (03) H_{03},, {\tilde{p}}_{22} = P_{2} (02) H_{02}, \\ {\tilde{p}}_{24} = P_{2} (40) H_{40} + P_{2} (04) H_{04} + P_{2} (32) H_{31}, {\tilde{p}}_{26} = P_{2} (60) H_{60} + P_{2} (06) H_{06} \\ + P_{2} (33) H_{33}, {\tilde{p}}_{31} = 0, {\tilde{p}}_{33} = P_{3} (30) H_{30} + P_{3} (03) H_{03}, {\tilde{p}}_{35} = P_{3} (05) H_{05}, \\ {\tilde{p}}_{37} = P_{3} (70) H_{70} + P_{3} (2^{7}) H_{07} + P_{3} (62) H_{61} + P_{3} (43) H_{43} + P_{3} (34) H_{34} \\ + P_{3} (25) H_{25} + P_{3} (16) H_{16} + P_{3} (07) H_{07}, \\ {\tilde{p}}_{39} = P_{3} (90) H_{90} + P_{3} (63) H_{63} + P_{3} (36) H_{36} + P_{3} (09) H_{09} . \\ A l s o, b_{13} = P_{1} (30) H_{20}, b_{22} = P_{2} (20) H_{10} + P_{2} (11) H_{01}, \\ b_{24} = P_{2} (40) H_{30} + P_{2} (31) H_{21}, b_{26} = P_{2} (60) H_{50} + P_{2} (33) H_{23}, \\ b_{31} = 0, b_{33} = P_{3} (30) H_{20}, b_{35} = 0, \\ b_{37} = P_{3} (70) H_{60} + P_{3} (61) H_{51} + P_{3} (43) H_{33} + P_{3} (34) H_{24}, \\ b_{39} = P_{3} (90) H_{80} + P_{3} (63) H_{53} + P_{3} (36) H_{26} . \end{matrix}

For

r = 1, 2, 3, {\tilde{p}}_{r} (x)

is now given by (2.13),

p_{r}^{*} (x)

, and Section 2 of [30]. By (2.4) and (3.19), this gives the conditional density

p_{n 1 \cdot 2} (x_{1})

to

O (n^{- 2})

. And (4.14) gives

g_{r}

needed for the conditional distribution

P_{n 1 \cdot 2} (x_{1})

to

O (n^{- 2})

.

5. Conclusions

[30] gave the density and distribution of

X_{n} = n^{1 / 2} (\hat{w} - w)

to

O (n^{- 2})

, for

\hat{w} \in R^{q}

any standard estimate, in terms of functions of the cumulant coefficients

{\bar{k}}_{j}^{1 - r}

of (2.1), called the Edgeworth coefficients,

{\bar{P}}_{r}^{1 - k}

.

Most estimates of interest are standard estimates, including smooth functions of sample moments, like the sample skewness, kurtosis, correlation, and any multivariate function of k-statistics. (These are unbiased estimates of cumulants and their products, the most common example being that for a variance.) Unbiased estimates are not needed for Edgeworth expansions, although this does simplify the Edgeworth coefficients, as seen in Examples 4.1, 4.2, 4.4. However unbiased estimates are not available for most parameters or functions of them, such as the ratio of two means or variances, except for special cases of exponential families. [29] gave the cumulant coefficients for smooth functions of standard estimates.

As noted, conditioning is a very useful and basic way to use correlated information to reduce the variability of an estimate. Section 3 gave the conditional density and distribution of

X_{n 1}

given

X_{n 2}

to

O (n^{- 2})

where

(\binom{X_{n 1}}{X_{n 2}})

is any partition of

X_{n} = n^{1 / 2} (\hat{w} - w)

. The expansion (3.19) gave the conditional density of any multivariate standard estimate. Our main result, an explicit expansion for the conditional distribution (3.23) to

O (n^{- 2})

, is given in terms of the leading

{\bar{I}}^{1 - k}

of (3.28). These are given explicitly by Theorems 3.3 and 3.4.

When

q_{1} = q_{2} = 1,

Theorem 4.1 simplified the conditional density expansion, and Theorem 4.3 gave a huge simplification, and the coefficients of the conditional distribution expansion in terms of

I_{0 k} = I^{2^{k}}

of Theorem 4.4.

Cumulant coefficients can also be used to obtain estimates of bias

O (n^{- k})

for

k \geq 2

: see [34,35,37].

6. Discussion

A good approximation for the distribution of an estimate, is vital for accurate inference. It enables one to explore the distribution’s dependence on underlying parameters. Our analytic method avoids the need for simulation or jack-knife or bootstrap methods while providing greater accuracy than any of them. [13] used the Edgeworth expansion to show that the bootstrap gives accuracy to

O (n^{- 1})

. [12] said that “2nd order correctness usually cannot be bettered”. But this is not true using our analytic method. Simulation, while popular, can at best shine a light on behaviour, only when there is a small number of parameters, and only for limited values of their range.

Estimates based on a sample of independent, but not identically distributed random vectors, are also generally standard estimates. For example, for a univariate sample mean

\bar{X} = n^{- 1} \sum_{j = 1}^{n} X_{j n}

where

X_{j n}

has rth cumulant

κ_{r j n}

, then

κ_{r} (\bar{X}) = n^{1 - r} κ_{r}

where

κ_{r} = n^{- 1} \sum_{j = 1}^{n} κ_{r j n}

is the average rth cumulant. For some examples, see [22,23] and [32] , 2020). The last is for a function of a weighted mean of complex random matrices. For conditions for the validity of multivariate Edgeworth expansions, see [24] and its references, and Appendix C of [30].

While the use of Edgeworth-Cornish-Fisher expansions is widespread, few papers address how to deal with their divergence for small sample sizes. [8] and [11] avoided this question as it did not arise in their examples. In contrast we confronted this in Example 4.2, the examples of Withers (1984), and in Example 7 of [30].

We now turn to conditioning. Conditioning on

{\hat{w}}_{2}

makes inference on

w_{1}

more precise by reducing the covariance of the estimate. The covariance of

{\hat{w}}_{1} | {\hat{w}}_{2}

can be substantially less than that of

{\hat{w}}_{1}

. [3] pp34-36 argue that an ideal choice would be when the distribution of

{\hat{w}}_{2}

does not depend on

w_{1}

. But this is generally not possible except for some exponential families. An example when it is true, is when

w_{1}

and

w_{2}

are location and scale parameters: on p54 they essentially suggest choosing

w_{2} = n v a r {\hat{w}}_{1}

. This is our motivation for Example 4.4. For some examples, see [2]. Their (7.5) gave a form for the 3rd order expansion for the conditional density of a sample mean to

O (n^{- 3 / 2})

, but did not attempt to integrate it.

Tilting (also known as small sample asympotics, or saddlepoint expansioins), was first used in statistics by [9]. He gave an approximation to the density of a sample mean, good for the whole line, not just in the region where the Central Limit Theorem approximation holds. A conditional distribution by tilting, was first given by [25] up to

O (n^{- 1})

, for a bivariate sample mean. Compare [2]. For some other results on conditional distributions, see [5,10,14,21], Hansen (1994), [20], Chapter 4 of [6], and [17].

Future directions. The results here give the first step for constructing confidence intervals and confidence regions of higher order accuracy. See [15] and [28]. What is needed next, is an application of [29] to obtain the cumulant coefficients of

{\hat{θ}}_{i} = {\hat{V}}_{i i}^{- 1 / 2} ({\hat{w}}_{i} - w_{i}), i = 1, 2,

or those of

\hat{θ} = {\hat{V}}^{- 1 / 2} (\hat{w} - w)

. This should be straightforward.

2. When

q_{1} = 1

, our expansion for the conditional distribution of

X_{n 1 \cdot 2}

of (3.1), can be inverted using the Lagrange Inversion Theorem, to give expansions for its percentiles. This should be straightforward. (The quantile expansions of [8] and Withers (1984) do not apply as Appendix A shows that conditional estimates of standard estimates are not standard estimates.)

3. Here we have only considered expansions about the normal. However expansions about other distributions can greatly reduce the number of terms by matching the leading bias coefficient. The framework for this is [32], building on [15]. For expansions about a matching gamma, see [33,36].

4. The results here can be extended to tilted (saddlepoint) expansions by applying the results of [32]. The tilted version of the multivariate distribution and density of a standard estimate are given by Corollaries 3, 4 there, and that of the conditional distribution and density follow from these. For the entangled gamma of Example 4.2, this requires solving a cubic. See also [16].

5. A possible alternative approach to finding the conditional distribution, is to use conditional cumulants, when these can be found. Section 6.2 of [18] uses conditional cumulants to give the conditional density of a sample mean to

O (n^{- 3 / 2})

. Section 5.6 of [19] gave formulas for the 1st 4 cumulants conditional on

X_{2} = x_{2}

only when

X_{1}

and

X_{2}

are uncorrelated. He says that this assumption can be removed, but gives no details how. That is unlikely to give an alternative to our approach, for as well as giving expansions for the first 3 conditional cumulants, Appendix A shows that the conditional estimate is not a standard estimate.

6. Lastly we discuss numerical computation. We have used [27] for our calculations. Its input is

V^{11}, V^{12}, V^{22}

and

y_{1}, y_{2}

, - not

V_{11}, V_{12}, V_{22}

and

x_{1}, x_{2}

. There is a function sub2(sb1,sb2) which takes as argument the two subscripts of mu, and returns the value. If global variables mu20, mu02, mu11 are symbolic variables (defined using sympy) then it returns the answer in terms of those, but if they are numeric then it returns a numeric answer. There is another function called biHermite(n, m, y1, y2) which takes the 2 subscripts of H. If y1 and y2 are symbolic, then it returns a symbolic answer, but if they are numeric it returns a numeric answer. A numerical example is given by Example 4.2, that is, for the case

V_{11} = 2, V_{12} = 1, V_{22} = 2

and

x_{1} = x_{2} = 1

or

x_{1} = x_{2} = 2

.

Similar software for numerical calculations for Theorems 4.1, 4.3 and 4.4 would be invaluable, as would software for applying the Lagrange Inversion Theorem. (We mention R-4.4.1 for Windows: dmvnorm for the density function of the multivariate normal, mvtnorm for the multivariate normal, qmvnorm for quantiles, and rmvnorm to generate multivariate normal variables.) On bivariate Hermite polynomials, see cran.r-project.org/web/packages/calculus/vignettes/hermite.html

Appendix A Conditional Moments

Here we give expansions for the conditional moments of

X_{n 1 \cdot 2}

of (3.1), in terms of the conditional normal moments of

X_{1 \cdot 2}

, of (3.1). And we show that

\begin{matrix} {\hat{w}}_{1 \cdot 2} = {\hat{w}}_{1} | (X_{n 2} = x_{2}) \end{matrix}

(A1)

is neither a standard estimate of

w_{1}

, nor a Type B estimate, as defined below.

Consider the case

q_{1} = 1

. By (3.5), Preprints 178664 i019

Non-central moments.

Theorem A1.

Take

C_{r}, D_{r}

of Theorem 3.1. Set

{\tilde{p}}_{0} (x) = 1 .

For

s > 0,

the sth conditional moment of

X_{n 1 \cdot 2}

of (3.1) about

Φ_{1 \cdot 2} (x_{1})

of (3.10), has the expansion

\begin{matrix} m_{n s} = E X_{n 1 \cdot 2}^{s} = n^{s / 2} E {({\hat{w}}_{1 \cdot 2} - w_{1})}^{s} \approx \sum_{r = 0}^{\infty} n^{- r / 2} G_{r}^{s} \\ w h e r e G_{r}^{s} = C_{r} \otimes g_{r}^{s}, g_{r}^{s} = E X_{1 \cdot 2}^{s} P_{r}, a n d P_{r} = {\tilde{p}}_{r} (x) a t x_{1} = X_{1 \cdot 2} . \end{matrix}

(A3)

\begin{matrix} S o, G_{0}^{s} = g_{0}^{s} = M_{s} = E X_{1 \cdot 2}^{s} . \end{matrix}

\begin{matrix} g_{r}^{s} = \sum_{k = 1}^{3 r} [g_{r k}^{s} : k - r e v e n], f o r r \geq 1, w h e r e f o r {\tilde{p}}_{r k} o f (2.6), \end{matrix}

(A4)

\begin{matrix} g_{r k}^{s} = E X_{1 \cdot 2}^{s} {\tilde{P}}_{r k} = {\bar{P}}_{r}^{1 - k} {\bar{I}}_{s}^{1 - k}, {\tilde{P}}_{r k} = {\tilde{p}}_{r k} a t x_{1} = X_{1 \cdot 2}, \end{matrix}

(A5)

\begin{matrix} a n d f o r 1 \leq i_{1}, \dots, i_{k} \leq q, {\bar{I}}_{s}^{1 - k} = I_{s}^{i_{1} \dots i_{k}} = E X_{1 \cdot 2}^{s} {\bar{H}}^{1 - k} (X_{1 \cdot 2}), \\ f o r {\bar{H}}^{1 - k} (X_{1 \cdot 2}) = {\bar{H}}^{1 - k} a t x_{1} = X_{1 \cdot 2} . \end{matrix}

(A6)

PROOF This follows from Theorem 3.1 □

So by (A3), the sth conditional moment of

X_{n 1 \cdot 2}

is

\begin{matrix} m_{n s} = M_{s} + n^{- 1 / 2} G_{1}^{s} + n^{- 1} G_{2}^{s} + O (n^{- 3 / 2}), where \\ G_{1}^{s} = C_{1} M_{s} + g_{1}^{s}, G_{2}^{s} = C_{2} M_{s} + C_{1} g_{1}^{s} + g_{2}^{s}, \\ g_{1}^{s} = g_{11}^{s} + g_{13}^{s}, g_{2}^{s} = g_{22}^{s} + g_{24}^{s} + g_{26}^{s}, \end{matrix}

of (A5) and (A6). For example,

\begin{matrix} g_{11}^{s} = {\bar{k}}_{1}^{1} E X_{1 \cdot 2}^{s} {\bar{H}}^{1} (X_{1 \cdot 2}), and g_{13}^{s} = {\bar{k}}_{2}^{1 - 3} E X_{1 \cdot 2}^{s} {\bar{H}}^{1 - 3} (X_{1 \cdot 2}) / 6 . \end{matrix}

So

{\hat{w}}_{1 \cdot 2}

of (A1) is not a standard estimate, as by (A3), the expansion for its mean is a power series in

n^{- 1 / 2}

, not

n^{- 1}

. Is it a Type B estimate? These are defined as for a standard estimate, but with cumulant expansions being series in

n^{- 1 / 2}

, not

n^{- 1}

. We shall see. Take

q_{2} = q_{1} = 1

. By Theorem 4.2, for

P_{r} (a b)

of (2.3),

g_{r k}^{s}

of (A4) is given by

\begin{matrix} g_{r k}^{s} = \sum_{b = 0}^{k} P_{r} (k - b, b) I_{k - b, b}^{s}, w h e r e I_{a b}^{s} = E X_{1 \cdot 2}^{s} H_{a b} (X_{1 \cdot 2}), \\ and H_{a b} (X_{1 \cdot 2}) = H_{a b} a t x_{1} = X_{1 \cdot 2} . \end{matrix}

For example,

\begin{matrix} g_{r 1}^{s} = P_{r} (10) I_{10}^{s} + P_{r} (01) I_{01}^{s} \\ g_{r 3}^{s} = P_{r} (30) I_{30}^{s} + P_{r} (21) I_{21}^{s} + P_{r} (12) I_{12}^{s} + P_{r} (03) I_{03}^{s} . \end{matrix}

Finding the

I_{a b}^{s} .

The

H_{a b}

needed are given in Appendix B of [30] in terms of

y = V^{- 1} x : y_{i} = V^{i 1} x_{1} + V^{i 2} x_{2}, y_{1} = μ_{20} x_{1} + μ_{11} x_{2}, y_{2} = μ_{11} x_{1} + μ_{02} x_{2} .

For example,

\begin{matrix} H_{10} = y_{1} = μ_{20} x_{1} + μ_{11} x_{2}, H_{01} = y_{2} = μ_{11} x_{1} + μ_{02} x_{2}, \\ H_{30} = y_{1}^{3} - 3 y_{1} μ_{20} = {(μ_{20} x_{1} + μ_{11} x_{2})}^{3} - 3 (μ_{20} x_{1} + μ_{11} x_{2}) μ_{20}, \\ H_{03} = y_{2}^{3} - 3 y_{2} μ_{02} = {(μ_{11} x_{1} + μ_{02} x_{2})}^{3} - 3 (μ_{11} x_{1} + μ_{02} x_{2}) μ_{20}, \\ H_{21} = y_{2} (y_{1}^{2} - μ_{20}) - 2 y_{1} μ_{11}, H_{12} = y_{1} (y_{2}^{2} - μ_{02}) - 2 y_{2} μ_{11} . \end{matrix}

Let us write

H_{a b}

in terms of

M_{s}

of (Appendix A2), as

\begin{matrix} H_{a b} = \sum_{k = 0}^{a + b} H_{a b}^{k} x_{1}^{k} . Then, I_{a b}^{s} = \sum_{k = 0}^{a + b} [H_{a b}^{k} M_{s + k} : s + k even] . \\ So, I_{10}^{s} = H_{10}^{0} M_{s} + H_{10}^{1} M_{s + 1}, I_{01}^{s} = H_{01}^{0} M_{s} + H_{01}^{1} M_{s + 1} : \\ for odd s, I_{10}^{s} = H_{10}^{1} M_{s + 1} = μ_{20} M_{s + 1}, I_{01}^{s} = H_{01}^{1} M_{s + 1} = μ_{02} x_{2} M_{s + 1}, \\ and for even s, I_{10}^{s} = H_{10}^{0} M_{s} = μ_{11} x_{2} M_{s}, I_{01}^{s} = H_{01}^{0} M_{s} = μ_{02} x_{2} M_{s} . \end{matrix}

\begin{matrix} So, H_{10}^{0} = μ_{11} x_{2}, H_{10}^{1} = μ_{20}, H_{01}^{0} = μ_{02} x_{2}, H_{01}^{1} = μ_{11}, \\ H_{30}^{0} = {(μ_{11} x_{2})}^{3} - 3 μ_{11} x_{2} μ_{20}, H_{30}^{1} = 3 μ_{20} [{(μ_{11} x_{2})}^{2} - μ_{20}], \\ H_{30}^{2} = 3 μ_{20}^{2} μ_{11} x_{2}, H_{30}^{3} = μ_{20}^{3}, \\ H_{03}^{0} = {(μ_{02} x_{2})}^{3} - 3 μ_{02} x_{2} μ_{20}, H_{03}^{1} = 3 μ_{11} [{(μ_{02} x_{2})}^{2} - μ_{11}], \\ H_{03}^{2} = 3 μ_{11}^{2} μ_{02} x_{2}, H_{03}^{3} = μ_{11}^{3}, \\ H_{21}^{0} = μ_{02} x_{2} [{(μ_{11} x_{2})}^{2} - μ_{20}] - 2 μ_{11}^{2} x_{2}, \\ H_{21}^{1} = μ_{11} [{(μ_{11} x_{2})}^{2} - μ_{20}] + 2 μ_{20} μ_{11} (μ_{02} x_{2}^{2} - 1), \\ H_{21}^{2} = μ_{20} μ_{22} x_{2} since μ_{22} = μ_{20} μ_{02} + 2 μ_{11}^{2}, H_{21}^{3} = μ_{11} μ_{20}^{2}, \\ H_{12}^{0} = μ_{11} x_{2} [{(μ_{02} x_{2})}^{2} - μ_{02}] - 2 μ_{02} μ_{11} x_{2}, \\ H_{12}^{1} = μ_{20} [{(μ_{02} x_{2})}^{2} - μ_{02}] + 2 μ_{11}^{2} (μ_{02} x_{2}^{2} - 1), H_{12}^{2} = μ_{11} μ_{22} x_{2}, H_{12}^{3} = μ_{20} μ_{11}^{2} . \end{matrix}

To get a general formula for

H_{a b}^{k}

, set

\begin{matrix} c_{1} = V_{11} x_{1}, c_{2} = I V_{11} X_{1}, c_{3} = V_{12} x_{2}, c_{4} = I V_{12} X_{2} . \\ So, y_{1} + I Y_{1} = c_{1} + c_{2}, y_{2} + I Y_{2} = c_{3} + c_{4}, \\ H_{a b} = E {(c_{1} + c_{2})}^{a} {(c_{3} + c_{4})}^{b} = \sum_{j = 0}^{a} (\binom{a}{j}) c_{1}^{a - j} \sum_{k = 0}^{b} (\binom{b}{k}) c_{3}^{b - k} C_{j k} \\ where C_{j k} = E c_{2}^{j} c_{4}^{k} = I^{j + k} V_{11}^{j} V_{12}^{k} μ^{j k}, μ^{j k} = E X_{1}^{j} X_{2}^{k} . \\ So, H_{a b}^{a - j} = (\binom{a}{j}) V_{11}^{a - j} \sum_{k = 0}^{b} (\binom{b}{k}) c_{3}^{b - k} C_{j k}, \end{matrix}

where

C_{j k} = 0

if

j + k

is odd.

μ^{j k}

is just

μ_{j k}

of Appendix B of [30] with V replaced by

V^{- 1} .

Central moments. Set

m_{s} (X) = E X^{s}

and

μ_{s} (X) = E {(X - E X)}^{s}

.

For

m_{s} = m_{s 1 \cdot 2}

of (A3), set

\begin{matrix} μ_{s} = μ_{s 1 \cdot 2} = μ_{s} (X_{n 1 \cdot 2}) = n^{s / 2} μ_{s} ({\hat{w}}_{1 \cdot 2}) . \\ So by y (), μ_{2} = m_{2} - m_{1}^{2} \approx \sum_{r = 0}^{\infty} n^{- r / 2} μ_{2 r}, where μ_{2 r} = G_{r}^{2} - G_{r}^{1} \otimes G_{r}^{1}, \\ and μ_{3} = m_{3} - 3 m_{1} m_{2} + 2 m_{1}^{3} \approx \sum_{r = 0}^{\infty} n^{- r / 2} μ_{3 r} \\ where μ_{3 r} = G_{r}^{3} - 3 G_{r}^{1} \otimes G_{r}^{2} + 2 G_{r}^{1} \otimes G_{r}^{1} \otimes G_{r}^{1} . \end{matrix}

Is the conditional estimate

{\hat{w}}_{1 \cdot 2}

a Type B estimate? This requires its rth cumulant to have magnitude

O (n^{1 - r})

for

r \geq 1

. This is true for

r = 1

and 2 but not for

r = 3

, as

μ_{r} ({\hat{w}}_{1 \cdot 2})

has magnitude

O (n^{- r / 2})

, since

μ_{s 1 \cdot 2} = O (1)

.

References

Anderson, T. W. (1958) An introduction to multivariate analysis. John Wiley, New York.
Barndoff-Nielsen, O.E. and Cox, D.R. (1989). Asymptotic techniques for use in statistics. Chapman and Hall, London.
Barndoff-Nielsen, O.E. and Cox, D.R. (1994). Inference and asymptotics. Chapman and Hall, London.
Bhattacharya, R.N. and Rao, Ranga R. (2010). Normal approximation and asymptotic expansions, SIAM edition.
Booth, J., Hall, P. and Wood, A. (1992) Bootstrap estimation of conditional distributions. Annals Statistics, 20 (3), 1594–1610. [CrossRef]
Butler, R.W. (2007) Saddlepoint approximations with applications, pp. 107–144, Cambridge University Press. [CrossRef]
Comtet, L. Advanced Combinatorics; Reidel: Dordrecht, The Netherlands, 1974.
Cornish, E.A. and Fisher, R. A. (1937) Moments and cumulants in the specification of distributions. Rev. de l’Inst. Int. de Statist. 5, 307–322. Reproduced in the collected papers of R.A. Fisher, 4. [CrossRef]
Daniels, H.E. (1954) Saddlepoint approximations in statistics. Ann. Math. Statist. 25, 631–650.
DiCiccio, T.J., Martin, M.A. and Young, G.A. (1993) Analytical approximations to conditional distribution functions. Biometrika, 80 4, 781–790.
Fisher, R. A. and Cornish, E.A. (1960) The percentile points of distributions having known cumulants. Technometrics, 2, 209–225. [CrossRef]
Hall, P. (1988) Rejoinder: Theoretical Comparison of Bootstrap Confidence Intervals Annals Statistics, 16 (3),9 81–985.
Hall, P. (1992) The bootstrap and Edgeworth expansion. Springer, New York.
Hansen, B.E. (1994) Autoregressive conditional density estimation. International Economic Review, 35 (3), 705–730. [CrossRef]
Hill, G.W. and Davis, A.W. (1968) Generalised asymptotic expansions of Cornish-Fisher type. Ann. Math. Statist., 39, 1264–1273. [CrossRef]
Jing, B. and Robinson, J. (1994) Saddlepoint approximations for marginal and conditional probabilities of transformed variables. Ann. Statist., 22, 1115–1132. [CrossRef]
Kluppelberg, C. and Seifert, M.I. (2020) Explicit results on conditional distributions of generalized exponential mixtures. Journal Applied Prob., 57 3, 760–774. [CrossRef]
McCullagh, P., (1984) Tensor notation and cumulants of polynomials. Biometrika 71 (3), 461–476. McCullagh (1984).
McCullagh, P., (1987) Tensor methods in statistics. Chapman and Hall, London.
Moreira, M.J. (2003) A conditional likelihood ratio test for structural models. Econometrica, 71 (4), 1027–1048. [CrossRef]
Pfanzagl, P. (1979). Conditional distributions as derivatives. Annals Probability, 7 (6), 1046–1050.
Skovgaard, I.M. (1981a) Edgeworth expansions of the distributions of maximum likelihood estimators in the general (non i.i.d.) case. Scand. J. Statist., 8, 227-236.
Skovgaard, I. M. (1981b) Transformation of an Edgeworth expansion by a sequence of smooth functions. Scand. J. Statist., 8, 207-217.
Skovgaard, I. M. (1986) On multivariate Edgeworth expansions. Int. Statist. Rev., 54, 169–186.
Skovgaard, I.M. (1987) Saddlepoint expansions for conditional distributions, Journal of Applied Prob., 24 (4), 875–887. [CrossRef]
Stuart, A. and Ord, K. (1991). Kendall’s advanced theory of statistics, 2. 5th edition. Griffin , London.
Teal, P. (2024) A code to calculate bivariate Hermite polynomials.https://github.com/paultnz/bihermite/blob/main/hermite8.py.
Withers, C.S. (1989) Accurate confidence intervals when nuisance parameters are present. Comm. Statist. - Theory and Methods, 18, 4229–4259. [CrossRef]
Withers, C.S. (2024) 5th-Order multivariate Edgeworth expansions for parametric estimates. Mathematics, 12,905, Advances in Applied Prob. and Statist. Inference. https://www.mdpi.com/2227-7390/12/6/905/pdf.
Withers, C.S. (2025) Edgeworth coefficients for standard multivariate estimates. New Perspectives in Mathematical Statistics, 2nd Edition. Axioms 2025.
Withers, C.S. and Nadarajah, S.N. (2009) Charlier and Edgeworth expansions via Bell polynomials. Probability and Mathematical Statistics, 29, 271–280.
Withers, C.S. and Nadarajah, S. (2010) Tilted Edgeworth expansions for asymptotically normal vectors. Annals of the Institute of Statistical Mathematics, 62 (6), 1113–1142. [CrossRef]
Withers, C.S. and Nadarajah, S. (2011) Generalized Cornish-Fisher expansions. Bull. Brazilian Math. Soc., New Series, 42 (2), 213–242. DOI:�¿¡10.1007/s00574-011-0012-9 Some typos: p217 line 7. Replace stem by step. p220. Replace the first two words “That is,” by “Suppose now that”. p220. After “replace” in line 6, insert “Y_n by -Y_n,”. p 226. Replace lines 5–7, “Suppose that ... This is”, as follows. “Suppose that for ν in N^p and |ν|=∑j=1pν_j,l_ν=n^a(|ν|)λ_ν satisfies $l v = O (1), where a (r) = r / 2 - I (r \geq 3) as n \to \infty$ . (7.3) This is”. p226. Replace κ_r on LHS of 4th displayed equation by k_r. p226. Replace k_r on RHS of 6th displayed equation by K_r. p227. Replace r in (7.5) and the following equation by |ν|. p227 Replace “variance” in (7.6) by “covariance”.
Withers, C.S. and Nadarajah, S. (2012) Nonparametric estimates of low bias. REVSTAT Statistical Journal, 10 (2), 229–283.
Withers, C.S. and Nadarajah, S. (2014a) Bias reduction: The delta method versus the jackknife and the bootstrap. Pakistan Journal of Statist., 30 (1), 143–151.
Withers, C.S. and Nadarajah, S. (2014b) Expansions about the gamma for the distribution and quantiles of a standard estimate. Methodology and Computing in Applied Prob., 16 (3), 693-713. DOI 10.1007/s11009-013-9328-9 For typos, see p25–26 of Withers (2024). [CrossRef]
Withers, C.S. and Nadarajah, S. (2023) Bias reduction for standard and extreme estimates. Commun. Statistics - Simulation and Comp., 52 (4), 1264–1277. [CrossRef]

Figure 4.1.

x_{1} (x_{2}, v) = x_{2} / 2 + {(3 / 2)}^{1 / 2} v

of (3.11) for

Φ (v) = . 01, . 1, 0, . 9, . 99

- courtesy of Dr Paul Teal:

Figure 4.1.

x_{1} (x_{2}, v) = x_{2} / 2 + {(3 / 2)}^{1 / 2} v

of (3.11) for

Φ (v) = . 01, . 1, 0, . 9, . 99

- courtesy of Dr Paul Teal:

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Expansions for the Conditional Density and Distribution of a Standard Estimate

Abstract

Keywords:

Subject:

1. Introduction and Summary

2. Multivariate Edgeworth Expansions

3. The Conditional Density and Distribution

4. The Case $q_{1} = q_{2} = 1 .$

5. Conclusions

6. Discussion

Appendix A Conditional Moments

References

MDPI Initiatives

Important Links

Subscribe

Expansions for the Conditional Density and Distribution of a Standard Estimate

Abstract

Keywords:

Subject:

1. Introduction and Summary

2. Multivariate Edgeworth Expansions

3. The Conditional Density and Distribution

4. The Case q 1 = q 2 = 1 .

5. Conclusions

6. Discussion

Appendix A Conditional Moments

References

MDPI Initiatives

Important Links

Subscribe

4. The Case $q_{1} = q_{2} = 1 .$