Overview
In this post we will be looking at the idea of attempting to calculate the expected value of the product of non independent normal distributions.
E[∏i=1nXi],⎣⎢⎢⎡X1⋮Xn⎦⎥⎥⎤∼N(μ,Σ)
The simplest thing to do is kinda a cop-out, if we implicitly assume independence we get the following result.
E[∏i=1nXi]=∏i=1nE[Xi]=∏i=1nμi
This can be useful as it is very simple to compute, but this assumption isn't always valid.
Calculating the intergral for the expectation directly analytically is difficult.
We will use the following theorem:
(From wikipedia)
Isserlis' Theorem
If (X1,…,Xn) is a zero-mean multivariate normal vector, then
E[X1…Xn]=∑p∈Pn2∏{i,j}∈PE[Xi,Xj]=∑p∈Pn2∏{i,j}∈PCov(Xi,Xj)
where the sum is over all the pairings of (1,…,n)
Note: This implies that if n∈2N+1 , as we will have no valid pairings that our expectation will be 0.
This is close to what we want, but we unfortunely don't have a 0 mean normal vectors, we have arbitary finite means.
We define the function h:T→R, where T is the indexing set for our random variables.
h(S)=E[∏i∈SXi]
For convinence we will also define
H(∅)=1
We already know the solution when card(S)=1
H({s})=E[Xs]=μs
We will also define I:T→R
I(S)=∑p∈S2∏{i,j}∈PCov(Xi,Xj)
We will be using the fact that translating does not effect the covarience.
Again for convinence we will also define
I(∅)=1
card(T)=n
We will refer to the elements of T as {1,…,n}
Let n∈2N
E[(X1−μ1)…(Xn−μn)] E[S⊆T∑i∈T∏1(i∈S)Xi+1(i∈T∖S)(−μi)] S⊆T∑E[i∈T∏1(i∈S)Xi+1(i∈T∖S)(−μi)] S⊆T∑i∈T∖S∏(−μi)E[i∈S∏Xi] S⊆T∑i∈T∖S∏(−μi)H(S) H(T)+S⊂T∑i∈T∖S∏(−μi)H(S) H(T)=I(T)−S⊂T∑i∈T∖S∏(−μi)H(S) =I(T)=I(T)=I(T)=I(T)=I(T)=I(T)
Let n∈2N+1
H(T)=−∑S⊂T∏i∈T∖S(−μi)H(S)
We replaced I(T) with 0 as mentioned previously.
This recursive definition only relies on calls to smaller sets and we have defined the base cases so this will terminate.
Now we wish to solve this recurrence relation.
Our base cases are defined.
Lets calculate H(T) with T={1,2}.
H(T) H({1,2}) H({1,2}) =I(T)−S⊂T∑i∈T∖S∏(−μi)H(S)=I({1,2})−((−μ1)H({2})+(−μ2)H({1}+μ1μ2))=I({1,2})+μ1μ2
We have an inductive hypothesis.
H(T)=∑S⊆T(∏i∈T∖Sμi)I(S)
We have already seen that this is satisified for sets of size 1,2.
Now for the inductive steps.
Case 1: n∈2N
H(T) =I(T)−S⊂T∑i∈T∖S∏(−μi)H(S)=I(T)−S⊂T∑i∈T∖S∏(−μi)D⊆S∑⎝⎛i∈S∖D∏μi⎠⎞I(D)=I(T)−S⊂T∑⎝⎛i∈T∖S∏(−μi)⎠⎞D⊆S∑⎝⎛i∈S∖D∏μi⎠⎞I(D)=I(T)−S⊂T∑D⊆S∑⎝⎛i∈T∖S∏(−μi)⎠⎞⎝⎛i∈S∖D∏μi⎠⎞I(D)=I(T)−S⊂T∑D⊆S∑(−1)card(T∖S)⎝⎛i∈T∖S∏μi⎠⎞⎝⎛i∈S∖D∏μi⎠⎞I(D)=I(T)−S⊂T∑D⊆S∑(−1)card(T∖S)⎝⎛i∈T∖D∏μi⎠⎞I(D)
In our nested summation we have (−1)card(T∖S) times a term that only depends on D and T and not S.
As S⊂T, 1≤card(T∖S)≤n
For a fixed T and D, with card(D)=m.
H(T) H(T) H(T) H(T) =I(T)−D⊂T∑i=1∑n−m(in−m)(−1)i⎝⎛i∈T∖D∏μi⎠⎞I(D)=I(T)−D⊂T∑(−1)⎝⎛i∈T∖D∏μi⎠⎞I(D)=I(T)+D⊂T∑⎝⎛i∈T∖D∏μi⎠⎞I(D)=D⊆T∑⎝⎛i∈T∖D∏μi⎠⎞I(D)
Case 2: n∈2N+1
H(T) H(T) =−S⊂T∑i∈T∖S∏(−μi)H(S)=−S⊂T∑D⊆S∑(−1)card(T∖S)⎝⎛i∈T∖D∏μi⎠⎞I(D)=−D⊂T∑i=1∑n−m(in−m)(−1)i⎝⎛i∈T∖D∏μi⎠⎞I(D)=−D⊂T∑(−1)⎝⎛i∈T∖D∏μi⎠⎞I(D)=D⊂T∑⎝⎛i∈T∖D∏μi⎠⎞I(D)=D⊆T∑⎝⎛i∈T∖D∏μi⎠⎞I(D)
We can do this as I(T)=0, due to the size of the set being odd.
This completes the proof by induction.
H(T)=∑S⊆T(∏i∈T∖Sμi)I(S)
We now have an explicit form for the expected value.
This however is quite expensive to compute, The outer summation makes this an exponentially expensive calculation in the size of T.
H(T)=S⊆T∑⎝⎛i∈T∖S∏μi⎠⎞⎝⎛p∈S2∑{i,j}∈P∏Cov(Xi,Xj)⎠⎞
We can see the if ∀i:μi=0, the summation will collapse except for when S=T, and we recover Isserlis Theorem.
In an attempt to speed this up.
Set up a matrix where the determinant of a 2x2 matrix is 0 if they are not the same row column pairs, and if they are its the covariance.