next up previous
Next: Source localization Up: Independent Component Analysis For Previous: Inverse Problem

Statistical preprocessing of the data

   figure153
Figure: Simulated scalp potentials due to three dipole sources mapped onto 32 channels (electrodes). Channels are numbered left to right, top to bottom. The first channel is the reference electrode. These signals are the input data for the ICA algorithm. The locations of these 32 electrodes are shown in Fig. 3.

   figure160
Figure 7: Singular values of the covariance matrix. It appears that only the first four singular values contribute to the signal subspace, with the rest constituting the noise subspace.

In EEG experiments, electric potential is measured with an array of electrodes (typically 32, 64, or 128) positioned primarily on the top half of the head, as shown in Fig. 3. The data are typically sampled every millisecond during an interval of interest.

For a given electrode configuration, the time-dependent data can be arranged as a matrix, where every column corresponds to the sampled time frame and every row corresponds to a channel (electrode). For example, the data obtained by 32 electrodes in 180 ms can be sampled in 180 frames and represented as a matrix (32 tex2html_wrap_inline1027 180). Below we will refer to this matrix as tex2html_wrap_inline1029 , where instead of a continuous variable t, we have sampled time frames tex2html_wrap_inline1011 .

Before performing source localization, we will preprocess the EEG activation maps in order to decompose them into several independent activation maps. The source for each activation map will then be localized independently. This is accomplished as follows:

-
First, we will process the raw signals, tex2html_wrap_inline1029 , in order to reduce the dimensionality of the data, and to remove some of its noise. The projection of the data on the signal subspace will be referred to as tex2html_wrap_inline1037 .
-
The signal subspace, tex2html_wrap_inline1037 , will then be decomposed into statistically independent terms, tex2html_wrap_inline1041 .
-
Each independent activation, tex2html_wrap_inline1041 , will be assumed to be due to a single stationary dipole, which we will then localize using a parameterized search algorithm.
As outlined above, the first step in processing the raw EEG data, tex2html_wrap_inline1029 , is to decompose the data into signal and noise subspaces by applying the Principal Component Analysis (PCA) method [30] (in the signal processing literature it is also known as the Karhunen-Loeve transform). The decomposition is achieved by finding the eigendecomposition of the data covariance matrix:

equation185

and constructing signal and noise subspaces [15]. The noise subspace will constitute the singular vectors with singular values less than a chosen noise threshold.

equation196

Having constructed the subspaces, we can project the original data onto the signal subspace by:

  equation208

where tex2html_wrap_inline1047 and tex2html_wrap_inline1049 are the signal subspace singular values and singular vectors.

Though PCA allows us to estimate the number of dipoles, in the presence of noise it does not necessarily give an accurate result [15]. In order to separate out any remaining noise, as well as each statistically independent term, we will use the recently derived infomax technique, Independent Component Analysis (ICA). (It is worth noting that PCA not only filters out noise from the data, but also makes a preliminary step of ICA decomposition by decorrelating the channels, or removing linear dependence, i.e., tex2html_wrap_inline1051 . ICA then makes the channels independent, i.e., tex2html_wrap_inline1053 for any powers n and m.)

There are several assumptions one needs to make about the sources in order to apply the ICA algorithm in electroencephalography [19]:

-
the sources must be independent (signals come from statistically independent brain processes);gif
-
there is no delay in signal propagation from the sources to the detectors (conducting media without delays at source frequencies);
-
the mixture is linear (Poisson's equation is linear);
-
the number of independent signal sources does not exceed the number of electrodes (we expect to have fewer strong sources than our 32 electrodes).
It follows then that since the PCA-processed EEG recordings tex2html_wrap_inline1037 are the result of linear combinations of the source signals tex2html_wrap_inline1061 , they can therefore be expressed as:

equation233

where tex2html_wrap_inline1063 is the so-called ``mixing'' matrix and each row of tex2html_wrap_inline1061 is a source's time activation. What we would like to find is an ``unmixing'' matrix tex2html_wrap_inline1067 , such that:

equation241

or, in other words, tex2html_wrap_inline1069 ; but we do not know M, the only data we have is the tex2html_wrap_inline1037 matrix.

Under the assumption of independent sources, ICA allows us to construct such a tex2html_wrap_inline1067 matrix; however, since neither the matrix nor the sources are known, tex2html_wrap_inline1067 can be restored only up to scaling and permutations (i.e., tex2html_wrap_inline1077 is not an identity matrix, but rather is equal to tex2html_wrap_inline1079 , where tex2html_wrap_inline1081 is a diagonal scaling matrix and tex2html_wrap_inline1083 is a permutation matrix). This problem is often referred to as Blind Source Separation (BSS) [18, 31, 32, 33].

   figure264
Figure 8: ICA activation maps obtained by unmixing the signals from the signal subspace. We observe that there are only three independent patterns, indicating the presence of only three separate signals in the original data; the fourth component is noise.

The ICA process consists of two phases: the learning phase and the processing phase. During the learning phase, the ICA algorithm finds a weighting matrix tex2html_wrap_inline1067 , which minimizes the mutual information between channels (variables), i.e., makes output signals that are statistically independent in the sense that the multivariate probability density function of the input signals becomes equal to the product of tex2html_wrap_inline1087 probability density functions (p.d.f.) of every independent variable. This is equivalent to maximizing the entropy of a non-linearly transformed vector tex2html_wrap_inline1089 :

  equation279

where tex2html_wrap_inline1091 is some non-linear function.

There exist several different ways to estimate the W matrix. For example, the Bell-Sejnowski infomax algorithm [18] uses weights that are changed according to the entropy gradient. Below, we use a modification of this rule as proposed by Amari, Cichocki and Yang [20], which uses the natural gradient rather than the absolute gradient of tex2html_wrap_inline1093 . This allows us to avoid computing matrix inverses and to speed up solution convergence:

equation292

where the vector tex2html_wrap_inline1095 is defined as:

equation302

and for the nonlinear function g we used:

equation307

In the above equation tex2html_wrap_inline1099 is a learning rate and tex2html_wrap_inline1101 is the identity matrix [33]. The learning rate decreases during the iterations and we stop when tex2html_wrap_inline1099 becomes smaller than a pre-defined tolerance (e.g., tex2html_wrap_inline1105 ).

   figure315
Figure: The projection of the first three activation maps from Fig. 8 (as well as the original signals from Fig. 7) onto the 32 electrodes.

The second phase of the ICA algorithm is the actual source separation. Independent components (activations) can be computed by applying the unmixing matrix tex2html_wrap_inline1067 to the signal subspace data:

  equation324

Projection of independent activation maps tex2html_wrap_inline1061 back onto the electrodes one at a time can be done by:

equation331

where tex2html_wrap_inline1111 is the set of scalp potentials due to just the tex2html_wrap_inline1113 source. For tex2html_wrap_inline1115 we zero out all rows but the tex2html_wrap_inline1113 ; that is, all but the tex2html_wrap_inline1113 source are ``turned off''. In practice we will not need the full time sequence tex2html_wrap_inline1111 in order to localize source tex2html_wrap_inline1123 , but rather simply a single instant of activation. For this purpose, we set the tex2html_wrap_inline1123 terms to be unit sources (i.e., tex2html_wrap_inline1127 ), resulting in tex2html_wrap_inline1129 row elements which are simply the corresponding columns of tex2html_wrap_inline1131 .

 


next up previous
Next: Source localization Up: Independent Component Analysis For Previous: Inverse Problem
Zhukov Leonid

Fri Oct 8 13:55:47 MDT 1999

 
Revised: March , 2005