Next: Source localization Up: Independent Component Analysis For Previous: Inverse Problem

Statistical preprocessing of the data

Figure: Simulated scalp potentials due to three dipole sources mapped onto 32 channels (electrodes). Channels are numbered left to right, top to bottom. The first channel is the reference electrode. These signals are the input data for the ICA algorithm. The locations of these 32 electrodes are shown in Fig. 3.

Figure 7: Singular values of the covariance matrix. It appears that only the first four singular values contribute to the signal subspace, with the rest constituting the noise subspace.

In EEG experiments, electric potential is measured with an array of electrodes (typically 32, 64, or 128) positioned primarily on the top half of the head, as shown in Fig. 3. The data are typically sampled every millisecond during an interval of interest.

For a given electrode configuration, the time-dependent data can be arranged as a matrix, where every column corresponds to the sampled time frame and every row corresponds to a channel (electrode). For example, the data obtained by 32 electrodes in 180 ms can be sampled in 180 frames and represented as a matrix (32 180). Below we will refer to this matrix as , where instead of a continuous variable t, we have sampled time frames .

Before performing source localization, we will preprocess the EEG activation maps in order to decompose them into several independent activation maps. The source for each activation map will then be localized independently. This is accomplished as follows:

-: First, we will process the raw signals, , in order to reduce the dimensionality of the data, and to remove some of its noise. The projection of the data on the signal subspace will be referred to as .
-: The signal subspace, , will then be decomposed into statistically independent terms, .
-: Each independent activation, , will be assumed to be due to a single stationary dipole, which we will then localize using a parameterized search algorithm.

As outlined above, the first step in processing the raw EEG data,

, is to decompose the data into signal and noise subspaces by applying the Principal Component Analysis (PCA) method [30] (in the signal processing literature it is also known as the Karhunen-Loeve transform). The decomposition is achieved by finding the eigendecomposition of the data covariance matrix:

and constructing signal and noise subspaces [15]. The noise subspace will constitute the singular vectors with singular values less than a chosen noise threshold.

Having constructed the subspaces, we can project the original data onto the signal subspace by:

where and are the signal subspace singular values and singular vectors.

Though PCA allows us to estimate the number of dipoles, in the presence of noise it does not necessarily give an accurate result [15]. In order to separate out any remaining noise, as well as each statistically independent term, we will use the recently derived infomax technique, Independent Component Analysis (ICA). (It is worth noting that PCA not only filters out noise from the data, but also makes a preliminary step of ICA decomposition by decorrelating the channels, or removing linear dependence, i.e., . ICA then makes the channels independent, i.e., for any powers n and m.)

There are several assumptions one needs to make about the sources in order to apply the ICA algorithm in electroencephalography [19]:

-: the sources must be independent (signals come from statistically independent brain processes);
-: there is no delay in signal propagation from the sources to the detectors (conducting media without delays at source frequencies);
-: the mixture is linear (Poisson's equation is linear);
-: the number of independent signal sources does not exceed the number of electrodes (we expect to have fewer strong sources than our 32 electrodes).

It follows then that since the PCA-processed EEG recordings

are the result of linear combinations of the source signals

, they can therefore be expressed as:

where is the so-called ``mixing'' matrix and each row of is a source's time activation. What we would like to find is an ``unmixing'' matrix , such that:

or, in other words, ; but we do not know M, the only data we have is the matrix.

Under the assumption of independent sources, ICA allows us to construct such a matrix; however, since neither the matrix nor the sources are known, can be restored only up to scaling and permutations (i.e., is not an identity matrix, but rather is equal to , where is a diagonal scaling matrix and is a permutation matrix). This problem is often referred to as Blind Source Separation (BSS) [18, 31, 32, 33].

Figure 8: ICA activation maps obtained by unmixing the signals from the signal subspace. We observe that there are only three independent patterns, indicating the presence of only three separate signals in the original data; the fourth component is noise.

The ICA process consists of two phases: the learning phase and the processing phase. During the learning phase, the ICA algorithm finds a weighting matrix , which minimizes the mutual information between channels (variables), i.e., makes output signals that are statistically independent in the sense that the multivariate probability density function of the input signals becomes equal to the product of probability density functions (p.d.f.) of every independent variable. This is equivalent to maximizing the entropy of a non-linearly transformed vector :

where is some non-linear function.

There exist several different ways to estimate the W matrix. For example, the Bell-Sejnowski infomax algorithm [18] uses weights that are changed according to the entropy gradient. Below, we use a modification of this rule as proposed by Amari, Cichocki and Yang [20], which uses the natural gradient rather than the absolute gradient of . This allows us to avoid computing matrix inverses and to speed up solution convergence:

where the vector is defined as:

and for the nonlinear function g we used:

In the above equation is a learning rate and is the identity matrix [33]. The learning rate decreases during the iterations and we stop when becomes smaller than a pre-defined tolerance (e.g., ).

Figure: The projection of the first three activation maps from Fig. 8 (as well as the original signals from Fig. 7) onto the 32 electrodes.

The second phase of the ICA algorithm is the actual source separation. Independent components (activations) can be computed by applying the unmixing matrix to the signal subspace data:

Projection of independent activation maps back onto the electrodes one at a time can be done by:

where is the set of scalp potentials due to just the source. For we zero out all rows but the ; that is, all but the source are ``turned off''. In practice we will not need the full time sequence in order to localize source , but rather simply a single instant of activation. For this purpose, we set the terms to be unit sources (i.e., ), resulting in row elements which are simply the corresponding columns of .

Next: Source localization Up: Independent Component Analysis For Previous: Inverse Problem

Zhukov Leonid

Fri Oct 8 13:55:47 MDT 1999

Revised: March , 2005