![]() |
Institut für Astronomie und AstrophysikAbteilung AstronomieSand 1, D-72076 Tübingen, GermanyNew Address! -- Neue Adresse! |
![]() |
PCA
Carry out a Principal Components Analysis (Karhunen-Loeve Transform)
Results can be directed to the screen, a file, or output variables
See notes below for comparison with the intrinsic IDL function PCOMP.
PCA, data, eigenval, eigenvect, percentages, proj_obj, proj_atr,
[MATRIX =, TEXTOUT = ,/COVARIANCE, /SSQ, /SILENT ]
data - 2-d data matrix, data(i,j) contains the jth attribute value
for the ith object in the sample. If N_OBJ is the total
number of objects (rows) in the sample, and N_ATTRIB is the
total number of attributes (columns) then data should be
dimensioned N_OBJ x N_ATTRIB.
/COVARIANCE - if this keyword is set, then the PCA will be carried out
on the covariance matrix (rare), the default is to use the
correlation matrix
/SILENT - If this keyword is set, then no output is printed
/SSQ - if this keyword is set, then the PCA will be carried out on
on the sums-of-squares & cross-products matrix (rare)
TEXTOUT - Controls print output device, defaults to !TEXTOUT
textout=1 TERMINAL using /more option
textout=2 TERMINAL without /more option
textout=3 .prt
textout=4 laser.tmp
textout=5 user must open file
textout = filename (default extension of .prt)
eigenval - N_ATTRIB element vector containing the sorted eigenvalues
eigenvect - N_ATRRIB x N_ATTRIB matrix containing the corresponding
eigenvectors
percentages - N_ATTRIB element containing the cumulative percentage
variances associated with the principal components
proj_obj - N_OBJ by N_ATTRIB matrix containing the projections of the
objects on the principal components
proj_atr - N_ATTRIB by N_ATTRIB matrix containing the projections of
the attributes on the principal components
MATRIX = analysed matrix, either the covariance matrix if /COVARIANCE
is set, the "sum of squares and cross-products" matrix if
/SSQ is set, or the (by default) correlation matrix. Matrix
will have dimensions N_ATTRIB x N_ATTRIB
This procedure performs Principal Components Analysis (Karhunen-Loeve
Transform) according to the method described in "Multivariate Data
Analysis" by Murtagh & Heck [Reidel : Dordrecht 1987], pp. 33-48.
Keywords /COVARIANCE and /SSQ are mutually exclusive.
The printout contains only (at most) the first seven principle
eigenvectors. However, the output variables EIGENVECT contain
all the eigenvectors
Different authors scale the covariance matrix in different ways.
The eigenvalues output by PCA may have to be scaled by 1/N_OBJ or
1/(N_OBJ-1) to agree with other calculations when /COVAR is set.
PCA uses the non-standard system variables !TEXTOUT and !TEXTUNIT.
These can be added to one's session using the procedure ASTROLIB.
The intrinsic IDL function PCOMP (introduced in V5.0) duplicates most
most of the functionality of PCA, but uses different conventions and
normalizations. Note the following:
(1) PCOMP requires a N_ATTRIB x N_OBJ input array; this is the transpose
of what PCA expects
(2) PCA uses standardized variables; use /STANDARIZE keyword to PCOMP
for a direct comparison.
(3) PCA (unlike PCOMP) normalizes the eigenvectors by the square root
of the eigenvalues.
(4) PCA returns cumulative percentages; the VARIANCES keyword of PCOMP
returns the variance in each variable
Perform a PCA analysis on the covariance matrix of a data matrix, DATA,
and write the results to a file
IDL> PCA, data, /COVAR, t = 'pca.dat'
Perform a PCA analysis on the correlation matrix. Suppress all
printing, and save the eigenvectors and eigenvalues in output variables
IDL> PCA, data, eigenval, eigenvect, /SILENT
TEXTOPEN, TEXTCLOSE
Immanuel Freedman (after Murtagh F. and Heck A.). December 1993
Wayne Landsman, modified I/O December 1993
Converted to IDL V5.0 W. Landsman September 1997
Fix MATRIX output, remove GOTO statements W. Landsman August 1998
Changed some index variable to type LONG W. Landsman March 2000
[Home Page] [Software, Documentation] [IDL Documentation] [Quick Reference] [Feedback]