Institut für Astronomie und AstrophysikAbteilung AstronomieSand 1, D-72076 Tübingen, GermanyNew Address! -- Neue Adresse! |
PCA
Carry out a Principal Components Analysis (Karhunen-Loeve Transform)
Results can be directed to the screen, a file, or output variables See notes below for comparison with the intrinsic IDL function PCOMP.
PCA, data, eigenval, eigenvect, percentages, proj_obj, proj_atr, [MATRIX =, TEXTOUT = ,/COVARIANCE, /SSQ, /SILENT ]
data - 2-d data matrix, data(i,j) contains the jth attribute value for the ith object in the sample. If N_OBJ is the total number of objects (rows) in the sample, and N_ATTRIB is the total number of attributes (columns) then data should be dimensioned N_OBJ x N_ATTRIB.
/COVARIANCE - if this keyword is set, then the PCA will be carried out on the covariance matrix (rare), the default is to use the correlation matrix /SILENT - If this keyword is set, then no output is printed /SSQ - if this keyword is set, then the PCA will be carried out on on the sums-of-squares & cross-products matrix (rare) TEXTOUT - Controls print output device, defaults to !TEXTOUT textout=1 TERMINAL using /more option textout=2 TERMINAL without /more option textout=3.prt textout=4 laser.tmp textout=5 user must open file textout = filename (default extension of .prt)
eigenval - N_ATTRIB element vector containing the sorted eigenvalues eigenvect - N_ATRRIB x N_ATTRIB matrix containing the corresponding eigenvectors percentages - N_ATTRIB element containing the cumulative percentage variances associated with the principal components proj_obj - N_OBJ by N_ATTRIB matrix containing the projections of the objects on the principal components proj_atr - N_ATTRIB by N_ATTRIB matrix containing the projections of the attributes on the principal components
MATRIX = analysed matrix, either the covariance matrix if /COVARIANCE is set, the "sum of squares and cross-products" matrix if /SSQ is set, or the (by default) correlation matrix. Matrix will have dimensions N_ATTRIB x N_ATTRIB
This procedure performs Principal Components Analysis (Karhunen-Loeve Transform) according to the method described in "Multivariate Data Analysis" by Murtagh & Heck [Reidel : Dordrecht 1987], pp. 33-48. Keywords /COVARIANCE and /SSQ are mutually exclusive. The printout contains only (at most) the first seven principle eigenvectors. However, the output variables EIGENVECT contain all the eigenvectors Different authors scale the covariance matrix in different ways. The eigenvalues output by PCA may have to be scaled by 1/N_OBJ or 1/(N_OBJ-1) to agree with other calculations when /COVAR is set. PCA uses the non-standard system variables !TEXTOUT and !TEXTUNIT. These can be added to one's session using the procedure ASTROLIB. The intrinsic IDL function PCOMP (introduced in V5.0) duplicates most most of the functionality of PCA, but uses different conventions and normalizations. Note the following: (1) PCOMP requires a N_ATTRIB x N_OBJ input array; this is the transpose of what PCA expects (2) PCA uses standardized variables; use /STANDARIZE keyword to PCOMP for a direct comparison. (3) PCA (unlike PCOMP) normalizes the eigenvectors by the square root of the eigenvalues. (4) PCA returns cumulative percentages; the VARIANCES keyword of PCOMP returns the variance in each variable
Perform a PCA analysis on the covariance matrix of a data matrix, DATA, and write the results to a file IDL> PCA, data, /COVAR, t = 'pca.dat' Perform a PCA analysis on the correlation matrix. Suppress all printing, and save the eigenvectors and eigenvalues in output variables IDL> PCA, data, eigenval, eigenvect, /SILENT
TEXTOPEN, TEXTCLOSE
Immanuel Freedman (after Murtagh F. and Heck A.). December 1993 Wayne Landsman, modified I/O December 1993 Converted to IDL V5.0 W. Landsman September 1997 Fix MATRIX output, remove GOTO statements W. Landsman August 1998 Changed some index variable to type LONG W. Landsman March 2000
[Home Page] [Software, Documentation] [IDL Documentation] [Quick Reference] [Feedback]