Sengupta95 Sengupta, S. K. and J. Boyle, 1995: Report 29: Nonlinear principal component analysis of climate data. PCMDI Report 29, Program for Climate Model Diagnosis and Intercomparison, Lawrence Livermore National Laboratory, 26 pp.


In traditional principal component analysis (PCA) a few significant linear combinations of the original variables are extracted to arrive at a parsimonious description of a complex data set obtained from climate observations, analysis or from GCM outputs. These are uncorrelated variables which are used in practice to understand the principal modes of variation in the climatological process under study. If we drop the requirement of linearity and uncorrelatedness, a greater data reduction is possible allowing us to deal with fewer modes of variation. These nonlinear functions can in fact be obtained by using a series of auto-associative feed-forward neural networks in which the residuals from the previous network are fed as the contents of the input output pair for the next. It can be shown that in special cases such networks provide ordinary principal components. We have explored this methodology to gain a better understanding of the precipitation data over the US observed over land and bordering oceans for the 1979 to 1988 decade. A careful comparison with the linear counterpart has been made. The improvement in the data reduction is noticeable but not overwhelming. Certain details in the modes of variation are more pronounced in the nonlinear representation. The leading nonlinear mode captures the seasonal cycles more clearly than the leading linear mode. In the latter, the seasonal cycle is shared by subsequent modes of the PCA. The principal linear and nonlinear modes of the observational data has been intercompared with the corresponding modes of the data obtained from a GCM simulation. We conclude by observing that nonlinear principal component analysis (NLPCA) based on auto-associative neural networks is potentially a more effective data reduction tool than conventional PCA. Also the principal modes of variation of the precipitation data of the continental US are better differentiated by a NLPCA than by ordinary PCA. It should be tried as an alternative method especially when linear PCA fails to show meaningful patterns in climatological data analysis.