PACE package for Functional Data Analysis and Empirical Dynamics (written in Matlab)
Version 2.17 (released June, 2015)
Maintainer as of April 2019: Jianing Fan jngfan at ucdavis.edu
PACE is a versatile package that provides implementation of various methods of Functional Data Analysis (FDA)
and Empirical Dynamics in Matlab. The core of this package is Functional Principal Component Analysis
(FPCA), a key technique for functional data analysis, for sparsely or densely sampled random trajectories and time courses,
via the Principal Analysis by Conditional Estimation (PACE) algorithm. PACE is useful for the analysis
of data that have been generated by a sample of underlying (but usually not fully observed) random trajectories. It
does not rely on pre-smoothing of trajectories, which is especially problematic when functional data
are sparsely sampled. PACE provides options for linear and nonlinear functional regression and correlation,
for Longitudinal Data Analysis, the analysis of stochastic processes from
samples of realized trajectories, and for the analysis of underlying dynamics, through empirical differential equations. The development of PACE has been supported by various NSF grants. PACE1.0
was written by Fang Yao and subsequent major improvements were made by Bitao Liu. The PACE project is coordinated by Hans-Georg Müller and Jane-Ling Wang. Contributors and developers include Kehui Chen, Andrea Gottlieb, Shuang Wu, Alexander Peterson, Hao Ji, Xiongtao Dai, Chun-Jui Chen, Andrew Farris, Wenwen Tao, Xiaoke Zhang, Jinjiang He, Cong Xu, Rona Tang,
Wenjing Yang, Jeng-Min Chiou, Joel Dubin, Xiongtao Dai, Yaqing Chen, Jianing Fan.
PACE 2.17 includes the following options for
Functional Data Analysis. Numbers [1], [2] etc. refer to the references listed below.
(1) Fitting of both sparsely and densely sampled random functions from noisy measurements by Functional
Principal Component Analysis (FPCA), including
Spaghetti plots to view the sample of functions (PACE) [1] [2] [5] [10]
(2) Fitting of derivatives for Empirical Dynamics for both sparsely and densely sampled
random functions (PACE-DER) [17] [20]
(3) Functional linear
regression, fitting functional linear regression models for both sparsely or densely sampled random trajectories, for
cases where the predictor is a random function and the response is a scalar or a random function (PACE-REG) [3] [13]
(4) Diagnostics and bootstrap inference for functional linear regression (PACE-REG) [9]
(5) Functional regression models where the responses are random curves and the predictors are covariate vectors by regressing the response principal components on the predictors by separate single index models,
which are fitted by semiparametric quasi-likelihood
Regression, using the SPQR algorithm (PACE-FQR) [21] [22] [23]
(6) Response-adaptive regression (RARE) for the case where both responses and predictors are sparsely sampled random trajectories (PACE-RARE). A version of RARE is also provided for an alternative implementation of FPCA with mean and covariance updating [24]
(7) Assessing functional
dependence and functional correlation for multivariate functional observations through functional singular value decomposition and singular components (PACE-SVD) [25]
(8) Generalized functional linear regression (GFLM),
where the response is a scalar generalized variable such as binary or
Poisson; can also be used for classification of functional data via binary regression (PACE-GLM) [4] [5] [7]
(9) Functional quadratic and polynomial regression (PACE-QuadReg) for functional predictors and scalar responses [19]
(10) Functional Additive Modeling (FAM), an additive generalization of functional linear regression,
for more flexible functional regression modeling, for the
case of functional predictors and both functional and scalar responses (PACE-FAM) [12]
(11) Modeling longitudinal data with repeated generalized responses (binary, Poisson etc.), which are derived from
a latent Gaussian process by a link function (PACE-GRM) [11]
(12) The functional variance process, a generalization of variance functions useful for functional variance and volatility
modeling (PACE-FVP) [6] [30]
(13) Time-synchronization based on iterated pairwise warping (alignment, registration) for both sparsely and densely sampled
functions (PACE-WARP) [8] [15] [16]
(14) Conditional functional distances for sparsely sampled functional data, which complement the L2 distance and
can be used for functional clustering and other distance-based applications (PACE) [14]
(15) Linear dynamic modeling and fitting of a first order stochastic differential equation with varying coefficient functions and smooth drift process to implement empirical dynamics (PACE-DYN) with subpackage DiffEq
[18] [20] (16) Dynamical correlation, a correlation measure for repeatedly observed pairs of random functions (PACE-dynCorr) [26] (17) Functional quantile regression, i.e., quantile regression for scalar responses and functional predictors, e.g. for functional median regression (PACE-Quantile) [27] (18) Stringing: Reordering the components of high-dimensional vector data by MDS, thus transforming the high-dimensional vectors into functional data (PACE-Stringing) [28] (19) Nonlinear dimension reduction and Manifold Learning: perform nonlinear dimension reduction via ISOMAP or Penalized-ISOMAP, using geodesic distances and multidimensionl scaling techniques, and obaining manifold component representations (PACE-Mani) [29] (20) Volatility: model volatility trajectories for high frequency observations in financial markets. (PACE-Fvola) [6] [30] (21) Covariate adjusted FPCA for longitudinal data, implementing two methods to accommodate covariate information in FPCA (CaFPCA, standalone package) [31] (22) Functional single index models for longitudinal data, implementing a new single-index model that reflects the time-dynamic effects of the single index for longitudinal and functional response data. (FSIM, standalone package) [32] (23) Implementation of two-stage FPCA for repeatedly measured functional data, also useful for functional data in two arguments, especially when the design is sparse in one of the arguments (PACE-repf) [33] (24) Estimation of the stickiness coefficient for sparsely observed functional data (PACE-stick) [34] (25) Implementation of data-adaptive nonlinear dynamic differential equations from obseved functional data, a generalization of linear dynamic modeling in (15) above (PACE-nonlindyn) [35] (26) Functional clustering by k-center method (PACE-kcfc) [36] If you use the program, please refer to the respective articles below where the core
methodology is described. [1] Yao, F., Müller, H.G., Clifford, A.J., Dueker, S.R., Follett, J., Lin, Y.,
Buchholz, B., Vogel, J.S. (2003). Shrinkage estimation for functional principal
component scores, with application to the population kinetics of plasma folate. Biometrics
59, 676-685. (pdf) [2] Yao, F., Müller, H.G., Wang, J.L. (2005). Functional data analysis for sparse
longitudinal data. J. American Statistical Association 100, 577-590. (pdf) [3] Yao, F., Müller, H.G., Wang, J.L. (2005). Functional Linear Regression
Analysis for Longitudinal Data. Annals of Statistics 33, 2873-2903.
(pdf) [4] Müller, H.G., Stadtmüller, U. (2005). Generalized functional linear models.
Annals of Statistics 33, 774-805. (pdf) [5] Müller, H.G. (2005). Functional modeling and classification of longitudinal
data. Scandinavian J. Statistics 32, 223-240. (pdf) [6] Müller, H.G., Stadtmüller, U., Yao, F. (2006). Functional variance processes. Journal of the American
Statistical Association 101, 1007-1018. (pdf) [7] Leng, X., Müller, H.G. (2006). Classification using functional data analysis
for temporal gene expression data. Bioinformatics 22, 68-76. (pdf) [8] Leng, X., Müller, H.G. (2006). Time ordering of gene co-expression. Biostatistics 7,
569-584. (pdf) [9] Chiou, J., Müller, H.G. (2007). Diagnostics for functional regression via
residual processes. Computational Statistics & Data Analysis 51,
4849-4863. (pdf) [10] Müller, H.G. (2009). Functional modeling of longitudinal
data. In: Longitudinal Data Analysis (Handbooks of Modern
Statistical Methods), Ed. Fitzmaurice, G., Davidian, M.,
Verbeke, G., Molenberghs, G., Wiley, New York, 223--252.
(pdf) [11] Hall, P., Müller, H.G., Yao, F. (2008). Modeling sparse
generalized longitudinal observations via latent Gaussian processes. Journal of the Royal Statistical Society B 70,
703-723. (pdf) [12] Müller, H.G., Yao, F. (2008). Functional additive models. Journal of the American Statistical
Association 103,
426-437. (pdf) [13] Müller, H.G., Chiou, J.M., Leng, X. (2008).
Inferring gene expression dynamics via functional regression
analysis. BMC Bioinformatics 9:60. (pdf) [14] Peng, J., Müller, H.G. (2008). Distance-based clustering of sparsely
observed stochastic processes, with applications to online auctions. Annals of Applied Statistics 2,
1056-1077. (pdf) [15] Tang, R., Müller, H.G. (2008). Pairwise curve synchronization for high-dimensional
data. Biometrika 95,
875-889. (pdf) [16] Tang, R., Müller, H.G. (2009). Time-synchronized clustering of gene expression trajectories.
Biostatistics 10,
32-45. (pdf) [17] Liu, B., Müller, H.G. (2009). Estimating derivatives for
samples of sparsely observed functions, with application to
on-line auction dynamics. J. American Statistical
Association 104, 704-717.
(pdf) [18] Müller, H.G., Yang, W. (2010). Dynamic relations for sparsely
sampled Gaussian processes. Test 19,
1-29.
(pdf) [19] Müller, H.G., Yao, F. (2010). Functional quadratic regression. Biometrika 97, 49-64.
(pdf) [20] Müller, H.G., Yao, F. (2010). Empirical dynamics for
longitudinal data. Annals of Statistics 38, 3458?486. (pdf) [21] Chiou, J.M., Müller, H.G., Wang, J.L. (2003). Functional quasi-likelihood
regression with smooth random effects. J. Royal Statistical Society B 65,
405-423. (pdf) [22] Chiou, J.M., Müller, H.G., Wang, J.L. (2004). Functional response models.
Statistica Sinica 14, 675-693. (pdf) [23] Chiou, J., Müller, H.G. (2004). Quasi-likelihood regression with multiple
indices and smooth link and variance functions. Scandinavian J. Statistics 31,
367-386. (pdf) [24] Wu, S., Müller, H.G. (2011). Response-adaptive regression for longitudinal data. Biometrics
(pdf ) [25] Yang, W., Müller, H.G., Stadtmüller, U. (2011). Functional
singular component analysis. J. Royal Statistical
Society B 73,
303?324.
(pdf ) [26] Dubin, J., Müller, H.G. (2005). Dynamical correlation for multivariate
longitudinal data. J. American Statistical Association 100, 872-881. (pdf)
[27] Chen, K., Müller, H.G. (2011). Conditional quantile analysis
when covariates are functions, with application to growth data.
J. Royal Statistical Society B 74, 67?9.
(pdf) [28] Chen, K., Chen,
K., Müller, H.G., Wang, J.L. (2011). Stringing high-dimensional data for
functional analysis. J. American Statistical Association 106,
275-284. (pdf) [29] Chen, D., Müller, H.G. (2012). Nonlinear manifold
representations for functional data. Annals of Statistics 40, 1-29.
(pdf) [30] Müller, H.G., Sen, R., Stadtmüller, U. (2011).
Functional Data Analysis for Volatility. J. Econometrics 165, 233-245.
(pdf) [31] Jiang, C.R. and Wang, J.L. (2010): Covariate Adjusted Functional Principal Components Analysis for Longitudinal Data, The Annals of Statistics, 38, 1194-1226.
(pdf) [32] Jiang, C.R. and Wang, J.L. (2011): Functional Single Index Model for Longitudinal Data, The Annals of Statistics, 39, 362-388.
(pdf) [33] Chen, K. and Müller, H.G. (2012). Modeling Repeated Functional Observations, Journal of the American Statistical Association, 107, 1599-1609.
(pdf) [34] Gottlieb, A. and Müller, H.G. (2012). A Stickiness Coefficient for Longitudinal Data, Computational Stat. and Data Analysis, i56, 4000-4010.
(pdf) [35] Verzelen, N., Tao, W. and Müller, H.G. (2012). Inferring Stochastic Dynamics from Functional Data. Biometrika, 99, 533-550.
(pdf) [36] Chiou, J.M. and Li, P.L. (2007). Functional clustering and identifying substructures of longitudinal data[J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69, 679-699.
(pdf)