Principal components analysis for sparsely observed correlated functional data using a kernel smoothing approach

Debashis Paul and Jie Peng

University of California, Davis

Abstract:

In this paper, we consider the problem of estimating the covariance kernel and its eigenvalues and eigenfunctions from sparse, irregularly observed, noise corrupted and (possibly) correlated functional data. We present a method based on pre-smoothing of individual sample curves through an appropriate kernel. We show that the naive empirical covariance of the pre-smoothed sample curves gives highly biased estimator of the covariance kernel along its diagonal. We attend to this problem by estimating the diagonal and off-diagonal parts of the covariance kernel separately. We then present a practical and efficient method for choosing the bandwidth for the kernel by using an approximation to the leave-one-curve-out cross validation score. We prove that under standard regularity conditions on the covariance kernel and assuming i.i.d. samples, the risk of our estimator, under L2 loss, achieves the optimal nonparametric rate when the number of measurements per curve is bounded. We also show that even when the sample curves are correlated in such a way that the noiseless data has a separable covariance structure, the proposed method is still consistent and we quantify the role of this correlation in the risk of the estimator.

Keywords : Functional data analysis, principal component analysis, kernel smoothing, cross validation, consistency