===========
 Description:
 ===========
 
             This is the main program to perform Functional Principal Component 
             Analysis (FPCA) via PACE (principal analysis via conditional expectation).
             The principal component scores can be estimated through conditional expectation 
             or via classical integration. For the latter, one can choose a shrinkage method 
             for estimated scores. See Reference [2].  For shrinkage, see Reference [1].
 
 ======
 Usage:
 ======
 
  function [no_opt,sigma,lambda,phi,eigen,xi_est,xi_var,mu,bw_mu,xcov,bw_xcov,
            xcovfit, AIC,BIC,FVE,y_pred, y_predOrig, out1,out21,...
            y,t, regular, rho_opt, mucopy, phicopy, eigencopy, out1copy, out21copy, xcovcopy, xcovfitcopy, xcorr]...
         = PCA(y,t,bwmu,bwmu_gcv, bwxcov,bwxcov_gcv,ntest1,ngrid1,selection_k,
           FVE_threshold, maxk,control,regular,error, ngrid,method,shrink,...
           newdata,kernel, numBins, yname, screePlot, designPlot, corrPlot, rho, verbose,method_mu)
 
 ================
 Input Arguments:
 ================
 
 Input y:          1*n cell array, y{i} is the vector of measurements for the
                   ith subject, i=1,...,n.
 Input t:          1*n cell array, t{i} is the vector of time points for the
                   ith subject for which corresponding measurements y{i} are
                   available, i=1,...,n.
 
 Input bwmu:       scalar bwmu>=0, bandwidth for mean curve mu,
                   0:  use cross-validation or generalized cross-validation
                       to choose bandwidth automatically.      [Default]
                   bwmu(>0): user-specified bandwidth.
 
 Input bwmu_gcv:   For choice bwmu = 0, two types of cross-validation can
                   be performed: One-curve-leave-out cross-validation (CV)
                   or  generalized cross-validation (GCV)
                   0: CV   (may be time-consuming)
                   1: GCV  (faster)                            [Default]
 
 Input bwxcov:     1*2 vector, bandwidths for covariance surface used for
                   smoothing of cov(X(t),X(s))
                   bwxcov(i): ith coordinate of bandwidth vector, i=1,2.
                   bwxcov(1)==0 & bwxcov(2)==0: use cross-validation (CV)
                   or generalized cross-validation (GCV) for automatic
                   selection.                                  [Default]
                   bwxcov(1)>0 & bwxcov(2)>0: user-specified bandwidths.
 
 Input bwxcov_gcv: If setting bwxcov = [0 0], automatic bandwidth selection
                   by CV or GCV choices
                   0: CV method (may be time-consuming, can be accelerated by 
                      choosing small values for ntest1 and ngrid1).
                   1: GCV method (faster)                      [Default]
 
 Input ntest1:     integer(<=n), number of curves used for CV when choosing
                   bandwidths for smoothing the covariance surface. The subjects 
                   in the test set are randomly selected from n subjects. Small 
                   ntest1 will accelerate CV at less accuracy. [Default is 30.]
 
 Input ngrid1:     integer, number of support points for the covariance surface 
                   in the CV procedure (selecting bandwidths of covariance).
                   Note that input ntest1 and ngrid1 provide options to save
                   computing time when using CV or GCV.        [Default is 30.]
 
 Input selection_k: the method of choosing the number of principal components K.
                    'AIC1': use AIC criterion with pseudo-likelihood of
                            measurements (marginal likelihood).
                    'AIC2': use AIC criterion with likelihood of measurements
                            conditional on estimated random coeffecients.
                    'BIC1': use BIC criterion with pseudo-likelihood of
                            measurements (marginal likelihood).
                                                                [Default]
                    'BIC2': use BIC criterion with likelihood measurements
                            conditional on estimated random coeffecients.
                    'FVE' (fraction of variance explained) : use scree plot
                            approach to select number of principal
                            components), see "FVE_threshold" below.
                     positive integer K: user-specified number of principal 
                                         components
 
                    Note: BIC1 and FVE produce the most parsimonious models.
 
 Input FVE_threshold:  a positive number that is between 0 and 1 [Default is 0.85.]
                       It is used with the option selection_k = 'FVE' to select
                       the number of principal components that explain at least
                       "FVE_threshold" of total variation (the fraction
                       of variance explained).
 
 Input maxk:      integer, the maximum number of principal components to consider
                  if using automatic methods to choose K, i.e., 'AIC1', 'AIC2',
                  'BIC1' or 'BIC2' defined by selection_k.      [Default is 20.]
                  Note: when selection_k = 'FVE' or 'AIC_R', maxk is ignored.
 
 Input control:   'auto', Select K by minimizing AIC or BIC, or find the
                          first K such that the FVE_threshold is exceeded. [Default]
                  'look', a scree plot (FVE% Vs No. of PC) will be generated based
                          on K <= 15. User will be prompted to enter user-specified
                          K after viewing scree plot. This can be combined
                          with any setting of selection_k.
 
 Input regular:   0, sparse (or irregular) functional data.      
                  1, regular data with missing values
                  2, completely balanced (regular) data.
                  [], automatically chosen based on the design in t.   [Default]
 
 Input error:     0, no additional measurement error assumed.
                  1, additional measurement error is assumed.    [Default]
 
 Input ngrid:     integer, number of support points in each direction of 
                  covariance surface when performing principal component 
                  analysis ( ngrid > K).                    [Default is 51.]
 Input method:    used for computing random effects \xi_{ik}
                  'CE': conditional expectation method           [Default]
                  'IN': classical integration method
                  Note: 'CE' can be applied for sparse data or regular data, but
                        'IN' only in the case of regular data.
 
 Input shrink:    indicator of whether applying shrinkage to estimates of random
                  coefficients (for regular data only)
                  0:  no shrinkage when method = 'CE' or error = 0 [Default]
                  1:  shrinkage when method = 'IN' and error =1, otherwise, this
                      will be re-set to 0.
 
 Input newdata:   a row vector of user-defined output time grids for
                  all curves. This corresponds to "out1" in the output argument
                  If newdata = [], then "out1" corresponds to the set of distinct
                  time points from the pooled data.
                  "newdata" is supposed to be a vector in ascending order on
                   the domain of the functions.                    [Default is []]
 
 Input kernel:    a character string to define the kernel to be used in the
                  1-D or 2-D smoothing
                  kernel = 'epan'  ==> Epanechnikov kernel [Default for dense designs with n_i >= 20]
                           'rect'  ==> Rectangular kernel
                           'gauss'  ==> Gaussian kernel    [Default for sparse designs, regular designs with
                                                            missings, dense designs for n_i < 20]
                  Note: The Gaussian kernel is overall best for sparse designs but is slower than the other kernels 
                        and if computational speed is of concern then one may wish to use the Epanechnikov kernel 
                        also for the case of sparse designs.
 
 Input numBins:   0: no binning
                  a positive interger (>= 10): prebin the data with user-defined
                  number of bins. When numBins < 10, no binning will be performed.
 		  []:  prebin the data with the following rule    [Default]
 
                  i) When the input data is regular = 1 or 2
                     m = max of n_i, where n_i is the number of repeated measurements
                     for ith subject.
                  ii) regular = 0
                     m = median of n_i
 
                  When m <= 20 subjects, no binning.
                  When n <= 5000 subjects and m <= 400, no binning.
                  When n <= 5000 subjects and m > 400, numBins = 400.
                  When n > 5000 subjects, compute
 
                  m* = max(20, (5000-n)*19/2250+400)
 
                  if m > m*, numBins = m*
                  if m <= m*, no binning
 
                  This option accelerates analysis, especially for data with
                  many time measurements.
 
 Input yname:     a character string which denotes the name of the current
                  function to be estimated.               [Default is []]
                  It is used in the title part of the scree plot output.
                  If it is set to [], then yname is set to be the same
                  name as the first input argument from PCA() or FPCA().
                  For example, if this analysis is for function y,
                  then yname = 'y'.
 
 Input screePlot: indicator of whether to create the scree plot
                  1  a scree plot will be created         
                  0  no scree plot will be created        [Default]
 Input designPlot: indicator of whether to create the design plot
                  1  a design plot will be created
                  0  no design plot will be created       [Default]
                  Interpretation of design plot: All subareas of the
                  domain x domain support square of the covariance surface need
                  to be covered more or less homogeneously with pairs of design points.
 Input corrPlot:  indicator of whether to create the correlation surface plot
                  1  a correlation surface plot will be created
                  0  no correlation surface plot will be created       [Default]
 
 Input rho:       truncation threshold for the iterative residual that is used
                  in the estimation of FPC scores. (see FPCscore.pdf under Help/ for more details)
                  -1:  compute unadjusted FPC scores (as in previous PACE versions)
                  >0:  user-defined choice of rho
                  0:   do not set truncation threshold, but use iterative residuals for sigmanew (see in output description below)
                  'cv-random':  a character string which specifies to use randomized leave-one-measurement-out CV approach to find
                         the optimal value of rho.        
                  Note that this choice contains a random element and therefore the analysis is not exactly
                   replicable when running  the program twice.
                  'cv': use non-randomized leave-one-measurement-out CV approach to find the optimal value of rho. [default]
 
 Input verbose:   a character string
                  'on': display diagnostic messages       [Default]
                  'off': suppress diagnostic messages
 
 Input xmu:       1*N vector, user-defined mean functions valued at distinct
                  input time points, in ascending order from all subjects, corresponding to out1. 
 
 
 Input method_mu: method to estimate mu
                 'RARE': using response-adaptive method to update
                  estimated       
                 'PACE': using the original method   [default]

 ========
  Details:
 ========
        1)       bwmu = 0 and bwmu_gcv = 1 ==> use GCV method to choose
                 the bandwidth for mean function
 
                 For Gaussian kernel, the optimal bandwidth from GCV is multiplied
                 by 1.1.
 
                 bwmu = 0 and bwmu_gcv = 0 ==> use CV method to choose the
                 bandwidth for mean function
 
                 bwmu > 0 then bwmu_gcv will be ignored, subjective
                 bandwidth choice
 
        2)       bwxcov = [0 0] and bwxcov_gcv = 1 ==> use GCV method to
                 choose the bandwidth for cov function
 
                 For Gaussian kernel, the optimal bandwidth from GCV is multiplied
                 by 1.1.
                 
                 bwxcov = [0 0] and bwxcov_gcv = 0 ==> use CV method to
                 choose the bandwidth for cov function
 
                 bwxcov(1) > 0 and bwxcov(2) > 0 then bwxcov_gcv will be
                 ignored, subjective bandwidth choice
 
                 If eigenvalues estimation is of primary interest, you may want
                 to undersmooth the covariance surface for better estimates 
                 of eigenvalues.
 
        3)       If error=1, AIC1, BIC1, AIC2, BIC2, FVE all can be used
                 for choosing K.
 
                 If error=0, only AIC1, BIC1 and FVE can be used by
                 definition of these criteria.
 
        4)       If some arguments are not needed or you want to use
                 default values, just use [] for these arguments.
 
                 Alternatively, use setOptions() to define nessary input
                 arguments and call PCA through FPCA.
 
        5)       In general, any output in the form of 1-dimensional
                 vectors will be a row vector and any output related to
                 1-dimensional cell arrays will be a row cell array.
 
 =================
 Output Arguments:
 =================
 Output no_opt:   integer, automatically or subjectively selected value of K, the number of selected components.
 
 Output sigma:    scalar, estimate of measurement error variance if
                  error=1, while it is [] if error=0.
 
 Output lambda:   1*K vector, estimated eigenvalues (variances of functional principal components scores).
 
 Output phi:      N*K matrix, estimated principal component functions
                  valued at distinct input time points with ascending
                  order of all subjects, corresponding to out1
 
 Output eigen:    ngrid*K matrix, estimated principal component functions,
                  valued at out21, ngrid of the pooled distinct time points
                  with ascending order of all subjects,
                  phi is an interpolated version of eigen at out1
 
 Output xi_est:   n*K matrix, predictions for random coeffecients (PC
                  scores) for n subjects.
 
 Output xi_var:   K*K matrix, Var(PC score)-Var(estimated PC score). The
                  omega matrix in equation (7) of the paper, which is used
                  to construct the point-wise C.I. for X_i(t)
 
 Output mu:       1*N vector, estimated mean functions valued at distinct
                  input time points (newdata = []), in ascending order from
                  all subjects, corresponding to out1; when newdata is defined,
                  corresponds to the time points from newdata, same as
                  out1.
 
 Output bw_mu:    scalar(>0), automatically or subjectively selected
                  bandwidth for smoothing mean curve.
 
 Output xcov:     ngrid*ngrid matrix, smoothed covariance surface (diagnal
                  removed), corresponding to out21
 
 Output bw_xcov:  1*2 vector(>0), automatically or subjectively selected
                  bandwidths for smoothing covariance surface.
 
 Output xcovfit:  ngrid * ngrid matrix, fitted covariance surface, based
                  on truncated estimate of eigenvalues ("lambda") and
                  principal component functions ("eigen"), corresponding
                  to out21
 
 Output xcorr:    ngrid * ngrid matrix, fitted correlation surface, based
                  on truncated estimate of eigenvalues ("lambda") and
                  principal component functions ("eigen"), corresponding
                  to out21
 
 Output AIC:      1*K vector, AIC values obtained when choosing K from
                  K=1 to K=maxk, where AIC(K) is the minimum. If AIC
                  method is not applied, it is []
 
 Output BIC:      1*K vector, BIC values obtained when choosing K from
                  K=1 to K=maxk, where BIC(K) is the minimum. If BIC
                  method is not applied, it is []
 
 Output FVE:      1*ngrid vector of fraction of variance explained
 
 Output y_pred:   cell array, y_pred{i} is the vector of predictions for
                  the ith subject evaluated at time points from the output
                   grid vector "out1".
 
 Output y_predOrig: cell array, y_predOrig{i} is the vector of predictions
                    for the ith subject at the same time points as the input.
 
 Output out1:     1*N vector, distinct input time points with ascending
                  order from all subjects if newdata = []; otherwise, it
                  is the same as newdata.
 
 Output out21:    1*ngrid vector, a grid of time points for which the
                  smoothed covariance surface assumes values, i.e.,
                  ngrids from out1.
 
 Output y:        if no binning is performed, same as the input y
                  if binning is performed, 1 * n cell array, y{i} is a vector
                  of measurements after binning for subject i, i = 1,...,n
 
 Output t:        if no binning is performed, same as the input t
                  if binning, 1 * n cell array, t{i} is a vector of
                  time points after binning for subject i, i = 1,...,n
                  Each value of t{i} corresponds to the midpoints of
                  each bin.
 
 Output regular:  if no binning is performed or regular = 2, this is the
                  same as the input
                  if binning is performed and regular = 0, it will be
                  reset to regular = 1. In other words, after binning,
                  the sparse and irregular case is analyzed as regular
                  with missings by sampled data.
 
 Output rho_opt:  if rho is set as 'cv' or 'cv-random', then rho_opt is the optimal rho obtained
                   from the CV method, otherwise, it is the same as the input rho
                  When rho is 'cv', rho_opt is non-random for a give data set
                  When rho is 'cv-random', rho_opt can be different due to the randomly leave-out measurements
                  for the same data set.
 
 Output sigmanew: if rho is set as >0, 0 or 'cv', then sigmanew is the iterative
                  residual sum of squares (see FPCscore.pdf for more details). It
                  can be used as an estimate of the variance of the measurement
                  errors.
                  if rho is set as -1, then sigmanew is set to the same as output
                  sigma.
  See also FPCA, setOptions, showOptionNames, example