Functional Data Analysis

Suneeta Godbole; Andrew Leroux; Ashley Brooks-Russell; Prem S. Subramanian; Michael J. Kosnett; Julia Wrobel

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Functional Data Analysis

SG Suneeta Godbole

AL Andrew Leroux

AB Ashley Brooks-Russell

PS Prem S. Subramanian

MK Michael J. Kosnett

JW Julia Wrobel

This method is extracted from research article: Digit Biomark, Apr 2024

A Study of Pupil Response to Light as a Digital Biomarker of Recent Cannabis Use

DOI: 10.1159/000538561

Ask a question

Favorite

Functional data analysis (FDA) is a field of statistics that models functions (e.g., full trajectories/time series of pupillary light response) without extracting predefined specific features [23, 24]. In our analysis, a single functional unit is the pupillary light response trajectory for a single subject. This functional unit is denoted y_i(t) or x_i(t) for participant i, depending on whether the trajectory is modeled as the outcome or predictor, respectively, with t specifying the time at which the measurement was assessed. For example, if a participant has the pupillary light response trajectory shown in Figure 1, with pupil change of −25.3% at 2 s after the start of the light test, then y_i(t) = y₁(2) = −25.3. Similarly, at 5 s after the start of the light test y₁(5) = −14.9.

We use a functional logistic regression (LogRegr) model to discriminate between those who recently smoked cannabis (combining individuals with daily and occasional use patterns) and those who did not. Functional LogRegr [24–26] relates binary responses y_i (e.g., recent cannabis use vs. no use) to a functional covariate x_i(t) (the pupil response trajectory for the i^th participant). This model is analogous to LogRegr and is given by

The coefficient β₁(t) can be thought of as a weight function, with larger absolute values indicating that pupillary light response is more strongly associated with the response (recent cannabis use) at a given time during the light test. When exponentiated, β₁(t) is interpreted as an odds ratio (OR) at each time t. The integral effectively takes a weighted average of the covariate effect over the test time. This model can be used to detect recent cannabis use by leveraging the full pupillary light response trajectory.

We compare the functional LogRegr model to a traditional LogRegr model, including (a) minimal constriction; (b) rebound dilation; and (c) the slope of the rebound from the point of minimal constriction to the end of the test as calculated in [15]. For rebound dilation, a larger magnitude of area under the curve corresponds to less rebound dilation. We compare both models in their ability to detect recent cannabis use and expect better detection from the functional LogRegr model. Area under the receiver operating characteristic curve (AUC) is used to compare the ability of each model to discriminate between recent cannabis use and no use, where values closer to 1 are interpreted as having a higher discrimination accuracy. The statistical significance of differences between AUCs was calculated with a Mann-Whitney U-statistic [27].

We use function-on-scalar regression (FoSR) to model average pupil response trajectories for participants with no cannabis use, patterns of occasional cannabis use, and daily cannabis use. FoSR is analogous to linear regression and relates functional responses y_i(t) to scalar covariates x_i (e.g., age, cannabis use group, gender). The FoSR model is

Indicators of cannabis use group are denoted by I(use group = occasional) and I(use group = daily), which take values of 1 for subjects in the specified category and 0 otherwise. Coefficients β₀(t), β₁(t), and β₂(t) are akin to regression coefficients in linear regression, defined at each time t during the pupillary light response test. The intercept β₀(t) is interpreted as the average trajectory of a participant in the no-use control group. β₁(t) and β₂(t) are the average differences at a specific time t between the occasional use and no-use groups, and the daily use and no-use groups, respectively. The error term ε_i(t) is normally distributed and independent across participants, but the errors may be correlated over time t.

The mean-centered time from initiation of cannabis smoking to the pupillary light response test, referred to as a time delay (TD), is included in a second FoSR model to explore how the shape of the pupil response trajectory changes over time as cannabis effects potentially become less pronounced. Cannabis use groups were combined to form one “recent use” group, which is compared with the no-use group. This model is given by

where y_i(t), β₀(t), and ε_i(t) have the same interpretation as the previous FoSR model (Equation 2). β₁(t) is interpreted as average difference in trajectories at a specific time t comparing recent cannabis use to no use with an average TD, and β₂(t) is the additional average difference at a specific time t for an additional minute increase in TD.

This article is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC) (http://www.karger.com/Services/OpenAccessLicense). Usage and distribution for commercial purposes requires written permission.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol