Data analysis

JZ Joanne Y. Zhou
AR Alexander Richards
KS Kornel Schadl
AL Amy Ladd
JR Jessica Rose
ask Ask a question
Favorite

Nine candidate kinematic metrics (Figure 2) were considered for the construction of the golf SPI, including peak rotational velocity pre-impact, at impact, and post-impact of the pelvis, upper torso, and X-prime. Before performing the index computation, the kinematic metrics were normalized to zero mean and unit variance using the professional group's distribution.

Box plot of individual extracted kinematic metrics of pro vs. amateur golf swings showing the median, minimum, maximum, 25, and 75 percentiles.

All combinations of up to three kinematic metrics were tested using the Kaiser-Meyer-Olkin (KMO) test to determine if the set of variables is suitable for factor analysis (41). KMO values below 0.6 were considered inadequate and were excluded from further analysis. For each combination suitable for factor analysis, a distinct copy of the original dataset containing only the kinematic metrics determined by the combination was used for further processing.

To lower potential bias and better capture differences between the two groups, the dataset of the amateur group was oversampled using the Synthetic Minority Over-sampling Technique (42) to account for imbalance in data containing more than twice the number of participants in the professional over the amateur group. A sample from the amateur group and its nearest neighbor in the feature space was chosen, and a randomly selected point between the two vectors was added to the dataset. This process was repeated until the number of samples in the two groups were equal.

To decouple interrelated parameters into independent components, for each combination, a principal component analysis (PCA) was performed. Principal components of each swing were calculated by applying the PCA transformation to the dataset, using PCA parameters determined by the subset of kinematic metrics of the professional group.

For generating the SPI, the logarithm of the Euclidean distance of the principal components between the swings and the average pro swing vector was calculated. A candidate SPI was calculated by scaling these values so that the mean and standard deviation of the professional golf swings were 100 and 10, respectively.

Each candidate SPI were evaluated using logistic regression with leave-one-out cross-validation to assess their predictive performance in classifying pro vs. amateur swings. The area under the receiver operating characteristic curve (AUC) of the cross-validated model was used to determine the optimal set of kinematic metrics to be used.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A