Stastistics

MF Marie S. A. Fernandez
HS Hedi A. Soula
MM Mylene M. Mariette
CV Clémentine Vignal
ask Ask a question
Favorite

All statistical tests were performed using R software (R Core Team, 2014). Linear mixed models were built with the lmer function (lme4 R package), and generalized mixed models were built with the glmer function (lme4 R package) (Bates et al., 2014). Models outputs from “Anova” (car library) (Fox and Weisberg, 2011) and “summary” functions are presented.

Before being interpreted each model was checked, paying particular attention to its residuals. For generalized linear models with a Poisson family, overdispersion was tested with the “overdisp.glmer” function of the “RVAideMemoire” package (Hervé, 2014), and if the model presented overdispersion we used a negative binomial family. The model validity was also checked with the plotresid function from the “RVAideMemoire” package before interpreting the model results.

We chose to build biologically relevant models and we kept the full model as recommended by Forstmeier and Schielzeth (2011).

When possible we added information about the quantification of the biological effect given by the models. Confidence intervals were computed with the “confint.merMod” function of the lme4 package. We used the “profile” method for the linear mixed models and the “Wald” method for the negative binomial models.

We only kept random factors that had a non-null variance in the model. If we were interested in the significance of the random factors included in the model, we used the following method. We first looked at the values of their residuals in the model summary (“summary” function in lme4 package). We then built two different models: one model including the random factor, and one model without the random factor. We compared these models using the “Anova” function, and if these models were not significantly different we assumed that the random factor effect was not significant. All random factors with non-null variance were kept in the models even if they had no significant effect.

First, for the group general vocal activity we built a Principal Component Analysis (PCA) over six parameters: the number of bursts, the average vocalization rate in bursts, the burst mean duration, the total duration of bursts, the inter-burst interval, and the latency to burst. We found two axes with eigenvalue above 1 that explained 88.5% of the data variability. The first axis describes the general pattern of how bursts were distributed in time (61.7%), and the second axis the density of vocalizations during the recording both within burst and overall (26.8%) (Figure (Figure22).

General vocal activity between group types. (A) Boxplot of the PC2 values in each group type, from the PCA including six parameters describing the bursts of vocal activity. Linear mixed effect models were built. Detailed sample sizes and model results are given in Table Table1.1. Boxes are median, first and third quartiles (Q1 and Q3 respectively). The upper whisker is located at the *smaller* of the maximum × value and Q3 + 1.5 Inter Quartile Range (IQR), whereas the lower whisker is located at the *larger* of the smallest × value and Q1 − 1.5 IQR. Individual points more extreme in value than Q3 + 1.5 IQR are plotted separately at the high end, and those below Q1 − 1.5 IQR are plotted separately on the low end. (B) Variable loadings of the PCA including six parameters on bursts. The first two axes (with eigen-value above 1) explained 88.5% of the data variability. *p < 0.05.

We built one linear mixed model per PCA axis (PCi) with the following structure:

PCi~GroupType+(1|GroupID)+(1|Day)+(1|StartTime), GroupType having two levels: 2Juv2Ad and 4YAd. The random factors were the group identity (GroupID), the day of recording (Day), and the hour of the recording start (StartTime).

The group type 4YAd had always the same sex ratio (2 females and 2 males). As a second step we restricted the analysis to the first group type 2Juv2Ad alone to study the potential influence of group sex ratio [possible sex ratio for juveniles: 2 males (2M), 2 females (2F) or 1 male and 1 female (1F1M)].

We built the following generalized mixed linear model (negative binomial family):

The response variable was the number of vocalizations. The factor Sex had two levels, M or F. We used a negative binomial model because the model using a Poisson distribution presented overdispersion. The model indicated an interaction between GroupType and Sex at the significance threshold so we studied it using the lsmeans R function.

We built a second model to study the influence of being a juvenile or an adult for GroupType = 2Juv2Ad.

The factor JuvAd had two levels: Juv or Ad.

For groups including juveniles, as several factors were linked, we had to build additional models to deal with confounding effects. We built a model using juvenile data only to test the influence of the sex on the number of vocalizations. As the factor SexRatio was strongly linked to the factor Sex we did not include it in this model:

We then built a model using the females' data only to test the difference between adult and juvenile females (as the males were juveniles only).

First we built a model in order to compare the cross-correlation between group types (2Juv2Ad and 4YAd):

The distance between two birds could be 1 or 2 (1: birds were on the same edge of the square, 2: birds were placed on the diagonal). The factor Sex1Sex2 had three levels: FF, MM, or FM and represented the sexes of both birds from which we computed the cross-correlation.

As the interaction between the group type and the sex was significant we first separated the dataset by group type and analyzed them separately:

GroupType = 2Juv2Ad:

the factor Sex1Sex2 was strongly linked to the factors JuvAd (three levels: JuvJuv, AdAd, JuvAd) which indicated if the dyads of birds comprised only juveniles, only adults or one juvenile and one adult and SexRatio (as the SexRatio could differ between groups), therefore we first built the following model including factors SexRatio and JuvAd: cc~JuvAd+Dist+SexRatio+JuvAd:Dist+JuvAd:SexRatio +(1|GroupID)+(1|Day)+(1|Bird1ID)+(1|Bird2ID)+(1|StartTime)

We then separated the dataset by sexes to assess the difference between the cross-correlations of two juveniles and two young adults. As we had only one data point per bird in this case, the only remaining random factor is Day. For each value of Sex1Sex2 (MM, MF, FF) we built the following model:

We first built a model to compare the maximum transition probabilities between group types (2Juv2Ad and 4YAd):

As the interaction between GroupType and Sex1Sex2 was significant we analyzed the group types separately, as we did for the cross-correlation.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A