The primary aim was the extent of variability between the nine pathologists for histological grade of DCIS based on review of the H&E‐stained slides. Tissue slides of insufficient quality, as judged by more than 50% of the participating pathologists for any histological variable, were excluded from analysis (n = 12).
As each slide was evaluated by each pathologist, generalised linear mixed models (GLMMs) for cross‐classified data structure were used to calculate kappa values as a chance‐corrected association between pathologists (κ ma) [11, 12]. κ ma were obtained by taking into account levels of exact concordance, i.e. where pathologists assigned the exact same grade to a slide, and the level of disagreement among pathologists' classifications. κ ma values were interpreted as the measurement of agreement using the criteria suggested by Landis and Koch [13], which are based on the interpretation that 0.00 is pure coincidence and 1.00 is perfect agreement: <0.00 as no, 0.00–0.20 as poor to slight, 0.21–0.40 as fair, 0.41–0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1.00 as almost perfect agreement.
We modelled the histological variables separately, and to analyse the influence of the tissue slides' and pathologists' characteristics on each of the histological variables, GLMMs were adjusted for guidelines used, experience, country, and using the dominant or highest grade in case of heterogeneous DCIS as characteristics of the pathologists and origin of the slide (both country and centre) as characteristics of the slides. As all the pathologists from the same country used the same guidelines (except in the USA; see supplementary material, Table S3), including both ‘country of pathologists’ and ‘guidelines’ in the same multivariable model resulted in collinearity. We therefore chose to use the guidelines as a covariate instead of country to evaluate variation. The different values of κ ma from the different adjusted models were compared to the results of the intercept‐only models. The ordinal package within the open‐source software R (version 2018; R Core Team, Vienna, Austria) was used for all the calculations.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.