The CellCognition installation instructions are given in Additional file 9.
The general steps for the analysis with CecogAnalyzer—the graphical user interphase of CellCogniton—are described in detail by Sommer and Gerlich and Held et al. [51, 52]. A schematic drawing of the machine learning pipeline is shown in Fig. 4.
The machine learning pipeline for analysis of microscopy data. Reproduced with permission from [51]
The image pre-processing is aimed at clearing away artifacts produced by the microscope or camera [51]. In this step, typically, smoothing filters are used to remove pixel noise and cellular signal intensity levels are normalized by image flat-field correction [51].
The objects of interest, which form the basis for classification, are then detected by image segmentation using the object detection parameters (Additional file 10) [53]. Object detection is mainly based on pixel intensities, shape information and distance between objects. A nuclear marker (e.g. Hoechst) is generally used for the primary object detection, in order to distinguish individual cells.
Secondary object regions are derived for a secondary marker (fluorescent dye, e.g. pHrodo™ Red) on the basis of the primary segmentation marker. These secondary regions are expanded areas around the primary regions corresponding to the nuclear marker. The dimensions of these expanded areas are specified based on typical cell size and commonly include the entire cytoplasmic area.
The gray-value normalization is essential to object detection as it can exclude background/noise signal and ensure at the same time that no signal is lost. The lower range value (arbitrary units) used for the normalization is the value in the 16 bit image which corresponds to 0 in the 8 bit image, whereas the higher range value corresponds to 255 in the 8 bit image. These values are generally chosen based on fluorescence intensity measurements of the background and maximum intensity values for a few randomly chosen time-lapse images.
The third step of the pipeline, feature extraction, is performed for the objects of interest (cells) and the corresponding fluorescence channels. The main features extracted are size, circularity, geometry and texture. Advanced statistical features used for this step are described in the Additional file 11 [53].
The support vector machine classifier can be trained for discrimination of different object classes. The user defines these classes and manually annotates in the classification browser example objects for each class (e.g. cells with different characteristics). Examples are picked by visual observation of the object characteristics and are recorded with the set of features specified in the feature extraction step. For instance, cells in different phases of cell division can be distinguished by defining separate object classes and providing representative examples for each of the respective classes.
The machine learning algorithm is then trained on this example set to discriminate the different object classes. Cross-validation is performed on the training data set to ensure agreement between human and computer annotation.
Only upon ensuring that the machine learning algorithm inferred the rules to discriminate the classes (i.e. low cross-validation error), the full data set is analyzed.
For this study, CellCognition has been adapted to distinguish two classes of cells: cells with internalized NWs and cells with no internalized NWs. The settings, which have been used for the pipeline described in this study, are exhibited in Additional file 9.
In our case, the Hoechst 33342 nuclear stain was used as a reference marker for image segmentation and the primary object detection step. A screenshot of the primary object detection step and the corresponding example is shown in Fig. 5a. The red contours define individual cells and mark their nuclear regions.
Screenshots of the main image analysis steps performed with CecogAnalyzer. a Object detection—primary channel. Object detection processing step for primary channel corresponding to Hoechst 33342 fluorescence. The contours in red correspond to the nuclear region of the cells, and define as such the number of cells per each time frame. b Object detection—secondary channel. Object detection processing step for secondary channel corresponding to pHrodo™Red fluorescence. The contours in green correspond to the area around the nucleus in which the pHrodo™Red signal (displayed in white) can be detected. c Manual annotation in the annotation browser. Examples picked for the two classes “nanonegative” and “nanopositive.” Hoechst 33342 fluorescence in blue, pHrodo™Red signal in red. “Nanopositive” cells indicated by “2”, “nanonegative” cells by “1.” d Automatic classification of cells approximately 3 h post-incubation with NWs. “Nanopositive” cells indicated by the yellow contour. “Nanonegative” cells indicated by the green contour
The secondary object region in this study was defined by an expanded area around the primary region corresponding to the Hoechst 33342 nuclear marker. A screenshot of the object detection for the secondary pHrodo™ Red channel and the corresponding example can be observed in Fig. 5b. The dimensions of this area were specified based on the typical cell size, and secondary object detection was based on the signal from the pHrodo™ Red channel in this expanded area. This signal is displayed in white in Fig. 5b and is found inside the green contours corresponding to the secondary object regions.
It should be noted that for HeLa cells, which are larger than HCT 116 cells, the expansion size was increased. A detailed description of the parameters used for object detection is found in the 4 [53].
For the gray-value normalization of the pHrodo™ Red channel, the range 100–800 was chosen (Min: 100; Max: 800) based on fluorescence intensity measurements of the background and maximum intensity values for a few arbitrarily chosen time-lapse images, using Image J.
By using the respective minimum value of 100, it was ensured that specific signal originating from well-defined spots in the pHrodo™ Red channel was not lost. However, there was a significant decrease in the background noise. As the software requires input from the user for initial annotation, such background noise can affect this process and the overall accuracy of the analysis. However, in the case of the experiment with HCT 116 cells and Ni NWs the background noise was larger than in the other conditions. A minimum value of 200 was used in this case instead of 100. It is our recommendation that for each particular experiment and condition, a different normalization should be performed based on the measurements of the background intensity for few arbitrarily chosen time-lapse images. We also recommend using FACS at early time points after incubation with nanostructures (3 and 6 h) for “validation” of the chosen fluorescence intensity normalization window. The choice of the minimum value in the gray-value normalization step is a critical step in the presented pipeline.
The maximum value of 800 was used for all experiments, following fluorescence intensity measurements of the brightest spots in the pHrodo™ Red channel for a few arbitrarily chosen time-lapse images.
The gray-value normalization for the pHrodo™ Red channel ensured that low intensity signal potentially originating from pHrodo™ Red-tagged NWs bound to the surface of the cell will be discriminated, thus addressing limitations of this dye previously remarked in literature [27].
For the feature extraction step, the basic shape features and convex hull features were not used in the case of the secondary pHrodo™ Red channel; gathering more features does not necessarily improve performance and makes the classification exponentially more complex [51].
Classification is central to machine learning and is the key step in the pipeline presented in this study.
For this project, the support vector machine classifier was trained for the discrimination of two different object classes—“nanopositive” (cells with internalized NWs) and “nanonegative” (cells with no internalized NWs).
These classes by manual annotation of approximately 20–50 example objects for each class, as shown in Fig. 5c. The “nanonegative” class was delimited in green and marked with label “1”, and the “nanopositive” class in yellow with label “2”.
As shown in Fig. 5c, the examples were picked by visual observation of the object characteristics. The “nanopositive” class is defined by the red fluorescence signal associated with the blue nuclear reference marker and found in the respective expanded region, while the “nanonegative” class is defined by the absence of such signal.
Cross-validation was performed on the training data set and additional examples were picked to ensure an agreement between human and computer annotation of at least 97 % accuracy.
Upon ensuring that the machine learning algorithm inferred the rules to discriminate the classes, the full data set was analyzed. Each full data set consisted of 145 image frames corresponding to 145 different time points within a 24 h time interval.
Figure 5d shows an example classification performed automatically by the program for an image frame corresponding to approximately 3 h post-incubation with NWs. The “nanopositive” cells are displayed in yellow and the “nanonegative” cells in green.
The object counts for the two classes previously defined, representing the number of “nanopositive” and “nanonegative” cells, were output by the software in a.txt file, for each time point of the 24 h time-lapse. The respective object counts allowed for a quantification of the number of cells with internalized NWs. Different areas with distinct populations of cells within the same cell dish were used for the analysis. The percentages of “nanopositive” cells for each area were calculated, averaged and reported with standard deviation.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.