The PRFSVM model combines three fundamental components for image analysis viz. PSPNet, ResNet50, and Fuzzy SVM. This model is intended to solve several aspects of image analysis, such as image segmentation, classification, and uncertainty handling using Fuzzy SVM approaches. It means that PRFSVM is about more than just integrating these three components; it's about providing a comprehensive solution for a wide range of image analysis.
In this study, we have employed well-established semantic segmentation architectures found in existing literature: PSPNet. The Pyramid Scene Parsing Network (PSPNet) architecture, as introduced by Zhao et al.43 was used. PSPNet is a semantic segmentation network specifically intended for segmenting complex scenarios in which complete global context information is critical for identifying related items. PSPNet, or Pyramid Scene Parsing Network, employs a multi-step strategy to successfully segment leaf diseases44,45. It starts with a maize leaf image as input and then uses a number of convolutional layers to extract complex details from the image. These traits capture information ranging from fine surface aspects of the leaf to broader contextual information. PSPNet, crucially, has a pyramid pooling module that collects features at different spatial scales. This enables the network to grasp both local and global context inside the image, which is critical in distinguishing between disease-affected and healthy regions. Following contextual enrichment, the network performs semantic segmentation, assigning a semantic label to each pixel in the image, indicating whether it refers to a healthy or diseased area. Subsequent post-processing techniques may improve the accuracy of the segmentation results. The end result is a segmented image in which colours or labels distinguish between distinct groups, such as healthy and diseased maize leaf parts. PSPNet's technique is effective for segmenting leaf diseases, which aids in agricultural disease diagnosis and management. Figure 5 shows the architecture of PSPNet.
Architecture of PSPNet.
Here's a simplified illustration of the PSPNet architecture, with a focus on the pyramid pooling module:
To retrieve the output of the pyramid pooling module PM, run the input image 'I' through the PSPNet model 'PN'.
The pyramid pooling module 'PM' gathers context information at many scales by operating on the output of the preceding levels. PSPNet typically divides the feature map into many regions at various scales (for example, 1 × 1, 2 × 2, 3 × 3, and 6 × 6). Average pooling is used to generate a context vector for each region. The context vectors are then concatenated and upsampled to the original resolution. Context information from several scales is concatenated in the feature map.
The pyramid pooling module's concatenated feature map is then fed through further convolutional layers and softmax activation to obtain the final semantic segmentation map.
where 'PD' is the output tensor indicating the probability distribution of each pixel's membership in several semantic classes.
ResNet50, an abbreviation for Residual Network with 50 layers, is a ground-breaking deep convolutional neural network architecture that has considerably advanced the field of computer vision and image recognition. ResNet-50, introduced in 2015 by Kaiming He et al. 46, is unique in its capacity to address the vanishing gradient problem, a long-standing challenge in deep neural networks. The duo operation of "7 × 7 conv 64, stride 2" followed by "3 × 3 max-pooling, stride 2" at the beginning of the ResNet-50 architecture is a critical milestone in picture feature extraction. The operation "7 × 7 conv 64, stride 2" utilizes a 2D convolutional layer with a 7 × 7 kernel and 64 filters, with a stride of 2. This first convolutional layer serves a dual purpose by applying filters to the input image, capturing crucial visual elements while also lowering the spatial dimensions of the feature maps. Following that, the "3 × 3 max-pooling, stride 2" operation refines the feature extraction procedure even further. It down-samples the feature maps by using a 3 × 3 pooling window with a stride of 2, retaining the most prominent features while removing extraneous information. This gradual reduction in spatial dimensions is an important part of ResNet50's architecture, as it allows the network to learn increasingly abstract and sophisticated properties in succeeding layers. Figure 6 shows the layering architecture of ResNet50 and Table Table11 shows the Layers parameters of ResNet50.
Layering architecture of RESNet50.
Layers parameters of ResNet50.
Here's a simplified representation of the ResNet-50 feature extraction process:
Deviations:
The feature extraction procedure is illustrated below:
Fuzzy Support Vector Machines (Fuzzy SVM) are employed in determining the severity of maize leaf diseases, especially when disease severity labels are not strictly binary but include a range of severity levels. Traditional SVMs are incapable of dealing with nuanced and inaccurate severity evaluations. Fuzzy SVM, on the other hand, adds fuzzy logic to the SVM framework, allowing for the representation of uncertainty in severity labels. This is useful in the context of maize leaf diseases, where disease severity might range from mild to severe in different cases. Fuzzy SVM assigns membership values to each severity level, capturing a sample's degree of belonging to various classes or severity categories.
Fuzzy SVM uses fuzzy membership functions to indicate the degree to which data points belong to distinct classes 47. The Gaussian membership function, which is frequently used in Fuzzy SVM, assigns a membership value to each data point in each class and has the following Eq. (8):
In this Eq. (8), the degree to which data point x belongs to class i is denoted by wij(x). It is computed by taking the Euclidean distance between x and the mean of the Gaussian function for class i, scaled by the spread parameter .
Fuzzy SVM seeks a hyperplane that maximizes the margin between classes while taking into account fuzzy memberships and slack variables ( ij):
here
In this modified objective function, w denotes the hyperplane's weight vector, and c the bias term. The fuzzy slack variables ( ) are introduced to allow certain data points to fall inside the margin of error or even be misclassified, reflecting the data's fuzzy character.
Fuzzy SVM's decision function uses fuzzy memberships and dual variables () obtained during optimisation to forecast new data points in Eq. 10.
Here, denotes the dual variables determined throughout the optimisation procedure, and denotes the class labels. The decision function calculates a weighted sum of fuzzy memberships for each class. The dual variables () indicate the relevance of each data item and its membership in class i. The resulting value aids in classifying fresh data items based on their fuzzy affiliations.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.