Next Article in Journal
Fuzzy Approximating Metrics, Approximating Parametrized Metrics and Their Relations with Fuzzy Partial Metrics
Next Article in Special Issue
Differential Evolution with Group-Based Competitive Control Parameter Setting for Numerical Optimization
Previous Article in Journal
Distributional Chaos and Sensitivity for a Class of Cyclic Permutation Maps
Previous Article in Special Issue
Performance of an Adaptive Optimization Paradigm for Optimal Operation of a Mono-Switch Class E Induction Heating Application
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimizing a Multi-Layer Perceptron Based on an Improved Gray Wolf Algorithm to Identify Plant Diseases

1
College of Information Technology, Jilin Agricultural University, Changchun 130118, China
2
College of Computer Science and Technology, Jilin University, Changchun 130118, China
3
College of Foreign Languages, Jilin Agricultural University, Chunchun 130118, China
4
Center for Artificial Intelligence, Jilin University of Finance and Economics, Changchun 130118, China
*
Author to whom correspondence should be addressed.
Qiaoyun Tian has made a major contribution to this paper.
Mathematics 2023, 11(15), 3312; https://doi.org/10.3390/math11153312
Submission received: 15 June 2023 / Revised: 7 July 2023 / Accepted: 12 July 2023 / Published: 27 July 2023
(This article belongs to the Special Issue Evolutionary Computation 2022)

Abstract

:
Metaheuristic optimization algorithms play a crucial role in optimization problems. However, the traditional identification methods have the following problems: (1) difficulties in nonlinear data processing; (2) high error rates caused by local stagnation; and (3) low classification rates resulting from premature convergence. This paper proposed a variant based on the gray wolf optimization algorithm (GWO) with chaotic disturbance, candidate migration, and attacking mechanisms, naming it the enhanced gray wolf optimizer (EGWO), to solve the problem of premature convergence and local stagnation. The performance of the EGWO was tested on IEEE CEC 2014 benchmark functions, and the results of the EGWO were compared with the performance of three GWO variants, five traditional and popular algorithms, and six recent algorithms. In addition, EGWO optimized the weights and biases of a multi-layer perceptron (MLP) and proposed an EGWO-MLP disease identification model; the model was tested on IEEE CEC 2014 benchmark functions, and EGWO-MLP was verified by UCI dataset including Tic-Tac-Toe, Heart, XOR, and Balloon datasets. The experimental results demonstrate that the proposed EGWO-MLP model can effectively avoid local optimization problems and premature convergence and provide a quasi-optimal solution for the optimization problem.

1. Introduction

The agrochemical network reported that an informal meeting of EU agriculture ministers was held in Prague. The conference’s theme was food security, the role of European agriculture, and food in global sustainable food production. Conflicts between Russia and Ukraine, the lingering effects of the COVID-19 pandemic, and rising climate change are majorly impacting global food security and prices. As the world’s population grows, more food must be produced to increase sustainable agricultural production and reduce food waste.
Farmers face crop disease management and prevention issues [1]. Due to the diversity and complexity of soybean diseases, they are more likely to appear in large areas under certain conditions, resulting in escalating soybean yield reductions. The impact of soybean disease increases with scale, but identifying and evaluating the crop’s final yield through improved and enhanced disease models remain a challenge [2]. During the growth of soybean, a variety of leaf diseases often occur, affecting the yield and quality of the soybean, which are also major factors in destroying crop health and causing genetic mutations. Cell mutations or tissue damage can lead to reduced yields and even crop extinction [3]. Agricultural diseases can seriously affect crop growth and threaten food security. Timely spraying of pesticides can control the spread of diseases and is one of the main measures to reduce losses [4]. Soybean acreage and planting methods are constantly changing. Conservation tillage operations such as intercropping, environmental disease of straw, and returning farmland to farmland are increasing, which makes it more difficult to predict and control diseases [5]. Rapid and accurate diagnosis of crop diseases will enable measures to be taken to improve their overall management and the effective prevention and control of the diseases [6]. Soybean leaf diseases are shown in Figure 1.
The gradual emergence of food crop diseases has attracted significant attention from various countries. Therefore, conducting in-depth research on the identification and effective and accurate control of crop diseases is of great significance. Learning-based image processing techniques are often used for crop disease diagnosis and recognition [7]. Traditional image recognition technology based on manually collecting image features will affect the model’s overall classification and recognition performance [8]. Agricultural intelligent detection based on the Internet of Things (IoT) and artificial intelligence (AI) is aimed at monitoring and detecting diseases [9].
With the development of group intelligence, artificial intelligence, and intelligent agriculture, scholars have researched crop disease identification, crop image processing based on computer technology, and crop disease identification technology [10]. Pests and pathogens enter crops and produce externally visible traits. Neural networks use deep convolutional migration to identify foliage diseases of crops [11]. Semantic segmentation uses pictures of dead plant leaves to identify disease [12]. Image processing strategies have recently been used widely and increasingly in agriculture due to their excellent characteristics, including accuracy, speed, and cost sensitivity [13].
Artificial neural networks (ANNs), called neural networks (NNs) or connectivity models, are algorithmic mathematical models with distributed parallel information processing that mimic the behavioral characteristics of animal neural networks. Neural networks have multiple and single layers. Each layer contains several neurons connected by a directed arc with variable weights. The network is trained by iterative learning of known information, which is changed by gradually adjusting the connection weights of neurons to process information and simulate input–output relationships. NNs are commonly used in machine learning [14]. For example, NNs can be used for image identification [15], speech identification [16], and so on, and can be extended to other fields. Computer vision and deep learning advances can predict impending crop disease [17]. Examples of common NNs are backpropagation (BP) [18] convolutional neural networks (CNNs) [19,20]. The most classic neural network is the multi-layer perceptron (MLP) [21]. The advantages of MLP can be summarized as follows: (1) high parallel processing; (2) highly nonlinear global effect; (3) good fault tolerance; (4) an associative memory function; (5) an adaptive solid and self-learning function.
There has been a tendency to use MLP to structure identification models, with remote sensing spectral imaging based on the MLP-CNN classifier [22]. Swarm intelligence optimization algorithms have been applied to optimize the weights and biases of the MLP, such as the PSO-based back propagation learning MLP [23], the application layer attack detection algorithm based on MLP-GA [24], and neural-based electric load forecasting using hybrid feature selection of GA and ACO [25].
The swarm intelligence optimization algorithm is a new computing intelligence technology, shown in Figure 2, which has become the focus of increasingly more scholars. It originates from the simulation of biological evolution processes or behavior in nature. The objective function measures individual adaptability to the environment. The survival of the fittest or individual foraging is compared. It finds the best agents and replaces poorly feasible solutions with reasonable, achievable solutions in the iterative optimization process. Forming a search algorithm with the characteristics of “generation + testing” is a method to solve the optimization problem based on adaptive artificial intelligence technology.
The gray wolf optimizer (GWO) [26] is an efficient optimization algorithm that imitates the wolf hierarchy and has a more vital ability to approximate the global optimum. In the GWO, gray wolves are generally divided into four levels and the α wolf, β wolf, δ wolf, and ω wolf represent the gray wolves of the first, second, third, and fourth levels, respectively. It has been successfully applied to solve feature subset selection [27], optimization of MLP weights and biases [28], etc. Scholars have been studying it and have proposed several variants, such as MGWO [29] and binary GWO with the integration of DE and GWO [30], the hybrid version of GWO and PSO [31]. However, No Free Lunch (NFL) states that no single heuristic algorithm can solve all optimization problems in different scenarios [32], which means newly proposed hybrid optimization algorithms can only solve some problems. Some heuristics can enhance their ability to relieve local optima, but there is still premature convergence [33]. Then, although some of the above variants have been proposed, there is room for improvement. In this regard, the motivations of the present study are as follows.

1.1. Motivations behind the Present Work

Although the GWO algorithm has certain advantages in some aspects, it also has some disadvantages:
1. Slow convergence speed;
2. Affected by the initial parameters;
3. Poor effect on high-latitude problems.
Therefore, the performance of the GWO should be guaranteed to become infinitely close to the global optimal solution and alleviate local stagnation.
  • The performance of the GWO should be ensured to become infinitely close to the global optimal solution.
  • Local stagnation should be relieved.
  • The possible defects of MLP include overfitting, difficulty in determining the optimal structure, long training times, and the ease of falling into a locally optimal solution. The improved GWO-MLP alleviates these problems and increases the optimization capability of the MLP to improve the classification rate and alleviate local stagnation.

1.2. Contribution of This Study

In order to further improve the performance of the GWO algorithm and make up for its shortcomings, the weight and biases of the MLP are modified by adding the enhanced GWO. Introducing MLP into the GWO algorithm can bring the advantages of a non-linear modeling ability, a better generalization ability, automatic learning of feature representation, and a faster optimization speed. The following is a summary of the specific contributions:
  • EGWO with the non-linear change of parameter a contributes to the balance between the exploration and exploitation capability.
  • The introduction of the chaotic disturbance mechanism is conducive to search diversity.
  • The candidate migration mechanism ensures the accuracy of the global optimal solution to strengthen the global convergence ability.
  • The attacking mechanism guarantees a trade-off between the exploration and exploitation capabilities.
  • The EGWO-MLP model is built to identify crop disease.
The remainder of this paper is organized as follows. Section 2 presents previous studies about GWO, MLP, and ptheir ractical applications. MLP, GWO, and the proposed enhanced gray wolf optimizer (EGWO) are introduced in Section 3. Section 4 explains the EGWO-MLP model. Section 5 presents the experiments and dataset. Section 6 analyzes and discusses the experimental results. The soybean identification model is expressed in Section 7. In Section 8, the conclusion of the paper and prospects are presented. The overall structure of the whole study is shown in Figure 3.

2. Literature Review

2.1. Meta heuristic Optimization Algorithms (MOAs)

Meta-heuristic optimization algorithms (MOAs) are a class of universal algorithms that can solve complex optimization problems. These algorithms do not rely on knowledge of specific problem domains but instead use highly exploratory search strategies to find global or near-optimal solutions in the solution space [34]. Tube element heuristic algorithms cannot guarantee finding the global optimal solutions, but they have been proven effective in achieving excellent results in many practical applications. In addition, compared with the traditional deterministic algorithm, meta-heuristic algorithms have the advantages of good parallel performance, strong global search ability, and adaptability to the problem structure. Computational techniques that derive inspiration from physical or biological logical phenomena can solve optimization problems. These can be divided into four categories: algorithms (a) based on physics, (b) based on evolution, (c) based on population, and (d) based on humans.
(1)
Evolutionary computation class algorithms
This idea is mainly used to optimize the search space continuously by simulating biological evolution and genetic mechanisms [35,36]. Standard evolutionary computation algorithms include genetic algorithms, differential evolution algorithms, etc.
(2)
Swarm intelligence optimization algorithm
This algorithm mainly simulates the behavior of certain natural groups, such as ant colonies, bird colonies, fish colonies, etc., treating the search space as an “ecosystem” to achieve a global optimal search through collaborative action. Common swarm intelligence algorithms include the Ant Algorithm (AA) [37], particle swarm optimization (PSO) [38], the Artificial Fish Swarm Algorithm (AFSA) [39], etc.
(3)
Physics-based optimization algorithm
Physics-based meta-heuristic optimization algorithms are a class of heuristic algorithms inspired by physical phenomena and principles [40]. These algorithms simulate physical processes or physical behaviors in nature to solve optimization problems. Common physics-based meta-heuristic optimization algorithms are as follows: Simulated Annealing (SA) [41], the Gravitational Search Algorithm (GSA) [42], the Atomic Search Algorithm (ASO) [43], Big-Bang Big-Crunch (BBBC) [44], etc. These physics-based meta-heuristic optimization algorithms are flexible, easy to implement, and applicable to various optimization problems. They can be widely used in continuous, discrete, and combinatorial optimization, etc., with global search and fast convergence abilities to some degree.
(4)
Human-based optimization algorithms
Human-based meta-heuristic optimization algorithms are a class of heuristic algorithms that are inspired by the way humans think and behave. These algorithms attempt to solve optimization problems by simulating human cognition, learning, and decision-making processes. Some of the human-based techniques include Forensic-Based Investigation Optimization (FBIO) [45], Political Optimizer (PO) [46], and Heap-Based Optimizer (HBO) [47]. These human-based meta-heuristic optimization algorithms are inspired by human intelligence and behavior with global search and fast convergence abilities to some degree. They can be widely used in continuous, discrete, and combinatorial optimization, and have achieved good results in solving complex problems.

2.2. Improved GWO and Its Application

The GWO algorithm is a meta-heuristic optimization algorithm based on group behavior inspired by the social behavior of gray wolves. This algorithm has been widely used and researched to solve various optimization problems. Since the GWO algorithm was proposed, it has been used in many fields and achieved remarkable results. The GWO algorithm is simple and easy to implement and adjust. The algorithm can find the global optimal solution in the search space by combining random search and the social behavior strategy. The GWO algorithm has been widely used in continuous, discrete, and combinatorial optimization problems. Researchers have made many improvements to and extensions of the GWO algorithm to improve its performance and adaptability. For example, introducing methods such as adaptive parameter control, chaotic search strategies, local search mechanisms, and multi-group structures enhances the convergence and searchability of the algorithm. The GWO algorithm has become an effective tool for solving optimization problems and has received extensive attention in both theoretical research and practical applications.
  • Mechanism Innovation
To ensure a precise approximation to the global optimum, an algorithm named mGWO is proposed. The exploration and exploitation capabilities are balanced by adjusting the operator to improve the search accuracy for the global optimal solution [29]. Furthermore, to solve the premature convergence and local stagnation problems, a gray wolf optimizer (DSGWO) based on diversity-enhanced strategies, such as group competition mechanisms and exploration–exploitation balance mechanisms, was proposed to improve the performance of the GWO [48]. A collaboration-based hybrid optimizer called chHGWOSCA was introduced to balance exploration and mining capabilities. This optimizer combines the gray wolf optimization algorithm and the sine–cosine algorithm (SCA). It improves the parameter a to enhance global fusion to inspire inspiration and improve the global fusion effect [49]. A hybrid GWO–sine cosine algorithm (SCA) (GWOSC) has been proposed [50]. GWO (gray wolf optimization) and the SCA (Sine Cosine Algorithm) are responsible for different tasks in different update stages to balance exploration and mining capabilities. Inspired by the gaze-cue behavior of wolves, two novel search strategies have been developed, including Neighbor Gaze-Cue Learning (NGCL) and Random Gaze-Cue Learning (RGCL) [51]. In addition, a weighted distance gray wolf optimizer (wdGWO) has been proposed, where the weighted sum of the best positions is used instead of just simple positions [52]. An improved algorithm called the exploration-enhanced GWO (EEGWO) algorithm, which improved exploration capabilities, was developed [53]. In addition, a nonlinear control parameter strategy is used to control the balance between the exploration and exploitation capability.
  • Practical Application
To optimize the Radio Resource Management (RRM), a fractional GWO-CS optimization model integrating the GWO with the cuckoo search (CS) algorithm has been proposed, which optimizes parameters like the power spectral density (PSD), the transmission power, and the sensing bandwidth (SB) [54]. GWO has been combined with a gated recurrent unit (GRU) neural network to compensate for the reduced calibration time and ensure the fibre optic gyroscope’s (FOG) calibration accuracy [55]. The GWO algorithm combining the Integer Wavelet Transform (IWT) reduces information loss regarding image steganography [56]. Then, the inverse of IWT and unsigncryption algorithms is utilized to extract the secret image from the stego image. A functional composition integration approach has been adopted to develop a hybrid meta-heuristic algorithm called HGWO-PSO [57]. HGWO-PSO combines the advantage of GWO algorithms and particle swarm optimization (PSO) and considers Clerc’s parameter setting to improve decision making in the oil and gas industry. A hybrid GWO and differential evolution algorithm (HGWODE) is proposed by cooperating with the GWO and DE algorithm to balance exploration and exploitation [58].
  • Feature Selection
A modified Gray Wolf optimizer is proposed by introducing the ReliefF algorithm and Coupla entropy in the initialization process to improve the quality of the initial population [59]. In addition, two new search strategies are adopted into the GWO to enhance the search more flexibly and avoid local stagnation [58]. The present study proposes a feature selection model for NIDS, which is based on particle swarm optimization (PSO), GWO, firefly optimization (FFA), and the genetic algorithm (GA) [60]. It aims to improve the performance of NIDS and shows promising results in terms of the false positive rate (FPR). Then, a binary version of a hybrid two-stage multi-objective FS method based on PSO and GWO is proposed. The first goal is to minimize the classification error rate, and the second is to reduce the number of selected features. The proposed FS method performs more efficiently and effectively than other meta-heuristic, statistical, and multi-objective FS methods [61]. A multi-strategy ensemble GWO (MEGWO) has been proposed. The proposed MEGWO incorporates three search strategies to update the solutions [62]. The present study proposes a binary hybrid GWO and Harris Hawks Optimization (HHO) to form a memetic approach called HBGWOHHO. Then, the continuous search space into a binary one to satisfy the feature selection based on the sigmoid function [63]. A binary method is developed into a two-phase multi-objective FS approach, based on PSO and GWO, which is applied to feature selection [64].
  • Optimization of the Artificial Neural Network (ANN)
The GWO optimizes the weights of the networks to reduce the error. The GWO-ANN model is developed to improve the accuracy, which compares PSO, multiple linear regression (MLR), and nonlinear regression (NLR) models [65]. Two machine learning techniques have been adopted, i.e., ANN and GWO, to predict the level of road crash severity [66]. The proposed approach is a novel hybrid machine learning model that combines an ANN and the augmented gray wolf optimizer (AGWO). Based on the experimental findings, the suggested ANN-AGWO can be utilized as a high-performance tool. The fuzzy C-means clustering algorithm (FCM) has been used to cluster historical electricity consumption data. At the same time, the FCM-GWO-BP neural network model is proposed to predict the energy consumption [67]. An improved GWO based on Levy flight has been proposed to help GWO jump out of local stagnation. Iit can be applied to train backpropagation (BP) [68].
  • Algorithm Optimization of the MLP
An MLP has been used with a genetic algorithm (MLP-GA) to estimate the detection efficiency using metrics [24]. An electrical conductivity (EC) model was constructed based on the hybrid machine learning model of MLP-GWO. The hybrid MLP-GWO model has potential implications in precision agriculture [69]. GWO-MLP, PSO-MLP, and SSA-MLP have been proposed to be trained on different objective functions [70]. Two classes of algorithms, including bio-inspired and gradient-based algorithms, have been adopted to train the MLP for pattern classification [71]. An artificial immune network (opt-aiNet), PSO, and an evolutionary algorithm (EA) were used to train MLP networks. In addition, the standard backpropagation with momentum (BPM), a quasi-Newton method (DFP), and the modified scaled-conjugate gradient (SCGM) were used to train MLP networks [72]. The PSO algorithm was adopted to optimize the feedforward ANN. This method was designed to predict the maximum power point of a photovoltaic array, training feedforward neural networks (FNNs) with GSA, PSO, and GSA to alleviate local stagnation and accelerate convergence [73].

3. Method

3.1. Multi-Layer Perception

Multi-Layer perceptron (MLP) is also known as an ANN. It is a feed-forward structured artificial neural network that maps a set of input vectors to a group of output vectors, which can be seen in Figure 4. In addition to the input and output layers, MLP has multiple hidden layers in the middle. MLP is an extension of the perceptron, which overcomes the weakness that the perceptron cannot recognize linearly inseparable data. The simplest MLP has only one hidden layer, and a three-layer structure. MLP is fully connected with the input, hidden, and output layers, which have a hidden layer that can learn an arbitrary non-linear function of the input (with an infinite number of hidden nodes). MLP networks consist of multiple layers of neurons related to each other through directed connections, forming a directed graph-like structure. Each node fully connects with the next layer of nodes. Each node is a neuron with a non-linear activation function except for the input node. The MLP network is trained to improve its performance in supervised learning using the backpropagation algorithm.
An MLP is composed of multi-layer directional connected neurons [74], it has nonlinear, high parallelism, anti-noise, fault-tolerance, and high generalization abilities to learn [75], and it has been widely used in many practical problems, such as nonlinear discriminants. If an MLP is used for regression, it can approximate the nonlinear input function. Furthermore, any function with continuous inputs and outputs can be approximated by an MLP. Novel optimization algorithms are suitable for training neural networks. These algorithms are based on sequential operator splitting techniques for specific related dynamical systems [76]. MLP training aims to find an optimal set of weights and bias parameters that minimize the mean squared error (MSE) value, which is essential for the optimization process [77]. In other words, an efficient MLP can be constructed to minimize a given error criterion by continuously adjusting and updating the weights and biases parameters.
The relationship of the MLP network layer can be expressed as i j k , where i acts as a subscript to the upper neuron or as an input node. j acts as a subscript to the current layer of neurons or to the hidden layer of neurons. k serves as a subscript to the next layer of neurons or the output layer of neurons.
The weighted sum h can be computed by Equation (1).
h j = i = 0 m w i j x j
where w i j represents the weight of each neuron in the previous layer to the current neuron, and w j k represents the weight of the current neuron and the next layer of neurons, that is, the weight of the neuron k. h j represents the sum of all input weights for the current node.
The output value of a hidden layer neuron is computed by Equation (2).
a j = g ( h j ) = g ( i = 0 M ) w i j x i j
where a j represents the output value of a hidden layer neuron. g ( h j ) represents an activation function, w is the weight, and x is the input. a j = x j k , i.e., the output value of the current neuron is equal to the input value of the next neuron. It can be computed by Equation (3).
y = a k = g ( h k ) = g i = 0 M w j k x i j
where y denotes the value of the output layer, which is the final result. h k represents the sum of the input weights of neurons in the output layer.
Each layer has an activation function. The sigmoid function is expressed by Equation (4).
g ( h ) = σ ( h ) = 1 1 + e h
where the derivative of the sigmoid function can be computed by Equation (5).
σ ( x ) = σ [ 1 σ ( x ) ]
To u the output layer weight w j k , the gradient descent method can be used for the loss function, which can be expressed by Equation (6). w j k can be obtained by Equation (7).
w j k w j k η E w j k
w j k = w j k η σ o ( k ) a i
where w j k is the output value of the previous layer, that is, the input value of the output layer x i . The hidden layer term can be updated by Equation (8).
h ( j ) = g ( h j ) ( k = 1 N σ o ( k ) w j k )
The hidden layer weights v j ( w i j ) can be updated by Equation (9).
v j = v j η a j ( 1 a j ) ( k = 1 N σ o ( k ) w j k ) a i
where a j is the output value of the current neuron, a j = g ( h j ) . a j is the input value of the neuron in the current layer (the output value of the previous layer). When there are hidden layers with multiple layers, the weight can be computed by Equation (10).
v j = v j η a j ( 1 a j ) ( k = q N σ h ( k ) w j k ) a i
The final output can be computed by Equation (11).
p = a i j w i j = h j

3.2. Gray Wolf Optimizer

3.2.1. Principle of Motion

The gray wolf optimizer (GWO) is a new swarm-based algorithm that simulates the behavior and leadership roles in a sub-society of a pack of wolves and is inspired by the social behavior of gray wolves, such as the hierarchy and hunting mechanisms [26]. There are four types of wolves in a wolf pack: α wolf makes every decision and is responsible for the survival of all members of the pack; β wolves, with a social status in the pack after α wolf; δ wolves, which are responsible for caring among pups or all packs; and ω wolves, with the lowest social status in the pack. Gray wolves encircling their prey and marching during the hunting process are modeled using the above relationship. The basic GWO includes the following three main processes.
  • To track and approach prey.
  • To harass, chase, and surround prey until the prey stops moving.
  • To attack the prey.
According to the division of the above levels, α wolves have absolute control over β , δ , and ω wolves; β wolves have absolute control over δ and ω wolves. δ wolves have absolute control over ω wolves. They play a key role in the main hunting process. Many ω wolves are usually in the best position to attack their prey. The whole optimization process can be divided into two stages:
The first stage is to surround the prey in response to its surrounding mechanisms (Equation (12))
D = | C × X p ( t ) X ( t ) |
X ( t + 1 ) = X p ( t ) A × D
where t is the current number of iterations. x p ( t ) is the position of the prey (equivalent to α , β , δ , ω wolves). r 1 and r 2 are random values in the range of [0, 1]. Furthermore, coefficient vectors A and C are computed by Equations (14) and (15).
A = 2 a × r 1 a
C = 2 × r 2
where A and C are condition vectors. X represents the position vector of the wolf. Furthermore, A is a random value in the range −2 a and 2 a . The iterative process randomly selects r 1 and r 2 in the normal range of 0 to 1. The component of a decreases linearly from 2 to 0 by Equation (16).
a = 2 l 2 M a x I t e r
where l is the current iteration and MaxIter is the max iterations.
In the search process, the position migration of the α , β , δ wolves is calculated according to Equations (17)–(19).
D = | C 1 × X a X |
D β = | C 2 × X β X |
D δ = | C 3 × X δ |
The second stage: The α wolf dominates the whole process, while the β and δ wolves are also involved. When hunting, they lead ω wolves to update their position, and other wolves move randomly when looking for prey. Equations (20)–(24) measure the location of the prey and search around the prey until they finally find it. During this process, they always maintain a high level of coordination and cooperation to ensure the successful capture of the prey.
D δ = | C δ × X δ X i |
X 1 = X α A α × ( D α )
X 2 = X β A β × ( D β )
X 3 = X δ A δ × ( D δ )
X ( t + 1 ) = x 1 + x 2 + x 3 3
The final position is anywhere inside the circle. The primary goal of the ω wolf is to update their position according to α , β , and δ wolves.

3.2.2. Insufficiency of the Algorithm

The GWO differs from other optimization algorithms in that it draws inspiration from the social and predation behaviors of gray wolves and simulates the cooperation and competition mechanisms of gray wolf groups. It employs specific search strategies for information exchange and knowledge sharing through direct communication. In contrast, other optimization algorithms may have different sources of inspiration, search strategies, and communication methods. In addition, the parameter settings of the GWO are relatively simple and one only needs to determine the initial population size and the upper and lower limits of the search range. However, the GWO also has some shortcomings. First of all, it is sensitive to the selection of the initial solution, and the quality of the initial solution may affect the final optimization performance of the algorithm. Second, the convergence speed of GWO may be fast in some cases, but it may also be unstable, leading to a locally optimal solution. In addition, for high-dimensional problems, GWO faces challenges because the gray wolf behavior simulation may need to be adapted to searches in high-dimensional spaces. Although the GWO has unique characteristics and advantages, its shortcomings must also be considered. In practical applications, selecting a suitable optimization algorithm according to specific problems and requirements is necessary.

3.3. The Proposed Enhanced Gray Wolf Optimizer Algorithm (EGWO)

Although the GWO takes advantage of the parameters to strike a balance between exploration and exploitation, it still needs to solve the problem of suboptimal solution stagnation and premature convergence, leading to the algorithm’s slow convergence. In the GWO algorithm, α , β , and δ wolves guide ω wolves to attack prey, where it is assumed that α wolves are in the best position for the prey position. During the search, α , β , and δ wolves are selected from the population, while the remaining wolves are treated as ω wolves and can be relocated to improve algorithm performance. However, this mechanism has some defects, quickly leading to premature convergence and local stagnation of the algorithm. This paper proposes an improved version of the GWO algorithm named enhanced gray wolf optimization algorithm (EGWO) to solve the above problem. Although the GWO takes advantage of parameters to strike a balance between exploration and exploitation, it still needs to solve the problem of suboptimal solution stagnation and premature convergence, leading to the algorithm’s slow progress.
First, to improve the parameter a, change the parameter a from a linear change to a non-linear change, to achieve a balance between exploration and exploitation. Parameter a is calculated by Equation (25).
a = 2 × e ( 2 × t m a x i t e r )

3.3.1. Chaotic Disturbance

Chaotic disturbance can effectively ensure the global diversity of the algorithm and enhance its exploration ability. Iterative mapping is added to the algorithm. Iterative mapping refers to approaching a certain target by repeatedly applying the mapping function until certain conditions are met, or a certain convergence is achieved. We can add adaptability and optimization capabilities to the algorithm by introducing iterative mapping to handle complex problems better. According to Figure 5, as the number of iterations increases or decreases, it also follows an irregular change. This change can effectively update the size and direction of wolf steps, avoiding falling into local optima during migration. The chaotic coefficient is calculated using Equations (26) and (27).
k ( t + 1 ) = s i n ( 0.7 × π k ( t ) )
G ( t ) = ( k ( t ) + 1 ) × 100 2
where k ( t ) is the parameter under the t-th iteration and the G ( t ) is the chaos mapping parameter.

3.3.2. Candidate Migration Mechanism

In the GWO algorithm, there is a situation where the optimal solution is lost, and the accuracy of the optimal solution is not guaranteed. A candidate mechanism is introduced computing by Equations (28) and (29). During the candidate mechanism, the center points of the three wolves are added to construct a candidate tribe pool based on the three wolves and the center positions of the the three wolves. The above mechanism can ensure that the optimal solution is not lost while avoiding local stagnation, promoting the possibility of local stagnation avoidance during the update process.
C a n d = ( X α + X β + X δ ) 3
C a n d P o o l = [ X α , X β , X δ , C a n d ]
where C a n d is the position of the three wolves, C a n d P o o l is the candidate pool with the α , β , and δ wolves and Cand wolf. α , β , and δ wolves play a crucial role in the wolf pack migration. Equations (30)–(35) can update the position migration.
D α = | a × X α X i | × ( G ( t ) × ( r a n d 0.5 ) 10 )
X 1 = C a n d P o o l r , j A 1 × D α
D β = | a × X β X i | × ( G ( t ) × ( r a n d 0.5 ) 10 )
X 2 = C a n d P o o l r , j A 1 × D β
D δ = | a × X δ X i | × ( G ( t ) × ( r a n d 0.5 ) 10 )
X 3 = C a n d P o o l r , j A 1 × D δ
where C a n d P o o l r , j is the random wolf from the candidate pool. G ( t ) is the mapped parameter based on the chaotic map. The direction parameter ( G ( t ) × ( r a n d 0.5 ) / 10 ) can determine the update direction and step length.

3.3.3. Attacking Mechanism

As the iteration updates, the attack mechanism also changes accordingly. Different attack methods can enhance the diversity of wolf packs, and two other attack methods can effectively improve the algorithm’s exploration ability. When the update phase is less (1/2 × m a x i t e r a t i o n ), select Equation (36) as the attacking method, and when the update phase is more (1/2 × m a x i t e r a t i o n ), the attacking method can be computed by Equation (37).
W i , j = W 1 × X 1 + W 2 × X 2 + W 3 × X 3
X i , j = X 1 + X 2 + X 3 3
where F is the total fitness based on the three wolves by Equation (38), w 1 , w 2 , and w 3 are the weights computed by Equation (39), and X i , j is the attacking position.
F = F α + F β + F δ
W 1 = F α F ; W 2 = F β F ; W 3 = F δ F
where F α , F β , and F δ are the fitness of the α , β , and δ wolves.
Based on the above mechanism, we propose a new algorithm named EGWO. In the original GWO algorithm, α wolf is the starting point for random initialization and weights, which can converge to ω wolf but also tend to fall into local optimal solutions. By introducing the new mechanism, the ability of wolves in global and local search is demonstrated. To summarize, the mechanism advantages can be expressed as follows:
Firstly, the non-linear change in parameter a can balance the exploration and exploitation capability.
Secondly, the chaotic disturbance mechanism can manage the step direction and length, which contributes to search diversity.
Thirdly, the candidate migration mechanism updates the position of three wolves to promote the search to jump out of local stagnation, ensure the accuracy of the global optimal solution, and strengthen the global convergence ability.
Fourth, introducing an attacking mechanism can effectively strengthen the exploration capability and ensure a balance with the exploration capability.

3.4. Computational Complexity of EGWO Algorithm

The computational complexity of the EGWO algorithm is described through two aspects: time complexity and space complexity. The above aspects are important factors in evaluating the performance of an algorithm.
(1) Time complexity
The number of particles (N), the number of iterations (t), and the cost of function evaluation (c) are the important factors affecting the time complexity. We must fully integrate their effects to obtain an accurate time complexity evaluation. It can be seen that the time complexity of the EGWO is equal to the GWO, which is maintained constant by Equations (40) and (41).
O ( G W O ) = O ( t N c )
O ( E G W O ) = O ( t N c )
(2) Space complexity
For space complexity, only the initial stage, i.e., the entire search space, is considered. Then, the space complexity of the EGWO is O ( n ) .

4. Combing EGWO with Multi-Layer Perceptron (EGWO-MLP)

4.1. EGWO-MLP Optimization Model

The training process is divided into four steps: preprocessing, learning, evaluation, and prediction. The first step is preprocessing. Firstly, the data are preprocessed and handled for better use in subsequent modeling and analysis. Secondly, the data are divided into training data and testing data. The second step is the learning process. The optimization algorithm continuously optimizes MLP to avoid local stagnation in the modeling process. The third step is evaluating the obtained model using evaluation criteria such as MSE. The final step is to predict the results and the final experimental results. The identification process can be seen in Figure 6.
The purpose of training a network via the proposed EGWO algorithm is to determine the weights and biases. The obtained weights and biases compute the expected network’s output value when the network presents different inputs. The EGWO-MLP identification model aims to identify plant disease types. Hidden layer structure features and dynamic weight parameter adjustment make it more accurate in identifying the disease types. The EGWO-MLP identification model structure consists of input, hidden, and output layers. The process includes normalization processing, input determination, output, hidden units, training parameter settings, network model creation, activation function calls, etc. The output is the identification result. If the output of the test samples satisfies the training sample’s expectation, the learning ends. The overall process is depicted in Figure 7, which can be summarized in five steps.
Step 1: Initialization stage of wolf pack. N wolves can be generated by the proposed EGWO algorithm.
Step 2: Weights and biases mapping phase. The solution (position) of the generated wolves via the EGWO algorithm is allocated as weights and biases.
Step 3: Update location phase. α , β , and δ wolves are computed by Equations (30)–(35).
Step 4: Evaluation phase. Evaluate the performance of EGWO algorithm in training the MLP network using MSE standards.
Step 5: Iterative update phase. The EGWO algorithm continuously updates the weight and bias terms of the MLP network until the termination condition is reached.

4.2. Encoding Mechanism

The weights and biases of the MLP are obtained by the proposed EGWO algorithm, which is the best position ( α wolf). The parameter ( α wolf) can train the MLP network based on continuous updating and iterating. The position can be mapped as θ = I w , h w , h b , O b , where I w means the weights of the input nodes and h w are the weights of the hidden nodes. Furthermore, h b represents the biases of the hidden layer and O b are the biases of the output layer. The wolf can be encoded as the weights and biases shown in Figure 8.

4.3. Evaluation Criteria

The mean square error (MSE) is the loss function commonly used in regression tasks. The statistical parameter is the value of the sum of squared errors at the corresponding points of the predicted data and the original data. The difference between the actual and expected values is the criterion for evaluating the training algorithm. The MSE is widely used in tasks such as linear and multiple linear regression. In deep learning, the MSE is also used to measure the performance of neural networks in regression tasks and as a loss function for optimization. When using the MSE as a loss function for optimization, optimization algorithms such as gradient descent are usually used to minimize the value of the MSE and thus improve the model’s performance. The smaller the difference, the better the algorithm trained and the closer the expected value is to the actual value. The MSE is a standard metric for evaluating MLPs; it is widely used and calculated by Equation (42).
M S E = i = 1 m ( o i k d i k ) 2
where m is the number of outputs and d i k and o i k are the expected and actual outputs for ith inputs using k-th training samples. The MLP performance is evaluated by the average M S E ¯ of all training samples to be effective on the entire set of training samples, t. M S E ¯ is the average M S E calculated by Equation (43).
M S E ¯ = k = 1 s i = 1 m ( o u k d i k ) 2 s
Training an MLP model requires considering the effects of several variables and functions, and in the EGWO algorithm, the M S E ¯ is calculated by Equation (44). This equation involves the number of training samples s and requires consideration of several factors, including the network structure and parameter tuning. In order to obtain better training results, we need to reasonably adjust and optimize these factors to minimize the M S E and improve the generalization ability of the model.
m i n i m i z e : F ( V ) = M S E ¯
In addition to using the MSE to measure model performance, the classification accuracy can also be used to evaluate model performance. For categorical datasets, during model training, we need to plot the sample data by Equation (45) and classify them according to different categories. The accuracy metric’s advantage is that it provides a more intuitive picture of the model’s classification ability and helps us better understand its performance and optimization direction. Therefore, when training and optimizing the model, we need to consider both metrics together for a more comprehensive evaluation result.
A c c u r a c y r a t e = N u m b e r o f c o r r e c t l y c l a s s i f i e d o b j e c t s N u m b e r o f o b j e c t s i n t h e d a t a s e t

4.4. Selection of Activation Function

This paper selects the activation function of training MLP as the Sigmoid function. It has an exponential function shape, closest to biological neurons in a physical sense, and is a standard S-type function in biology, also known as an S-type growth curve in Figure 9. In addition, the output of (0, 1) can also be expressed as a probability or used for normalization of the input. The Sigmoid function is a commonly used function in machine learning [21], and it is the most widely used type of activation function, which is expressed in Equation (46).
S i g m o i d = 1 1 + e z
The reasons why the Sigmoid function is widely used are summarized as follows:
  • Its derivative reduces decay and dilution errors, signal problems, oscillation problems, and asymmetrical input problems. When this function is used to solve problems, it can be used for category classification and is suitable for prediction [78].
  • The segmented linear recursive approximation method calculates the Sigmoid function and its derivatives in artificial neurons. This method helps the neuron to estimate the Sigmoid function and its derivatives more accurately during the learning process so that the neuron can better process the input data and output the correct results [79].

5. Experimental Preparation

5.1. Experimental Setting

  • Experimental environment. The experiment codes are executed in Matlab R2015b under the Windows 10 operating system, all simulations were run on a computer with an Intel (R) Core(TM) i5-9300 CPU @ 2.40 GHz, and its memory is 8G. Thirty runs for each working access the predictive performances. The population is set to 30, and the max iteration is 500 for the IEEE CEC 2014 benchmark functions and 100 iterations for the UCI dataset to verify the EGWO and EGWO-MLP.
  • Data processing. In order to eliminate the dimensional impact between indicators, data are standardized to achieve comparability between data indicators. After the original data are standardized, all indicators are on the same order of magnitude so that comprehensive processing can occur. The experiment in this paper will process the data to the range of [0, 1], which uses the method of Min-Max normalization. Min-Max normalization can be computed by Equation (47).
    x = x x m i n x m a x x m i n
    where x m a x and x m i n are the max and min value of the x current data values, respectively. x is the standardized value.

5.2. Comparison Algorithm Selection

The EGWO algorithm is compared with other MLP algorithms, and the algorithm’s ability for disease identification is verified. All parameters of the comparative algorithms are set as shown in Table 1.

5.2.1. GWO Variant

The variant algorithm refers to making improvements or modifications based on the original algorithm to achieve a better performance. By comparing different variant algorithms, we can obtain more choice and diversity to find the most suitable algorithm for a specific problem. Therefore, GWO variants are selected for performance comparison, and their performance can be evaluated more comprehensively by comparing different variant algorithms.
  • Improved gray wolf optimization (IGWO) introduces an adaptive weight mechanism, which can dynamically adjust an individual’s weight according to its fitness value so that individuals with a higher fitness have more significant influence [80]. Through this mechanism, the IGWO algorithm can more effectively explore the search space and speed up the convergence. It has many applications for solving complex optimization problems, parameter optimization, feature selection, etc.
  • Greedy non-hierarchical gray wolf optimizer (G-NHGWO) introduces a greedy strategy to increase the locality of the search [81]. In addition, the method also adopts a non-hierarchical optimization strategy, which avoids the use of fixed weight factors. G-NHGWO can search for the best solution more efficiently in practical problems and provide more accurate and reliable optimization results.
  • Weighted distance gray wolf optimizer (WdGWO) introduces the concept of a weighted distance, which measures how close an individual wolf is to the current global best solution [52]. The WdGWO algorithm exploits the notion of social hierarchy among gray wolves to guide the search for promising regions of the solution space.

5.2.2. Traditional Algorithms

Traditional optimization algorithms have been widely used in research and have good performance and effects in many problem areas. Traditional optimization algorithms have a high interpretability and reliability.
  • Particle swarm optimization (PSO) has a more vital ability to explore the solution set space for non-convex optimization problems. It is relatively simple, and the calculation process is separated from the problem model. As a population-based meta-heuristic algorithm, PSO is applicable to distributed computing and can effectively improve the computing power. Its speed (update) mechanism, inertia, and other factors can be well optimized for parameter optimization in ANNs [38]
  • Differential evolution (DE) is a heuristic random search algorithm based on population differences. The differential evolution algorithm has the advantages of simple principles, few controlled parameters, and strong robustness [82].
  • The Bat Algorithm (BA) is an optimization algorithm for simulating bat swarm behavior, which has multiple advantages such as parallelism, an adaptive search strategy, diversity maintenance, a relatively simple implementation, powerful global search capability, and adaptability. Its adaptability can adjust the search strategy according to the characteristics of the problem and improve the robustness and global search ability. Randomness and exploration operations are introduced to maintain population diversity, avoid falling into local optimal solutions, and provide more comprehensive search space coverage [83].
  • The Tree-seed Algorithm (TSA) has a simpler structure, a higher search accuracy, and a stronger robustness than some traditional intelligent optimization algorithms [84].
  • The Sine-Cosine optimization algorithm (SCA) is a random optimization algorithm that is highly flexible, simple in principle, and easy to implement. It can be easily applied to optimization problems in different fields [85].
  • The Butterfly Optimization Algorithm (BOA) solves global optimization problems by mimicking the food searching and mating behavior of butterflies [86]. The design framework of the algorithm is mainly based on the foraging strategy of butterflies looking for nectar or mating partners, in which butterflies use their sense of smell to determine the target location. The BOA algorithm draws on this foraging behavior and combines the characteristics of optimization algorithms to provide an effective solution for complex global optimization problems.

5.2.3. Recent Algorithms

Over time, new algorithms emerge, often with more advanced techniques and better performance. Using recently emerged algorithms as benchmarks for comparative experiments can provide more accurate and unbiased evaluation criteria. By comparing with the latest algorithm, the advantages and disadvantages of the proposed algorithm can be evaluated more objectively.
  • The Spider Wasp Optimizer (SWO) is inspired by the hunting, nesting, and mating behavior of female spider wasps in nature [87]. Furthermore, it shows promising results in solving various optimization problems with different exploration and development requirements through various unique update strategies.
  • The Zebra Optimization Algorithm (ZOA) is a heuristic optimization algorithm that simulates the behavior of zebra swarms and is used to solve optimization problems [88]. The ZOA algorithm searches for the optimal solution in the solution space by imitating the foraging and migration strategies of the zebra population. The core idea of the ZOA is to regard the candidate solution of the problem as an individual zebra and search by simulating the foraging and migration behavior of zebra.
  • The Reptile Search Algorithm (RSA) is inspired by the hunting behavior of crocodiles [89]. The implementation of the algorithm includes two key steps: encirclement and hunting. This makes the RSA algorithm adaptable to different optimization problems and have better exploration and exploitation capabilities.
  • The Brown-bear Optimization Algorithm (BOA) is inspired by the communication patterns of pedal scent marking and sniffing behavior of brown bears, and utilizes the communication behavior characteristics of brown bears and provides an effective solution by simulating their strategies in finding food and marking territory [90].
  • The Osprey Optimization Algorithm (OOA) mimics the behavior of ospreys in nature and is mainly inspired by an osprey’s strategy when fishing at sea [91]. Ospreys detect the location of their prey, hunt it down, and bring it to a suitable place to enjoy it. OOA algorithms can efficiently solve various optimization problems and balance exploration and exploitation during the search process.
  • The Cheetah Optimizer (CO) is proposed by simulating cheetahs’ hunting behavior and related strategies. The cheetah optimizer can effectively solve various optimization problems and adapt to complex environments [92].

5.3. Standard Test Set

5.3.1. IEEE CEC 2014 Benchmark Functions

The IEEE CEC 2014 benchmark functions are a benchmark test set for evaluating the performance of optimization algorithms [93] provided by the 2014 IEEE Congress on Evolutionary Computation (CEC) competition, which contains a total of 30 standard continuous optimization problems shown in Table 2, Table 3, Table 4 and Table 5, covering many different types of functions. The IEEE CEC 2014 benchmark functions are designed to evaluate optimization algorithms’ global search abilities, convergence speeds, accuracies, and robustness. It is used to compare the performance of different algorithms and improve and optimize their optimization algorithms. The IEEE CEC 2014 benchmark functions have become one of the standard benchmarks for evaluating the performance of optimization algorithms. It provides a fair and reliable platform for researchers to compare and verify the capabilities and effects of optimization algorithms.

5.3.2. University of California, Irvine Dataset (UCI Dataset)

• XOR dataset
The XOR dataset is a classic binary classification problem widely used in machine learning and neural networks [21]. This dataset contains three input features, eight training and testing samples, and one output. The simplicity and nonlinear separability of the XOR dataset make it a standard benchmark for evaluating and validating classification algorithms. It can help researchers and practitioners test and compare the classification capabilities of different algorithms, especially models such as deep learning and neural networks, when dealing with nonlinear problems.
• Balloon dataset
The Balloon dataset is a small dataset for classification problems and has many applications in machine learning and data mining. This dataset contains four features, 16 training and testing samples, and covers two categories [21]. Using the Balloon dataset, researchers can build and test the performance of various classification algorithms, including decision trees, support vector machines, neural networks, and more. The small size of this dataset makes it useful for quickly validating algorithms and debugging classification models. At the same time, the Balloon dataset can also help beginners understand and master fundamental classification problems and algorithms.
• Tic-Tac-Toe dataset
The Tic-Tac-Toe dataset is a commonly used classification dataset for solving the classification problems of the game of tic-tac-toe [21]. It contains nine features and two categories, where the number of training samples is 637 and the number of testing samples is 300. Using the Tic-Tac-Toe dataset, researchers can build and evaluate the performance of various classification algorithms, such as decision trees, logistic regression, support vector machines, etc. This dataset is moderate in size and can be used to quickly verify the accuracy and reliability of the algorithm and provide players with a reference for the next best move position. At the same time, the Tic-Tac-Toe dataset can also be used for teaching purposes to help beginners understand and master the basic concepts and methods of classification problems.
• Heart dataset
The Heart dataset is a commonly used medical dataset for predicting whether a patient has heart disease [21]. This dataset contains 22 features and two categories, the number of training samples is 80, and the number of test samples is 187. Using the Heart dataset, researchers can better understand and predict the occurrence and risk factors of heart disease and provide clinicians with a basis for auxiliary decision making. In addition, the Heart dataset can also be used for teaching purposes to help students learn and master the basic concepts and techniques of cardiac classification.

6. Analysis and Discussion of Experimental Results

6.1. Analysis and Discussion of Results on IEEE CEC 2014 Benchmark Functions

• Comparison of EGWO with GWO variants
In order to reflect on the advantages of the EGWO optimization algorithm, this part selects the variants of the GWO algorithm, such as GNHGWO, IGWO, and WdGWO. It can be seen from the experimental results in Table 6 that EGWO and GWO have the same average ranking and total ranking and still have certain advantages compared with the other three algorithms. Although EGWO and GWO have the same ranking, it can be seen from Figure 10 that EGWO has a faster convergence speed than the GWO algorithm. Meanwhile, the Friedman ANOVA test and the Wilcoxon rank sum test are selected, and the experimental results shown in Table 7 demonstrate that it is significantly different from the other three variants of GNHGWO, IGWO, and WdGWO.
• Comparison of the GWO with traditional algorithms
Regarding the applicability and interpretability of the EGWO algorithm, this part selects more traditional and popular algorithms, such as GA, BOA, SCA, TSA, and JAYA algorithms. It can be seen from the results in Table 8 that EGWO can achieve the first average ranking and overall ranking. In order to ensure the accuracy of the experimental results, the Friedman ANOVA test and the Wilcoxon rank sum test have verified that EGWO can clearly be distinguished from and compared to other algorithms in Table 9. In addition, the convergence curves of EGWO and GA, BOA, SCA, TSA, and JAYA algorithms, as shown in Figure 11, reflect that the EGWO algorithm can quickly converge and avoid local stagnation.
• Comparison of GWO with recent algorithms
In order to ensure the novelty and performance superiority of the proposed EGWO algorithm, proposed algorithms from the past three years were selected for comparison, such as ZOA, RSA, SWO, BOA, CO, and OOA algorithms. It can be seen from Table 10 that the EGWO algorithm can achieve the best experimental results and ranks first. The Friedman ANOVA test and Wilcoxon rank sum test shown in Table 11 also prove that the EGWO algorithm is superior to the six recently proposed algorithms. Figure 12 shows that EGWO can obtain the fastest convergence speed and smoothly approach the global optimal solution.
Through the above three types of experiments, compared with GWO variants and traditional and popular algorithms proposed in recent years, the experimental results show that the algorithm can achieve the best global optimal solution, and statistical tests also verify the effectiveness of the algorithm’s performance and efficiency. The introduction of the mechanism promotes the algorithm to prevent the loss of the optimal solution when the particles migrate and avoid the local optimum simultaneously. The perturbation mechanism ensures the diversity of searches during the three wolves’ migration process and improves the exploration performance. The candidate solution mechanism can ensure the ability of exploitation to achieve a balance between exploration and exploitation. The mutual assistance of various mechanisms can strengthen the ability of the EGWO algorithm to find the global optimal solution when solving single-mode, multi-mode, and complex function problems and ensure the exploration and exploitation capabilities of the EGWO algorithm.

6.2. Analysis and Discussion of Results on UCI Dataset

The Tic-Tac-Toe dataset includes nine input features and two categories. According to the results in Table 12, the EGWO-MLP algorithm achieved the highest classification accuracy on this dataset. For the best mean square error (MSE) and standard deviation (Std.), EGWO-MLP ranks first. The Heart dataset includes 22 features and two categories, and GWO-MLP performs the best, followed by EGWO-MLP, but it still outperforms other comparison algorithms. The XOR dataset contains three input features and one output. By observing the MSE and Rate values in Table 12, it can be seen that the EGWO algorithm achieves the highest classification accuracy and the smallest MSE value on the XOR dataset and Balloon dataset, which contains four input features and two categories. According to the results in Table 12, in terms of the classification rate, the EGWO algorithm has a similar performance to GWO and DE when compared with other algorithms but it is higher than other algorithms. At the same time, EGWO also has a specific stability in terms of the average mean square error (MSE) and Std. The training of three datasets shows that the EGWO algorithm has certain advantages in training multi-layer perceptrons and can more stably find the global optimal solution. The introduction of the mechanism prompts EGWO to avoid local optimum and enhances the exploration ability while ensuring the exploitation ability.

6.3. Advantages and Disadvantages

Through the two different tests mentioned above, the exploration and exploration capabilities of the EGWO algorithm were verified, and its advantages can be summarized as follows:
  • The EGWO algorithm can quickly find the global optimal solution in solving the single-mode simple function problem but can ensure the accuracy of the optimal solution.
  • The EGWO-MLP model has apparent advantages in solving multi-classification problems, such as a fast convergence and a strong stability, ensuring a high classification rate.
Although EGWO can have the above advantages, there are still certain shortcomings:
  • In combinatorial function problems, local stagnation occurs when searching for the global optimal solution.
  • The performance of the EGWO-MLP model in solving single classification problems is not very significant.

7. EGWO-MLP Identification Model

This study investigates the optimization performance of agricultural disease identification using the EGWO algorithm. EGWO trains the MLP to build disease identification models by updating the weights and biases to optimize them continuously. The performance of MLP is improved, and an improvement in the classification rate and a reduction in the error rate are realized.

7.1. Soybean (Large) Dataset

This paper uses the soybean (Large) dataset from the University of California (UCI) Machine Learning Repository dataset, a machine learning database. It is a commonly used standard test dataset. The soybean (Large) dataset has 599 datasets, and this number is still increasing. The soybean (Large) dataset with the 35 attributes shown in Table 13 was used to judge the type and disease of crops and test the model’s accuracy. The data source is shown in the Table 13’s footer. Furthermore, the data were divided into a test and training set at 70% and 30%, respectively.

7.2. Identification Model (EGWO-MLP) Parameter Setting

The EGWO-MLP identification model consists of an input, a hidden, and an output layer. The input layer of EGWO-MLP selects 35 attributes from the UCI soybean (Large) dataset, including the date and germination as input nodes. The hidden layer is set to ( 2 × n u m b e r o f i n p u t s + 1 ) nodes. In the identification model in this paper, the 35 influencing factors use soybean as the input of the model, with one hidden layer with a hidden node. Finally, 19 nodes are used as the output of the model.

7.3. Experimental Analysis

To verify the accuracy of the proposed EGWO-MLP model in identifying soybean disease, some models were accumulated for comparison, such as PSOGWO-MLP, DE-MLP, TSA-MLP, PSO-MLP, BA-MLP, GA-MLP, and SCA-MLP. The final classification and error rates are the criteria to evaluate the performance of the EGWO-MLP model.
It can be seen in Table 14 that the final classification rate of EGWO-MLP is the highest. The EGWO-MLP model can identify diseases more accurately than other methods. This demonstrates that the EGWO algorithm has a strong exploration ability and the ability to reduce local optima. Regarding M S E ¯ , the value obtained by EGWO-MLP ranks first. It can be proven that the EGWO algorithm has a robust searchability. The Std. value of EGWO is the smallest, so it can be obtained that the EGWO-MLP model is the most stable and can be effectively applied to the problem of disease identification. In conclusion, the EGWO algorithm can enhance the classification rate and performance of the soybean disease model. The training process of multilayer perceptron leads to EGWO having a strong exploration ability, avoiding local stagnation, and effectively updating the weights and biases of MLP to improve the classification rate.
The EGWO-MLP model has certain advantages in disease identification and is superior to PSOGWO-MLP, DE-MLP, TSA-MLP, PSO-MLP, BA-MLP, GA-MLP, and SCA-MLP, as shown in Figure 13. The EGWO-MLP model can obtain a high classification rate and strong robustness. The error rate proves that EGWO has a robust global search ability, effectively balances exploration and exploitation, avoids local stagnation, and improves the convergence speed.

8. Conclusions

It is difficult for existing models to predict actual crop diseases accurately. Therefore, this paper proposes a disease recognition model based on algorithms and MLP. GWO is a swarm intelligence optimization algorithm widely used in many fields, but some shortcomings remain. We proposed the EGWO algorithm based on chaotic disturbance, candidate mechanisms, and attacking mechanisms. In addition, EGWO is used to optimize the weight and bias items of MLP to construct the EGWO-MLP model.
In order to test the effectiveness of the proposed EGWO algorithm, the IEEE CEC 2014 benchmark functions were compared with the variants of the GWO algorithm, traditional and popular algorithms, and algorithms proposed in recent years. The experimental results show that the EGWO algorithm has certain advantages and stability. In addition, the proposed EGWO-MLP model was verified with four standard classification datasets (XOR, Balloon, Heart, and Tic-Tac-Toe datasets). The experimental results prove that the performance of EGWO-MLP is better than other algorithms, including GWO TSA, PSO, BA, GA, and SCA. Wilcoxon rank sum tests and Friedman ANOVA tests, two statistical tests, proved that at 95% and 90% confidence levels, the EGWO algorithm and other algorithms have apparent advantages.
Meanwhile, the Soybean dataset was selected to verify the effectiveness of disease identification. Compared with PSOGWO-MLP, DE-MLP, TSA-MLP, PSO-MLP, BA-MLP, GA-MLP, and SCA-MLP, EGWO-MLP has a certain degree of accuracy, and it can effectively manage and control crop diseases when they occur.
However, some limitations affect the prediction accuracy, such as the number and attributes of the UCI data referenced by the existing data. After that, a neural network (CNN) was chosen to predict crop diseases based on the dataset. At the same time, we can also select more effective swarm intelligence technologies, optimize and adjust the neural network structure, adjust the parameters, and improve the recognition accuracy.

Author Contributions

Methodology, Q.T. and J.J.; Software, Q.T. and X.M.; Validation, Q.T.; Writing—review & editing, Q.T. and X.M.; Supervision, C.B., H.C., H.W. and W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ratnadass, A.; Fernandes, P.; Avelino, J.; Habib, R. Plant species diversity for sustainable management of crop pests and diseases in agroecosystems: A review. Agron. Sustain. Dev. 2012, 32, 142–149. [Google Scholar] [CrossRef] [Green Version]
  2. Donatelli, M.; Magarey, R.D.; Bregaglio, S.; Willocquet, L.; Whish, J.P.; Savary, S. Modelling the impacts of pests and diseases on agricultural systems. Agric. Syst. 2017, 155, 213–224. [Google Scholar] [CrossRef]
  3. Wrather, J.; Anderson, T.; Arsyad, D.; Tan, Y.; Ploper, L.D.; Porta-Puglia, A.; Ram, H.; Yorinori, J. Soybean disease loss estimates for the top ten soybean-producing counries in 1998. Can. J. Plant Pathol. 2001, 23, 115–121. [Google Scholar] [CrossRef]
  4. Qin, W.; Xue, X.; Zhang, S.; Gu, W.; Wang, B. Droplet deposition and efficiency of fungicides sprayed with small UAV against wheat powdery mildew. Int. J. Agric. Biol. Eng. 2018, 11, 27–32. [Google Scholar] [CrossRef] [Green Version]
  5. Ficke, A.; Cowger, C.; Bergstrom, G.; Brodal, G. Understanding yield loss and pathogen biology to improve disease management: Septoria nodorum blotch—A case study in wheat. Plant Dis. 2018, 102, 696–707. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Kulkarni, O. Crop disease detection using deep learning. In Proceedings of the 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA 2018), Pune, India, 16–18 August 2018. [Google Scholar]
  7. Park, H.; JeeSook, E.; Kim, S.-H. Crops disease diagnosing using image-based deep learning mechanism. In Proceedings of the 2018 International Conference on Computing and Network Communications (CoCoNet 2018), Astana, Kazakhstan, 15–17 August 2018. [Google Scholar]
  8. Xiong, Y.; Liang, L.; Wang, L.; She, J.; Wu, M. Identification of cash crop diseases using automatic image segmentation algorithm and deep learning with expanded dataset. Comput. Electron. Agric. 2020, 177, 105712. [Google Scholar] [CrossRef]
  9. Devi, N.; Sarma, K.K.; Laskar, S. Design of an intelligent bean cultivation approach using computer vision, IoT and spatio-temporal deep learning structures. Ecol. Inform. 2023, 75, 102044. [Google Scholar] [CrossRef]
  10. Barbedo, J.G.A. Plant disease identification from individual lesions and spots using deep learning. Biosyst. Eng. 2018, 180, 96–107. [Google Scholar] [CrossRef]
  11. Chen, J.; Chen, J.; Zhang, D.; Sun, Y.; Nanehkaran, Y.A. Using deep transfer learning for image-based plant disease identification. Comput. Electron. Agric. 2020, 173, 105393. [Google Scholar] [CrossRef]
  12. Goncalves, J.P.; Pinto, F.A.; Queiroz, D.M.; Villar, F.M.; Barbedo, J.G.; Del Ponte, E.M. Deep learning architectures for semantic segmentation and automatic estimation of severity of foliar symptoms caused by diseases or pests. Biosyst. Eng. 2021, 210, 129–142. [Google Scholar] [CrossRef]
  13. Keyvanpour, M.R.; Shirzad, M.B. Machine learning techniques for agricultural image recognition. In Application of Machine Learning in Agriculture; Elsevier: Amsterdam, The Netherlands, 2022; pp. 283–305. [Google Scholar]
  14. Camero, A.; Toutouh, J.; Alba, E. Random error sampling-based recurrent neural network architecture optimization. Eng. Appl. Artif. Intell. 2020, 96, 103946. [Google Scholar] [CrossRef]
  15. Maurya, L.S.; Hussain, M.S.; Singh, S. Machine learning classification models for student placement prediction based on skills. Int. J. Artif. Intell. Soft Comput. 2022, 7, 194–207. [Google Scholar] [CrossRef]
  16. Wang, G.; Sim, K.C. Sequential classification criteria for NNs in automatic speech recognition. In Proceedings of the Twelfth Annual Conference of the International Speech Communication Association, Florence, Italy, 27–31 August 2021. [Google Scholar]
  17. Baser, P.; Saini, J.R.; Kotecha, K. Tomconv: An improved cnn model for diagnosis of diseases in tomato plant leaves. Procedia Comput. Sci. 2023, 218, 1825–1833. [Google Scholar] [CrossRef]
  18. Hush, D.R.; Horne, B.G. Progress in supervised neural networks. IEEE Signal Process. Mag. 1993, 10, 8–39. [Google Scholar] [CrossRef]
  19. Monti, F.; Boscaini, D.; Masci, J.; Rodola, E.; Svoboda, J.; Bronstein, M.M. Geometric deep learning on graphs and manifolds using mixture model CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 22–25 July 2017; pp. 5115–5124. [Google Scholar]
  20. Srivastava, A.; Singh, A.; Tiwari, A.K. An efficient hybrid approach for the prediction of epilepsy using CNN with LSTM. Int. J. Artif. Intell. Soft Comput. 2022, 7, 179–193. [Google Scholar] [CrossRef]
  21. Mirjalili, S. How effective is the grey wolf optimizer in training Multi-Layer Perceptrons. Appl. Intell. 2015, 43, 150–161. [Google Scholar] [CrossRef]
  22. Zhang, C.; Pan, X.; Li, H.; Gardiner, A.; Sargent, I.; Hare, J.; Atkinson, P.M. A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification. ISPRS J. Photogramm. Remote Sens. 2018, 140, 133–144. [Google Scholar] [CrossRef] [Green Version]
  23. Das, H.; Jena, A.K.; Nayak, J.; Naik, B.; Behera, H. A novel PSO based Back Propagation learning-MLP (PSO-BP-MLP) for Classification. In Computational Intelligence in Data Mining-Volume 2, Proceedings of the International Conference on CIDM, 20–21 December 2014; Springer: Berlin/Heidelberg, Germany, 2015; pp. 461–471. [Google Scholar]
  24. Singh, K.J.; De, T. MLP-GA based algorithm to detect application layer DDoS attack. J. Inf. Secur. Appl. 2017, 36, 145–153. [Google Scholar] [CrossRef]
  25. Sheikhan, M.; Mohammadi, N. Neural-based electricity load forecasting using hybrid of GA and ACO for feature selection. Neural Comput. Appl. 2012, 21, 1961–1970. [Google Scholar] [CrossRef]
  26. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Let a biogeography-based optimizer train your Multi-Layer Perceptron. Inf. Sci. 2014, 269, 188–209. [Google Scholar] [CrossRef]
  27. Emary, E.; Zawbaa, H.M.; Grosan, C.; Hassenian, A.E. Feature subset selection approach by Gray-Wolf Optimization. In Proceedings of the Afro-European Conference for Industrial Advancement, Ababa, Ethiopia, 17–19 November 2015; pp. 1–13. [Google Scholar]
  28. Meng, X.; Jiang, J.; Wang, H. AGWO:Advanced GWO in multi-layer perception optimization. Expert Syst. Appl. 2021, 173, 114676. [Google Scholar] [CrossRef]
  29. Mittal, N.; Singh, U.; Sohi, B.S. Modified grey wolf optimizer for global engineering optimization. Appl. Comput. Intell. Soft Comput. 2016, 2016, 7950348. [Google Scholar] [CrossRef] [Green Version]
  30. Zhu, A.; Xu, C.; Li, Z.; Wu, J.; Liu, Z. Hybridizing grey wolf optimization with differential evolution for global optimization and test scheduling for 3d stacked SoC. J. Syst. Eng. Electron. 2015, 26, 317–328. [Google Scholar] [CrossRef]
  31. Kamboj, V.K. A novel hybrid PSO–GWO approach for unit commitment problem. Neural Comput. Appl. 2016, 27, 1643–1655. [Google Scholar] [CrossRef]
  32. Gómez, D.; Rojas, A. An empirical overview of the No Free Lunch Theorem and its effect on real-world machine learning classification. Neural Comput. 2016, 28, 216–228. [Google Scholar] [CrossRef] [Green Version]
  33. Shadkam, E.; Bijari, M. A novel improved cuckoo optimisation algorithm for engineering optimisation. Int. J. Artif. Intell. Soft Comput. 2020, 7, 164–177. [Google Scholar] [CrossRef]
  34. Mostafa Bozorgi, S.; Yazdani, S. IWOA: An improved whale optimization algorithm for optimization problems. J. Comput. Des. Eng. 2019, 6, 243–259. [Google Scholar]
  35. Nadimi-Shahraki, M.H.; Taghian, S.; Mirjalili, S.; Faris, H. MTDE: An effective multi-trial vector-based differential evolution algorithm and its applications for engineering design problems. Appl. Soft Comput. 2020, 97, 106761. [Google Scholar] [CrossRef]
  36. Nadimi-Shahraki, M.H.; Taghian, S.; Zamani, H.; Mirjalili, S.; Elaziz, M.A. MMKE: Multi-trial vector-based monkey king evolution algorithm and its applications for engineering optimization problems. PLoS ONE 2023, 18, e0280006. [Google Scholar] [CrossRef] [PubMed]
  37. Mullen, R.J.; Monekosso, D.; Barman, S.; Remagnino, P. A review of ant algorithms. Expert Syst. Appl. 2009, 36, 9608–9617. [Google Scholar] [CrossRef]
  38. Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
  39. Pourpanah, F.; Wang, R.; Lim, C.P.; Wang, X.Z.; Yazdani, D. A review of artificial fish swarm algorithms: Recent advances and applications. Artif. Intell. Rev. 2023, 56, 1867–1903. [Google Scholar] [CrossRef]
  40. Nadimi-Shahraki, M.H.; Moeini, E.; Taghian, S.; Mirjalili, S. DMFO-CD: A discrete moth-flame optimization algorithm for community detection. Algorithms 2021, 14, 314. [Google Scholar] [CrossRef]
  41. Rutenbar, R.A. Simulated annealing algorithms: An overview. IEEE Circuits Devices Mag. 2023, 5, 1867–1903. [Google Scholar] [CrossRef]
  42. Rashedi, E.; Nezamabadi-Pour, H.; Saryazdi, S. GSA: A gravitational search algorithm. Inf. Sci. 2023, 179, 2232–2248. [Google Scholar] [CrossRef]
  43. Li, L.L.; Chang, Y.B.; Tseng, M.L.; Liu, J.Q.; Lim, M.K. Wind power prediction using a novel model on wavelet decomposition-support vector machines-improved atomic search algorithm. J. Clean. Prod. 2020, 270, 121817. [Google Scholar] [CrossRef]
  44. Erol, O.K.; Eksin, I. A new optimization method: Big bang–big crunch. Adv. Eng. Softw. 2006, 37, 106–111. [Google Scholar] [CrossRef]
  45. Shaheen, A.M.; Ginidi, A.R.; El-Sehiemy, R.A.; Ghoneim, S.S. A forensic-based investigation algorithm for parameter extraction of solar cell models. IEEE Access 2020, 9, 1–20. [Google Scholar] [CrossRef]
  46. Askari, Q.; Younas, I.; Saeed, M. Political Optimizer: A novel socio-inspired meta-heuristic for global optimization. Knowl.-Based Syst. 2020, 159, 105709. [Google Scholar] [CrossRef]
  47. Askari, Q.; Saeed, M.; Younas, I. Heap-based optimizer inspired by corporate rank hierarchy for global optimization. Expert Syst. Appl. 2020, 161, 113702. [Google Scholar] [CrossRef]
  48. Jiang, J.; Zhao, Z.; Liu, Y.; Li, W.; Wang, H. Dsgwo: An improved grey wolf optimizer with diversity enhanced strategy based on group-stage competition and balance mechanisms. Knowl.-Based Syst. 2022, 250, 109100. [Google Scholar] [CrossRef]
  49. Duan, Y.; Yu, X. A collaboration-based hybrid GWO-SCA optimizer for engineering optimization problems. Expert Syst. Appl. 2023, 213, 119017. [Google Scholar] [CrossRef]
  50. Singh, N.; Singh, S. A novel hybrid GWO-SCA approach for optimization problems. Eng. Sci. Technol. 2017, 20, 1586–1601. [Google Scholar] [CrossRef]
  51. Nadimi-Shahraki, M.H.; Taghian, S.; Mirjalili, S.; Zamani, H.; Bahreininejad, A. GGWO: Gaze cues learning-based grey wolf optimizer and its applications for solving engineering problems. J. Comput. Sci. 2022, 61, 101636. [Google Scholar] [CrossRef]
  52. Malik, M.R.S.; Mohideen, E.R.; Ali, L. Weighted distance grey wolf optimizer for global optimization problems. In Proceedings of the 2015 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC 2015), Madurai, India, 10–12 December 2015. [Google Scholar]
  53. Long, W.; Jiao, J.; Liang, X.; Tang, M. An exploration-enhanced grey wolf optimizer to solve high-dimensional numerical optimization. Eng. Appl. Artif. Intell. 2018, 68, 63–80. [Google Scholar] [CrossRef]
  54. Kannan, K.; Yamini, B.; Fernandez, F.M.H.; Priyadarsini, P.U. A novel method for spectrum sensing in cognitive radio networks using fractional GWO-CS optimization. Ad Hoc Netw. 2023, 103135. [Google Scholar] [CrossRef]
  55. Wang, F.; Zhao, S.; Wang, L.; Zhou, Y.; Huang, T.; Shu, X. Study on FOG scale factor error calibration in start-up stage based on GWO-GRU. Measurement 2023, 206, 112214. [Google Scholar] [CrossRef]
  56. Lim, S.-J. Hybrid image embedding technique using Steganographic Signcryption and IWT-GWO methods. Microprocess. Microsyst. 2022, 95, 104688. [Google Scholar]
  57. Ocran, D.; Ikiensikimama, S.S.; Broni-Bediako, E. A compositional function hybridization of PSO and GWO for solving well placement optimization problem. Pet. Res. 2022, 7, 401–408. [Google Scholar]
  58. Yu, X.; Jiang, N.; Wang, X.; Li, M. A hybrid algorithm based on grey wolf optimizer and differential evolution for UAV path planning. Expert Syst. Appl. 2014, 215, 119327. [Google Scholar] [CrossRef]
  59. Pan, H.; Chen, S.; Xiong, H. A high-dimensional feature selection method based on modified Gray Wolf optimization. Appl. Soft Comput. 2023, 135, 110031. [Google Scholar] [CrossRef]
  60. Almomani, O. A feature selection model for network intrusion detection system based on pso, gwo, ffa and ga algorithms. Symmetry 2020, 12, 1046. [Google Scholar] [CrossRef]
  61. Dhal, P.; Azad, C. A multi-objective feature selection method using Newton’s law based PSO with GWO. Appl. Soft Comput. 2021, 107, 107394. [Google Scholar] [CrossRef]
  62. Tu, Q.; Chen, X.; Liu, X. Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Appl. Soft Comput. 2019, 76, 16–30. [Google Scholar] [CrossRef]
  63. Al-Wajih, A.R.; Abdulkadir, S.J.; Aziz, N.; Al-Tashi, Q.; Talpur, N. Hybridbinary grey wolf with harris hawks optimizer for feature selection. IEEE Access 2021, 9, 31662–31677. [Google Scholar] [CrossRef]
  64. Abdollahzadeh, B.; Gharehchopogh, F.S. A multi-objective optimization algorithm for feature selection problems. Eng. Comput. 2022, 38, 1845–1863. [Google Scholar] [CrossRef]
  65. Nikoo, M.; Malekabadi, R.A.; Hafeez, G. Estimating the mechanical properties of Heat-Treated woods using Optimization algorithms-based ANN. Measurement 2023, 207, 112354. [Google Scholar] [CrossRef]
  66. Astarita, V.; Haghshenas, S.S.; Guido, G.; Vitale, A. Developing new hybrid grey wolf optimization-based artificial neural network for predicting road crash severity. Transp. Eng. 2023, 12, 100164. [Google Scholar] [CrossRef]
  67. Tian, Y.; Yu, J.; Zhao, A. Predictive model of energy consumption for office building by using improved GWO-BP. Energy Rep. 2020, 6, 620–627. [Google Scholar] [CrossRef]
  68. Amirsadri, S.; Mousavirad, S.J.; Ebrahimpour-Komleh, H. A levy flightbased grey wolf optimizer combined with back-propagation algorithm for neural network training. Neural Comput. Appl. 2018, 30, 3707–3720. [Google Scholar] [CrossRef]
  69. Mosavi, A.; Samadianfard, S.; Darbandi, S.; Nabipour, N.; Qasem, S.N.; Salwana, E.; Band, S.S. Predicting soil electrical conductivity using multilayer perceptron integrated with grey wolf optimizer. J. Geochem. Explor. 2021, 220, 106639. [Google Scholar] [CrossRef]
  70. Al-Badarneh, I.; Habib, M.; Aljarah, I.; Faris, H. Neuro-evolutionary models for imbalanced classification problems. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 2787–2797. [Google Scholar] [CrossRef]
  71. Pasti, R.; de Castro, L.N. Bio-inspired and gradient-based algorithms to train MLPs: The influence of diversity. Inf. Sci. 2009, 179, 1441–1453. [Google Scholar] [CrossRef]
  72. Al-Majidi, S.D.; Abbod, M.F.; Al-Raweshidy, H.S. A particle swarm optimisation-trained feedforward neural network for predicting the maximum power point of a photovoltaic array. Eng. Appl. Artif. Intell. 2020, 92, 103688. [Google Scholar] [CrossRef]
  73. Mirjalili, S.; Hashim, S.Z.M.; Sardroudi, H.M. Training feedforward neural networks using hybrid particle swarm optimization and gravitational search algorithm. Inf. Sci. 2014, 218, 11125–11137. [Google Scholar] [CrossRef]
  74. Heidari, A.A.; Faris, H.; Aljarah, I.; Mirjalili, S. An efficient hybrid multilayer perceptron neural network with grasshopper optimization. Soft Comput. 2019, 23, 7941–7958. [Google Scholar] [CrossRef]
  75. Azzini, A.; Tettamanzi, A.G. Evolutionary ANNs: A state of the art survey. Intell. Artif. 2011, 25, 19–35. [Google Scholar] [CrossRef]
  76. Alecsa, C.D.; Pinţa, T.; Boros, I. New optimization algorithms for neural network training using operator splitting techniques. Neural Netw. 2020, 126, 178–190. [Google Scholar] [CrossRef] [Green Version]
  77. Ridge, B.; Gams, A.; Morimoto, J.; Ude, A. Training of deep neural networks for the generation of dynamic movement primitives. Neural Netw. 2020, 127, 121–131. [Google Scholar]
  78. Zhang, R.; Wang, Q.; Yang, Q.; Wei, W. Temporal link prediction via adjusted sigmoid function and 2-simplex structure. Sci. Rep. 2022, 12, 16585. [Google Scholar] [CrossRef] [PubMed]
  79. Basterretxea, K.; Tarela, J.M.; del Campo, I. Approximation of sigmoid function and the derivative for hardware implementation of artificial neurons. IEE Proc.-Circuits Devices Syst. 2004, 151, 18–24. [Google Scholar] [CrossRef]
  80. Nadimi-Shahraki, M.H.; Taghian, S.; Mirjalili, S. An improved grey wolf optimizer for solving engineering problems. Expert Syst. Appl. 2021, 166, 113917. [Google Scholar] [CrossRef]
  81. Akbari, E.; Rahimnejad, A.; Gadsden, S.A. A greedy non-hierarchical grey wolf optimizer for real-world optimization. Electron. Lett. 2021, 57, 499–501. [Google Scholar] [CrossRef]
  82. Storn, R.; Price, K. Differential Evolution—A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
  83. Yang, X.-S.; He, X. Bat algorithm: Literature review and applications. Int. J. Bio-Inspired Comput. 2013, 5, 141–149. [Google Scholar] [CrossRef] [Green Version]
  84. Kiran, M.S. TSA: Tree-seed algorithm for continuous optimization. Expert Syst. Appl. 2015, 42, 6686–6698. [Google Scholar] [CrossRef]
  85. Mirjalili, S. SCA: A Sine Cosine algorithm for solving optimization problems. Knowl.-Based Syst. 2016, 96, 120–133. [Google Scholar] [CrossRef]
  86. Arora, S.; Singh, S. Butterfly optimization algorithm: A novel approach for global optimization. Soft Comput. 2019, 23, 715–734. [Google Scholar] [CrossRef]
  87. Abdel-Basset, M.; Mohamed, R.; Jameel, M.; Abouhawwash, M. Spider wasp optimizer: A novel meta-heuristic optimization algorithm. Artif. Intell. Rev. 2023, 1–64. [Google Scholar] [CrossRef]
  88. Trojovská, E.; Dehghani, M.; Trojovský, P. Zebra optimization algorithm: A new bio-inspired optimization algorithm for solving optimization algorithm. IEEE Access 2022, 10, 49445–49473. [Google Scholar] [CrossRef]
  89. Abualigah, L.; Abd Elaziz, M.; Sumari, P.; Geem, Z.W.; Gandomi, A.H. Reptile Search Algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Syst. Appl. 2022, 191, 116158. [Google Scholar] [CrossRef]
  90. Prakash, T.; Singh, P.P.; Singh, V.P.; Singh, S.N. A novel Brown-bear optimization algorithm for solving economic dispatch problem. In Advanced Control & Optimization Paradigms for Energy System Operation and Management; River Publishers: New York, NY, USA, 2023; pp. 137–164. [Google Scholar]
  91. Dehghani, M.; Trojovský, P. Osprey optimization algorithm: A new bio-inspired metaheuristic algorithm for solving engineering optimization problems. Front. Mech. Eng. 2023, 8, 1126450. [Google Scholar] [CrossRef]
  92. Akbari, M.A.; Zare, M.; Azizipanah-Abarghooee, R.; Mirjalili, S.; Deriche, M. The cheetah optimizer: A nature-inspired metaheuristic algorithm for large-scale optimization problems. Sci. Rep. 2022, 8, 10953. [Google Scholar] [CrossRef] [PubMed]
  93. Liang, J.J.; Qu, B.Y.; Suganthan, P.N. Problem Definitions and Evaluation Criteria for the CEC 2014 Special Session and Competition on Single Objective Real-Parameter Numerical Optimization; Technical Report; Computational Intelligence Laboratory, Zhengzhou University: Zhengzhou, China; Nanyang Technological University: Singapore, 2013; Volume 635. [Google Scholar]
Figure 1. Soybean diseases.
Figure 1. Soybean diseases.
Mathematics 11 03312 g001
Figure 2. Metaheuristic Optimization Algorithms (MOAs).
Figure 2. Metaheuristic Optimization Algorithms (MOAs).
Mathematics 11 03312 g002
Figure 3. The overall structure of the whole study.
Figure 3. The overall structure of the whole study.
Mathematics 11 03312 g003
Figure 4. Architecture of a multi-Layer perceptron.
Figure 4. Architecture of a multi-Layer perceptron.
Mathematics 11 03312 g004
Figure 5. The chaotic map update with iteration number.
Figure 5. The chaotic map update with iteration number.
Mathematics 11 03312 g005
Figure 6. Training the whole identification model process.
Figure 6. Training the whole identification model process.
Mathematics 11 03312 g006
Figure 7. The process of the EGWO to train MLP.
Figure 7. The process of the EGWO to train MLP.
Mathematics 11 03312 g007
Figure 8. Mapping an EGWO solution to the MLP network.
Figure 8. Mapping an EGWO solution to the MLP network.
Mathematics 11 03312 g008
Figure 9. Sigmoid function S-shaped growth curve image.
Figure 9. Sigmoid function S-shaped growth curve image.
Mathematics 11 03312 g009
Figure 10. Convergence curve of EGWO and its variant algorithms.
Figure 10. Convergence curve of EGWO and its variant algorithms.
Mathematics 11 03312 g010
Figure 11. Convergence curve of EGWO and other traditional algorithms.
Figure 11. Convergence curve of EGWO and other traditional algorithms.
Mathematics 11 03312 g011
Figure 12. Convergence curve of EGWO and other recent algorithms.
Figure 12. Convergence curve of EGWO and other recent algorithms.
Mathematics 11 03312 g012
Figure 13. Box diagram of comparative experimental results.
Figure 13. Box diagram of comparative experimental results.
Mathematics 11 03312 g013
Table 1. The initial parameter settings for the corresponding algorithms.
Table 1. The initial parameter settings for the corresponding algorithms.
AlgorithmsParametersValues
GWO variants
WdGWONullNull
IGWONullNull
GNHGWONullNull
Traditional and popular algorithms
PSOCoefficient of the cognitive component2
Coefficient of the social component2
DEScale factor primary (F)0.6
Crossover rate (Cr)0.8
BALoudness (A)0.5
Pulse rate (a)0.5
Frequency minimum0
Frequency maximum2
TSAST0.1
The number of seeds (ns)[0.1 × N, 0.25 × N]
SCAa2
r 1 Linearly decreased from a to 0
BOAp0.8
power exponent0.1
JAYAsensory modality0.01
NullNull
Traditional and popular algorithms
SWOTR0.3
Cr0.02
Minimum population size20
ZOANullNull
COANullNull
BOANullNull
OOANullNull
COsearch agents in a group2
Table 2. The unimodal benchmark functions of IEEE CEC 2014 benchmark functions.
Table 2. The unimodal benchmark functions of IEEE CEC 2014 benchmark functions.
Name of the FunctionsExpression
Rotated High Conditioned Elliptic Function F 1 ( x ) = f 1 ( M ( x o 1 ) ) + 100
Rotated Bent Cigar Function F 2 ( x ) = f 2 ( M ( x o 2 ) ) + 200
Rotated Discus Function F 3 ( x ) = f 3 ( M ( x o 3 ) ) + 300
Table 3. The multimodal benchmark functions of IEEE CEC 2014 benchmark functions.
Table 3. The multimodal benchmark functions of IEEE CEC 2014 benchmark functions.
Name of the FunctionsExpression
Shifted and Rotated Rosenbrock’s Function F 4 ( x ) = f 4 ( M ( 2.048 ( x o 4 ) 100 ) + 1 ) + 400
Shifted and Rotated Ackley’s Function F 5 ( x ) = f 5 ( M ( x o 5 ) ) + 500
Shifted and Rotated Weierstrass Function F 6 ( x ) = f 6 ( M ( 0.5 ( x o 6 ) 100 ) ) + 600
Shifted and Rotated Griewank’s Function F 7 ( x ) = f 7 ( M ( 600 ( x o 7 ) 100 ) ) + 700
Shifted Rastrigin’s Function F 8 ( x ) = f 8 ( M ( 5.12 ( x o 8 ) 100 ) ) + 800
Shifted and Rotated Rastrigin’s Function F 9 ( x ) = f 8 ( M ( 5.12 ( x o 9 ) 100 ) ) + 900
Shifted Schwefel’s Function F 10 ( x ) = f 9 ( M ( 1000 ( x o 10 ) 100 ) ) + 1000
Shifted and Rotated Schwefel’s Function F 11 ( x ) = f 9 ( M ( 1000 ( x o 11 ) 100 ) ) + 1100
Shifted and Rotated Katsuura Function F 12 ( x ) = f 10 ( M ( 5 ( x o 12 ) 100 ) ) + 1200
Shifted and Rotated HappyCat Function F 13 ( x ) = f 11 ( M ( 5 ( x o 13 ) 100 ) ) + 1300
Shifted and Rotated HGBat Function F 14 ( x ) = f 12 ( M ( 5 ( x o 14 ) 100 ) ) + 1400
Shifted and Rotated Expanded Griewank’s plus Rosenbrock’s Function F 15 ( x ) = f 13 ( M ( 5 ( x o 15 ) 100 ) + 1 ) + 1500
Shifted and Rotated Expanded Scaffer’s F6 Function F 16 ( x ) = f 14 ( M ( ( x o 16 ) ) 1 ) + 1600
Table 4. The hybrid benchmark functions of IEEE CEC 2014 benchmark functions.
Table 4. The hybrid benchmark functions of IEEE CEC 2014 benchmark functions.
Name of the FunctionsExpression
F 17 = f 9 ( M 1 Z 1 ) + f 8 ( M 2 Z 2 ) + f 3 ( M 3 Z 3 ) + 1700 p = [ 0.3 , 0.3 , 0.4 ]
F 18 = f 2 ( M 1 Z 1 ) + f 8 ( M 2 Z 2 ) + f 3 ( M 3 Z 3 ) + 1800 p = [ 0.3 , 0.3 , 0.4 ]
F 19 = f 7 ( M 1 Z 1 ) + f 8 ( M 2 Z 2 ) + f 3 ( M 3 Z 3 ) + f 8 ( M 4 Z 4 ) + 1900 p = [ 0.2 , 0.2 , 0.3 , 0.3 ]
F 20 = f 12 ( M 1 Z 1 ) + f 3 ( M 2 Z 2 ) + f 13 ( M 3 Z 3 ) + f 8 ( M 4 Z 4 ) + 2000 p = [ 0.2 , 0.2 , 0.3 , 0.3 ]
F 21 = f 14 ( M 1 Z 1 ) + f 12 ( M 2 Z 2 ) + f 4 ( M 3 Z 3 ) + f 9 ( M 4 Z 4 ) + f 1 ( M 5 Z 5 ) + 2100 p = [ 0.1 , 0.2 , 0.2 , 0.2 , 0.3 ]
F 22 = f 10 ( M 1 Z 1 ) + f 11 ( M 2 Z 2 ) + f 13 ( M 3 Z 3 ) + f 9 ( M 4 Z 4 ) + f 5 ( M 5 Z 5 ) + 2200 p = [ 0.1 , 0.2 , 0.2 , 0.2 , 0.3 ]
Notes:
Z 1 = [ y s 1 , y s 1 , y s n 1 ]
Z 2 = [ y s n 1 + 1 , y s n 1 + 2 , y s n 1 + n 2 ]
Z N = [ y s i = 1 N 1 n 1 + 1 , y s i = 1 N 1 n 1 + 2 , y s 5 D ]
y = x o i , S = r a n d p e r m ( 1 : D ) , p e r c e n t a g e o f g i ( x )
n 1 = [ p 1 D ] , n 2 = [ p 2 D ] ,…, n N 1 = [ p N 1 D ] , n N = D i = 1 N 1 n + i
Table 5. The composite benchmark functions of IEEE CEC 2014 benchmark functions.
Table 5. The composite benchmark functions of IEEE CEC 2014 benchmark functions.
Name of the FunctionsExpression
F 23 = w 1 F 4 ( x ) + w 2 [ 1 e 6 F 1 ( x ) + 100 ] + w 3 [ 1 e 26 F 2 ( x ) + 200 ]
+ w 4 [ 1 e 6 F 3 ( x ) + 300 ] + w 5 [ 1 e 6 F 1 ( x ) + 400 ] + 2300 σ = [ 10 , 20 , 30 , 40 , 50 ]
F 24 = w 1 F 10 ( x ) + w 2 [ F 9 ( x ) + 100 ] + w 3 [ F 14 ( x ) + 200 ] + 2400 σ = [ 20 , 20 , 20 ]
F 25 = w 1 0.25 F 11 ( x ) + w 2 [ F 9 ( x ) + 100 ] + w 3 [ 1 e 7 F 1 ( x ) + 200 ] + 2500 σ = [ 10 , 30 , 50 ]
F 26 = w 1 0.25 F 11 ( x ) + w 2 [ F 13 ( x ) + 100 ] + w 3 [ 1 e 7 F 1 ( x ) + 200 ]
+ w 4 [ 2.5 F 6 ( x ) + 300 ] + w 5 [ 1 e 6 F 13 ( x ) + 400 ] + 2700 σ = [ 10 , 10 , 10 , 10 , 10 ]
F 27 = w 1 10 F 14 ( x ) + w 2 [ 10 F 9 ( x ) + 100 ] + w 3 [ 2.5 F 1 1 ( x ) + 200 ]
+ w 4 [ 25 F 16 ( x ) + 300 ] + w 5 [ 1 e 6 F 1 ( x ) + 400 ] + 2700 σ = [ 10 , 10 , 10 , 20 , 20 ]
F 28 = w 1 2.5 F 15 ( x ) + w 2 [ 10 F 9 ( x ) + 100 ] + w 3 [ 2.5 F 1 1 ( x ) + 200 ]
+ w 4 [ 5 e 4 F 16 ( x ) + 300 ] + w 5 [ 1 e 6 F 1 ( x ) + 400 ] + 2800 σ = [ 10 , 20 , 30 , 40 , 50 ]
F 29 = w 1 F 17 ( x ) + w 2 [ F 18 ( x ) + 100 ] + w 3 [ F 19 ( x ) + 200 ] + 2900 σ = [ 10 , 30 , 50 ]
F 30 = w 1 F 20 ( x ) + w 2 [ F 21 ( x ) + 100 ] + w 3 [ F 22 ( x ) + 200 ] + 3000 σ = [ 10 , 30 , 50 ]
Notes:
w i = 1 j = 1 D ( x j o i j ) e x p ( j = 1 D ( x i o i j 2 ) 2 D σ i 2 )
Table 6. Results of the EGWO and GWO variants.
Table 6. Results of the EGWO and GWO variants.
EGWOGWOGNHGWOIGWOWdGWOEGWOGWOGNHGWOIGWOWdGWO
MeanRank
8.851 × 10 8 8.567 × 10 8 2.317 × 10 10 2.441 × 10 10 4.103 × 10 9 21534
7.809 × 10 10 9.219 × 10 10 6.318 × 10 11 6.867 × 10 11 2.144 × 10 11 12534
3.033 × 10 5 3.283 × 10 5 1.748 × 10 6 7.235 × 10 6 8.165 × 10 8 12534
9.873 × 10 3 1.114 × 10 4 2.906 × 10 5 3.237 × 10 5 3.796 × 10 4 12534
5.214 × 10 2 5.214 × 10 2 5.215 × 10 2 5.215 × 10 2 5.214 × 10 2 51234
7.269 × 10 2 7.040 × 10 2 7.778 × 10 2 7.793 × 10 2 7.589 × 10 2 21534
1.482 × 10 3 1.625 × 10 3 6.385 × 10 3 6.697 × 10 3 2.467 × 10 3 12534
1.728 × 10 3 1.560 × 10 3 2.858 × 10 3 2.917 × 10 3 2.142 × 10 3 21534
1.867 × 10 3 1.697 × 10 3 3.398 × 10 3 3.489 × 10 3 2.501 × 10 3 21534
2.176 × 10 4 2.010 × 10 4 3.641 × 10 4 3.647 × 10 4 3.207 × 10 4 21534
2.433 × 10 4 2.475 × 10 4 3.663 × 10 4 3.669 × 10 4 3.398 × 10 4 12534
1.205 × 10 3 1.205 × 10 3 1.207 × 10 3 1.207 × 10 3 1.205 × 10 3 12534
1.305 × 10 3 1.305 × 10 3 1.314 × 10 3 1.315 × 10 3 1.307 × 10 3 12534
1.603 × 10 3 1.650 × 10 3 3.114 × 10 3 3.259 × 10 3 1.936 × 10 3 12534
2.360 × 10 5 2.605 × 10 5 6.914 × 10 8 9.421 × 10 8 3.300 × 10 7 12534
1.647 × 10 3 1.647 × 10 3 1.649 × 10 3 1.649 × 10 3 1.648 × 10 3 12534
1.168 × 10 8 1.056 × 10 8 3.910 × 10 9 4.650 × 10 9 3.916 × 10 8 21534
2.473 × 10 9 2.968 × 10 9 8.400 × 10 10 9.090 × 10 10 2.449 × 10 9 51234
2.691 × 10 3 2.627 × 10 3 2.520 × 10 4 2.911 × 10 4 2.599 × 10 3 52134
2.477 × 10 5 2.972 × 10 5 5.578 × 10 7 1.200 × 10 8 1.777 × 10 6 12534
4.545 × 10 7 4.609 × 10 7 1.658 × 10 9 2.561 × 10 9 1.442 × 10 8 12534
5.753 × 10 3 5.904 × 10 3 2.631 × 10 6 3.533 × 10 6 8.447 × 10 3 12534
3.275 × 10 3 3.124 × 10 3 1.153 × 10 4 1.349 × 10 4 3.586 × 10 3 21534
2.927 × 10 3 2.601 × 10 3 4.412 × 10 3 4.524 × 10 3 3.547 × 10 3 21534
2.898 × 10 3 2.751 × 10 3 4.319 × 10 3 4.495 × 10 3 3.239 × 10 3 21534
2.816 × 10 3 2.811 × 10 3 4.263 × 10 3 4.593 × 10 3 2.843 × 10 3 21534
6.265 × 10 3 5.845 × 10 3 1.025 × 10 4 1.144 × 10 4 6.932 × 10 3 21534
1.633 × 10 4 4.930 × 10 3 4.357 × 10 4 4.455 × 10 4 1.121 × 10 4 25134
2.203 × 10 8 3.142 × 10 3 9.100 × 10 9 9.814 × 10 9 1.954 × 10 8 25134
9.028 × 10 6 6.053 × 10 3 5.515 × 10 8 7.087 × 10 8 4.797 × 10 6 25134
Average Ranking1.871.874.2734
Total Ranking11423
Table 7. Statistical analysis results on EGWO and GWO variants.
Table 7. Statistical analysis results on EGWO and GWO variants.
Friedman ANOVA TestWilcoxon Rank Sum Test
SSdfMSChi-sqpp α = 0.05 α = 0.1
EGWO vs.GWO437429150.82856.440.00170.797098NoNo
GNHGWO440429151.86256.830.0015 1.73 × 10 6 YesYes
IGWO440929152.03456.890.0015 1.73 × 10 6 YesYes
WdGWO447229154.20757.70.00120.00532YesYes
Table 8. Results of the EGWO and traditional algorithms.
Table 8. Results of the EGWO and traditional algorithms.
EGWOGABOASCATSAJAYAEGWOGABOASCATSAJAYA
MeanRanking
8.851 × 10 8 1.678 × 10 10 9.329 × 10 9 5.084 × 10 9 2.340 × 10 9 1.352 × 10 10 154362
7.809 × 10 10 3.997 × 10 11 3.007 × 10 11 2.301 × 10 11 5.258 × 10 10 4.587 × 10 11 514326
3.033 × 10 5 2.367 × 10 7 3.262 × 10 5 3.991 × 10 5 3.462 × 10 5 9.263 × 10 5 135462
9.873 × 10 3 1.724 × 10 5 1.039 × 10 5 5.127 × 10 4 8.750 × 10 3 1.496 × 10 5 514362
5.214 × 10 2 5.216 × 10 2 5.214 × 10 2 5.214 × 10 2 5.214 × 10 2 5.214 × 10 2 541632
7.269 × 10 2 7.763 × 10 2 7.559 × 10 2 7.607 × 10 2 7.436 × 10 2 7.666 × 10 2 153462
1.482 × 10 3 4.632 × 10 3 3.807 × 10 3 2.992 × 10 3 1.182 × 10 3 4.417 × 10 3 514362
1.728 × 10 3 2.498 × 10 3 2.165 × 10 3 2.171 × 10 3 1.886 × 10 3 2.555 × 10 3 153426
1.867 × 10 3 2.761 × 10 3 2.405 × 10 3 2.415 × 10 3 2.094 × 10 3 3.048 × 10 3 153426
2.176 × 10 4 3.669 × 10 3 3.346 × 10 3 3.196 × 10 3 2.989 × 10 3 3.058 × 10 4 156432
2.433 × 10 3 3.637 × 10 3 3.315 × 10 3 3.348 × 10 3 3.235 × 10 3 3.391 × 10 3 153462
1.205 × 10 3 1.207 × 10 3 1.205 × 10 3 1.205 × 10 3 1.205 × 10 3 1.205 × 10 3 514632
1.305 × 10 3 1.311 × 10 3 1.310 × 10 3 1.308 × 10 3 1.303 × 10 3 1.310 × 10 3 514362
1.603 × 10 3 2.574 × 10 3 2.335 × 10 3 2.041 × 10 3 1.534 × 10 3 2.327 × 10 3 514632
2.360 × 10 5 1.378 × 10 8 2.682 × 10 7 1.075 × 10 7 6.103 × 10 5 2.256 × 10 5 154326
1.647 × 10 3 1.649 × 10 3 1.647 × 10 3 1.648 × 10 3 1.647 × 10 3 1.648 × 10 3 153462
1.168 × 10 8 3.530 × 10 9 1.830 × 10 9 6.445 × 10 8 2.170 × 10 8 1.150 × 10 9 154632
2.473 × 10 9 6.333 × 10 10 4.128 × 10 10 1.405 × 10 10 3.24 × 10 3 1.916 × 10 10 514632
2.691 × 10 3 1.966 × 10 4 1.238 × 10 4 4.594 × 10 3 2.154 × 10 3 8.061 × 10 3 514632
2.477 × 10 5 7.514 × 10 7 1.450 × 10 6 8.578 × 10 5 2.741 × 10 5 6.518 × 10 6 154362
4.545 × 10 7 1.657 × 10 9 5.645 × 10 8 2.578 × 10 8 9.170 × 10 7 4.268 × 10 8 154632
5.753 × 10 3 1.790 × 10 6 4.034 × 10 5 9.476 × 10 3 7.145 × 10 3 3.350 × 10 4 154632
3.275 × 10 3 6.788 × 10 3 2.500 × 10 3 4.224 × 10 3 2.758 × 10 3 5.710 × 10 3 351462
2.927 × 10 3 3.816 × 10 3 2.600 × 10 3 3.166 × 10 3 2.994 × 10 3 3.997 × 10 3 315426
2.898 × 10 3 3.288 × 10 3 2.700 × 10 3 3.093 × 10 3 3.052 × 10 3 3.566 × 10 3 315426
2.816 × 10 3 3.161 × 10 3 2.800 × 10 3 3.053 × 10 3 2.993 × 10 3 3.076 × 10 3 315462
6.265 × 10 3 1.333 × 10 4 8.104 × 10 3 7.421 × 10 3 6.397 × 10 3 8.443 × 10 3 154362
1.633 × 10 4 4.926 × 10 4 2.393 × 10 4 2.564 × 10 4 2.103 × 10 4 2.277 × 10 4 156342
2.203 × 10 8 1.092 × 10 10 3.100 × 10 3 1.837 × 10 9 2.631 × 10 7 1.966 × 10 9 351462
9.028 × 10 6 8.873 × 10 8 3.200 × 10 3 6.262 × 10 7 2.547 × 10 6 1.517 × 10 8 351462
Average Ranking2.603.433.704.234.232.80
Total Ranking134552
Table 9. Statistical analysis results of EGWO and traditional algorithms.
Table 9. Statistical analysis results of EGWO and traditional algorithms.
Friedman ANOVA TestWilcoxon Rank Sum Test
SSdfMSChi-sqpp α = 0.05 α = 0.1
EGWO vs.GA441629152.27656.980.0014 1.73 × 10 6 YesYes
BOA419729144.72454.150.00310.002279YesYes
SCA446329153.89757.590.0012 3.79 × 10 6 YesYes
TSA437329150.79356.430.00170.336552NoNo
JAYA443229152.82857.190.0014 3.79 × 10 6 YesYes
Table 10. Results of the EGWO and recent algorithms.
Table 10. Results of the EGWO and recent algorithms.
EGWOZOARSASWOBOACOOOAEGWOZOARSASWOBOACOOOA
MeanRank
8.851 × 10 8 9.924 × 10 8 7.443 × 10 9 6.266 × 10 9 4.213 × 10 9 8.570 × 10 9 1.055 × 10 10 1254367
7.809 × 10 10 1.507 × 10 11 2.803 × 10 11 2.912 × 10 11 2.288 × 10 11 3.154 × 10 11 3.064 × 10 11 1253476
3.033 × 10 5 2.516 × 10 5 3.070 × 10 5 4.625 × 10 5 3.044 × 10 5 7.333 × 10 5 3.361 × 10 5 2153746
9.873 × 10 3 2.142 × 10 4 8.052 × 10 4 8.772 × 10 4 5.441 × 10 4 8.940 × 10 4 1.061 × 10 5 1253467
5.214 × 10 2 5.212 × 10 2 5.214 × 10 2 5.215 × 10 2 5.213 × 10 2 5.215 × 10 2 5.214 × 10 2 2513746
7.269 × 10 2 7.410 × 10 2 7.574 × 10 2 7.623 × 10 2 7.441 × 10 2 7.657 × 10 2 7.614 × 10 2 1253746
1.482 × 10 3 2.212 × 10 3 3.554 × 10 3 3.591 × 10 3 3.108 × 10 3 3.671 × 10 3 3.815 × 10 3 1253467
1.728 × 10 3 1.653 × 10 3 2.254 × 10 3 2.259 × 10 3 2.046 × 10 3 2.308 × 10 3 2.160 × 10 3 2157346
1.867 × 10 3 1.854 × 10 3 2.372 × 10 3 2.478 × 10 3 2.259 × 10 3 2.725 × 10 3 2.342 × 10 3 2157346
2.176 × 10 4 2.127 × 10 4 3.095 × 10 4 3.247 × 10 4 2.894 × 10 4 3.137 × 10 4 3.068 × 10 4 2157364
2.433 × 10 4 2.289 × 10 4 3.154 × 10 4 3.439 × 10 4 2.874 × 10 4 3.461 × 10 4 3.226 × 10 4 2153746
1.205 × 10 3 1.203 × 10 3 1.205 × 10 3 1.206 × 10 3 1.205 × 10 3 1.206 × 10 3 1.204 × 10 3 2713546
1.305 × 10 3 1.306 × 10 3 1.309 × 10 3 1.309 × 10 3 1.308 × 10 3 1.309 × 10 3 1.310 × 10 3 1253467
1.603 × 10 3 1.833 × 10 3 2.236 × 10 3 2.257 × 10 3 2.053 × 10 3 2.285 × 10 3 2.325 × 10 3 1253467
2.360 × 10 3 1.138 × 10 3 1.439 × 10 3 2.538 × 10 3 7.816 × 10 3 8.170 × 10 3 2.492 × 10 3 1253746
1.647 × 10 3 1.645 × 10 3 1.647 × 10 3 1.648 × 10 3 1.647 × 10 3 1.648 × 10 3 1.647 × 10 3 2135746
1.168 × 10 8 1.742 × 10 8 1.208 × 10 9 1.069 × 10 9 5.084 × 10 8 1.156 × 10 8 1.970 × 10 8 1254637
2.473 × 10 9 7.379 × 10 10 3.391 × 10 10 3.017 × 10 10 2.147 × 10 10 2.261 × 10 10 4.214 × 10 10 1256437
2.691 × 10 3 3.117 × 10 3 8.969 × 10 3 7.495 × 10 3 5.115 × 10 3 6.173 × 10 3 1.135 × 10 4 1256437
2.477 × 10 5 2.264 × 10 5 8.816 × 10 5 2.414 × 10 6 6.972 × 10 5 9.080 × 10 6 1.492 × 10 6 2153746
4.545 × 10 7 6.209 × 10 7 3.956 × 10 8 3.389 × 10 8 1.526 × 10 8 5.455 × 10 8 3.935 × 10 8 1254736
5.753 × 10 3 7.559 × 10 3 1.104 × 10 5 5.840 × 10 4 1.469 × 10 4 3.293 × 10 4 1.850 × 10 5 1256437
3.275 × 10 3 2.500 × 10 3 2.500 × 10 3 3.166 × 10 3 2.500 × 10 3 5.278 × 10 3 2.500 × 10 3 2357416
2.927 × 10 3 2.600 × 10 3 2.600 × 10 3 2.744 × 10 3 2.600 × 10 3 3.579 × 10 3 2.600 × 10 3 2357416
2.898 × 10 3 2.700 × 10 3 2.700 × 10 3 2.751 × 10 3 2.700 × 10 3 3.501 × 10 3 2.700 × 10 3 2357416
2.816 × 10 3 2.800 × 10 3 2.800 × 10 3 2.810 × 10 3 2.797 × 10 3 3.407 × 10 3 2.800 × 10 3 5237416
6.265 × 10 3 7.654 × 10 3 7.556 × 10 3 7.935 × 10 3 3.193 × 10 3 7.507 × 10 3 2.900 × 10 3 7516324
1.633 × 10 4 2.883 × 10 4 2.382 × 10 4 3.364 × 10 4 4.169 × 10 3 3.102 × 10 4 3.000 × 10 3 7513264
2.203 × 10 8 2.710 × 10 8 1.935 × 10 7 1.631 × 10 9 2.142 × 10 8 2.929 × 10 9 3.100 × 10 3 7351246
9.028 × 10 6 2.292 × 10 7 4.422 × 10 7 1.353 × 10 8 3.200 × 10 3 9.792 × 10 7 3.200 × 10 3 5712364
Averagr Ranking2.272.534.204.404.5746.03
Total Ranking1245637
Table 11. Statistical analysis results on EGWO and recent algorithms.
Table 11. Statistical analysis results on EGWO and recent algorithms.
Friedman ANOVA TestWilcoxon Rank Sum Test
SSdfMSChi-sqpp α = 0.05 α = 0.1
EGWO vs.ZOA445429153.58657.470.00130.047156YesYes
RSA439729151.62156.740.00150.000413YesYes
SWO441629152.27656.980.0014 2.84 × 10 5 YesYes
BOA435129150.03456.140.00180.022778YesYes
CO443529152.93157.230.0013 1.73 × 10 6 YesYes
OOA420829145.10354.30.0030.015788YesYes
Table 12. Training results of EGWO-MLP and other algorithms on the UCI dataset.
Table 12. Training results of EGWO-MLP and other algorithms on the UCI dataset.
Tic-Tac-Toe Dataset
EGWO-MLPGWO-MLPDE-MLPTSA-MLPPSO-MLPBA-MLPGA-MLPSCA-MLP
Rate97.643%93.790%94.091%97.310%87.830%94.704%64.725%93.531%
MSE0.0050.0010.0130.0130.0170.0170.0280.015
Std.2.5518.2458.7925.40416.5619.70333.35011.274
Heart Dataset
EGWO-MLPGWO-MLPDE-MLPTSA-MLPPSO-MLPBA-MLPGA-MLPSCA-MLP
Rate85.292%89.042%79.167%71.417%59.833%57.000%44.750%70.042%
MSE0.1030.0760.1570.1800.2720.2860.3230.206
Std.33.1883.0743.5863.7969.0429.3118.4233.543
XOR Dataset
EGWO-MLPGWO-MLPDE-MLPTSA-MLPPSO-MLPBA-MLPGA-MLPSCA-MLP
Rate95.417%93.750%52.500%35.417%31.667%88.333%31.667%47.500%
MSE0.0050.0100.0450.0650.1910.0110.1910.090
Std.8.98012.60718.97117.08416.97324.33016.97315.186
Balloon Dataset
EGWO-MLPGWO-MLPDE-MLPTSA-MLPPSO-MLPBA-MLPGA-MLPSCA-MLP
Rate100%100 %100%44.333%43.333%59.000 %41.167%97.333%
MSE 3.880 × 10 10 6.277 × 10 9 1.07 × 10 6 0.1840.1970.1190.2100.001
Std.00012.08713.79216.57814.7797.397
Table 13. The attribution of the soybean (Large) dataset.
Table 13. The attribution of the soybean (Large) dataset.
AttributionMeans
1dateApril, May, June, July, August, September, October, unknown.
2plant-standnormal, lt-normal, unknown.
3preciplt-norm, norm, gt-norm, unknown.
4templt-norm, norm, gt-norm, unknown.
5hailyes, no, unknown.
6crop-histdiff-lst-year, same-lst-yr, same-lst-two-yrs, same-lst-sev-yrs, unknown.
7area-damagedscattered, low-areas, upper-areas, whole-field, unknown.
8severityminor, pot-severe, severe, unknown.
9seed-tmtnone, fungicide, other, unknown.
10germination90–100%, 80–89%, lt–80%, unknown.
11plant-growthnorm, abnorm, unknown.
12leavesnorm, abnorm.
13leafspots-haloabsent, yellow-halos, no-yellow-halos, unknown.
14leafspots-margw-s-marg, no-w-s-marg, dna, unknown.
15leafspot-sizelt-1/8, gt-1/8, dna, unknown.
16leaf-shreadabsent, present, unknown.
17leaf-malfabsent, present, unknown.
18leaf-mildabsent, upper-surf, lower-surf, unknown.
19stemnorm, abnorm, unknown.
20lodgingyes, no, unknown.
21stem-cankersabsent, below-soil, above-soil, above-sec-nde, unknown.
22canker-lesiondna, brown, dk-brown-blk, tan, unknown.
23fruiting-bodiesabsent, present, unknown.
24external decayabsent, firm-and-dry, watery, unknown.
25myceliumabsent, present, unknown.
26int-discolornone, brown, black, unknown.
27sclerotiaabsent, present, unknown.
28fruit-podsnorm, diseased, few-present, dna, unknown.
29fruit spotsabsent, colored, brown-w/blk-specks, distort, dna, unknown.
30seednorm, abnorm, unknown.
31mold-growthabsent, present, unknown.
32seed-discolorabsent, present, unknown.
33seed-sizenorm, lt-norm, unknown.
34shrivelingabsent, present, unknown.
35rootsnorm, rotted, galls-cysts, unknown.
Table 14. Results of soybean disease identification compared with other algorithms.
Table 14. Results of soybean disease identification compared with other algorithms.
EGWO-MLPPSOGWO-MLPDE-MLPTSA-MLPPSO-MLPBA-MLPGA-MLPSCA-MLP
Rate98.763%77.204%68.548%91.505%51.935%22.957%39.677%64.677%
Std.3.10815.81821.93310.02027.99330.21742.99423.717
MSE12.62780.225100.14236.019118.036104.71140.241125.489
Std.2.39738.90113.3843.17338.26341.68820.04422.566
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bi, C.; Tian, Q.; Chen, H.; Meng, X.; Wang, H.; Liu, W.; Jiang, J. Optimizing a Multi-Layer Perceptron Based on an Improved Gray Wolf Algorithm to Identify Plant Diseases. Mathematics 2023, 11, 3312. https://doi.org/10.3390/math11153312

AMA Style

Bi C, Tian Q, Chen H, Meng X, Wang H, Liu W, Jiang J. Optimizing a Multi-Layer Perceptron Based on an Improved Gray Wolf Algorithm to Identify Plant Diseases. Mathematics. 2023; 11(15):3312. https://doi.org/10.3390/math11153312

Chicago/Turabian Style

Bi, Chunguang, Qiaoyun Tian, He Chen, Xianqiu Meng, Huan Wang, Wei Liu, and Jianhua Jiang. 2023. "Optimizing a Multi-Layer Perceptron Based on an Improved Gray Wolf Algorithm to Identify Plant Diseases" Mathematics 11, no. 15: 3312. https://doi.org/10.3390/math11153312

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop