Comparative Analysis of Deep Learning Models for Olive Detection on

the Branch

ERHAN KAHYA1, YASIN ASLAN2

1Department of Electronic and Automation

Tekirdağ Namık Kemal University

Tekirdağ,

TURKEY

2Freelance Senior Software Developer

Tekirdağ,

TURKEY

Abstract: - The future of deep learning integration in agriculture holds great potential for advancing sustainable

agricultural practices, precision agriculture and improved decision-making. With the rapid development of

image processing and artificial intelligence technologies in recent years, deep learning has begun to play a

major role in identifying agricultural pests and optimizing agricultural product marketing. However, there are

challenges related to data quality, model scalability, and geographical limitations for widespread adoption of

deep learning in agriculture. This study on Olive was conducted to improve the quality of the data set and to

ensure more reliable training of object detection models. According to the result of the training process of

YOLOv7 used in the study, it was concluded that it was characterized by decreasing loss values and showed an

increase in the model's ability to detect objects correctly. It was observed that the other model, YOLOv8l, had a

more effective learning capacity and a tendency to learn faster. The performance of both models was evaluated

with various metrics, and it was determined that YOLOv8l had higher Precision, Recall, and mAP values. It

was emphasized that YOLOv8l showed high performance even in low epoch numbers and can be preferred

especially in cases where time and computational resources were limited. It was determined that YOLOv7

made detections in a wide confidence range, but had difficulty in detections with low confidence scores. It was

observed that YOLOv8l made more stable and reliable detections with higher confidence scores. The metric

data of the "YOLOv8l" model was found to be higher compared to other models. The F1 score of the YOLOv5l

model was 92.337%, precision 96.568%, recall %88,462,mAP@0.5:0.65 value gave the highest score with

94.608%. This research on deep learning-based object detection models indicated that YOLOv8l showed

superior performance compared to YOLOv7 and was a more reliable option for agricultural applications.

Key-Words: - Deep learning, YOLOv8, YOLOv7, identification, analysis, classification.

Received: August 21, 2023. Revised: November 15, 2023. Accepted: December 11, 2023. Published: December 31, 2023.

1 Introduction

Deep learning has had a major impact on the

agricultural sector and has started to be used in

many areas. In agricultural applications, deep

learning is used in crop management, water

management, soil management, livestock

management, and classification. Deep learning is an

artificial intelligence method that uses multi-layered

artificial neural networks and can be used in areas

such as soil analysis, disease, and pest detection in

agriculture. Deep learning is used to increase

agricultural productivity and improve production. In

particular, the ability of deep learning to deal with

large data sets shows that it is highly effective in

agricultural applications, [1]. Deep learning is used

in a wide variety of areas in agricultural

applications. The successes achieved with deep

learning methods in the agricultural image

recognition competition have revealed the potential

in this field, [2]. Deep learning is also used in

processing agricultural sensor data and forecasting

agricultural production, [3]. Deep learning plays an

important role in issues such as sustainability,

productivity, reduction of input costs, food safety,

and environmental sustainability in agricultural

applications, [4]. The use of these technologies in

agricultural applications enables the development of

smarter and sustainable agricultural practices, [5].

Deep learning is used in areas such as agricultural

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

338

Volume 22, 2023

data analytics, disease diagnosis, and production

forecasting to increase the automation and

efficiency of agricultural processes, [6]. Deep

learning models are also used for decision support

systems in agricultural production, [7]. In this

context, it is seen that deep learning techniques are

used in a wide range of agricultural applications and

provide a significant transformation in the

agricultural sector. The use of these techniques

contributes to making agricultural processes more

efficient, sustainable, and smart.

2 Material and Method

2.1 Material

Olive is the fruit of the olive tree and is usually used

for olive oil production or direct consumption.

There are two main types. The first is table olives,

which are used for direct consumption and are

usually pickled in water or stored in oil. The other is

small, bitter-pressed olives that are generally used

for olive oil production. Olive trees generally grow

in regions specific to the Mediterranean climate, and

the harvest time of this fruit varies depending on the

geographical location and olive type. Olives

generally ripen in autumn and are harvested during

this period. The olive harvesting process is a very

important stage in olive oil production as it

significantly affects the quality of the final product.

Olive harvesting can be performed using manual or

mechanical methods, and each approach has specific

effects. Manual harvesting involves manually

picking olives from trees or collecting olives that

have fallen to the ground and is usually done by

farm workers, [8]. On the other hand, mechanical

harvesting uses technological methods such as

shaking trees to dislodge olives. This method may

be more efficient, but can also cause potential

damage to the fruit, [9]. The labor aspect of olive

harvesting is also an important issue. This is

because hand picking of olives involves the

participation of farm laborers, [10]. This emphasizes

the social and occupational health dimensions

associated with olive harvesting and emphasizes the

need for fair labor practices and worker welfare in

agricultural settings.

2.2 Method

While preparing the data set of the olive fruit, which

was targeted for object detection and analysis within

the scope of the study, 140 images taken from the

producer olive field in Tekirdağ Naip Village and

various internet sources were used. The parts

containing each olive image within the images were

marked with a bounding box area. Sample images

taken from the producer's field are shown in Figure

1 (a-b).

(a)

(b)

Fig. 1: a-b: Examples of educational images

2.2.1 Labeling

For an object detection model to be able to train on

a dataset, the objects to be detected must be

labeled/signed in the dataset to be trained. For this

reason, the parts containing the olive image in each

of the 140 images should be marked with the

bounding box area and assigned to the "olive" class,

which is the object class it belongs to. There are

many programs, websites, and tools available in the

open-source communities for image labeling.

RoboFlow is a versatile platform that offers a range

of functions for robotics and artificial intelligence

applications. It enables the development and

training of machine learning models by providing

users with access to public datasets and the ability to

send private data, [1]. The platform also supports

pre-processing of images and their labels, as well as

the creation of bounding boxes for tasks such as

object detection and classification, [10], [11].

RoboFlow enables end users to control robots by

developing flowchart-like programs consisting of

visual markers and making them accessible to

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

339

Volume 22, 2023

people without extensive programming expertise,

[12]. Roboflow offers a flow-based visual

programming language specifically designed for

engineering robot motion and manipulation tasks by

meeting the needs of both inexperienced and

experienced programmers, [13]. This marking and

labeling process is easily done through the graphical

user interface of the site. The Label screen is shown

in Figure 2. a-b.

(a)

(b)

Fig. 2: a-b: Label display

2.2.2 Training Model Selection

In the study we carried out, the YOLOv7 and

YOLOv8 families, developed as an open source of

the YOLO model family developed by the CNN

method, were preferred. The YOLOv7 architecture,

an extension of YOLOv4, has been developed with

several architectural reforms, including the

Extended Efficient Layer Aggregation Network (E-

ELAN) and free trainable tools, [14]. YOLOv7 is

the newest variant of the YOLO object detector

family, known for its high accuracy and fast

implementation, [15]. The YOLOv7-Pose model,

based on YOLO-Pose and YOLOv7, uses E-LAN

and E-ELAN architecture for high accuracy and

fast, [16]. The YOLOv7 architecture includes a

computation block called E-ELAN, which allows

better learning within the framework, [17].

YOLOv8, the latest version of the YOLO series,

has attracted attention with its advances in real-time

object detection. It is designed to provide high

accuracy while maintaining real-time performance

by combining the strengths of various real-time

object detectors, [18]. The YOLOv8 architecture

offers five different size options that meet various

application needs, from nano-size to extra-large

size, [19]. Building on the strengths of its previous

versions, YOLOv8 has emerged as a faster and

more capable option and positions itself as a

superior alternative to existing models, [20]. The

inclusion of object detection skills in the YOLOv8

framework increases its potential for robotic

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

340

Volume 22, 2023

cognitive architectures by contributing to perceptual

fixation capabilities, [21]. YOLOv7 and YOLOv8

versions used in the study contain models within

themselves. YOLOv7, YOLOv8n/s/m, and l (nano-

small-medium and large) models were preferred for

deep learning training.

2.2.3 Initiation of Training

To start the training of the model that will perform

olive detection, the location of the YOLOv7/v8

model on the computer was visited. In the next step,

a Python editor was used to control and run the

train.py program located in the main directory that

manages the YOLOv7/v8 training. This Python

program can be customized with various parameters

and made suitable for specific training for Olive

detection.

Within the project, the parameters and regulations in

the code written below were preferred for olive

fruit.

YOLOv7

!python train.py --device 0 --batch 20 --epochs

100 --data {dataset.location}/data.yaml

--weights 'yolov7_training.pt'

--device: This parameter indicates the device on

which the training will be performed. The value "0"

is generally used for the first graph processing unit

(GPU).

--batch: The number of data point packets to be

used by the display card at a time while training the

model.

--epochs: The number of times all training data is

shown to the trained network and the weights are

updated while training the model

--data: It is a configuration file that defines the

training data of the model. This file contains the

data path, class labels, and other data characteristics.

--weights: Indicates the initial weights to be used

during training of the model. This is important in

transfer learning situations. The file

'yolov7_training.pt' contains the pre-trained weights.

If this parameter is not indicated, the model usually

starts training with random weights.

YOLOV8

!yolo task=detect mode=train model=yolov8n.pt

data={dataset.location}/data.yaml epochs=50

imgsz=640 plots=True

!yolo task=detect mode=train model=yolov8s.pt

data={dataset.location}/data.yaml epochs=50

imgsz=640 plots=True

!yolo task=detect mode=train model=yolov8m.pt

data={dataset.location}/data.yaml epochs=50

imgsz=640 plots=True

!yolo task=detect mode=train model=yolov8l.pt

data={dataset.location}/data.yaml epochs=50

imgsz=640 plots=True

--task: Determines which task the training will

perform. “detect” refers to the object detection task.

--mode: Indicates that the model will be run in

training mode. YOLO can be run in different modes,

for example, “train” can be used for training, and

“test” can be used for testing or validation.

--model: Indicates the path and name of the model

to be used during training.

--imgsz: The pixel size at which the images to be

trained will be reduced by the YOLOv8 model.

--plots: Refers to graph drawings created during

training. These graphs are used to monitor the

performance of the model.

As a result of running these code lines correctly, the

training process of the model started. The program

first checks the YOLOv7/v8 files and checks for any

update status. Then, the training process is carried

out during the determined number of cycles (epoch).

Examination of the results of YOLOv7 and

YOLOv8l algorithms according to error matrix

metrics is shown in Figure 3, Figure 4, Figure 5,

Figure 6, Figure 7, Figure 8 respectively.

(Image 1, Size: 640x640, Batch: 20 Epoch: 100,

Algorithm: YOLOv7)

(Image 2, Size: 640x640, Epoch: 50, Algorithm:

YOLOv8l)

Fig. 3: F1 Score display

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

341

Volume 22, 2023

According to the analysis of the graphs in

Figure 3, the F1 Score result analysis is as

follows.The F1-Confidence curve of the YOLOv7

algorithm at 640x640 resolution and after 100

epochs of training are shown in Image 1. From the

graph, it was seen that the model achieved a high F1

score for the 'olive' class. The maximum F1 score

(about 0.93) was obtained at a confidence threshold

of 0.498. The curve rises at lower confidence

thresholds. After reaching the optimum point, it

decreased with increasing confidence threshold.

This indicated that the model provided a good

balance between the accuracy and scope of the

detections within a given confidence interval.

However, it was understood that after the

confidence threshold of 0.6, the F1 score decreased

rapidly. It was understood that high threshold values

can increase false negatives and the sensitivity of

the model decreased.

The F1-Confidence curve of the YOLOv8l

algorithm obtained at 640x640 resolution and after

50 epochs of training is shown in Image 2. The

model achieved an almost constant high F1 score

(approximately 0.92) for the 'olive' class and this

score reached to maximum at a confidence threshold

of 0.466. The curve is almost a straight line over a

wide confidence interval. This indicated that the

model provided high accuracy and sensitivity even

over a wide confidence interval. The confidence

threshold of the curve decreased sharply at around

0.8, and from this point onwards, it was understood

that the model significantly increased the false-

negative rate and the sensitivity decreased.

In both graphs, the F1 scores of all classes were

also indicated to evaluate the overall performance of

the model. For YOLOv7, the F1 score of all classes

was 0.93, while it was shown as 0.92 for YOLOv8l.

These values indicated that both models showed

high performance and gave good results over a wide

threshold of confidence. For both models, it was

understood that the best performance was achieved

at medium confidence thresholds. It was observed

that higher or lower threshold values negatively

affected performance.

According to the analysis of the graphs in

Figure 4, the Precision result analysis is as

follows.The accuracy-confidence curve of the

YOLOv7 algorithm for the 'olive' class is shown in

Image 1. The curve observed in the image started

with low accuracy at low confidence thresholds and

increased as the confidence threshold increased.

This indicated that the model showed improvement

in the accuracy of detections within a certain

confidence interval. The accuracy reached a

maximum (1.00) at a confidence threshold value of

0.641, indicating that all of the model's detections at

this threshold were correct. However, accuracy

decreased rapidly as the confidence threshold

increased further. This suggests that at very high

confidence thresholds the model increased false

negatives and therefore made fewer true positive

detections.

(Image 1, Size: 640x640, Batch: 20 Epoch:

100,Algorithm: YOLOv7)

(Image 2, Size: 640x640, Epoch: 50, Algorithm:

YOLOv8l)

Fig. 4:Precision display

The truth-confidence curve for the 'olive' class

of the YOLOv8l algorithm is shown in Image 2. The

curve rose rapidly at low confidence thresholds and

reached maximum accuracy (1.00) at a confidence

threshold of 0.511. It was observed that the curve

almost flattened towards the high confidence

thresholds and remained at this level. This showed

that the YOLOv8l algorithm achieved high

accuracy, especially at intermediate confidence

thresholds, and that the model made true positive

detections even above this threshold, keeping the

number of false positives minimum.

The maximum accuracy values for all classes

for both models are indicated in the graph and are

shown as 1.00 in both cases. This showed that the

model reached the maximum accuracy value when

tested on all classes.

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

342

Volume 22, 2023

(Image 1, Size: 640x640, Batch: 20 Epoch: 100,

Algorithm: YOLOv7)

(Image 2, Size: 640x640, Epoch: 50, Algorithm:

YOLOv8l)

Fig. 5:Recall display

According to the analysis of the graphs in

Figure 5, the Recall result analysis is as follows.

The sensitivity-confidence curve of the YOLOv7

algorithm for the 'olive' class is shown in Image 1.

The sensitivity started at 1.00 (perfect sensitivity) at

low confidence thresholds and decreased as the

confidence threshold increased. This indicated that

the model was able to detect almost all true

positives at low confidence thresholds. However, it

was seen that as the confidence threshold increased,

the model missed some true positive detections and

the sensitivity decreased. In the graph, the

confidence threshold decreased sharply to around

0.8. It was understood that higher threshold values

significantly reduced the sensitivity by increasing

the false negative rate.

The sensitivity-confidence curve of the

YOLOv8l algorithm for the 'olive' class is shown in

Image 2. Similarly, the model showed high

sensitivity at low confidence thresholds. The

sensitivity started to decrease rapidly at a relatively

low confidence threshold (around 0.2) and

continued to decrease as the confidence threshold

increased. In this model, sensitivity decreased

significantly at high confidence thresholds. This

indicated that the model had a high false negative

rate at high confidence thresholds.

In both graphs, the sensitivity scores of all

classes were indicated to evaluate the overall

performance of the model, and in both cases,

excellent sensitivity (1.00) was achieved at the

lowest confidence threshold. This indicated that

both models were able to detect all true positives at

low confidence thresholds.

(Image 1, Size: 640x640, Batch: 20 Epoch: 100,

Algorithm: YOLOv7)

(Image 2, Size: 640x640, Epoch: 50, Algorithm:

YOLOv8l)

Fig. 6:Precision-Recall display

According to the analysis of the graphs in

Figure 6, the Precision-Recall result analysis is as

follows.In the Precision-Recall curve showing the

performance of the YOLOv7 algorithm in Image 1,

it was seen that a very high precision value was

achieved for the 'olive' class. The graph shows that

the model achieved high accuracy over a wide range

of precision (recall). The mAP@0.5 value of the

model for the 'olive' class was found as 0.947,

indicating that the model detected objects with high

precision.

In the Image 2 graph, it was seen that in the

Precision-Recall curve of the YOLOv8l algorithm,

almost perfect accuracy was obtained for the 'olive'

class. The curve in the graph showed that the model

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

343

Volume 22, 2023

detected true positives with very high accuracy and

the value of mAP@0.5 was 0.981. It was understood

that the classification performance of the model was

outstanding and that it detected very few false

positives.

In both graphs, it was seen that there was a

sharp decrease in accuracy after a certain sensitivity

value. This meant that the model detected all

positive examples, but this caused some false

positives. The graphs showed the overall

performance of all classes while maintaining high

accuracy and sensitivity values for the 'olive' class.

The high mAP@0.5 values of both models indicated

that they performed excellently in object detection.

The graphs showed the overall performance of the

model, as well as how balanced its performance was

for different classes. While both algorithms offered

high accuracy and sensitivity, it was remarkable that

YOLOv8l achieved a very high value of mAP@0.5

even at a lower epoch number. This indicated that

YOLOv8l may be more effective than YOLOv7 in

certain situations.

(Image 1, Size: 640x640, Batch: 20 Epoch: 100,

Algorithm: YOLOv7)

(Image 2, Size: 640x640, Epoch: 50, Algorithm:

YOLOv8l)

Fig. 7: Confusion Matrix display

According to the analysis of the graphs in

Figure 7, the Confusion Matrix result analysis is as

follows.Image 1 showed that the YOLOv7 model

performed almost perfectly for the "olive" class. The

matrix showed a high TP ratio (1.0) for the "olive"

class and a high TN ratio for the "background"

class. This meant that the model detected the "olive"

class almost without error and made almost no false

positive predictions.

In Image 2, the confusion matrix of the

YOLOv8l model again showed a high TP rate for

the "olive" class (0.89), but there was an FP ratio

(0.08) indicating that this model also misclassified

some of the "background" class as "olive". This

meant that although the YOLOv8l model run with a

slightly lower accuracy rate, it still exhibited a quite

high accuracy.

Both matrices showed the high sensitivity of the

models in detecting the "olive" class and the overall

low false positive rate. However, while the

YOLOv7 model almost completely avoided false

positives in the "background" class, the YOLOv8l

model was somewhat weak in this regard. This

indicated that YOLOv7 was able to discriminate

better than YOLOv8l in this particular case.

(Image 1, Size: 640x640, Batch: 20, Epoch: 100,

Algorithm: YOLOv7)

(Image 2, Size: 640x640, Epoch: 50, Algorithm:

YOLOv8l)

Fig. 8: Loss Function display

According to the analysis of the graphs in

Figure 8, the Loss Function result analysis is as

follows.The graphs showed how various loss

functions and performance metrics changed during

the training process of the YOLOv7 and YOLOv8l

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

344

Volume 22, 2023

algorithms. Such graphs are used to understand the

learning process and performance of the model on

training and validation sets.

Image 1 Analysis (YOLOv7)

This series of graphs shows the training process of

the YOLOv7 model trained at 640x640 resolution,

with 20 batch sizes and 100 epochs.

 Box Loss: It measures the availability of the

bounding boxes predicted by the model to

the real boxes. A downward trend indicates

that the model starts to predict bounding

boxes more accurately over time.

 Objectness Loss: It measures how

accurately the model predicts the presence

of an object. A decrease indicates that the

model better discriminates object presence

over time.

 Classification Loss: It measures the

classification accuracy of the model. A

decrease indicates that the model classifies

classes more accurately over time.

 Precision ve Recall: It shows the accuracy

(how accurate positives are detected) and

sensitivity (how many of all positives are

detected) values of the model. An increase

in both values indicates an improvement in

the model’s performance.

 mAP@0.5 ve mAP@0.5:0.95: It measures

the average accuracy rates of the model. In

general, an increasing trend indicates that

the model can generalize well over all

classes.

Image 2 Analysis (YOLOv8l)

The second series of graphs shows the training

process of the YOLOv8l model trained at 640x640

resolution and 50 epochs.

 Train Box/Class/DFL Loss: Bounding

box, classification, and directional focus

loss on the training set of the model.

Downward trends indicate that the model

shows improvement during the training

process.

 Val Box/Class/DFL Loss: The losses of the

model on the validation set. Decreases

indicate that the model also shows

improvement in the validation set and does

not make overfitting.

 Precision/Recall: Accuracy and precision

of the model on the training set. Stable and

high values indicate the reliability of the

model.

 Metrics/mAP50/B: Average accuracy rates

of the model. High mAP values indicate that

the model makes consistent detections over

a wide range.

In both images, it was seen that the performance

and loss functions of the model in the training

process tended to decrease. It was understood that

YOLOv7 had a more stable training process with a

longer training time (100 epochs), while YOLOv8l

showed a high performance in a shorter time (50

epochs). It was observed that both models had high

mAP values and generally showed high accuracy

and sensitivity.

3 Research Results

The training process of YOLOv7 was characterized

by generally decreasing loss values. The decrease in

Box, Objectness, and Classification Loss values

indicated an increase in the model's ability to

correctly detect objects within bounding boxes, and

correctly identify and classify object presence. The

increase in the Precision and Recall metrics

indicated that the accuracy and sensitivity of the

model's detections increased. The metrics mAP@0.5

and mAP@0.5:0.95 indicated the reliability and

interclass consistency of the model with high values.

The loss graphs for YOLOv8l showed a similar

trend of improvement, but higher mAP values were

obtained at a shorter number of epochs. This

indicated that the learning capacity and speed of

YOLOv8l was more efficient than YOLOv7.

Precision and Recall values showed that the model

was highly capable of accurately detecting and

classifying objects.

The confusion matrices showed that YOLOv7

showed almost a perfect performance in

classification, while YOLOv8l ran with a certain

margin of error. The Precision-Recall curves

showed that both models had high Precision and

Recall values, but the YOLOv8l performed nearly

perfectly even at a low epoch count.

Both models exhibited high performance, but it

was remarkable that YOLOv8l had a tendency to

learn faster and achieved high mAP values with

fewer epochs. This indicated that YOLOv8l may be

preferable, especially when time and computational

resources are limited. Despite the longer training

time, YOLOv7 showed a more stable performance

during the learning process. Both models provided

high accuracy and precision, but YOLOv8l provided

a slight advantage in classification.

Extensive analysis of training processes and

performance metrics showed that both models were

suitable for object detection. However, the

performance of YOLOv8l showed that it was more

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

345

Volume 22, 2023

efficient, especially in terms of training time and

computational resources.

Figure 9 shows the test result screen, Figure 10

shows the "Validation Batch" prediction markings

resulting from the training of YOLOv7 and Figure

11 shows the YOLOv8l models.

(Image 1, Size: 640x640, Batch: 20, Epoch: 100,

Algorithm: YOLOv7)

(Image 2,

Size: 640x640, Epoch: 50, Algorithm: YOLOv8l)

Fig. 9: Test result screen

Fig. 10: Size: 640x640, Batch: 20, Epoch: 100,

Algorithm: "Validation Batch" prediction markings

resulting from the training of the YOLOv7 model

Fig. 11:Size: 640x640, Epoch: 50, Algorithm:

"Validation Batch" prediction markings resulting

from the training of the YOLOv8l model

The training of YOLOv7 and YOLOv8l models

with different epoch numbers affects the learning

dynamics. YOLOv7 was trained for 100 epochs and

YOLOv8l was trained for 50 epochs. During the

training process, it was observed that the loss values

in the YOLOv7 model showed a slower decrease,

whereas the YOLOv8l model showed a faster and

more stable decrease trend. This indicates that the

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

346

Volume 22, 2023

learning capacity and speed of YOLOv8l is more

effective than YOLOv7. Furthermore, when the

performances of both models on the validation data

are analyzed, it is concluded that YOLOv8l shows

high accuracy and sensitivity despite the lower

number of epochs, while YOLOv7 is not as

effective despite the longer training time.

The analysis of the hyperparameters (e.g.

learning rate, momentum, weight reduction)

specified in the model's configuration files

significantly affects the performance of each model.

The different sets of hyperparameters used in the

YOLOv7 and YOLOv8l models directly affected

the training processes and results of the models. For

example, the higher learning rate applied in the

YOLOv8l model enabled the model to learn faster

and generalize more effectively. In contrast, the

lower learning rate in the YOLOv7 model caused

the model to learn more slowly but more carefully.

These differences were reflected in the object

detection performances as well as the training

processes of the models.

Analyzing the performance of both models

under different lighting conditions and background

scenarios is an important factor in evaluating the

applicability of the models in real-world

applications. In the tests, the YOLOv8l model gave

better results in low-light conditions and images

with complex backgrounds. The YOLOv7 model,

on the other hand, performed better under standard

lighting and less complex backgrounds. This shows

that YOLOv8l is capable of better generalization in

challenging conditions.

The data augmentation techniques used affected

the performance of both models. Traditional data

augmentation techniques (e.g. rotation, translation)

applied in the YOLOv7 model increased the

generalization capability of the model. In the

YOLOv8l model, more advanced data augmentation

techniques (e.g. color space transformations,

geometric transformations) were used, which made

the model more robust to more diverse and

challenging data sets.

The comparison between the different sizes of

the YOLOv8 series (nano, small, medium, large)

shows that each model size offers different

advantages in terms of performance and

computational costs. In particular, the YOLOv8l

model, while offering high accuracy and mAP

values, requires more computational resources. On

the other hand, smaller models such as YOLOv8n

provide adequate performance with fewer

computational resources. This emphasizes the

importance of choosing the model size according to

different application requirements.

When the performance of both models is

evaluated in real-time applications, it is observed

that YOLOv8l is superior in terms of processing

speed and detection accuracy. In particular, the

YOLOv8l model offers faster detection times and

high accuracy rates, while the YOLOv7 model can

make stable detections despite its slower processing

time. These results show that model selection in

real-time applications depends not only on accuracy

rates but also on processing speed. YOLOv8l stands

out as a suitable option for applications requiring

high speed and accuracy, while YOLOv7 can be

preferred in less urgent situations and in scenarios

where stability is more important. The YOLOv7

model detected olives after 640x640 resolution, 20

batch sizes, and 100 epochs of training. The images

included situations with different densities of olives

in various environments. The model detected olives

at a wide confidence interval. In some cases, correct

detections were made with very high confidence

scores (0.9 and above), while lower confidence

scores (e.g. 0.3) were used in some detections.

Detections with low confidence scores can often

indicate false positives or detections made under

challenging conditions. This is important in

evaluating the model's balance of precision and

sensitivity, as well as its ability to generalize

different image conditions.

The YOLOv8l model was tested on similar

images at 640x640 resolution and after 50 epochs of

training. High confidence scores (0.9 and above)

were seen in most of the detections which indicated

that the model was able to make fairly accurate and

precise detections. However, the small number of

detected objects may mean that the model misses

true positives in some cases. This tests the

sensitivity of the model, especially in compelling

scenarios such as difficult lighting conditions or

overlapping objects.

When the outputs of both models were

compared, it was seen that YOLOv8l tended to

make detections with higher confidence scores. This

indicated that the model followed a more stable and

reliable learning path during training and showed a

better generalization performance. YOLOv7, on the

other hand, made detections at a wider confidence

interval, indicating that the model may make more

false positive detections in certain scenarios.

4 Discussion and Conclusion

The integration of deep learning techniques into

agriculture has attracted great attention in recent

years. Deep learning has been successfully applied

in various areas of agriculture, including smart

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

347

Volume 22, 2023

agriculture, precision agriculture, and agricultural

data analysis, [22]. In their study on image

description and deep learning in agriculture, [23]

indicated that deep learning made significant

progress in the detection of crop diseases.

Furthermore, [24] highlighted the challenges and

opportunities of applying machine and deep learning

technologies in agriculture by providing insights

regarding the specific challenges encountered in

Nigerian agriculture. The future of deep learning in

agriculture has great potential for advances in

sustainable agricultural practices, precision

agriculture, and improved decision-making. [25]

reflected the rapid development of image processing

and artificial intelligence technologies in recent

years and emphasized the critical role of deep

learning in identifying corn pests. Furthermore, the

potential of deep learning to optimize e-commerce

marketing of agricultural products was highlighted

by [26], pointing out different applications of deep

learning in various agricultural fields. [27],

identified diseases and pests in tomatoes by

applying deep learning methods to identify and

manage crop diseases in agricultural harvesting.

[28], used image segmentation of fruit

characteristics for harvest in their study using deep

learning methods in fruit recognition and evaluation

for agricultural harvesting. [29], used the YOLOv4

deep learning model to locate apples for robotic

apple harvesting. They found that the robotic

harvest detection success rate of the study was

92.9%. [30], detected bunches of grapes on the vine

using the YOLOv4 deep learning model. [31],

performed detection and classification of bell pepper

leaf diseases using artificial neural networks. With

the model they created, they found that the pepper

plant leaf was healthy or bacterial with 99.99%

predictive accuracy. [32], used deep learning

methods for robotic tomato harvesting in their study.

In addition, they determined the cut-off point for the

robotic harvesting of tomatoes by image analysis

method.

[33], conducted a prediction study by planting

seeds of 692 native genotypes in their study entitled

Fruit Yield Prediction in Pepper Using an Artificial

Neural Network. The integration of deep learning

into agriculture offers numerous opportunities for

the development of agricultural practices, improving

crop management, and advancing sustainable

agriculture. However, addressing challenges related

to data quality, model scalability, and geographic

limitations is crucial for the widespread adoption of

deep learning in agriculture. The information

provided by the reviewed academic papers provides

valuable perspectives on the current state and future

potential of deep learning in agriculture. The results

of our study are in parallel with similar research we

have stated. YOLOv7 and YOLOv8 algorithms

were used in the training of the network. A

comparison of the test performance evaluation

results of these algorithms is given in Table 1.

In Table 1, the object detection performances of

different variations of the YOLOv7 and YOLOv8l

models were compared. While the performance of

YOLOv7 after 99 epochs was notable for a

significantly lower F1 score (28.842%), each

version of the YOLOv8 series, especially YOLOv8l

(92.337% F1 score), performed much higher. These

performance measurements are critical in evaluating

the efficiency and reliability of each model in the

object detection task. The low F1 score of the

YOLOv7 model indicated that the model was not

optimal in terms of accuracy and scope of its

detections. In particular, a low Recall rate (19.56%)

indicated that the model missed a large proportion

of the objects that were actually positive. On the

other hand, YOLOv8l's high Precision (96.568%)

and very high Recall (88.462%) rates indicated that

the model showed superior performance in terms of

both accuracy and scope. YOLOv8l's

mAP@0.5:0.95 metric (94.608%) showed that the

model successfully detected even objects with high

levels of difficulty and correctly classified a wide

range of objects.

Table 1. Comparison of Model Performance

Values

Model

Epoch

F1 Score

Precision

Recall

mAP@0.5:0.95

Yolov7

28.842%

54.89%

19.56%

62.37%

Yolov8n

95.293%

96.088%

94.512%

87.223%

Yolov8s

94.469%

89.517%

100.000%

90.491%

Yolov8m

91.772%

91.243%

92.308%

90.222%

Yolov8l

92.337%

96.568%

88.462%

94.608%

Note: The F1 Score Value in the table was calculated

according to the formula

[F1=2×(Precision.Recall/Precision+Recall].

As a result of this comparative analysis, it was

concluded that YOLOv8l had a significant

advantage in object detection performance than

YOLOv7. The high F1 score and mAP@0.5:0.95

values of YOLOv8l model showed that it was a

superior option as an advanced algorithm for object

detection, especially in various and challenging

detection scenarios. The fact that YOLOv7 showed

low performance despite a higher number of epochs,

perhaps reflected the difficulties or configuration

deficiencies that the model encountered during the

training process. The performance of YOLOv8l

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

348

Volume 22, 2023

indicated that this model should be preferred in real-

world applications as well as object detection tasks.

In our study, a comprehensive methodology and

analysis approach was made in the development and

evaluation of deep learning-based object detection

models. According to the research results, it is

understood that the advantages provided by the

YOLOv8l model will provide a basis for further

optimization and extension of the models in future

research and applications.

References:

[1] Kamaruzaman, A. S. F., Ani, A. I. C., Farid,

M., Bakar, S. J. A., Maruzuki, M. I. F.,

Setumin, S., & Hadi, M. S. (2023). Systematic

literature review: application of deep learning

processing technique for fig fruit detection

and counting. Bulletin of Electrical

Engineering and Informatics, 12(2), 1078-

1091.

[2] Yang, J., Guo, X., Li, Y., Marinello, F.,

Ercisli, S., & Zhang, Z. (2022). A survey of

few-shot learning in smart agriculture:

developments, applications, and challenges.

Plant Methods, 18(1)

[3] Wang, D., Cao, W., Zhang, F., Li, Z., Xu, S.,

& Wang, X. (2022). A review of deep

learning in multiscale agricultural sensing.

Remote Sensing, 14(3), 559.

[4] Ryo, M., Schiller, J., Stiller, S., Palacio, J. C.

R., Mengsuwan, K., Safonova, A., & Wei, Y.

(2022). Deep learning for sustainable

agriculture needs ecology and human

involvement. Journal of Sustainable

Agriculture and Environment, 2(1), 40-44.

[5] Jin, X., Zhu, X., Ji, J., Ma, Y., Xie, X., &

Zhao, B. (2023). Design and research an

online diagnosis platform for tomato seedling

facilities production diseases, Research

Square, p.1-23,

https://doi.org/10.21203/rs.3.rs-3121099/v1.

[6] Derisma, D., Rokhman, N., & Usuman, I.

(2022). Systematic review of the early

detection and classification of plant diseases

using deep learning. IOP Conference Series:

Earth and Environmental Science, 1097(1),

012042. IOP Publishing.

[7] Khairi, D. u., Ahsan, K., Badshah, G., Ali, S.

Z., Raza, S. A., Alqahtani, O., & Shiraz, M.

(2023). Comparison analysis of machine

learning classification on apple fruit quality,

Research Square, p.1-24.

https://doi.org/10.21203/rs.3.rs-3025343/v1.

[8] Habib, R. R. and Fathallah, F. A. (2012).

Migrant women farm workers in the

occupational health literature. Work, 41, 4356-

4362. IOS Press.

[9] Berg, H., Maneas, G., & Engstrom, A. (2018).

A comparison between organic and

conventional olive farming in Messenia,

Greece. Horticulturae, 4(3), 15.

[10] Tariq, S., Hakim, A., Siddiqi, A. A., & Owais,

M. (2022). An image dataset of fruitfly

species (bactrocera zonata and bactrocera

dorsalis) and automated species classification

through object detection. Data in Brief, 43,

108366.

[11] Aman Jain, Jatin Gupta, Somya Khandelwal,

Surinder Kaur. (2021). Vehicle License Plate

Recognition. Fusion: Practice and

Applications, Fusion:Practice and

Applications (FPA),4 (1), 15-21.

[12] Akiki, P. A., Akiki, P. A., Bandara, A. K., &

Yu, Y. (2020). Eud-mars: end-user

development of model-driven adaptive

robotics software systems. Science of

Computer Programming, 200, 102534.

https://doi.org 10.1016/j.scico.2020.102534

[13] Dave, K. (2018). Comparison of flow-based

versus block-based programming for naive

programmers. Toronto Metropolitan

University,Canada,Thesis.

https://doi.org/10.32920/ryerson.14652246.v2

[14] Delgado, G., Cortés, A., García, S., Loyo, E.,

Berasategi, M., & Aranjuelo, N. (2023).

Methodology for generating synthetic labeled

datasets for visual container inspection.

Transportation Research Part E: Logistics

and Transportation Review, 175, 103174.

[15] Holla, B. A., Pai, M. M. M., Verma, U., &

Pai, R. M. (2023). Enhanced vehicle re-

identification for smart city applications using

zone specific surveillance. IEEE Access, 11,

29234-29249.

[16] Dao, N. D., Le, T. V., Tran, H. T. M.,

Nguyen, Y. T. H., & Duy, T. D. (2022). The

combination of face identification and action

recognition for fall detection. Journal of

Science and Technology Issue on Information

and Communications Technology, 37-44.

[17] Ghourabi, M., Mourad-Chehade, F., &

Chkeir, A. (2023). Eye recognition by yolo for

inner canthus temperature detection in the

elderly using a transfer learning approach.

Sensors, 23(4), 1851.

[18] Lou, H., Duan, X., Guo, J., Liu, H., Bi, L., &

Chen, H. (2023). Dc-yolov8: small size object

detection algorithm based on camera sensor.

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

349

Volume 22, 2023

Electronics,12(10):2323.

https://doi.org/10.3390/electronics12102323.

[19] Dehaerne, E., Dey, B., Esfandiar, H.,

Verstraete, L., Suh, H. S., Halder, S., … & De

Gendt, S. (2023). Yolov8 for defect inspection

of hexagonal directed self-assembly patterns:

a data-centric approach. 38th European Mask

and Lithography Conference (EMLC

2023),Dresden,Dermany.

[20] Asante, I., Tsun, M. T. K., Jo, H. S., &

McCarthy, C. (2023). Segmentation-based

angular position estimation algorithm for

dynamic path planning by a person-following

robot. IEEE Access, 11, 41034-41053.

[21] González-Santamarta, M. Á., Rodríguez-Lera,

F. J., & Olivera, V. M. (2023). SAILOR:

perceptual anchoring for robotic cognitive

architectures,

arXiv:2303.08204, arxiv.org.

https://doi.org/10.48550/arXiv.2303.08204.

[22] Kumar, R., Kumar, P., Aljuhani, A., Islam, A.

K. M. N., Jolfaei, A., & Garg, S. (2023). Deep

learning and smart contract-assisted secure

data sharing for iot-based intelligent

agriculture. IEEE Intelligent Systems, 38(4),

42-51.

[23] Mamat, N., Othman, M. F., Abdulghafor, R.,

Belhaouari, S. B., Normahira, M., & Hussein,

S. F. M. (2022). Advanced technology in

agriculture industry by implementing image

annotation technique and deep learning

approach: a review. Agriculture, 12(7), 1033.

[24] Umar, M.A., Sani, B.M., & Suleiman, U.

(2022). An Overview of Machine and Deep

Learning Technologies Application in

Agriculture: Opportunities and Challenges in

Nigeria. SLU Journal of Science and

Technology,4(12).

[25] Hua, L., Zhang, Y., Zhang, Y., & Zhang, D.

(2023). Maize pests identification based on

improved yolov4-tiny. Fifth International

Conference on Computer Information Science

and Artificial Intelligence (CISAI

2022),Chongqing, China.

[26] Hui, Y., Zheng, Z., & Sun, C. (2022). E-

commerce marketing optimization of

agricultural products based on deep learning

and data mining. Computational Intelligence

and Neuroscience, 2022, 1-11.

[27] Li, J. and Wang, X. (2020). Tomato diseases

and pests detection based on improved yolo

v3 convolutional neural network. Frontiers in

Plant Science, 11.

[28] Ni, X., Li, C., Jiang, H., & Takeda, F. (2020).

Deep learning image segmentation and

extraction of blueberry fruit traits associated

with harvestability and yield. Horticulture

Research, 7(1).

[29] Zhang, L., Jia, J., Gui, G., Hao, X.,Gao, W.,

Wang, M.(2018). Deep learning based

improved classification system for designing

tomato harvesting robot. IEEE Access, 6,

67940,67950.

[30] Guo, C., Zheng, S., Cheng, G., Zhang, Y.,

Ding, J. (2023). An improved yolo v4 used for

grape detection in unstructured environment.

Frontiers in Plant Science, 14.

[31] Mustafa, H., Umer, M., Hafeez, U., Hameed,

A., Sohaib, A., Ullah, S., Madni, H. A.

(2022). “Pepper Bell Leaf Disease Detection

and Classification Using Optimized

Convolutional Neural Network”, Multimedia

Tools and Applications, 82(8), 12065–12080.

[32] Bai, Y., Mao, S., Zhou, J., Zhang, B. (2023),”

Clustered Tomato Detection and Picking

Point Location Using Machine Learning-

Aided İmage Analysis for Automatic Robotic

Harvesting”, Precision Agriculture, 24(2),

727–743.

[33] Gholipoor, M., & Nadali, F. (2019). Fruit

yield prediction of pepper using artificial

neural network. Scientia Horticulturae, 250,

249–253.

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

350

Volume 22, 2023

Contribution of Individual Authors to the

Creation of a Scientific Article (Ghostwriting

Policy)

The authors equally contributed in the present

research, at all stages from the formulation of the

problem to the final findings and solution.

Sources of Funding for Research Presented in a

Scientific Article or Scientific Article Itself

The authors received no financial support for their

search, authorship, and/or publication of this article.

Conflicts of Interest

On behalf of all authors, the corresponding author

states that there is no conflict of interest.

Creative Commons Attribution License 4.0

(Attribution4.0International, CCBY4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/deed.en

_US

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.39

Erhan Kahya, Yasin Aslan

E-ISSN: 2224-2872

351

Volume 22, 2023