Application of Linear Discriminant Analysis and k-Nearest Neighbors

Techniques to Recommendation Systems

JAVIER BILBAO, IMANOL BILBAO

Applied Mathematics Department,

University of the Basque Country (UPV/EHU),

Bilbao School of Engineering, Pl. Ing. Torres Quevedo, 1, 48013, Bilbao,

SPAIN

Abstract: - Among the different techniques of Machine Learning, we have selected various of them, such as

SVM, CART, MLP, kNN, etc. to predict the score of a particular wine and give a recommendation to a user. In

this paper, we present the results from the LDA and kNN techniques, applied to data of Rioja red wines,

specifically with Rioja Qualified Denomination of Origin. Principal Component Analysis has been used

previously to create a new and smaller set of data, with a smaller number of characteristics to manage, contrast,

and interpret these data more easily. From the results of both classifiers, LDA and kNN, we can conclude that

they can be useful in the recommendation system.

Key-Words: - Machine Learning, recommendation systems, LDA, kNN, Principal Components Analysis,

classification regions.

Received: May 29, 2023. Revised: January 2, 2024. Accepted: January 24, 2023. Published: March 4, 2024.

1 Introduction

Currently, Machine Learning techniques are varied

and are applied to different fields of science. In

addition, they can also be applied to industry. One

of those possible applications is the wine industry

and the field of enology. The production and

consumption of wine in the world is currently of

great importance in certain countries, such as Spain,

Italy, Greece, Chile, France, etc., [1], [2], [3], [4].

Wine has a tradition in society that goes back

thousands of years, integrating itself into the culture

of different societies and forming part of everyday

life in different classes at different levels, [5], [6].

However, it is still the wine experts who generally

mark the quality of wines, also based on laboratory

analysis assessments by authorities and wine

producers, [7], [8].

This personal experience when tasting a wine

depends on each person. Even if it is the same wine,

vintage, and production, even the same bottle, each

person can feel different nuances and classify the

same wine in different ways. Therefore, it is

interesting to be able to predict the rating of a wine

based on each person.

The mathematical models that can be used to

predict each person are very diverse. But if we also

want it to be done automatically, statistical

techniques based on Machine Learning (ML) can be

very valid and are postulated as suitable tools for,

among other things, generating personalized

predictive models automatically, [9], [10]. These

techniques allow making a prediction based on a

database of previously collected data and, in our

case, focus on the evaluation of wines. In this

article, we focus on the prediction of personal

evaluation.

By applying ML techniques, it is possible to

predict what would be the rating of a wine given by

a certain person, as long as the data related to that

specific wine is provided (wine characteristics) and

there is also a background of the person's tastes.

Machine Learning techniques are usually divided

taking into account the type of learning into two

main groups, which are unsupervised and

supervised, [11].

For unsupervised learning, only the

characteristics that identify the product to be

compared, which in our case is wine, would be

necessary. The score given to the wine in previous

tests is not necessary. Although the effectiveness of

these unsupervised techniques may normally be

lower than supervised learning techniques, the

contribution they make to the study can be very

useful in simplifying the number of functionalities

that are used. Some of these techniques are Linear

Discriminant Analysis (LDA) and Principal

Component Analysis (PCA), [12].

In contrast, in supervised learning, it is necessary

to include the results of previous evaluations,

already known, so that the methods can be them and

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2024.21.16

Javier Bilbao, Imanol Bilbao

E-ISSN: 2224-3402

160

Volume 21, 2024

train with them. Some of the techniques of this type

of Machine Learning are support vector machines

(SVM) [13], [14], classification and regression trees

(CART) [15], [16], k-nearest neighbor (kNN), [17]

multilayer perceptron (MLP) [18], [19], Naïve

Bayes classifiers (NBC) [20], [21], linear regression

(LR) [22] and logistic regression [23].

They are different techniques that can be more or

less interesting depending on the problem to be

analyzed, its characteristics, the available data, etc.

However, sometimes the analysis can obtain better

results if several of these techniques are combined,

[24], [25].

For the study of wines, one of the characteristics

that is usually chosen to try to classify this product

is the concentration of anthocyanins, [26], [27].

This article aims to explain the applicability of

different Machine Learning techniques to make

meaningful recommendations, individually for each

person, referring to Rioja red wines, specifically

with Rioja Qualified Denomination of Origin

(DOC).

2 Data Sets

As a prior step to the study of Machine Learning

techniques, different sets of wine characteristics

were obtained. These characteristics were the

following:

 the characteristics obtained in the analysis of

the wines, which, originally, were a total of 62;

 21 components derived from anthocyanins;

 the PCA components that express 99%, 95%,

and 90% of the variability in the data (16

components in the PCA90 set, 23 components

in the PCA95 set, and 37 in PCA99);

 the class-independent Fisher discriminant of 3

components for each taster;

 the class-dependent Fisher discriminant with 12

components per taster; and finally,

 three sets of selected characteristics: on the one

hand, from the first taster, 19 characteristics

from his data; from the second taster, 36

characteristics of the data of that second taster,

these two subsets forming two LDA selections;

and the third set of selected features was QDA

selection, with a total of 21 features.

3 Principal Component Analysis

Principal Components Analysis (PCA) is a Machine

Learning technique that fits within the category of

unsupervised learning. Based on the characteristics

of a data set, this technique allows creation a new

set of characteristics, smaller in number and,

therefore, easier to manage, contrast, and interpret.

To achieve this objective, a linear transformation is

used first, to then select a smaller number of those

characteristics but without losing important

information for the study, [28], [29].

Focusing briefly on the mathematics behind this

method, the final objective of PCA is to find an

orthogonal matrix that allows a change of

characteristics, from the original ones to a new set,

in such a way that the characteristics of the new set

are not correlated with each other and all of this in

order of decreasing variance. This means that this

set of new characteristics will have a diagonal

covariance matrix and the elements of its main

diagonal will be ordered from largest to smallest.

The variance captured by each component is

represented by the eigenvalue associated with each

eigenvector i. In this way, the average error

committed when approximating the original data

with the new set will coincide with the sum of the

eigenvalues of the components not selected in the

study.

Let X be a random vector of r variables (r-

dimensional), each with n observations, which can

be expressed as deviations from the mean or

standardized:

 

1 2 3

, , ,..., 

r

X X X X X

(1)

The steps to apply the algorithm are as follows:

 In the data, taking the characteristics i one by

one, its arithmetic mean is calculated

separately, and also a measure of its variance,

such as the standard deviation. Subsequently,

each characteristic is normalized.

 Then, using the normalized data, the covariance

matrix is calculated.

1T

norm norm

S X X

r



(2)

 Then, the eigenvalues of the matrix S and

the associated eigenvectors are obtained.

The most widely used algorithm for this

step is the singular value decomposition

and singular vector decomposition.

 Next, the n eigenvectors of S associated with

the largest eigenvalues are taken. Thus, the

matrix

reduced

U

is generated.

 Finally, the data for the new vector space is

obtained. These will be the reduced

(transformed) ones through this technique. For

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2024.21.16

Javier Bilbao, Imanol Bilbao

E-ISSN: 2224-3402

161

Volume 21, 2024

the training data set, we will therefore have



reduced reduced

X X U

.

In this way, the original data are compressed and,

in addition, its representation is usually possible due

to the reduced number of dimensions.

4 The Classifiers

The Machine Learning techniques used were the

following: QDA, LDA, NBC (Naïve Bayes), CART,

kNN, Multi-Layer Perceptron (MLP) and

Probabilistic Neural Networks (PNN). Using these

techniques, different families of classifiers were

generated. Matlab R2019b was used to carry out the

calculations and the study.

The mission of a classifier is to correctly assign

data represented by a vector of d characteristics, to

one of c different categories, which have been

previously defined. We will use

 

12

, ,...,d

x x x x

to

designate the data and

12

, ,..., c

C C C

to designate the

categories or classes. Most Machine Learning

techniques search for and assign the category to

which the lowest risk is associated if an error

occurs. That is, for a certain data x, the techniques,

in general, first, calculate the risk or consequences

of making a wrong decision by incorrectly selecting

the category

i

C

using the expression:

1

( ) ( , ) ( )

( , ) ( ) ( )

c

i i j j

j

c

i j j j

j

R C x L C C P C x

L C C P C P x C









(3)

where

( , )

ij

L C C

represents the losses if it is decided

to classify an element as

i

C

, when in reality it

belongs to class

j

C

. After that, the techniques

choose the

i

C

that minimizes

()

i

R C x

.

Different expressions can be used for the loss

function, depending on how you want to show the

result of a bad recommendation.

Normally, the standard expression is used, but

two cases tend to get more attention: the first one is

when a sample that has the worst possible

classification is classified as a positive sample; and,

the second case, is when you have an ordinary (or

even negative) sample and it is misclassified as

positive.

If what happens is that a sample is classified as

the best of all the samples, when the truth is that it

does not belong to that optimal class, the result

would be recommendations that would be

misleading and that would cause the users to

withdraw their trust in the recommendation system.

This may lead to discontinuation of the

recommender system or even to penalties.

Generally, some degree of distrust is generated,

which is directly proportional to the distance

between the categories, but that degree of distrust is

not always the same. Fundamentally, the user

usually shows more interest in positive

recommendations than in negative

recommendations. We propose for this case the loss

matrix shown in Table 1, where the classification

mistakes are taken into account.

In our case, and because the number of samples

was not large, the classifiers were validated using

Leave One Out (LOO).

Table 1. Loss matrix in which the consequences of

the classification mistakes are taken into account.

Real

Classified

Bad

Medium

Good

Excellent

Bad

0

2

3

4

Medium

1

0

1

3

Good

1,5

1

0

2

Excellent

3,5

2,5

1,5

0

5 Linear Discriminant Analysis

The Linear Discriminant Analysis (LDA) technique

is usually used to classify each of the samples with

the assumption that the probability distribution

()

j

P x C

is a multivariate Gaussian with mean

j

m

,

and that the covariance matrix is the same for all

distributions of the different classes. This

covariance matrix is represented as:

intra intra

T

S U S U

  

(4)

with U taken from the class-independent Fisher

discriminant algorithm.

The decision is made according to the so-called

Fisher discriminant functions:

1

int

1

log ( ) ( )

2

()

1

log ( ) log

2

log 2

2

jj

T

ra j

j ra

P C x m

Sx

x

m

P C S

n





   



   



  



(5)

If we had standard losses, the decision limits

obtained using this technique are the hyperplanes

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2024.21.16

Javier Bilbao, Imanol Bilbao

E-ISSN: 2224-3402

162

Volume 21, 2024

equidistant from the centroids of the different

classes (Figure 1).

Fig. 1: Decision regions of the not-validated LDA

classifier in the first two independent Fisher

principal components for the data of the first taster

In our case, when we use the proposed loss

matrix, we try to ensure the predictions are

classified as positive and negative by decreasing the

regions assigned to these classes (Figure 2).

This way of operating can be interpreted in a

way that, for the boundaries of those particular

regions, we are increasing their safety margin.

Fig. 2: Decision regions of the not-validated LDA

classifier in the first two independent Fisher

principal components for the data of the first taster

with a non-standard loss matrix

The classifiers validated through LOO that use

the first three principal components of the data of

the first taster obtain an overall accuracy of 90.62%.

All samples in the Truly Bad category have been

correctly classified. However, this technique has had

some problems with the samples rated with the

highest score by the taster. If we take the 20 samples

that obtained the highest score, 1 of them was

classified as Good, and 2 as Medium; This implies

that 15% of the samples that should be

recommended as a priority to a potential user would

be lost and would be classified with a lower score.

However, the biggest problem that has been found is

that this technique recommends three samples as

having a maximum score when their true

classification is simply Medium. Something similar

happens with one of the samples in the Good

category, which is also wrongly classified among

the best. This can cause great disappointment to the

user of the system, since 19% of the samples that

the classifier issues as positive recommendations

would not be positive (Figure 3).

Fig. 3: Confusion matrix of the 3 independent Fisher

components of the main class LDA classifier,

validated by LOO, for the first taster data with the

standard loss matrix

With the data from the second taster, the

accuracy of the validated classifier is even better, at

95.31%. These good results with these classifiers

may be due to the high separability of the classes.

On the other hand, we have verified that when

using PCA to eliminate noise in the data before the

Fisher method, no advantage is obtained.

6 k-Nearest Neighbors

The k-Nearest Neighbors (kNN) algorithm is a

simple algorithm that falls under the category of

supervised learning. It is often used for

classification, [30], [31]. Using this algorithm, the

decision regions that separate each class can be

constructed, without first needing to estimate the

density function.

To build these decision or classification regions,

all the training cases provided are saved by the

algorithm to later compare the distances between the

new sample that we want to classify and every one

of the training cases. Thus, the algorithm obtains an

ordering with the k nearest neighbors. The category

assigned to the new input sample is that which is in

the majority among those k nearest neighbors.

In the reference [32], several ideas to build a

kNN classifier that supports a non-standard loss

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2024.21.16

Javier Bilbao, Imanol Bilbao

E-ISSN: 2224-3402

163

Volume 21, 2024

matrix are proposed. If we estimate the probability

()

j

P C x

using the number of nearest neighbors in

each category, we can then calculate the risk to

finally make our choice. In this case, the relative

frequency of occurrence of each category in the

subset of the selected k nearest neighbors is taken as

an estimate of

()

j

P C x

.

The number of neighbors to choose and the

metric used to measure distances are the parameters

in our design. We will previously select the distance

metric, which can be Euclidean or another, such as

the Manhattan distance, and we will build two

different sets of classifiers. Once we preselect the

metric, we reserve a sample to test the classifiers

using LOO. To choose the most favorable option, of

the samples that remain unused, we reserve one to

validate the classifier and select the number of

neighbors. The rest of the samples will be used as

training samples.

Following this procedure, in the case of the first

data set, we obtain 96 different classifiers with 95

validated variations for each one according to the

number of neighbors. Using each of these

variations, we can construct the confusion matrix

and calculate the total losses by applying the loss

matrix over the confusion matrix. Finally, we

proceed to calculate the average of the costs among

the 95 experiments carried out, choosing the optimal

number of neighbors (Figure 4).

We want to highlight here that the samples that

had been reserved to test the final classifier have not

been used to obtain this value.

The decision regions (or, in general,

classification regions) are different depending on

whether Euclidean distance classifiers or Manhattan

distance classifiers are used.

Figure 5 shows the different classification

regions in the case of the Manhattan distances for a

classifier with the loss matrix of Table 1 on the first

data set built using only the first two Fisher

principal components. As it can be seen, such a

metric distance tends to form separation regions

parallel to the axes along which distances are

measured.

This vision of the problem shows us that, in this

case, the classifiers with the proposed losses also

tend to increase the regions of the intermediate

categories, such as Medium and Good, reducing at

the extremes, that is, in the classifications called

Bad and Excellent.

This acts again as an increase in the guard zone

when making significant predictions, trying to

discard doubtful cases from the Excellent and Bad

categories and ensuring in some way that the

resulting predictions of those categories are more

reliable. This increase translates into a greater

extension of the regions associated with the

intermediate categories.

Fig. 4: Classification regions using the first two

class-independent Fisher principal components for

k=4 along with standard lossy Euclidean distance

Fig. 5: Classification regions using the first two

class-independent Fisher principal components for

k=4 along with the Manhattan distance with

proposed losses

Table 2 shows the results obtained after applying

the kNN technique to the second data set. These

results can be considered excellent since they

represent the characteristics in the independent

components of each Fisher class with more than

95% accuracy. The use of different distances or

metrics does not seem too relevant since the results

are very similar, although it is true that it provides

slight improvements in the classifiers.

The differences between using one metric or

another are smaller than the margin of error that

arises when validating the data.

Figure 6 and Figure 7 show the confusion

matrices and it can be seen that they coincide for

both types of metrics when the number of neighbors

is optimal.

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2024.21.16

Javier Bilbao, Imanol Bilbao

E-ISSN: 2224-3402

164

Volume 21, 2024

We also want to highlight that the proposed loss

classifier provides correct recommendations in

100% of the cases, losing only one case that was

originally Excellent in the process.

Fig. 6: Confusion matrix of the kNN classifier

validated using LOO on the second data set, without

losses

Fig. 7: Confusion matrix of the kNN classifier

validated using LOO on the second data set, with

losses

If we consider the first data set, the application of

the kNN technique obtains lower precision results,

slightly lower than 89% of global accuracy.

7 Conclusion

This article presents a comparison of various

Machine Learning techniques applied to the

classification of red wines from Rioja.

The novelty focuses on the applicability and also

on the results of the PCA, LDA, and kNN

techniques, comparing the results obtained with

each of these techniques on the same data.

The scores of the wines have been grouped into

four different categories. This has made it easier for

the samples to be classified according to the

opinions of the different tasters.

Furthermore, it has been demonstrated that four

factors are sufficient to characterize the wines, in

this case, red wines from Rioja, to create a

recommendation system. These factors have been:

anthocyanin derivatives, alcoholic content, tannins,

and anthocyanins.

If we take into account the representation spaces

of the samples, there is no significant advantage in

the classifiers if we apply the PCA technique to the

original data. Furthermore, it does not improve the

classification results if the PCA technique is applied

before the Fisher decomposition.

More reliable meaningful predictions can be

made if we use the proposed loss matrix for

classifier generation. In this way, greater accuracy is

obtained per significant category. Unfortunately, it

is necessary to reduce the total number of

recommended wines to achieve this greater

precision, that is, some possible recommendations

must be lost.

When we apply the proposed loss matrix, the

positive and negative predictions are ensured, and at

the same time the regions assigned to these classes

are decreased.

Both classifiers, LDA and kNN, can be useful in

the recommendation system. On the one hand, the

kNN classifier without losses in the LDA

components with 4 neighbors and Euclidean

distance offers the second best classification rate for

wines categorized as Excellent (88.2%). In this

process, only 20% and 25% of the original samples

are lost. On the other hand, the LDA classifier with

standard losses offers an intermediate level: its

success rates are 81% in the Excellent category and

losses of 20% and 15% of the original samples.

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2024.21.16

Javier Bilbao, Imanol Bilbao

E-ISSN: 2224-3402

165

Volume 21, 2024

Table 2. Design parameters and general results of the kNN classifiers taking into account the second data set

Standard Losses

Proposed Losses

Features

Distance

Optimum number of

neighbors

Accuracy

Optimum number of

neighbors

Accuracy

Originals

Euclidean

18

35.40

13

40.63

Manhattan

13

37.50

13

37.50

Anthocyanin

derivatives

Euclidean

4

43.75

15

39.06

Manhattan

15

45.31

15

40.63

LDA2 Selection

Euclidean

3

43.75

6

37.50

Manhattan

2

48.44

2

48.44

QDA Selection

Euclidean

11

43.75

18

37.50

Manhattan

11

42.19

11

40.63

PCA

99%

Euclidean

2

43.75

16

37,50

Manhattan

1

45.31

1

45,31

95%

Euclidean

2

43.75

2

43,75

Manhattan

2

45.31

2

45,31

90%

Euclidean

2

45.31

2

45,31

Manhattan

2

46.88

2

46,88

Fischer’s independent

Euclidean

5

96.88

9

96.88

Manhattan

9

96.88

11

96.88

Fischer’s dependent

Euclidean

2

46.88

9

43.75

Manhattan

9

60.94

8

59.38

Acknowledgement:

We would like to thank the work and generosity of

the members of the QAProdNat research group,

especially Dr. Noelia Prieto Perea and Dr. Luis

Ángel Berrueta Simal, whose previous work served

as an invaluable basis for this research, and Dr.

Oihane Elena Albóniga Díez, the catalyst of this

relationship.

References:

[1] G. Vazquez Vicente, V. Martin Barroso, F. J.

Blanco Jimenez, Sustainable tourism,

economic growth and employment—The case

of the wine routes of Spain, Sustainability,

vol. 13, no 13, 2021, pp. 7164.

[2] B. Marco-Lajara, P. Seva-Larrosa, J.

Martínez-Falcó, F. García-Lillo, Wine clusters

and Protected Designations of Origin (PDOs)

in Spain: an exploratory analysis, Journal of

Wine Research, vol. 33, no 3, 2022, pp. 146-

167.

[3] J. P. Torres, J. I. Barrera, M. Kunc, S.

Charters, The dynamics of wine tourism

adoption in Chile, Journal of Business

Research, vol. 127, 2021, pp. 474-485.

[4] C. Yang, C. Menz, H. Fraga, S. Costafreda-

Aumedes, L. Leolini, M. C. Ramos, M. C., D.

Molitor, C. van Leeuwen, J. A. Santos,

Assessing the grapevine crop water stress

indicator over the flowering-veraison phase

and the potential yield lose rate in important

European wine regions, Agricultural Water

Management, vol. 261, 2022, pp. 107349.

https://doi.org/10.1016/j.agwat.2021.107349.

[5] J. A. Santos, H. Fraga, H., A. C. Malheiro, J.

Moutinho-Pereira, L. T. Dinis, C. Correia, M.

Moriondo, L. Leolini, C. Dibari, S.

Costafreda-Aumedes, T. Kartschall, C. Menz,

D. Molitor, J. Junk, M. Beyer, H. R. Schultz,

A review of the potential climate change

impacts and adaptation options for European

viticulture, Applied Sciences, vol. 10, no 9,

2020, pp. 3092.

https://doi.org/10.3390/app10093092.

[6] E. Pijet-Migoń, P. Migoń, Linking wine

culture and geoheritage—Missing

opportunities at European UNESCO World

Heritage sites and in UNESCO Global

Geoparks? A survey of web-based resources,

Geoheritage, vol. 13, no 3, 2021, pp. 71.

[7] V. Santos, P. Ramos, N. Almeida, E. Santos-

Pavón, Developing a wine experience scale: a

new strategy to measure holistic behaviour of

wine tourists, Sustainability, vol. 12, no 19,

2020, pp. 8055.

[8] I. Dos Santos, G. Bosman, J. L. Aleixandre-

Tudo, W. du Toit, Direct quantification of red

wine phenolics using fluorescence

spectroscopy with chemometrics, Talanta,

vol. 236, 2022, pp. 122857.

[9] M. Torrisi, G. Pollastri, Q. Le, Deep learning

methods in protein structure prediction,

Computational and Structural Biotechnology

Journal, vol. 18, 2020, pp. 1301-1310.

[10] F. Huang, Z. Cao, J. Guo, S. H. Jiang, S. Li,

Z. Guo, Comparisons of heuristic, general

statistical and machine learning models for

landslide susceptibility prediction and

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2024.21.16

Javier Bilbao, Imanol Bilbao

E-ISSN: 2224-3402

166

Volume 21, 2024

mapping, Catena, vol. 191, 2020, pp. 104580.

https://doi.org/10.1016/j.catena.2020.104580.

[11] M. Alloghani, D. Al-Jumeily, J. Mustafina, A.

Hussain, A. J. Aljaaf, A systematic review on

supervised and unsupervised machine learning

algorithms for data science, Supervised and

unsupervised learning for data science, 2020,

pp. 3-21. https://doi.org/10.1007/978-3-030-

22475-2_1.

[12] D. K. Choubey, M. Kumar, V. Shukla, S.

Tripathi, V. K. Dhandhania, Comparative

analysis of classification methods with PCA

and LDA for diabetes, Current diabetes

reviews, vol. 16, no 8, 2020, pp. 833-850.

https://doi.org/10.2174/157339981666620012

3124008.

[13] B. E. Boser, I. M. Guyon and V. N. Vapnik, A

training algorithm for optimal margin

classifiers, Proceedings of the fifth annual

workshop on Computational learning theory -

COLT 92, 1992.

[14] C. Cortes and V. Vapnik, Support-vector

networks, Machine Learning, vol. 20, 1995,

pp. 273-297.

[15] L. Breiman, J. Friedman, C. J. Stone and R.

A. Olshen, Classification and Regression

Trees, Taylor & Francis, 1984.

[16] L. Breiman, Random forests, Machine

learning, vol. 45, 2001, pp. 5-32.

[17] T. Cover and P. Hart, Nearest neighbor

pattern classification, IEEE Transactions on

Information Theory, vol. 13, 1967, pp. 21-27.

[18] D. E. Rumelhart, G. E. Hinton and R. J.

Williams, Learning internal representations

by error propagation, 1985.

[19] B. Widrow and M. A. Lehr, 30 years of

adaptive neural networks: perceptron,

madaline, and backpropagation, Proceedings

of the IEEE, vol. 78, 1990, pp. 1415-1442.

[20] R. O. Duda, P. E. Hart and D. G. Stork,

Pattern Classification, Wiley John & Sons,

2000.

[21] P. Langley, W. Iba, and K. Thompson, An

analysis of Bayesian classifiers, Proceedings

of the Tenth National Conference on Artificial

Intelligence, 1992, pp. 223–228.

[22] D. Maulud, A. M. Abdulazeez, A review on

linear regression comprehensive in machine

learning, Journal of Applied Science and

Technology Trends, vol. 1, no 4, 2020, pp.

140-147. https://doi.org/10.38094/jastt1457.

[23] D. W. Hosmer Jr, S. Lemeshow and R. X.

Sturdivant, Applied logistic regression, John

Wiley & Sons, 2013.

[24] C. El-Hajj, P. A. Kyriacou, A review of

machine learning techniques in

photoplethysmography for the non-invasive

cuff-less measurement of blood pressure,

Biomedical Signal Processing and Control,

vol. 58, 2020, pp. 101870.

[25] M. Elbadawi, S. Gaisford, A. W. Basit,

Advanced machine-learning techniques in

drug discovery, Drug Discovery Today, vol.

26, no 3, 2021, pp. 769-777.

https://doi.org/10.1016/j.drudis.2020.12.003

[26] Y. Ju, L. Yang, X. Yue, Y. Li, R. He, S.

Deng, X. Yang, Y. Fang, Anthocyanin

profiles and color properties of red wines

made from Vitis davidii and Vitis vinifera

grapes, Food Science and Human Wellness,

vol. 10, no 3, 2021, pp. 335-344.

https://doi.org/10.1016/j.fshw.2021.02.025.

[27] A. B. Bautista-Ortín, J. I. Fernández-

Fernández, J. M. López-Roca, E. Gómez-

Plaza, The effects of enological practices in

anthocyanins, phenolic compounds and wine

colour and their dependence on grape

characteristics, Journal of Food Composition

and Analysis, vol. 20, no 7, 2007, pp. 546-

552.

[28] I. Bilbao, J. Bilbao, C. Feniser, A. Borsa,

Practical data mining applied in steel coils

manufacturing, Acta Technica Napocensis-

Series: Applied Mathematics, Mechanics, and

Engineering, vol. 63, no 3, 2020.

[29] Y. M. Sebzalli, X. Z. Wang, Knowledge

discovery from process operational data using

PCA and fuzzy clustering, Engineering

Applications of Artificial Intelligence, 14,

2001. https://doi.org/10.1016/S0952-

1976(01)00032-X.

[30] I. Revilla, S. Pérez-Magariño, M. L.

González-SanJosé and S. Beltrán,

Identification of anthocyanin derivatives in

grape skin extracts and red wines by liquid

chromatography with diode array and mass

spectrometric detection, Journal of

Chromatography A, vol. 847, 1999, pp. 83-90.

https://doi.org/10.1016/S0021-

9673(99)00256-3.

[31] N. Katanić, K. Fertalj, Improving Physical

Security with Machine Learning and Sensor-

Based Human Activity Recognition, WSEAS

Transactions on Information Science and

Applications, vol. 14, pp. 1-9, 2017.

[32] Z. Qin, A. T. Wang, C. Zhang, S. Zhang, S.,

Cost-Sensitive Classification with k-Nearest

Neighbors, Knowledge Science, Engineering

and Management, Springer, Berlin,

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2024.21.16

Javier Bilbao, Imanol Bilbao

E-ISSN: 2224-3402

167

Volume 21, 2024

Heidelberg, 2013.

https://doi.org/10.1007/978-3-642-39787-

5_10.

Contribution of Individual Authors to the

Creation of a Scientific Article (Ghostwriting

Policy)

Conceptualization, J.B. and I.B.; methodology, J.B.

and I.B.; software, I.B.; validation, J.B. and I.B.;

formal analysis, J.B. and I.B.; investigation, J.B. and

I.B.; resources, J.B. and I.B.; data curation, I.B.;

writing—original draft preparation, J.B. and I.B.;

writing—review and editing, J.B. and I.B.;

visualization, J.B.; supervision, J.B.; project

administration, J.B. All authors have read and

agreed to the published version of the manuscript.

Sources of Funding for Research Presented in a

Scientific Article or Scientific Article Itself

No funding was received for conducting this study.

Conflict of Interest

The authors have no conflicts of interest to declare.

Creative Commons Attribution License 4.0

(Attribution 4.0 International, CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/deed.en

_US

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2024.21.16

Javier Bilbao, Imanol Bilbao

E-ISSN: 2224-3402

168

Volume 21, 2024