A Movie Recommendation System Design Using Association Rules

Mining and Classification Techniques

ZAKARIA SULIMAN ZUBI1, ALI A. ELROWAYATI2, IBRAHIM SAAD ABU FANAS3

1Department of Computer Science, Faculty of Science, Sirte University, LIBYA

2 Department of Electronic, College of Industrial Technology, Misurata, LIBYA

3Department of Information Technology, Libyan Academy, Misurata, LIBYA

Abstract: - The importance of recommendation systems is increasing day by day due to the massive number of

data and information-overloaded arising from the internet. This data can be collected in predictive datasets;

these datasets can be processed and analysed via data mining methods. In this paper, an efficient hybrid movie

recommender system has been designed using the association rules mining technique and K-nearest neighbours

(KNN) algorithm as a classification method. The K-nearest neighbours (KNN) algorithm subsystem was used

to create the first candidate list through a practical MovieLens dataset, which was retrieved from the source of

the NetFlix network. Besides, the Apriori algorithm subsystem is used to analyse the same dataset and create

the second list. Finally, the proposed system creates a final recommended list by matching the two lists. The

results of the proposed system provide better performance than the existing systems in terms of the important

degree. The important degree gives a better accuracy rate than the existing techniques used.

Key-Words: -Recommendation engine, Association Rules Mining, Collaborative Filtering, Apriori algorithm,

Classification.

Received: August 19, 2021. Revised: April 15, 2022. Accepted: May 12, 2022. Published: June 6, 2022.

1 Introduction

Dataset is an important factor these days, especially

for many applications worldwide for many purposes

such as scientific, industrial and commercial

enterprises, whereas: databases stores extremely

large amount of data. But this tremendous amount

of data is useless and needs to be analyzed to find

the hidden data to help the decision makers to come

up with important decisions. In this case we need a

powerful approach to analyze this data this approach

is called data mining. Data mining is a method of

analyzing and generating data and rules gathering, it

is specialized also in identification of the relevant

elements with each other. Association rule mining is

one of the poplar data mining techniques that focus

on finding frequent patterns, correlations,

associations, or causal structures from data sets

found in various types of data sets.

Movie recommendation systems provide a

mechanism to assist users in classifying customers

with similar interests. In [1] authors used a new

approach that can solve sparsity problem to a great

extent. In [2], authors built a recommendation

engine by analyzing rating data sets collected from

Twitter to recommend movies to specific user using

R. In Golbeck and Hendler [3], they also proposed

FilmTrust, which is the website that integrates

Semantic Web-based social networks and

augmented with trust, to create predictive movie

recommendations. It works by applying a

collaborative filtering where the recommendations

were generated to suggest how much a given user

may be interested in a movie that the user already

found. In [4][5][6] authors built a movie

recommender system using the K-means clustering

and K-nearest neighbor (KNN) algorithms.

Recently, in [7] authors present a complete survey

of recommendation systems and give a platform for

researchers in the recommendation system domain

and provide collective discussions over various

techniques.

In this paper, we will use Apriori algorithm which

is an influential algorithm for mining association

rules. Meanwhile, association rules mining plays an

essential role in rule-based recommendation system.

However, the classic Apriori algorithm has many

advantages and disadvantages. The main downside

is the degree of importance does no measured by the

minimum support and confidence. Furthermore, the

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

189

Volume 21, 2022

Apriori algorithm deals only with single Boolean

association rules [14]. However, the NetFlix

database contains many characteristics and is

considered multi-dimensional association rules, not

single Boolean association rules.

Therefore, this paper proposes a solution to these

problems by using the KNN classification algorithm

with the Apriori algorithm to increase the accuracy

of the recommender system. On the other hand, it

increases the efficiency of the Apriori algorithm in

two stages. First, the contents of the subsets are

arranged. Second, the ineffective elements are

removed which leads to a decrease in the efficiency

of the system.

2. Recommender Systems

Recommender systems are employed to help users

to find their items based on their preferences. They

produce individualized recommendations as output

or have the effect of guiding the user in a

personalized way to find interesting or useful items

in a large amount of other items [12].

To produce recommendations, these systems need

background data, input data and an algorithm.

Background data is the information that the system

has before it produces any recommendation. Input

data is the information that is communicated to the

system by the user in order to produce

recommendations. An algorithm in the system is

needed to combine the input data and the

background data to produce a recommendation.

Based on these three points, mentioned by Burke in

[12] it distinguished five different recommendation

methods as follows:

(1) A collaborative recommender system

collects ratings of items, recognizes

similarities between users based on their

ratings, and produces new recommendations

based on inter-user comparisons.

(2) Content-based recommender systems

produce recommendation based on the

associated features of an item: it recognized

a user’s interests profile based on the

features present in items that the user has

rated before.

(3) A recommender system based on

demographic categorizes users based on

personal attributes and finds interesting

items based on demographic classes.

(4) Utility-based systems evaluate the match

between a user’s need and the set of options

available: it recommends items based on a

computation of the utility of each item for

the user.

(5) Knowledge-based recommenders also make

such evaluations, but they have knowledge

about how a particular item meets a

particular user’s need.

Figure 1: The main types of recommendation

system

A hybrid recommender systems are developed to

build a recommender system that combine two or

more recommendation methods into one

recommender system for a better performance. The

following combination methods are identified by

Burke in [1]:

(1) A weighted hybrid recommender system

calculates the score of a recommended item

from the results of the recommendation

methods that the system uses.

(2) Switching hybrid recommender systems

uses some criterion to switch between the

recommendations methods used in the

system to do the recommendation

(3) In a mixed hybrid recommender,

recommendations from the different

recommendation methods are presented

together.

(4) Hybrid recommender systems based on

feature combination combine the features of

the unlike recommendation methods in the

system and use these features in a single

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

190

Volume 21, 2022

recommendation algorithm to produce

recommendations.

(5) In a cascade hybrid recommender system,

one recommendation method is used first to

produce a ranking of recommended items

and a second recommendation method

refines this ranking of items.

(6) A hybrid recommender based on feature

augmentation method uses the output of one

recommendation method as input for

another recommendation method used in the

recommender system.

(7) Meta-level hybrid recommenders use the

model learned by the first recommendation

model as input to another recommendation

method.

The proposed hybrid movie recommenders [13] also

combined the content-based method with

collaborative filtering to get a higher accuracy of

performance. Both methods were based on a naïve

Bayesian classifier and the evaluation of the

recommenders, it combined the movie data from

IMDb as well as the rating data from Netflix. In

Symeonidis et al. [13], they constructed a feature-

weighted user profile to disclose the duality between

users and features. The outline of their approach

consisted in four steps:

(1) Constructing a content-based user profile

from both collaborative and content

features;

(2) Quantifying the affect of each feature inside

the user’s profile and among the users;

(3) Creating the user’s neighborhood by

calculating the similarity between each user

to provide recommendations;

(4) Providing a Top-N recommendation list for

each test user based on the most frequent

feature in his neighborhood. The

experimental results were performed with

IMDb and MovieLens data sets.

3 Association Rule Mining

In general, association rule mining is the process

of finding association rules. An association rule is

an expression on the form X  Y. This rule can be

read as: “IF X THEN Y”, where X and Y are sets of

items in the database. With such rule there are

measures of worthiness associated with it. These

measures are being support s and confidence c. The

calculation of the support(s) and confidence(c) is

performed as follows (1) (2) (3):

(X)Support

Y) (XSupport

Y) (X Confidence 



(1)

(2)

(3)

For example, suppose that we would like to

determine which items are frequently purchased

together within the same transactions in a computer

firm and suppose that we have found the following

rule:

Contains (T, "computer")Contains (T, "software")

[Support = 1%; confidence = 50%]

The interpretation of such rule is as follows:

50% of transactions, T, which contains

computer, also contain software. 1% of all

transactions, T, contain both of these items. In our

work we will use association rules mining as an

essential role in our proposed rule-

based recommendation system. The association

rules will be generated by a common known

algorithm for association rules mining called Apriori

algorithm.

4 Apriori Algorithms

The Apriori algorithm was proposed by Agarwal

and Srikant in 1994. Apriori is intended to operate

on databases or datasets containing transactions,

Apriori in[12], is an algorithm for frequent item set

mining and association rule learning over

transactional databases. It profits by identifying the

frequent individual items in the database and

extending them to larger and larger item sets as long

as those item sets appear sufficiently often in the

database. The frequent item sets determined by

Apriori can be used to determine association rules

which highlight general trends in the database: this

has applications in domains such as market basket

analysis.

The algorithm is known also as a classic algorithm

for learning association rules. Besides, the Apriori

algorithm is applied on a database that contains the

transaction (e.g. a collection of items purchased by

customers etc.). It is also easy to execute and very

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

191

Volume 21, 2022

simple. It is used to mine all frequent item sets

in database. The algorithm makes many searches in

database to find frequent item sets whereas; k-item

sets are used to generate k+1-itemsets. Each k-item

set must be greater than or equal to minimum

support threshold frequency. Otherwise, it is called

candidate item sets. In our proposal work we will

use Apriori algorithm in generating association rules

to find frequency of 1-itemsets that contains only

one item by counting each item in the MovieLens

dataset. The frequency of 1-itemsets is used to find

the item sets in 2-itemsets which in turn is used to

find 3-itemsets and so on until there are not any

more k-item sets. If an item set is not frequent, any

large subset from it is also non-frequent. In this

condition pruning from the search space in

MovieLens dataset is conducted. Figure 2, illustrates

the flowchart of Apriori Algorithm [10], [11].

Figure 2: Flowchart of Apriori algorithm

5. K-Nearest Neighbours Algorithm

The k-nearest neighbours (KNN) algorithm, also

known as KNN or k-NN, is a non-parametric,

supervised learning classifier, which uses proximity

to make classifications or predictions about the

grouping of an individual data point. While it can be

used for either regression or classification problems,

it is typically used as a classification algorithm,

working off the assumption that similar points can

be found near one another.

For classification problems, a class label is

assigned on the basis of a majority vote for instance,

the label that is most frequently represented around

a given data point is used. While this is technically

considered “plurality voting”, the term, “majority

vote” is more commonly used in literature. The

distinction between these terminologies is that

“majority voting” technically requires a majority of

greater than 50%, which primarily works when there

are only two categories. When you have multiple

classes for example, four categories, you don’t

necessarily need 50% of the vote to make a

conclusion about a class; you could assign a class

label with a vote of greater than 25%.

5.3 Compute KNN Using Distance Metrics

The main goal of the k-nearest neighbour (KNN)

algorithm is to identify the nearest neighbours of a

given query point, so that we can assign a class label

to that point. In order to do this, KNN has a few

requirements these requirements are indicated as

following:

1. Determine your distance metrics

In order to determine which data points are

closest to a given query point, the distance between

the query point and the other data points will need to

be calculated. These distance metrics help to form

decision boundaries, which partitions query points

into different regions. A commonly decision

boundaries will be visualized with a Voronoi

diagrams.

2. Euclidean distance

This is the most commonly used distance measure,

and it is limited to real-valued vectors. Using the

below formula (4), it measures a straight line

between the query point and the other point being

measured.

(4)

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

192

Volume 21, 2022

3. Compute KNN Defining K

The k value in the k-NN algorithm defines how

many neighbours will be checked to determine the

classification of a specific query point. For example,

if k=1, the instance will be assigned to the same

class as its single nearest neighbour. Defining k can

be a balancing act as different values can lead to

over fitting or under fitting. Lower values of k can

have high variance, but low bias, and larger values

of k may lead to high bias and lower variance.

The choice of k will largely depend on the input

data as data with more outliers or noise will likely

perform better with higher values of k. Overall, it is

recommended to have an odd number for k to avoid

ties in classification, and cross-validation tactics can

help you choose the optimal k for your dataset.

6 The Proposed Movie

Recommendation System

In this section, the parts of the proposed system

will be explained in Figure (3) and it combines two

different techniques; collaborative filter and and

association rules. The collaborative filter based on

calculating the similarity between films and the

characteristics of the movie type, average rating.

Meanwhile, the association rules according to

support and confidence using the Apriori algorithm

will be me measured as well.

Figure 3: Proposed Movie Recommendation System

Dataset

We used in this paper, Netflix Movies

dataset and it contains data of users who watch

movies and detailed movie data, in addition to

100,000 records of movie viewers form 943 users

and 1682 movies.

Preprocessing Data

In order to increase the efficiency of the Apriori

algorithm preprocess stage applied on dataset in two

sub-stages. First, the contents of the subsets are

arranged. Second, the ineffective elements are

removed. It presence leads to decrease the efficiency

of the system.

In sorting sub-stage, the data will be sorted in

ascending order and grouped according to the

sequence of users. In removed redundancy sub-

stage, for each user, remove the ineffective elements

due to slow in execution of the classic Apriori

algorithm because it always scans the elements

every time in all dataset, the unsorted elements

were consuming time and effort in the

implementation [14]. Therefore, the proposed

system sorts the elements, and removing the

elements that do not affect the results.

Input Movie ID

In this stage, the user selects the movie id in

order to calculate the similarity with the selected

movie, and the rest of the movies in the

recommendation system, whether in the part related

to the rules of association or in the part about

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

193

Volume 21, 2022

calculating similarity in the recommendation

system.

The collaborative filter subsystem consists of four

components as follow:

First, input the number of Movies N in

candidate list:

At this stage, we input the number of Movies list

that the system will propose after applying the

recommendation system.

Second, calculate Average Movie Rating:

In the recommendation system, the average rating

of the Movie will be calculated based on the rating

by other users; the calculation will be placed in the

dataset. The data will be grouped by the movie ID,

to compute the total number of ratings (each movie's

popularity) and the average rating for every movie.

On the other hand, we will determine a list of users

similar to a user U that we need to calculate the

rating R. Whereas; the user U would give to a

certain item I. Again, we will repeat this procedure

many times just like similarity; you can do this in

multiple ways.

We can predict that a user’s rating R for an item I

will be close to the average of the ratings given to I

by the top rating 5 or top rating 10 users most

similar to U. The mathematical formula for the

average rating given by n users are indicated as the

following:

(5)

This equation (5) shows that the average rating

given by the n similar users is equal to the sum of

the ratings given by them divided by the number of

similar users, which is n. There will be situations

where the n similar users that you found are not

equally similar to the target user U. The top rating 3

of them might be very similar, and the rest might

not be as similar to U as the top rating 3. In that

case, we could consider an approach where the

rating of the most similar user matters more than the

second most similar user and so on. The weighted

average can help us achieve that[12].

Third, Apply Recommendation System

Based on Euclidean Distance

Applying a recommendation system based on

the Euclidean distance algorithm to calculate the

similarity between movies related to the user's

desire, using the characteristics of the movies types

(action, Documentary, Romance, .... etc.). The

Euclidean distance is a familiar distance measures

used for 2- dimensional and 3-dimensional

geometry. The Euclidean distance r2(x, y) between

two 2-dimensional vectors x = (x1, x2)T and y =

(y1, y2)T is given by the following equation:

(6)

Define a function that computes the "distance"

between two movies based on how similar their

genres are, and how similar their popularity is. Just

to make sure it works, we'll compute the distance

between movies ID in the next step.

Forth, Generate N1 recommendation movies

list

The recommendation system determines the

movies that are most similar to the user's request

and puts them in a list called N1.

In the second part, the association rules subsystem

consists of four components as follow:

Part one; select support and confidence terms:

select them is one of the necessary of Apriori

algorithm.

Part two; apply Apriori to generate

association rules mining using min support

and confidence

The Apriori algorithm used to create association

rules, between movies, according to the support and

trust specified by the user.

Part three; generate The N2

recommendation movies list

Define the list of movies to appear based on the

association rules called N2 list.

Part four; create final recommendation list

Match the two Movie lists; N1 of the

collaborative recommender system and the list N2

based on Apriori algorithm in order to create fully

recommended list, the final list is the proposed

results of proposed recommender system.

7 Implementation

The proposed system was implemented using the

C# programming language, to demonstrate the

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

194

Volume 21, 2022

Apriori algorithm, illustrated in Figure (5), where

the user can specify the required support and

confidence, and the system finds association rules

between the elements in the Netflix dataset.

Figure 5: Apriori algorithm implementation.

Support and confidence values can be modified as

desired, and the system finds association rules

between items, which is a recommendation system,

based on association rules, using the Apriori

algorithm. The python language was used as well to

implement the part of the recommendation program

based on the KNN algorithm using the Euclidean

distance measures Since the python language

contains many libraries for machine learning

systems, which greatly helped in completing the

recommendation system and achieving a promising

recommendations result.

8 Results and Discussion

This section presents an experimental study of

our proposed system. It presents the experiment

results, and summarizes our observation. The

performance of our system evaluates the dataset

based on the important degree term.

There are some cases that have been observed to

validate our system. First, we apply the Apriori

algorithm for several different users on the Netflix

dataset with minimum support was 50, and a

confidence value 60%. According to that we

obtained a list, containing most ranked of the

movies, that users interacted with in the dataset of

the Netflix as shown in table 1. We observed from

table 1, the most frequently movies by users. It also

could be notable that the capability of Apriori

algorithm of extract the movies have frequently

used in dataset. However, this list has a minimum

number of items up to 21 movies only among those

in the dataset. Thus, we have obtained the nearest

neighbour movies to support this list.

Movie ID

Movie Title

Toy Story (1995)

Twelve Monkeys (1995)

Mr. Holland's Opus (1995)

Star Wars (1977)

Pulp Fiction (1994)

Shawshank Redemption

Blade Runner (1982)

Terminator 2: Judgment Day (1991)

Silence of the Lambs

121

Independence Day (ID4) (1996)

172

Empire Strikes Back

173

Princess Bride

174

Raiders of the Lost Ark (1981)

181

Return of the Jedi (1983)

222

Star Trek: First Contact (1996)

227

Star Trek VI: The Undiscovered

Country (1991)

228

Star Trek: The Wrath of Khan (1982)

229

Star Trek III: The Search for Spock

(1984)

230

Star Trek IV: The Voyage Home (1986)

258

Contact (1997)

Table 1, the most ranked videos in the Apriori

algorithm

Second, when applying the KNN algorithm, and

assuming that the value of k =15. If we choose a

movie entitled "Star Wars (1977) " movie as an

example. The movie video ID = 50, we have a list of

recommended movies from the system, shown in the

table 2. This list is the recommended movies using

KNN to the movie entitled " Star Wars (1977)".

Return of the Jedi (1983)

4.0

Empire Strikes Back, The (1980)

4.2

Starship Troopers (1997)

3.2

Independence Day (ID4) (1996)

3.4

African Queen, The (1951)

4.1

Star Trek: First Contact (1996)

3.6

Jurassic Park (1993)

3.7

Star Trek: The Wrath of Khan (1982)

3.8

Raiders of the Lost Ark (1981)

4.2

Star Trek IV: The Voyage Home (1986)

3.4

Star Trek III: The Search for Spock (1984)

3.1

Star Trek VI: The Undiscovered Country

(1991)

3.2

Indiana Jones and the Last Crusade (1989)

3.9

English Patient, The (1996)

3.6

Princess Bride, The (1987)

4.1

Table 2, the list of recommended movies from KNN

to Star Wars (1977)

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

195

Volume 21, 2022

On the other hand, when we use Apriori algorithm

and chooses the movie ID =50 entitled "Star Wars

(1977)", we will found the 10 most related movies

based on the Apriori algorithm. table 3 shows the

matched movies according to Apriori algorithm and

KNN.

Movie Title

Rating

Return of the Jedi (1983)

4.0

Empire Strikes Back, The (1980)

4.2

Independence Day (ID4) (1996)

3.4

Star Trek: First Contact (1996)

3.6

Star Trek: The Wrath of Khan (1982)

3.8

Raiders of the Lost Ark (1981)

4.2

Star Trek IV: The Voyage Home (1986)

3.4

Star Trek III: The Search for Spock (1984)

3.1

Star Trek VI: The Undiscovered Country

(1991)

3.2

Princess Bride, The (1987)

4.1

Table 3, the matched movies list based on the movie

entitled " Star Wars (1977)" using Apriori algorithm

and KNN

The matching ratio was (10/15) = 0.666 when

comparing the list of Apriori algorithm to KNN

lists. This result is considered as an excellent match

or high important degree as shown in Figure 6.

Figure 6, Most Ranked Movies list related to Stars

War (1977) using Apriori algorithm.

Moreover, if we take movies ID=222 entitled

"Star Trek: First Contact (1996)" as an example, the

recommended KNN list by using our proposed

system, the list will be as follows in table 4.

Movie Title

Rating

Jurassic Park (1993)

3.7

Star Trek: The Wrath of Khan (1982)

3.8

Star Trek IV: The Voyage Home (1986)

3.4

Star Trek III: The Search for Spock (1984)

3.1

Star Trek VI: The Undiscovered Country

(1991)

3.2

Stargate (1994)

3.1

Star Trek: The Motion Picture (1979)

3.0

Star Trek: Generations (1994)

3.3

Star Trek V: The Final Frontier (1989)

2.3

Judge Dredd (1995)

2.8

Time Tracers (1995)

1.5

Indiana Jones and the Last Crusade (1989)

3.9

Raiders of the Lost Ark (1981)

4.2

Men in Black (1997)

3.7

Starship Troopers (1997)

3.2

Table 4, the list of recommended movies from KNN

to Star Trek: First Contact (1996)

Once we compare these results with cluster list in

KNN algorithm for movie ID=222 we found 5

movies only related to the move entitled " Star Trek:

First Contact (1996)" which is the most ranked

movies Figure 7 shows the most ranking movies of

the mentioned movie. The list is shown in table 5.

Movie Title

Rating

Star Trek: The Wrath of Khan (1982)

3.8

Star Trek IV: The Voyage Home (1986)

3.4

Star Trek III: The Search for Spock (1984)

3.1

Star Trek VI:The Undiscovered Country

(1991)

3.2

Raiders of the Lost Ark (1981)

4.2

Table 5, The matched movies list to Star Trek: First

Contact (1996)

Figure 7, The most ranked movies list related to the

movie entitled " Star Trek: First Contact (1996)"

This matching ratio was (5/15) = 0.333, which is

considered as a good match between two Lists.

Therefore, when we applied KNN to the movie ID=

258 entitled "Contact (1997)", the outcome list of

movies will be shortlisted as follows in table 6.

Movie Title

Rating

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

196

Volume 21, 2022

Twelve Monkeys (1995)

3.7

Day the Earth Stood Still, The (1951)

3.9

Until the End of the World (Bis ans

Ende der Welt) (1991)

2.8

Dead Man Walking (1995)

3.8

Mr. Holland's Opus (1995)

3.7

Shawshank Redemption, The (1994)

4.4

One Flew Over the Cuckoo's Nest

(1975)

4.2

Dead Poets Society (1989)

3.9

Trainspotting (1996)

3.8

Time to Kill, A (1996)

3.6

It's a Wonderful Life (1946)

4.1

Clockwork Orange, A (1971)

3.9

To Kill a Mockingbird (1962)

4.2

People vs. Larry Flynt, The (1996)

3.5

Field of Dreams (1989)

3.6

Table 6, The recommended movies list using KNN

to movie " Contact (1997)"

When we compare the inferred used by the KNN

algorithm for movie ID=258 entitled " Contact

(1997) " we will found 2 movies related to that

movie. This proves that the comparison has been

done more accurately. Those movies are shown in

table 7 and as well as the important degree ratio in

are illustrated in Figure 8.

Movie Title

Rating

Twelve Monkeys (1995)

3.7

Shawshank Redemption, The

(1994)

4.4

Table 7, The matching movies list of the movie

entitled " Contact (1997)"

Figure 8, The most ranked movies list related to the

movie entitled "Contact (1997)"

Based on that the matching ratio was (2/15) =

0.133, which is considered too bad matching

between two lists.

Therefore, the number of most ranked videos by

using Apriori algorithm list is very small due to the

disadvantage of using Apriori algorithm alone in

term of the important degree. At our discretion, we

suggest supporting the most ranked list of Apriori

algorithm by adding the related videos found in the

KNN list with this we give the user more

recommended videos based on his first chosen

movie.

Thus we can conclude, that the new movie which

has less number of using, the Apriori algorithm

cannot meets the minimum supportive degree

accurately. Therefore, the Apriori algorithm list

could be supported by nearest neighbour movies

extracted by KNN technique. As an example, movie

entitled "Contact (1997)" has minimum number of

related movies in Apriori list as it is shown in table

7. Therefore, we can support this list by adding the

nearest neighbor items in KNN list in table 6.

9 CONCLUSIONS

In this paper, an efficient hybrid movie

recommender system has been designed using the

association rules mining technique and collaborative

filter technique. The data were taken from

Movielens dataset and the system were implemented

in the Python and C# programming languages. A

dataset was taken from the MovieLens dataset

granted from Netflix.Our proposed recommendation

system applied the KNN algorithm as a

classification method as well as the Apriori

algorithm as an association rules mining. Applying

both techniques give more realistic movie lists for

the user to choose. The results were evaluated in

term of the important degree. The proposed system

improves the important degree and gives better

accuracy than the existing techniques used. KNN

and Apriori algorithm improved the lists of user-

recommended movies that are close to their liking,

depending on which movie the user selects the first

time. In the future, the proposed system can be more

improved using big datasets. In addition, new

directions for improvement could be using deep

learning techniques which may enhance the

efficiency of the movie recommendation system, in

that case the model can be tuned to trained more

situations.

ACKNOWLEDGEMENTS

The authors would like to thank, the Department of

Computer Science at Faculty of Science, Sirte

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

197

Volume 21, 2022

University, Libya, College of Industrial Technology,

Misurata, Libya, and The Libyan Academy

Department of Information Technology, Libya.

Furthermore, Full thanks to the Ministry of Higher

Education, Libya for partially supported financial

support.

References:

[1] Mishra N., Chaturvedi S., Mishra V., Srivastava

R., Bargah P. (2017)Solving Sparsity Problem in

Rating-Based Movie RecommendationSystem. In:

Behera H., Mohapatra D. (eds) Computational

Intelligence in Data Mining. Advances in Intelligent

Systems and Computing, vol 556. Springer,

Singapore

[2] Das D., Chidananda H.T., Sahoo L. (2018)

Personalized Movie Recommendation System Using

Twitter Data. In: Pattnaik P., Rautaray S., Das H.,

Nayak J. (eds) Progress in Computing, Analytics

and Networking.Advances in Intelligent Systems and

Computing, vol 710. Springer,Singapore

[3] Golberg, J., Hendler, J. (2006). FilmTrust:

movie recommendations using trust in web-based

social networks. In Consumer Communications and

Networking Conference, Vol. 1, (pp. 282-282).

[4] Ahuja, R., A. Solanki, and A. Nayyar. Movie

recommender system using k-means clustering and

k-nearest neighbor. in 2019 9th International

Conference on Cloud Computing, Data Science &

Engineering (Confluence). 2019. IEEE.

[5] Kokate, S., et al. Traveler's Recommendation

System Using Data Mining Techniques. in 2018

Fourth International Conference on Computing

Communication Control and Automation

(ICCUBEA). 2018. IEEE.

[6] Li, H. and D. Han, A Novel Time-Aware Hybrid

Recommendation Scheme Combining User

Feedback and Collaborative Filtering. IEEE

Systems Journal, 2020.

[7] Awati, C. and S. Shirgave. The State of the Art

Techniques in Recommendation Systems. in

International Conference on Computing in

Engineering & Technology. 2022. Springer.

[8] Ye, Y. Research on Apriori algorithm and its

application in electronic commerce system. in 2016

International Conference on Advances in

Management, Arts and Humanities Science

(AMAHS 2016). 2016. Atlantis Press.

[9] Burke, R. (2002). Hybrid recommender systems:

Survey and experiments. In User Modeling and

User-Adapted Interaction, 12, (pp. 331–370).

[10] Mendes, R. I. (2007). "A Hybrid Recommender

for movies based on Naïve Bayesian Classifier."

Bacherlor’s Thesis Informatics & Economics 2007,

Erasmus University Rotterdam.

[11] Symeonidis, P., Nanopoulos, A., Manopoulos,

Y. (2007). Feature-Weighted User Model for

Recommender Systems. In Proceedings of the 11th

International Conference on User Modeling, (pp.

97-106).

[12] Rakesh Agrawal and Ramakrishnan Srikant

Fast algorithms for mining association rules in large

databases. Proceedings of the 20th International

Conference on Very Large Data Bases, VLDB,

pages 487-499, Santiago, Chile, September 1994.

[13] Karandeep, T., Abhishek N and Mahajan

Narsale. ,Recommendation System using Apriori

Algorithm. IJSRD - International Journal for

Scientific Research & Development

| Vol. 3, Issue 01, 2015 | ISSN (online): 2321-

0613.

[14] Zakaria Suliman Zubi, Ayman Altaher

Mahmmud, Crime Data Analysis Using Data

Mining Techniques to Improve Crimes Prevention,

international journal of computers, ISSN: 1998-

4308, Volume 8, 2014.

Contribution of individual authors to

the creation of a scientific article

(ghostwriting policy)

Zakaria Suliman Zubi, carried out the optimization

as well as the statistics of the article.

Ali A. Elrowayati, carried out the evaluation of the

system performance as well as prepared the

statistics of the article results.

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

198

Volume 21, 2022

Ibrahim Saad Abu Fanas carried out the idea and

implemented the algorithm's code with Python and

C# programming language.

Sources of funding for research

presented in a scientific article or

scientific article itself

The research work was partially supported by the

Ministry of Higher Education, Libya.

Creative Commons Attribution

License 4.0 (Attribution 4.0

International, CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.

0/deed.en_US

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2022.21.24

Zakaria Suliman Zubi, Ali A. Elrowayati,

Ibrahim Saad Abu Fanas

E-ISSN: 2224-2872

199

Volume 21, 2022