A Comparative Study of Statistical and Deep Learning Model-Base
weather Prediction in Albania
MALVINA XHABAFTI, BLERINA VIKA, VALENTINA SINAJ
Department of Statistics and Applied Informatics,
Faculty of Economy, University of Tirana,
Road “Arben Broci”, Tirana, 1001
ALBANIA
Abstract: - Rainfalls are one of the most important climate variables that today impact significantly different
sectors like agriculture, energy, industry, and so on. Agriculture is one of the most sensitive sectors to climate
change because rainfalls in this case, directly affect the positive progress of corps activity. In this case,
forecasting rainfalls would help farmers to effectively survive the increasing occurrence of extreme weather
events, plan their farming activities, and reduce costs. On the other hand, circular economy (CE) promises a
strategy to support sustainable and regenerative agriculture by supporting the sustainable management of water
based on water resources. This paper aims to determine the best method for forecasting a natural phenomenon
such as the rainfall, that today in Albania, as a result of the unpredictable flows that it often has, is a major
problem in the field of agriculture. In this study, the rainfall model based on statistical methods, Auto-
Regressive Integrated Moving Average (ARIMA), Error, Trend & Seasonal (ETS) and deep learning models,
Long Short-Term Memory Network (LSTM), and Deep Forward Neural Network (DFNN) was developed. The
study area that will be used for rainfall forecasting is Albania with a time interval between January 1901 and
December 2022. The period that will be used for prediction will be January 2023- December 2024. The
performance of each of the models used has been evaluated by using Root Mean Square Error (RMSE) where
we also used the comparison of training and validation loss curves to analyze and avoid the model overfitting in
the training phase. The results showed that from the comparison between ARIMA and ETS, ETS has the
minimum prediction error value while between LSTM and DFNN, DFNN has the best performance in the
evaluation metrics (RMSE) and with the best training and validation loss curves. From the final comparison,
ETS was better than the DFNN model with the lowest root mean square error (RMSE). ETS was the best model
and provided higher accuracy in precipitation forecast.
Key-Words: - Rainfalls, Agriculture, ARIMA, ETS, LSTM, DFNN, RMSE.
Received: June 14, 2022. Revised: September 17, 2023. Accepted: October 19, 2023. Available online: November 22, 2023.
1 Introduction
Climate change is an issue that is ranked as a
leading global matter, [1]. It is a very fundamental
factor that affects different sectors of the economy
of a country but in our study, we are going to focus
on the sector of agriculture. Agriculture is one of the
most important sectors of the Albanian economy,
and being so, it is crucial to determine the factors
that directly affect its sustainability. Agriculture
requires the highest amount of water usage than any
other economic sector globally and by adopting
Circular Economies (CE), agriculture can shift from
a linear approach that turns natural resources into
products and waste, to sustainable practices that
minimize resource consumption and waste
accumulation, [2]. Also, [1], concluded that the
climate crisis has to shift to a circular production
and consumption model because CE (Circular
Economy) is gaining worldwide attention as a
sustainable alternative for the future. The practice of
predicting weather conditions serves as a way for
agriculture to adapt to the effects of climate change
and also optimizes agricultural production by
promoting sustainable development while reducing
loss and improving economic results, [3]. To
mitigate the negative impacts of climate change on
agriculture, it is critical to consider both current and
future climate changes while planning strategies for
agricultural production, [3]. In this way, farmers can
make profitable and good choices and also by
utilizing science and technology, the practice of
weather forecasting will foresee forthcoming
atmospheric conditions at a specific location
depending on their needs, [4].
Out of all-weather events, rainfall is one of the
crucial meteorological variables that plays an
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.15
Malvina Xhabafti, Blerina Vika, Valentina Sinaj
E-ISSN: 2415-1521
151
Volume 12, 2024
important role in the existence of human beings.
Hereupon, rainfall is an important phenomenon that
affects daily life in various ways, including water
consumption, agriculture, pollution, etc. and its
prediction is of great interest, [5]. Besides being
vital to the economy it can seriously damage
infrastructure and crops through floods, [6]. So,
based on the importance of this topic, we will
forecast rainfalls using machine learning algorithms
because are appropriate options for utilization in the
modeling and prognostication of meteorological
events, [5]. This study aims to build a univariate
rainfall time series data model in Albania for the
period January 1901- December 2022, based firstly
on a statistical approach, ARIMA and ETS, and then
on deep learning algorithms like LSTM and DFNN
for various numbers of lags and horizon.
The article is organized as follows. Section 2
proposes a literature review on the definition of
weather forecasting (rainfalls) impact on agriculture
and management of water and also agriculture in the
transition that is happening nowadays towards
circular economy, and the related work on statistical
and deep learning models we are going to use to
forecast the rainfall time series. Section 3 presents
the study area and the data sets also its main
characteristics, introduces the methods and
algorithms to be used, and also focuses on data and
experimental design. In Section 4 the results are
discussed. Finally, Section 5 presents the
conclusions and some lines of future work. The
motivation for this article came from the fact that in
Albania, rainfall is a genuine problem that brings
irreparable damage and consequences to crops and
not only. And taking into consideration the
aforementioned machine learning methods, we
would bring a contribution to the Albanian literature
in this field, since there are few or almost no studies
dealing with this natural phenomenon using this
type of approach.
2 Literature Review & Related Work
Nowadays, climate change is one of the most
current worldwide issues, especially with irregular
and unpredictable rainfall patterns, extended periods
without rain, floods, and other associated
phenomena. It has emerged as the primary catalyst
for issues experienced across various sectors.
Among the most affected ones, agriculture exhibits
a significant dependence on climatic factors and is
predicted to be the foremost influenced sector by
climate change, [7]. Agricultural drainage, rain and
storm runoff, etc. are a consequence of climate
change, which is making it difficult to access water
for irrigation, negatively affecting agriculture, [8].
Furthermore, the need to forecast atmospheric
conditions precisely, to avoid or reduce the impact
of a disaster, [9], has become a necessity. Efficient
weather forecasting has the potential to enhance the
agricultural sector's resilience against natural
disasters, minimize damages, steer production, and
enable strategic arrangements for consistent and
increased productivity, [3].
Because agricultural operations are mostly
controlled by rainfall allocations, [7], we are going
to focus on forecasting rainfalls as their significance
extends far beyond their role in agriculture as they
play a crucial role in preserving the ecological
balance, and have widespread positive implications
for the entire ecosystem, whether directly or
indirectly, [10]. Recent theoretical developments
have revealed that it has a direct impact on the
sustainable development of various economic
sectors, including agriculture, and also plays a
significant role in the circular economy, [11]. On the
other hand, a circular economy has the potential to
mitigate water scarcity issues by directing attention
to water resources used in agriculture and
implementing strategies to reduce consumption and
increase water reuse, [2]. Its implementation can
ensure the conservation of resources, stimulate
agricultural productivity and boost the economy,
[12]. To adopt this approach, the appropriate
infrastructure is needed for the rainwater to be
collected in the appropriate structures for further use
in agriculture. In, [2], stated that a comprehensive
strategy for managing water resources should be
both interconnected and circular, taking into account
other systems and factors, [2]. He also stated that it
is highly crucial to educate and inform the
agricultural industry and its employees about CE's
principles and benefits and its execution needs to be
carried out at the local and why not global level, [2].
Over the past few years, there has been a surge in
the utilization of machine learning algorithms for
simulating atmospheric phenomena due to their
ability to handle large amounts of data, provide a
clear representation of the modeled phenomenon,
and detect patterns or correlations in the data that
aren't readily visible, [5]. In, [7], also stated that the
use of Machine Learning can efficiently handle the
difficulties associated with analyzing extensive
amounts of non-linear meteorological data and can
yield considerable benefits in the field of weather
prediction, including better accuracy, faster results,
and various other perks.
Some works have used different machine
learning techniques mostly deep learning
algorithms, to forecast rainfalls. More specifically,
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.15
Malvina Xhabafti, Blerina Vika, Valentina Sinaj
E-ISSN: 2415-1521
152
Volume 12, 2024
for our main focus, these are the following studies.
In, [13], a set of methods of different data mining
fields are described and compared to conclude
which methods are the best to predict weather
extreme conditions. In, [14], the air quality model
based on the LSTM and ARIMA was developed for
Malaysia on monthly data from the period of 2017–
2019 using air quality data. In, [5], a set of machine
learning methods was used to make predictions
about the phenomenon of rain in the main cities of
Australia in the last 10 years. In, [15], ARIMA and
Artificial Neural Networks (ANN) were used to
forecast monthly rainfall data from year 1901 to
2015 for different regions across the country of
India. In, [7], shows the importance of rainfalls in
agricultural production in Africa and has developed
Machine Learning-based models adapted to the
context of daily weather forecasting in Senegal. In,
[16], a method utilizing an LSTM-based prediction
model with a seq2seq structure was introduced to
estimate hourly rainfall runoff. The model was
tested on two watersheds located in Iowa, to predict
24-hour periods of hourly runoff. In, [17], the
LSTM technique was applied utilizing the
meteorological components of Eastern China's
historical information, specifically for the next
twelve hours following a specified time, to improve
the accuracy of rainfall predictions. In, [18], simple
models for estimating rainfall were developed using
traditional Machine Learning algorithms and Deep
Learning structures, in conjunction with climate
data from five major cities in the United Kingdom
covering the period of 2000 to 2020. In, [9], a
discharge prediction model for flash flood
forecasting in hilly areas has been developed by
using LSTM networks. From 1984
to 2012, the recorded rainfall and discharge
data have been transformed into hourly time series
data. In, [19], two hybrid models utilizing LSTM
networks to accurately predict monthly streamflow
and rainfall patterns were developed. This study
utilized monthly data on streamflow volume
obtained from Cuntan, Hankou, Jinan, and
Wenjiang (Chengdu). In, [6], a rainfall prediction
model with 6 parameters was developed by using
artificial intelligence and LSTM techniques, for
Dhaka city from 2000 to 2014. This study focuses
on developing a system for removing flood damage
effects and improving agriculture using the latest
approach of deep learning in time series forecast
analysis. Lastly, in, [10], the precipitation records
from 2021 and 2022 in Tamil Nadu's Vellore area
were predicted using various machine learning
techniques like ARIMA and ETS, followed by
LSTM on the time-series data. In this research, two
forms of machine learning and deep learning
algorithms are applied to analyze a rainfall dataset
to determine which approach is the most efficient in
forecasting precipitation. When making this
decision, one of the factors that is taken into account
is their scores in terms of performance and
accuracy.
3 Tools and Methodology
3.1 Study Area and Data Analysis
3.1.1 Study Area
Albania is characterized by a subtropical
Mediterranean climate. The geographical features of
the country are mainly defined by its mountainous
terrain, rolling hills, and coastline, which contribute
to the formation of an elaborate system of rivers and
lakes due to its distinct geological and
climatological attributes. The topography of the
country predominantly comprises mountainous
terrain, which is marked by the presence of copious
water resources, a varied range of flora and fauna,
and an extensive shoreline that stretches along the
Adriatic and Ionian Seas. The northern, western, and
southwestern regions of Albania are characterized
by considerably higher precipitation levels
compared to other areas within the country. The
yearly mean precipitation measures 1,430 mm,
however, notable distinctions exist in both the
seasonal and spatial distribution patterns, whereby
the majority of rainfall prevails during the winter
season, [20]. The Albanian Alps exhibit the highest
levels of humidity among various regions. In
November, the precipitation levels reach a climax,
whereas the period from July to August marks a
nadir in this regard, with notably lower amounts,
[21]. This pattern of seasonal variation indicates a
distinct annual cycle in terms of precipitation.
Agriculture is one of the main sources of food and
economic development. Agriculture employs over
50 percent of the population and accounts for about
19% percent of the gross domestic product, [21].
Primarily, the agricultural sector heavily relies on
rainfed cultivation, which is contingent upon the
cyclical precipitation patterns of specific seasons.
3.1.2 Data Analysis
In this paper, we analyze the historical data on the
monthly rainfall(mm) for the period from January
1901 to December 2022 (Figure 1). They were
collected from the Climate Change Knowledge
Portal, World Bank, [20]. Firstly, we model rainfall
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.15
Malvina Xhabafti, Blerina Vika, Valentina Sinaj
E-ISSN: 2415-1521
153
Volume 12, 2024
time series using ARIMA and ETS methods. The
rainfall time series was modeled and forecasted
using the Box Jenkins Model for the ARIMA
method, [22], and the Exponential Smoothing State-
Space model for ETS, [23]. The precipitation
variable has 1464 monthly observations data and the
experiments were carried out using Time-series
Library (R) software.
Fig. 1: Observed average monthly precipitation of
Albania 1901-2021
To forecast rainfall using deep learning models,
we need to calculate the data points and divide them
into three sets. The dataset consists of a total of
1464 data points. It is divided into three sets: the
training set (from January 1901 to December 2016),
the validation set (from January 2017 to December
2019), and the test set (from January 2020 to
December 2022). The deep learning models we will
use to forecast are LSTM and DFNN and below we
will explain the steps we should follow to put them
into practice:
Selection of input and output data
As we previously mentioned, the months of October
- March are identified as the rainfall season in
Albania. Thus, the present study explores the data of
these 6 and 12 months from 1901 to 2022. There
will be 732 entries for the 6th step and 1464 entries
for the 12th step in the output and the input files. The
Input parameters are the average rainfall for the 6
and 12 months of 122 years from 1901-2022. The
output parameter is the average rainfall in the
months of every year from 2020 -2022. The data
that we will use for these models are obtained from
the World Bank website.
Split data: training & testing Initialize model
parameters
The experiments on deep learning models are
implemented in Python. The models are trained with
a set of the input data, the training data set. The
model types we are going to use as we have
previously mentioned are LSTM and DFNN
composed of six layers: one input, one output, and
four hidden layers. In the training phase, the
algorithm uses only 90 percent of the input data
which means that from 1464 examples, only 1392
are used for training and we will keep 36 samples
for validation and 36 samples for testing. These
1392 examples are chosen randomly from the
overall data set. The split of the series is based on
the sequential nature of the data. The first 90 % of
data are used for training and the remaining data are
equally divided and used respectively as validation
and testing data sets.
Train and Test Model
The training set is used to train the model so that it
learns and reduces the error value. During the
training phase, we use the validation set to control
the learning process and avoid overfitting. Finally,
after the model is trained, the third set of data is
given to the model and we estimate the out-of-
sample prediction performance. With the results
obtained in this phase, we select the best deep-
learning model. The LSTM and DFNN set aside
10% of the input data for testing and validation and
out of 1464 samples, 36 are used for testing and
another 36 are used for validation.
Model evaluation and comparison
In this step, we have trained and tested the model
and can generate a graph that will help compare
actual and predicted output.
Also, you can have a clear vision of how accurate
the model is.
3.2 Methodology
As we previously mentioned, we have selected
models where two of them are classified as
statistical models such as ARIMA and ETS and the
other two are deep learning algorithms to forecast
univariate models, such as LSTM and DFNN for
forecast analysis. Our main interest is to study the
predictive performance of different deep learning
models for various numbers of lags and horizons,
and then to compare them with the statistical
approaches. The delayed data used as node values
for the input layer on each model are 12, 18, and 24
and the predictive horizon is 6 and 12 months ahead.
The predictive DL models and the statistical
methods were evaluated based on their performance
on the test dataset using RMSE. By using RMSE
criteria simultaneously for forecasting estimation,
we can find the fluctuations in errors. Below we
give a brief introduction to ARIMA, ETS, LSTM,
and DFNN.
0
50
100
150
200
250
300
350
400
1/1/1901
1/1/1911
1/1/1921
1/1/1931
1/1/1941
1/1/1951
1/1/1961
1/1/1971
1/1/1981
1/1/1991
1/1/2001
1/1/2011
1/1/2021
Rainfall/mm
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.15
Malvina Xhabafti, Blerina Vika, Valentina Sinaj
E-ISSN: 2415-1521
154
Volume 12, 2024
3.2.1 Autoregressive Integrated Moving Average
(ARIMA)
The utilization of the Box-Jenkins approach
facilitated the construction of an ARIMA model.
The ARIMA model is denoted as ARIMA (p, d, q),
wherein p, d, and q correspond to the number of
autoregressive parameters, degree of differencing,
and the order of moving average, respectively [24].
To be specific, p signifies the autoregressive order,
d denotes the degree of differencing, and q implies
the order of moving average, [22].

 
 


  
(1)
Where,

If the data exhibits seasonal patterns, the
corresponding models are going to be referred to as
Seasonal Autoregressive Integrated Moving
Average (SARIMA)models (p,d,q)(P, D, Q), [22].
3.2.2 Error, Trend, Seasonal (ETS)
The ETS model is utilized to disintegrate a given
series into four distinct components, namely the
level component, the trend component (T), the
seasonal component (S), and an error term (E), [23].
The forecast derived by ETS is generated by
computing a weighted average across all the
observations contained within the time series data
set. The weight values exhibit an exponential
decrease as time progresses, as opposed to the fixed
weight values utilized in basic moving average
approaches, [23]. The weights depend on a constant
parameter, commonly referred to as the smoothing
parameter. Exponential smoothing models use a
weighted average of past observations to give more
weight to the most recent observation, with the
weights decreasing over observation time, [25].
3.2.3 Long short-term Memory (LSTM)
LSTM is the top deep learning architecture for
future tasks that works with long-range time-series
data by incorporating memory structures to manage
lengthy information and offers solutions for
nonlinear time series, [26]. LSTM was created to
overcome traditional neural networks' inability to
link past data with the present in lengthy
relationship tasks, [16]. The LSTM process deviates
from the conventional neural network structure by
incorporating an exclusive type of neuronal
arrangement called the "memory cell". By utilizing
a concealed layer framework, the LSTM network
can retain data for an indefinite period, resulting in a
more accurate time-based model, [17]. The memory
module comprises a loop connector and three door-
like structures, namely the input gate, output gate,
and forget gate. The fundamental concept involves
managing the gate switches by utilizing a non-linear
function to safeguard and regulate the memory unit's
state, and additionally manage the flow of
information, whether it is being amplified or
reduced, [17]. LSTM is one of the most successful
techniques that address the vanishing gradients
effect where they minimize it by implementing three
gates along with the hidden state, [18].
LSTM neural network algorithm exhibits better pred
ictive performance rather than neural networks
when utilized to estimate the water depth in
agricultural regions, [9].
3.2.4 Deep Feed Forward Network (DFNN)
DFNN is a type of multi-layer perceptron (MLP),
consisting of a sequence of layers, where
information is directed from the input layer to the
output layer. The architecture of a DFNN model
consists of input, hidden, and output layers where in
the input layer the input vector of a sample is stored
for each iteration while the output layer is the final
stage of the DFNN model where the number of
nodes in this layer is set according to the dimension
of the output data, [27]. The learning process for
DFNN is supervised learning in which the weights,
during the training phase, are adjusted to reduce the
difference between the real output value and the
model output. In, [28], stated that “the main purpose
of DFNN is to learn an abstract representation of
data in a hierarchy by passing the data through
multiple transformation layers where each layer has
several interconnected processing units”.
4 Results and Discussions
The analysis of this study is divided into three parts.
The first part is the comparison between the
ARIMA model and ETS. The second part is the
comparison between LSTM and DFNN. The third
part has to do with the comparison of the two best
methods that come out of the first and second parts
of the above analysis.
4.1 Comparison between ARIMA and ETS
Model
To find out the potential use of ARIMA and ETS
models in fitting and forecasting rainfall, the
ARIMA model is compared to the ETS model. In
this study, it is found that ARIMA (5,0,1) (2,0,0),
[12] is suitable to fit the data. The RMSE for the
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.15
Malvina Xhabafti, Blerina Vika, Valentina Sinaj
E-ISSN: 2415-1521
155
Volume 12, 2024
differences between the predicted values from the
ARIMA model and the actual values of rainfall for
the test data is calculated. The value of RMSE is
38.06956. The ETS model was built and the RMSE
between actual values for variable rainfall in test
data and forecast values found from the ETS model
was calculated. RMSE in this case is 33.89779. The
RMSE and other performance metric values for both
the ARIMA and ETS models are shown in Table 1.
Table 1. Root means square error (RMSE) and other
performance metrics for ARIMA and ETS model
ARIMA model
RMSE
38.06956
MAPE
0.8115115
MPE
72.6696
MASE
-0.001548048
A comparison between the ARIMA model and
the ETS model for rainfall was made. It was found
that the RMSE value for the ARIMA model is
higher than the ETS model respectively 38.06956.
Also, if we have a look at the other metrics shown in
Table 1, for the ARIMA model, they are in larger
values than for the ETS model proving a bigger
error in prediction. Therefore, it could be concluded
that the ETS model can predict rainfall better than
the ARIMA model.
4.2 Comparison between LSTM and DFNN
Model
In this part, we will see which of the two deep
learning models, LSTM or DFNN is best to forecast
rainfalls. We have calculated data points and then
divided them into three sets, the training set (from
January 1901 to December 2016), the validation set
(from January 2017 to December 2019), and the test
set (from January 2020 to December 2022) and also
followed the steps mentioned in the previous
paragraph “Data analysis”. We have used 2
horizons, 6 and 12 months, and 12, 18, and 24 lags
for each horizon. As a result, we evaluated 6 cases
for each model:
– (LSTM and DFNN) 12-1024-512-256-256-6
– (LSTM and DFNN) 18-1024-512-256-256-12
– (LSTM and DFNN) 24-1024-512-256-256-6
– (LSTM and DFNN) 12-1024-512-256-256-12
– (LSTM and DFNN) 18-1024-512-256-256-6
– (LSTM and DFNN) 24-1024-512-256-256-12
After evaluating the results for the LSTM
algorithm, the best LSTM model is LSTM 12-1024-
512-256-256-6 based on the average RMSE value,
55.234002 shown in Figure 2, for lag 12 and
horizon 6.
As for the DFNN algorithm, DFNN 24-1024-
512-256-256-6 is better than other cases based on
the average RMSE value, of 53.723089. Depending
on the horizon of each model, for horizon 6 the best
model is DFNN 24-1024-512-256-256-6 and for
horizon 12 the best model is DFNN 24-1024-512-
256-256-12. These results are also illustrated in
Figure 2 and Figure 3. Based on the value of RMSE,
the best deep learning model between the LSTM
model or DFNN model to forecast rainfalls in
Albania, is DFNN with horizon 6, lags 24.
Fig. 2: DFNN and LSTM model ranking for
forecasting rainfalls based on the lowest RMSE at
h=6
Fig. 3: DFNN and LSTM model ranking for
forecasting rainfalls based on the lowest RMSE at
h=12
Also, if we evaluate the models by also
considering this second part of the analysis to
identify those models with the best training and
validation loss curves, we will identify by the
graphs in Figure 4 and Figure 5 which has the best
curve. For the LSTM model, the best training and
validation loss curves are LSTM 12-1024-512-256-
256-6, as shown in Figure 4 below, and for the
DFNN model is DFNN 24-1024-512-256-256-6,
Figure 5.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.15
Malvina Xhabafti, Blerina Vika, Valentina Sinaj
E-ISSN: 2415-1521
156
Volume 12, 2024
Fig. 4: Training and validation loss for LSTM
Fig. 5: Training and validation loss for DFNN
So, in conclusion, we can say that we evaluated
both algorithms based on the RMSE metric as well
as the best training and validation loss curves from
these two models where the best model in both
comparative cases presented is, DFNN 24-1024-
512-256-256-6.
4.3 Comparison between ETS and DFNN
Model
Based on the results obtained for the two first parts
above, where we compared the ARIMA model with
the ETS model and the LSTM model with DFNN,
we came to the conclusion that the two best models
for forecasting based on the RMSE value are ETS
and DFNN models. In this paragraph, we will make
the final comparison if we will use a machine
learning model such as ETS or a deep learning
model such as DFNN for the average rainfall
variable. We will draw this conclusion based on the
performance metrics we have used so far. If we have
a look at the RMSE value for the ETS model which
is 33.89779 and for the DFNN model which is
53.723089, we will conclude that the best
forecasting model based on the RMSE performance
metric is the ETS model. If we look at Figure 6 and
Figure 7, where the forecast graphs for both models
are presented, the ETS model is closer to the real
values of the average rainfall time series. This
forecast is made for the time interval January 2023 -
December 2024. In Figure 7, we have started the
values from January 2010 to have a clearer view of
the forecast of the values for the years 2023-2024.
Fig. 6: Plot for test data and predicted values based
on the ETS model.
Fig. 7: Plot for test data and predicted values based
on DFNN model from 2010
So, using RMSE as the main metric for
comparison, we identified that the best forecasting
method is ETS. Not only from the metric used but
also from the graphic representation of the actual
and predicted values above, we see a more accurate
forecast for the ETS model.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.15
Malvina Xhabafti, Blerina Vika, Valentina Sinaj
E-ISSN: 2415-1521
157
Volume 12, 2024
5 Conclusion
Rainfalls are one of the main climatic elements with
a fundamental impact in several different sectors,
but in our study, we focused on its effect on
agriculture. Since this industry relies on the use of
water for its development and maintenance as a
whole, the provision of this element becomes a need
and necessity. Also taking into consideration the
developments and changes in the world economy,
where today the focus is moving towards the
circular economy, this approach also takes into
account water as an element and its use in different
industries, but the emphasis in this study is still on
agriculture. From the overall information and
knowledge, we gained from the literature review,
the main idea we formed is that rainfall forecasting
plays a key role in maintaining the ecological
balance as well as in the sustainable development of
various economic sectors, including agriculture, and
also plays an important role in the circular economy.
Besides the evidence of the importance of rainfall in
the aforementioned areas, this study aims to identify
the method with the minimum error to guarantee
accuracy and efficiency. The methods we used to
forecast rainfalls were ARIMA, and ETS as
statistical methods and LSTM, and DFNN as deep
learning models. The area where we conducted the
study is Albania for the period January 1901-
December 2022 because of the direct and crucial
impact that rainfalls have on our crops. At the end
of the study, it is intended to determine the best
forecasting technique for this variable. For each
model, the corresponding tests were done and the
metric used to evaluate the best technique for
forecasting as well as to obtain the correct data was
RMSE. Also, for the deep learning models we
demonstrated training and validation loss curves for
each model and evaluated that the best model was
the DFNN model with horizon 6 and lags 24. After
comparing the RMSE and observing the
performance of all methods in the comparison for
actual and predicted data, ETS was the one with the
best measurements and results. In other words, the
obtained results illustrate that the statistical methods
optimize the error better than the deep learning
models for the dataset for weather collected from
World Bank meteorological data. Furthermore, in
future work, the comparison between different
models can be widened by adding newer algorithms
to the existing ones or combining multiple
algorithms for better outcomes by using a hybrid
method to see how the result would affect the
current conclusions we have in this paper. Also,
taking into consideration all the variables that affect
rainfalls can give the study a different direction of
approach.
References:
[1]
K. Manav , G. Natalia M. and F. Marco , "The
Relevance of the Circular Economy for
Climate Change: An Exploration through the
Theory of Change Approach," Sustainability,
MDPI, 2022.
[2]
D. Maurício , G. Priscila R., A. Marco and Et.
al, "International circular economy strategies
and their impacts on agricultural water use,"
Cleaner Engineering and Technology,
Elsevier, 2022.
[3]
S. Danna , S. Wang-Fang , T. Wei, W. Yan and
L. Jun , "The Agricultural Economic Value of
Weather Forecasting in China," Sustainability,
MDPI, 2022.
[4]
K. Vimal , K. Rajesh , P. Chandra and K.
Alesh , "Weather Forecasting through Synoptic
Technique," Agriculture&Food, 2022.
[5]
S.-C. Antonio , "Prediction of Rainfall in
Australia Using Machine Learning,"
Information, MDPI, 2022.
[6]
S. Imrus , T. Iftakhar M., H. Mehedi , T. D.
Sadia , S. Mohd. and N. M. Nazmun , "An
Artificial Intelligence Based Rainfall
Prediction Using LSTM and Neural Network,"
in Electrical and Computer Engineering, 2020.
[7]
N. Chimango , D. Awa , T. Assitan , D.
Abdoulaye and B. Cheikh , "Towards Resilient
Agriculture to Hostile Climate Change in the
Sahel Region: A Case Study of Machine
Learning-Based Weather Prediction in
Senegal," Agriculture, MDPI, 2022.
[8]
M. Giorgio , G. Hazal and N. Bing-Jie , "Water
Reuse from wastewater treatment: The
transition towards Circular Economy in the
water sector," Achieving wider uptake of
water-smart solutions-WIDER UPTAKE, 2022.
[9]
S. Tianyu , D. Wei , W. Jian , L. Haixing , Z.
Huicheng and C. Jinggang , "Flash Flood
Forecasting Based on Long Short-Term
Memory Networks," Water, MDPI, 2022.
[10]
G. Ganapathy P., S. Kathiravan , D. Debajit ,
C. Chuan-Yu , P. Om , Z. Vladislav and B.
Olga , "Rainfall Forecasting Using Machine
Learning Algorithms for Localized Events,"
Computers, Materials & Continua, pp. 6334-
6348, 2022.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.15
Malvina Xhabafti, Blerina Vika, Valentina Sinaj
E-ISSN: 2415-1521
158
Volume 12, 2024
[11]
S. Marzena, A. Christian and P. Michal,
"Circular economy model framework in the
European water and wastewater sector,"
Journal of Material Cycles and Waste
Management, p. 682–697, 2020.
[12]
V.-M. Juan F. , M. Joan M. F. , A.-S. José A.
and G.-S. Alejandro , "Circular economy
implementation in the agricultural sector:
Definition, strategies, and indicators,"
Resources Conservation and Recycling, 2021.
[13]
Z. Maheen and G. Farisa, "A CRITICAL
REVIEW ON WEATHER FORECASTING
USING DATA," International Journal of
Creative & Innovation Sciences, pp. 247-278,
2021.
[14]
B. Mohd A. A. , A. Noratiqah M. , N. Mohd
Sh. M., W. Ong L. and S. Fatin N. A.,
"Prediction of Multivariate Air Quality Time
Series Data using Long Short-Term Memory
Network," Malaysian Journal of Fundamental
and Applied Sciences, pp. 52-59, 2022.
[15]
N. Manjunath , B. R. Muralidhar , S. Sachin K,
K. Vamshi and K. Savitha, "Rainfall Prediction
using Machine Learning and Deep Learning
Techniques: A Review," International
Research Journal of Engineering and
Technology (IRJET), 2021.
[16]
X. Zhongrun, Y. Jun and D. Ibrahim , "A
Rainfall-Runoff Model With LSTMBased
SequencetoSequence Learning," Water
Resources Research, AESS, 2020.
[17]
Z. Chang-Jiang , Z. Jing , W. Hui-Yuan , M.
Lei-Ming and C. Hai , "Correction model for
rainfall forecasts using the LSTM with
multiple meteorological factors,"
Meterological Applications, RMetS, 2019.
[18]
B.-A. Ari Y., O. Lukumon O. , B. Muhammad
, D. A. Taofeek , D. D. Juan Manuel and A.
Lukman Adewale, "Rainfall prediction: A
comparative analysis of modern machine
learning algorithms for time-series
forecasting," Machine Learning with
Applications, Elsevier, 2022.
[19]
N. Lingling , W. Dong , S. Vijay P. , W.
Jianfeng , W. Yuankun , T. Yuwei and Z.
Jianyun , "Streamflow and rainfall forecasting
by two long short-term memory-based
models," Journal of Hydrology, Elsevier, 2020.
[20]
"Climate Change Knowledge Portal, World
Bank," [Online].
https://climateknowledgeportal.worldbank.org/
country/albania/climate-data-historical
(Accessed Date: 20 January 2023).
[21]
"Instat," [Online]. https://www.instat.gov.al/
(Accessed Date: 25 March 2023).
[22]
G. M. Ljung and G. E. Box, "On a measure of
lack of fit in time series models," Biometrika,
pp.297-303, 1978.
[23]
Chesilia, A. J., Miftahuddin and Hizir,
"Selection for the best ETS (error, trend,
seasonal) model to forecast weather in the
Aceh Besar District," IOP Conference Series:
Materials Science and Engineering, 2018.
[24]
V. Sinaj, "Models to forecast inflation in
Albania," International Journal of Scientific &
Engineering Research, vol. 5, no. 1, pp. 590-
594, 2014.
[25]
G. Jain and . M. Bhawna, "A Study of Time
Series Models ARIMA and ETS," 2017.
[26]
A.-J. M. R. K. Sardar , . K. W. Shahab and Y.
Raghad Z., "Predicting temperature of erbil
city applying deep learning and neural
network," Indonesian Journal of Electrical
Engineering and Computer Science, pp.944-
952, 2021.
[27]
L. Q. Huy , T. T. Tam , D.-C. D. and N.-T. T. ,
"A deep feed-forward neural network for
damage detection in functionally graded
carbon nanotube-reinforced composite plates
using modal kinetic energy," Frontiers of
Structural and Civil Engineering, vol. 15,
p.1453-1479, 2021.
[28]
T. T. Tam , D.-C. D. , L. Jaehong and N.-T. T.
, "An effective deep feedforward neural
networks (DFNN) method for damage
identification of truss structures using noisy
incomplete modal data," Journal of Building
Engineering, 2020.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.15
Malvina Xhabafti, Blerina Vika, Valentina Sinaj
E-ISSN: 2415-1521
159
Volume 12, 2024
Contribution of Individual Authors to the
Creation of a Scientific Article (Ghostwriting
Policy)
The authors equally contributed to the present
research, at all stages from the formulation of the
problem to the final findings and solution.
Sources of Funding for Research Presented in a
Scientific Article or Scientific Article Itself
No funding was received for conducting this study.
Conflict of Interest
The authors have no conflicts of interest to declare
that are relevant to the content of this article.
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.15
Malvina Xhabafti, Blerina Vika, Valentina Sinaj
E-ISSN: 2415-1521
160
Volume 12, 2024