to the WDCNN backbone along with a proposed few-shot
learning algorithm. We demonstrate that the model developed
in this paper outperforms the fault diagnosis performance of
the existing model with several samples.
This paper is organized as follows. Section 2 describes
Few shot learning, Siamese network, and LSTM. Section 3
describes the main idea of this paper, Few-shot learning based
fault diagnosis. Section 4 describes the experimental proce-
dure, dataset construction, and experimental results. Finally,
Section 5 describes the conclusion and future work.
Few-shot learning was first addressed in the 1980s [9].
Deep learning has been very successful in many fields, but
the performance of the model is hindered when the data set is
small. To solve this problem, few shot learning was proposed.
Few-shot learning can alleviate the burden of collecting large-
scale supervised data. It uses a new machine learning paradigm
called few shot learning to learn with less data.
Recently, Few-shot learning has made great progress in
solving the data shortage problem. Few-shot learning is dif-
ferent from a typical CNN model. General deep learning is
divided into training set and test set, but few shot learning is
divided into training set, support set, query set, and performs
N-way K -shot classification. N is the number of classes, K is
the number of support data for each class, and N is inversely
proportional and K is directly proportional, so the larger the
N, the harder the problem to solve, and the larger the K, the
easier the problem to solve [10], [11], [12].
Siamese networks were first introduced by Bromley and
Lekun in the early 1990s to solve signature verification as
an image matching problem [9]. Fig. 1 shows the siamese
network. A Siamese network is a type of neural network ar-
chitecture used to learn the similarity or dissimilarity between
two inputs and was first introduced by Bromley and Lekun in
the early 1990s to solve signature verification as an image
matching problem. The network consists of two identical
subnetworks that share the same weights and architecture,
hence the name ”Siamese”. The inputs are passed through the
subnetworks and a similarity score is calculated by comparing
the output representations of the two inputs [13], [14].
Subnetworks can be any type of neural network architec-
ture, such as convolutional neural networks (CNNs), recurrent
neural networks (RNNs), or a combination of several types of
networks. Subnetworks are trained to learn representations of
inputs so that similar inputs have similar representations and
different inputs have different representations.
Siamese networks are commonly used for tasks such as im-
age or text similarity, one-shot learning, and few-shot learning.
They are also used in conjunction with triple loss functions, a
method of training a network to produce similar outputs (e.g.,
1 or positive pair) for similar inputs and different outputs (e.g.,
0 or negative pair) for different inputs.
Fig. 1. Siamese Network Structure
LSTM stands for Long Short-Term Memory and is a type
of Recurrent Neural Network (RNN), a type of deep learning
proposed by Hochreiter et al. LSTMs are models that can
remember long-term information, solving one of the limita-
tions of RNNs, the gradient blowup problem. It is widely
used in fields such as natural language processing and speech
recognition [15]. Fig. 2 shows the LSTM structure. An LSTM
consists of four gates and one memory cell. It consists of four
gates, Forget Gate, Input Gate, Output Gate, and Cell State,
and an update memory cell that creates a new memory cell
by updating the information in the previous memory cell. The
roles of each gate are as follows.
1. Forget Gate: Determines which of the information in the
previous memory cell is discarded.
2. Input Gate: Adds or modifies information about the
current input value.
3. Output Gate: Determines which information to use to
generate the final output value.
4. Cell State: The memory cell of the LSTM, which
remembers and transmits information in the long term.
Fig. 2. LSTM Structure.
It is a Siamese network few-shot learning classification
method based on our proposed WDCNN+LSTM model. Fig. 3
shows the system structure.
It consists of data preparation step (first), training and
testing process with few shot learning (second), and finally
Siamese network structure based on WDCNN+LSTM model
(third). To verify the model performance, 12k drive end
2. Related Work
2.1 Few Shot Learning
2.2 Siamese Network
2.3 LSTM
3. WDCNN-LSTM Based Bearing
Fault Diagnosis
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.10
Daehwan Lee, Jongpil Jeong, Chaegyu Lee,
Hakjun Moon, Jaeuk Lee, Dongyoung Lee