Single-Channel Blind Separation Using Adaptive Mode Separation-

Based Wavelet Transform and ICA Single-Channel Separation of the

MINA KEMIHA1, ABDELLAH KACHA2

1Automatic department. 2Electronic departement

Mohammed Seddik BenYahia University

BP 98 Ouled Aissa, 18000

JIJEL, ALGERIA

Abstract: - In this paper, a new method to solve the signal-channel blind source separation (SCBSS) problem

has been proposed. The method is based on combining the Adaptive Mode Separation-Based Wavelet

Transform (AMSWT) and the ICA-based single channel separation. First, the amplitude spectrum of the

instantaneous mixture signal is obtained via the Fourier transform. Then, the AMSWT is introduced to

adaptively extract spectral intrinsic components (SIC) by applying the variational scaling and wavelet

functions. The AMSWT is applied to every mode to obtain the time-frequency distribution. Then the time-

frequency distribution of the mixed signal is exploited. The ICA-based single-channel separation has been

applied on spectral rows corresponding to different time intervals. Finally, these components are grouped using

the -distance of Gaussian distribution . Objective measure of separation quality has been performed using

the scale-invariant (SI) parameters and compared with the existing method to solve SCBSS problem.

Experimental results show that the proposed method has better separation performance than the existed

methods, and the proposed method present a powerful method to solve de SCBSS problem.

Key-Words: -Signal-channel blind source separation. Adaptive Mode Separation-Based Wavelet Transform.

Spectral decomposition-based method. β-distance of Gaussian distribution

Received: April 25, 2021. Revised: March 16, 2022. Accepted: April 17, 2022. Published: May 18, 2022.

1 Introduction

Blind signal separation (BSS) consists to separate

source signal from mixed signals without any

information. BSS have wide range of applications

such as medical imaging and engineering [1-4],

image processing and speech recognition [5, 6], and

speech signal processing [7, 8], communication

systems [9], astrophysics [10], automatic

transcription or speech and musical instrument

identification [11], mechanical fault detection [12,

13].

In the literature, many approaches have

been proposed to solve the BSS problem. The most

popular is the independent component analysis

method (ICA). In [14] an algorithm based on phase

space reconstruction was proposed. In [15], an

algorithm composed of pseudo-multiple input

multiple output observation structure and

independent component analysis (ICA) was

proposed. In [16], an improved empirical mode

decomposition method for blind separation of

single-channel vibration signal mixtures was

proposed. The ICA is characterized by simplicity

and results quality. ICA technique is based on

linear transformation to find components from

multidimensional mixed data. The ICA is

performed on the hypothesis that the source signals

are statistically independent. The founded

components are statistically independent too.

A single channel source separation methods

overview is presented in [17]. Methods based on

spectral representation of the observed signal are

usually known as spectral decomposition-based

methods. Spectral decomposition-based methods

have been introduced by many authors. In [18]

nonnegative matrix factorization (NMF) method has

been applied on the Short Time Fourier Transform

(STFT) representation of a single-channel observed

signal, but the method requires the use of an

additional training data. In [19], wavelet transforms

and a combination of empirical mode decomposition

(EMD) and ICA has been proposed, but the wavelet

transforms require some predefined basis functions

to represent a signal. The EMD and its improved

algorithms are empirical, and there is no complete

mathematical theory basis [20]. In [21] the bark

scale aligned wavelet packet decomposition has

been introduced, after the Fourier transform, the

Gaussian mixture model (GMM) has been used in

separation step. In [22] a combination of various

single channel separation methods, a spectral

decomposition based techniques and model based

methods has been dissected.

In [23] a new Adaptive Mode Separation-

Based Wavelet Transform (AMSWT) has been

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

77

Volume 18, 2022

proposed to seismic time–frequency analysis. The

novel time-frequency analysis approach is inspired

by the adaptive wavelet bank configuration to

empirical wavelet transform (EWT) [24-26] and the

spectral mode separation thought from variational

mode decomposition (VMD) [27]). The AMSWT

method consists to adaptively extract spectral

intrinsic components by solving a recursive

optimization problem. To obtain the spectral

boundaries for wavelets bank configuration, the

limited support of every spectral mode is

introduced. Then, the obtained spectral boundaries

for wavelets bank configuration built to highlight

the spectral information. The AMSWT method is a

fully adaptive approach without requiring prior

information.

In [28] a new method to solve the SCBSS

problem is proposed. The method is applied on the

time-frequency representation of a single-channel

observed signal. The ICA-based single-channel

separation has been applied on spectral rows

corresponding to different time intervals. The -

distance of Gaussian distribution  is used to

measure the distance between time-frequency

domain components of the mixed signal obtained by

ICA, and finally, these components are grouped.

The grouping algorithm of the components return to

solve the optimization problem by minimizing the

negentropy of reconstructed constituent signals.

In this paper a new method has been

proposed to solve the SCBSS problem. The method

is based on combining the AMSWT [23] and the

ICA-based single channel separation method [28].

The time-frequency representation of a signal is

considered as a multichannel observed signal and

can be separated by ICA. After separation, the

statistically independent time-frequency

components are then grouped. The grouping using

the -distance of Gaussian distribution 

The performance of the proposed method is

tested on real speech sounds chosen from available

databases and compared to the results obtained via

EMD based single-channel separation, the wavelets

based-single channel separation introduced in [19]

and the single-channel separation audio signals

based on variational mode decomposition (VMD).

The quality of the obtained separation results was

evaluated using the scale-invariant (SI) parameters

such as SI-SDR, SI-SAR, SI-SIR, which are

particularly recommended for single-channel

separation evaluation [29, 30].

The remaining content is composed of the

following parts: the second section gives the SCBSS

problem formulation; the third section introduces

adaptive mode separation-based wavelet transform;

the fourth section shows the ICA-based single

channel separation method; The fifth section present

the main steps of the proposed algorithm with the

application of this algorithm in the simulation

experiments and the comparison results with other

algorithms; finally, conclusions and discussions are

given in the fifth section.

2 SCBSS Problem Formulation

A general BSS problem can be mathematically

defined as follows: Let ()=[(),..,()] be

a vector of N independent sources at the discrete

time instant t. The vector ()=

[(),..,()]of the M observed instantaneous

mixtures is modeled as follow:



(



)

=



(



)

(1)

where is the (×)mixing matrix.

In the literature, the main BSS

classifications are defined such as: linear and

nonlinear BSS; instantaneous and convolutive BSS;

over complete and underdetermined BSS. For the

last classification, when the number of observed

signals  is more than the number of independent

sources , this refers to over complete BSS. On the

other hand, when the number of observed signals 

is smaller than the number of independent sources

, this becomes to underdetermined BSS.

In general case and for many practical

applications only one-channel recording is available.

This special case of instantaneous underdetermined

source separation problem termed as single channel

source separation is discussed in many papers. For

this special case, the conventional source separation

methods are not suitable.

The SCSS research area where the problem can be

simply treated as one observation instantaneous

mixed with several unknown sources:



(



)

=











(



)









(2)

where =1,.., denotes number of sources and

the goal is to estimate the sources () when only

the observation signal () is available. In

frequency domain, by applying the short time

Fourier transform (STFT). The mixture defined in

equation (2) becomes:



(



)

=











(



)









(3)

where  denote the frequency. () design the

Fourier transform of the mixture signal () and

() is a (1) vector whose elements () are

the Fourier transforms of the source signals ().

Since the separation of the signal is performed

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

78

Volume 18, 2022

frame by frame, the mixing model of each frame can

be written as :



(



,



)

=







(



,



)

(4)

where m denotes the frame index.

In [31] the original EMD description, a

mode is defined as a signal whose number of local

extrema and zero-crossings differ at most by one. In

most lately related works, the definition is changed

into so-called Intrinsic Mode Functions (IMF),

based on modulation criteria [31, 20].

3 Adaptive Mode Separation-Based

Wavelet Transform

The wavelet () is a function localized jointly in

time and frequency and with a zero mean. A mother

wavelet ,() defined as follow





,



(



)

=

1

√





(



−





)

(5)

Where  and  denote the translation and dilatation

parameters respectively.

The wavelet transform consists to perform the inner

product between the family of wavelets ,() and

the signal ().





=

〈



(



)

,





,



(



)



〉

=





(



)

1

√







(



−





)









(6)

The AMSWT perform the time-frequency

analysis by the variational scaling and wavelet

functions to every mode. So, the method is based on

the ADMM [31] solver and then defines a bank of

variational scaling functions and wavelets based on

the established spectral boundaries.

Therefore, the approximate coefficients and

detailed coefficients are obtained by the inner

product of the analyzed signal s with the variational

scaling function, and by the inner product of the

analyzed signal s with variational wavelets

respectively and expressed by the following

equations





(

0

,



)

=

〈



,

∅



〉

=





(



)

∅





(



−



)



(7)

and





(



,



)

=

〈



,





〉

=





(



)







(



−



)



(8)

In [23] the intrinsic modes () have

distinguishable features in the frequency domain

under the amplitude-modulated frequency-

modulated (AM-FM) assumption, using the

alternate direction method of multiplier (ADMM)

solver, the spectral modes can be adaptively

obtained, following how intrinsic mode functions

(IMF) are obtained, to estimate compact modes:

min

,

















(



)

+







∗





(



)























.



.









=



(



)



(9)

Where () is the signal to be decomposed under

the constraint that over all modes should be the

input signal. (.) is a Dirac impulse. 󰇡()+

󰇢∗

() denotes the original data and its Hilbert

transform. ,  and  denote the modes and their

central frequencies and the mode number

respectively. The spectral segmentation boundary

number can be empirically determinate using on the

following equation:





=



{



∈

ℤ



|



≥

2



ln



}

(10)

where  presents the signal length and  is the

scaling exponent determined by the detrended

fluctuation analysis (DFA) [ 32].

According to [23] the equation is solved using a

quadratic penalty term and the parameter  that

denotes the Lagrangian multiplier for rendering the

problem unconstrained



(





,





,





)

=



∑

󰇼





[

󰇡



(



)

+





󰇢

∗







(



)

]





󰇼



+

〈



,



−

∑







〉

+

‖



−

∑







‖



.

(11)

therefore  is determined recursively as













(



)

=





(



)

−

∑













(



)

+

















1

+

2



(



−







)



(12)

where (), 

() and 󰆹() denote the Fourier

transform of the input signal (), the mode

function () and () respectively.  denotes the

balancing parameter of the data-fidelity constraint.

The center frequencies 

 are updated as the

center of gravity of the corresponding mode’s power

spectrum using the following equation











=

∫

















(



)











∫















(



)











(13)

Therefore, Instead of using a predefined wavelet

bank, we build adaptive wavelets banks using the

spectral modes and associated center frequencies

represent the intrinsic components.

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

79

Volume 18, 2022

In [23] authors defined the boundaries between

each mode using the mode bandwidth and central

frequencies, Whereas, in the literature, some authors

are just used the average between the two central

frequencies as the spectral boundary, which does not

consider the spectral distribution.

We consider the ℎmode with the mean frequency

 and a spectral bandwidth , then the boundary

 between ℎ the and the +1 mode is given by

the following equation





=





+







+









−











2

(14)

we take =0 and =.

For the variational scaling functions and wavelets

based on the spectral boundaries: the authors use the

idea used in the construction of both Littlewood–

Paley and Meyer’s wavelets [33]. ∅

 and 

 are

respectively defined by the following equation, with

 is the parameter that ensures no overlap between

the two consecutive transitions.

∅





=

⎩

⎨

⎧

1

,





≤

(

1

−



cos

󰇧



2



(



,





)

󰇨

,

(

1

−



)





≤



≤

(

1

+

0



otherwise



(15

)

and







=

⎩

⎪

⎨

⎪

⎧

1

,



(

1

+



)





≤



≤

(

1

−

cos󰇧

2(,)󰇨,(1−)≤≤

(

sin

󰇧



2



(



,





)

󰇨

,

(

1

−



)





≤



≤

(



0



otherwise



(1

6)

Where (,)={󰇡

󰇢[||−(1−)]}]

and () is an arbitrary function defined as follow:



(



)

=



0

,





≤

0

1

,





>

1



(



)

+



(

1

−



)

=

1

,



0

<



<

1

(17)

4 ICA-Based Single Channel

Separation Method

In [28] the authors propose a new method to solve

the SCBSS problem. The method is applied on the

time-frequency representation of a single-channel

observed signal. The time-frequency representation

is a non-linear transformation, the use of non-linear

ICA would be appropriate, but However, as

mentioned in [28, 34] under certain conditions

nonlinear BSS problem can be solved using linear

ICA.

Let () denote the signal in time domain,

using the Short Time Fourier Transform (STFT), the

signal is transformed in the frequency domain. The

transformation is performed frame by frame and 

is the STFT time frame number. The STFT is the

x complex matrix of time frequency

representation, this matrix contain -rows

instantaneous signal spectra,

Let  where =1,.., spectral

components obtained via the time-frequency

representation of a single channel signal. The

obtained  are statistically independent. In this

step, the rows of the TFDmix matrix are treated as

individual channels in a multichannel signal. Then

the ICA is applied on this multichannel signal.

The ith row of  denoted  can be written as

= an ith time frequency component of a

mixed one-channel signal. The relation between Z

where =[] =1,.., and TFDmix is given as

the following equation





=



.



=











=







(18)

where Ais the (×)mixing matrix whose

elements  ,where,  is an ith column of .

The  present the spectral bases. The columns of 

describing time variation of are called time bases

and denoted by . The matrix  denote the

product of the time basis  and the spectral basis 

is called ith time-frequency component.

The grouping of  bases is performed

into subgroups by the grouping of time bases  and

frequency bases  as the following equation:





=







=



++⋯



+











(19)

Where ,.., are  index sets obtained by

grouping  bases.

In [28], to reduce computational

complexity, authors used only the  bases

which have a specified variance of the mixed signal.

The grouping of bases consists to collecting

elements into clusters. The clustering is based on the

maximization of negentropy of separated

components. The ICA-based single-channel

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

80

Volume 18, 2022

separation methods primarily use component

grouping based on similarity in time or frequency

domain. In [28] authors suggest the use of a time-

frequency structure to measure the similarity

features in both time and spectral domain.

5 Grouping Process

The grouping process is performed by clustering the

ith time-frequency distribution  bases, or the

distance between  bases, using the -distance

of Gaussian distribution  [28]. The generalized

Gaussian distribution is expressed as following:



(



|



,



,



)

=



(



)





󰇩

−



(



)

󰇻



−





󰇻



/

(







)

󰇪

(20)

where  denote the expected value. describes

the type of a random variable y, i.e., its deviation

from normal distribution where −1≤≤0. 

present the standard deviation of a random variable

. The parameters () and () are given by the

following expression: ()=󰇣

()󰇤/

()󰇣

()󰇤/

and ()=󰇩󰇣

()󰇤

󰇣

()󰇤󰇪/() where Γ is the

Gamma-Euler function.





=







,



−





(





,



)



(21)

the  parameter is estimated by a posteriori

determination of the maximum of the , where , the

a posteriori distribution of the  parameter is given

as [32, 35]: (|)∝(|)(), where (|)

denotes a data likelihood [32] and is given as the

following equation



(



|



)

=





(



)





󰇩

−



(



)

󰇻





−





󰇻



/

(







)

󰇪



(24)

where () present the a priori distribution of

the  parameter [18, article de khedamy bih].

The statistically independent constituent signals

have the maximum negentropy [10,50]. So, the

grouping or the  bases consists in maximizing

negentropy (negative entropy) of reconstructed

constituent signals , by finding of

reconstructed constituent signals , =

∑

 with the maximum negentropy, the 

bases can be grouped. Let  is the normalized

Gaussian random variable (=0,=1) and

(.)is a nonlinear function of the random variable

usually having the form ()=

logcosℎ() ,

∈(1,2) or ()=−(

). The negentropy

function () is given by the following

equationexpression [35]: ()~()−

(). The negentropy function ()

approximation has numerous advantages such as

conceptual simplicity and rapid calculation rate

[35]. As a result, it is very often used as a cost

function in algorithms for solving ICA problems

[28].

6 Results and Discussion

To evaluate the performance of the proposed

approach, simulations are performed. The proposed

method has been applied on speech datasets selected

from TIMIT [36] and NOIZEUS [37] databases.

The instantaneous mixture is simulated by the

recordings of three sentences 1(t),2(t)and

3(t). The signals are pronounced by male and

female speakers and were recorded at the sampling

frequency . The instantaneous mixture is defined

by the following equation:



(



)

=











(

t

)

+











(

t

)

+











(

t

)



(22)

where ,  and  are constants parameters. The

proposed method operates in the time-frequency

domain, and is summarized by the following steps

for each frame:

1. Compute the Short Time Fourier Transform

(STFT) of the observed signal ()

2. Apply the variational scaling and wavelet

functions to every mode to obtain the time-

frequency distribution using equation (7) and (8).

3. The input data for ICA is a spectrogram. The ICA

is applied on this multichannel signal (applied on

spectral rows corresponding to different time

intervals)

4. The -distance of Gaussian distribution  is

used to measure the distance between time-

frequency domain components of the mixed signal

obtained by ICA.

5. Solve the optimization problem by minimizing

the negentropy of reconstructed constituent signals.

6. Reconstruct the appearance of the particular

source in the original signal.

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

81

Volume 18, 2022

The time-frequency is considered as a random

variable, its distribution is given in parametric

terms. Therefore, it is poss

Figure 1. Flowchart of the proposed method

Thereafter, as an illustration example, the proposed

method is applied to separate an instantaneous

mixture defined by equation (22). The Fig.2 (a)

shows the three speech signals representation in

time domain. First, the observed single-channel

presented in Fig.2 (b) was transformed to the

frequency domain using the STFT. The Fig.2 (c)

presents the STFT of a frame of the observed

mixture. Then, for each frame, the AMSWT method

is introduced to obtain optimal spectral mode

separation; we apply the variational scaling and

wavelet functions to every mode to obtain the time-

frequency distribution using equation (7) and (8) as

illustrated by Fig. 2 (d).

Once the T-F distribution is obtained, the

spectrogram which is considered as a multichannel

observed signal is used as the input data for ICA-

based single channel separation. Then, as mentioned

in step 4 of the algorithm, the -distance of

Gaussian distribution  is used to measure the

distance between time-frequency domain

components. Solving the optimization problem as

mentioned in step 5. For our example and for a

mode, the estimated spectral components are shown

in Fig. 2(e). For this frame, collecting elements into

clusters, the estimates frame of the signal ()is

illustrated in Fig.2(f). The estimated signals are

illustrated in Fig.2(g). As can be seen, the estimated

signals were similar to the original signal showed in

figure Fig.2 (a)





(



)





(



)





(



)

(a) Original sources time-Domain représentation.

Figure 2. illustration example

(b) Observed signal time-domain representation.

(c) FFT of the frame of the observed signal.

Figure 2. continued

0 5 10 15 20 25

-1

-0.5

0

0.5

1

Time (s)

0 5 10 15 20 25

-1

-0.5

0

0.5

1

Time (s)

0 5 10 15 20 25

-1

-0.5

0

0.5

1

Time (s)

0 5 10 15 20 25

-1

-0.5

0

0.5

1

Time (s)

0 1000 2000 3000 4000 5000 6000 7000 8000

-1

-0.5

0

0.5

1

Frequency (Hz)

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

82

Volume 18, 2022

(d) Ttime-frequency distribution of one arbiter

chosen mode

Figure 2. continued

Objective measure of separation quality has been

performed. The performances of the proposed

method are compared with existing methods in the

literature such as the EMD signal-channel

separation [19], the wavelets signal-channel

separation presented in [19], and the single-channel

separation audio signals based on variational mode

decomposition (VMD).

(a)

(b)

©

(e) The time-frequency distribution of one arbiter

chosen mode.

(f) The time-frequency distribution of one arbiter

chosen mode

Estimated source







(



)

Estimated source







(



)

Estimated source







(



)

(g) The time-frequency distribution of one arbiter

chosen mode

Figure 2. continued

In [29, 30] a new method has been proposed, the

method is a simpler scale-invariant alternative for

single-channel separation evaluation by the

introduction a new parameters. These parameters

are called scale-invariant (SI) such as SI-SDR, SI-

SAR, SI-SIR, and they are particularly

recommended single-channel separation evaluation.

These parameters are defined by the usage of a

single coefficient  to account for scaling

discrepancies. Let () is the original sources, and

() is the estimated source expressed as=

+ where  can be decomposed as

 =+ , where  are the

source signals, and  denotes the

Time

Freq uency

0 0.2 0.4 0.6 0.8

0

0.2

0.4

0.6

0.8

1

Tim e

Freque ncy

0 0.2 0.4 0.6 0.8

0

0.2

0.4

0.6

0.8

1

Tim e

Frequency

0 0.2 0.4 0.6 0.8

0

0.2

0.4

0.6

0.8

1

Tim e

Freque ncy

0 0.2 0.4 0.6 0.8

0

0.2

0.4

0.6

0.8

1

Tim e

Freque ncy

0 0.2 0.4 0.6 0.8

0

0.2

0.4

0.6

0.8

1

Tim e

Frequency

0 0.2 0.4 0.6 0.8

0

0.2

0.4

0.6

0.8

1

Time

Frequency

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

Time

Frequency

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

Time

Frequency

0 0.2 0.4 0.6 0 .8 1

0

0.2

0.4

0.6

0.8

1

Time

Frequency

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25

1

1.5

2

2.5

Time (s)

0 5 10 15 20 25

1

1.5

2

2.5

Time (s)

0 5 10 15 20 25

1

2

Time (s)

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

83

Volume 18, 2022

interferences from other sources, and 

includes all other artifacts introduced by the

separation algorithm. The − is given as the

following equation : −= ||

|| where

=argmin|−|. The optimal scaling factor

for the target is obtained as = 

‖‖ , and the

scaled reference is defined as  =. The

performance criteria are given by the following

equation:



−



=

10

















‖





‖







−



=

10





󰇧

‖‖

‖





−





‖



󰇨

(23)

The scale-invariant signal to interference ratio (SI-

SIR) is given by the following equation:



−



=

10





























(24)

and the scale-invariant signal to artifacts ratio (SI-

SAR) is defined as follows:



−



=

10





























(25)

Another performance measure has been evaluated;

the measure is expressed in terms of the relative root

mean squared error () given by the

following equation



=









(



)

−







(



)











(



)



100

[

%

]

(26)

where () denote the signal we want to extract and



() is the estimate of the signal. (In our case

=1,..,3). The speech dataset is corrupted at a

signal-to-noise ratio SNR=5 dB, then the SI-SIR,

SI- SAR and SI-SDR are evaluated. The obtained

results are showed in the Fig.3. A set of 4 noisy

mixtures are simulated by corrupting the clean

mixture at a signal-to-noise ratio (SNR) ranging

from 5 dB to 20 dB with a step of 5 dB, then the

RRMSE is evaluated. The Fig.4 shows the obtained

results. To discuss the relation between the frame

length and the -distance of Gaussian distribution

, the mean of the mean -distance of Gaussian

distribution  has been evaluated for different

frame length (512, 1024, 2048, 4096 frame length),

the obtained results are showed in Fig.5. As shown,

the proposed method presents a better separation

quality then the exiting methods expressed by the

scale-invariant SI-SDR, SI-SAR, SI-SIR parameters

values. On the other hand, the relative roots mean

squared error () of the proposed method is

better than the  of the existing methods for

different SNR values. The mean -distance of

Gaussian distribution  for different frame length

shows that the combination of the AMSWT method

and the ICA-based single channel separation method

allows having better results and better separation

compared to existing methods. So, the proposed

method allows having better separation results then

the exiting methods in the literature. The use of the

AMSWT allow to generate a superior time–

frequency resolution because the wavelet bank is

adaptively built on the intrinsic spectral modes; and

the use of a time-frequency structure allows

measuring the similarity features in both time and

spectral domain also the -distance of Gaussian

distribution is a distance measure based on the

knowledge of the statistical nature of spectra of

original constituent signals of the mixed signal.

Figure 3. Comparison between the proposed

method and the EMD based single-channel

separation and the wavelets based-single channel

separation and the single-channel separation audio

signals based on variational mode decomposition

(VMD) in term of SI-SIR, SI-SAR, SI-SDR for

SNR=5 dB.

Proposed method

EMD based single-channel separation

Wavelets based-single channel separation

VMD based single-channel separation

SI-SIR SI-SDR SI-SAR

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

84

Volume 18, 2022

Figure 4. Comparison of algorithms

performances for different SNR values

Figure 5. Comparison of the mean -distance of

Gaussian distribution  for different frame length.

7 Conclusion

A new method to solve the signal-channel blind

source separation problem has been proposed. The

method is based on combining two powerful

methods such as the Adaptive Mode Separation-

Based Wavelet Transform (AMSWT) and the ICA-

based single channel separation. A new objective

measure of separation quality has been introduced to

evaluate the performance of the proposed method.

The evaluation parameters are called scale-invariant

(SI) such as SI-SDR, SI-SAR, SI-SIR. Simulation

results showed the good performance of the

proposed method compared to the exiting method.

0 5 10 15

0.04

SNR (dB)

RRMSE

Proposed method

EMD based single-channel separation

Wavelets based-single channel separation

VMD based single-channel separation

1

6

Frame length

Mean distance

EMD based single-channel separation

Wavelets based single-channel separation

Proposed method

VMD bas ed single-channel separation

References:

[1] Al-Baddai, S.; Al-Subari, K.; Tomé, A.M.;

Volberg, G.; Lang, E.W. Combining EMD with

ICA to Analyze Combined EEG-fMRI Data. In

Proceedings of the MIUA, Egham, UK, 9–11 July

2014; pp. 223–228.

[2] James, C.J.; Hesse, C.W. Independent

component analysis for biomedical signals. Physiol.

Meas. 2005, 26, R15–R39.

[3] Jimenéz-Gonzaléz, A.; James, C. Source

separation of Foetal Heart Sounds and maternal

activity from single-channel phonograms: A

temporal independent component analysis

approach. In Proceedings of the 2008 Computers in

Cardiology, Bologna, Italy, 14– 17 September

2008; pp. 949–952.

[4] Zeng, X.; Li, S.; Li, G.J.; Zhou, Y.; Mo, D.H.

Fetal ECG extraction by combining singlechannel

SVD and cyclostationarity-based blind source

separation. Int. J. Signal Process 2013, 6, 367–376.

[5] Draper, B.A.; Baek, K.; Bartlett, M.S.;

Beveridge, J.R. Recognizing faces with PCA and

ICA. Comput. Vis. Image Underst. 2003, 91, 115–

137.

[6] Liu, X.; Srivastava, A.; Gallivan, K. Optimal

Linear Representations of Images for Object

Recognition. In Proceedings of the 2003

Conference on Computer Vision and Pattern

RecognitionWorkshop, Madison, WI, USA, 18–20

June 2003.

[7] X. Huang, L. Yang, R. Song, and W. Lu,

``Effective pattern recognition and nd-densitypeaks

clustering based blind identication for

underdetermined speech mixing systems,''

Multimedia Tools Appl., vol. 77, no. 17, pp.

2211522129, Sep. 2018.

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

85

Volume 18, 2022

[8] A. Nagathil, C. Weihs, K. Neumann, and R.

Martin, ``Spectral complexity reduction of music

signals based on frequency-domain reduced-rank

approximations: An evaluation with cochlear

implant listeners,'' J. Acoust. Soc. Amer., vol. 142,

no. 3, pp. 12191228, Sep. 2017.

[9] Yang, J.; Williams, D.B. MIMO Transmission

Subspace Tracking with Low Rate Feedback. In

Proceedings of the IEEE International Conference

on Acoustics, Speech, and Signal Processing,

Philadelphia, PA, USA, 23 March 2005.

[10] Wilson, S.; Yoon, J. Bayesian ICA-based

source separation of Cosmic Microwave

Background by a discrete functional approximation.

arXiv 2010, arXiv:1011.4018.

[11] Eronen, A. Musical Instrument Recognition

Using ICA-Based Transform of Features and

Discriminatively Trained HMMs. In Proceedings of

the Seventh International Symposium on Signal

Processing and Its Applications, Paris, France, 4

July 2003.

[12] R. B. Randall, ``A history of cepstrum analysis

and its application to mechanical problems,'' Mech.

Syst. Signal Process., vol. 97, pp. 319, Dec. 2017.

[13] M. A. Haile and B. Dykas, ``Blind source

separation for vibrationbased diagnostics

ofrotorcraft bearings,'' J. Vib. Control, vol. 22, no.

18, pp. 38073820, Oct. 2016.

[14] P.He et al., “Single channel blind source

separation on the instantaneous mixed signal of

multiple dynamic sources,” Mechanical Systems &

Signal Processing, vol.113, pp. 22- 35, December

2018.

[15] X.Cai et al., “Single Channel Blind Source

Separation of Communication Signals Using

Pseudo-MIMO Observations,” IEEE

Communications Letters, vol.22, no.8, pp.1616-

1619, Aug 2018.

[16] D.Wang, W.Guo, and P.W.Tse, “An enhanced

empirical mode decomposition method for blind

component separation of a singlechannel vibration

signal mixture,” Journal of Vibration & Control,

vol.22, no.11, 2015.

[17] Gao, B. Single Channel Blind Source

Separation. Ph.D. Thesis, Newcastle University,

Newcastle, UK, 2011.

[18] Wang, B.; Plumbley, M.D. Investigating

Single-Channel Audio Source Separation Methods

Based on Non-Negative Matrix Factorization. In

Proceedings of the ICA Research Network

InternationalWorkshop, Liverpool, UK, 18–19

September 2006; pp. 17–20.

[19] Mijovic, B.; De Vos, M.; Gligorijevi´c, I.;

Taelman, J.; Van Hu_el, S. Source Separation From

Single-Channel Recordings by Combining

Empirical-Mode Decomposition and Independent

Component Analysis. IEEE Trans. Biomed. Eng.

2010, 57, 2188–2196.

[20] NE. Huang, Z. Shen, SR. Long, MC. Wu, HH.

Shih, Q. Zheng, NC. Yen, CC. Tung, HH. Liu, The

empirical mode decomposition and the Hilbert

spectrum for nonlinear and nonstationary time

series analysis. In proceedings of The Royal Society

A Mathematical Physical and Engineering Sciences

454(1971), 903-995 (1998).

[21] Litvin, Y.; Cohen, I. Single-Channel Source

Separation of Audio Signals Using Bark Scale

Wavelet Packet Decomposition. J. Signal Process.

Syst. 2010, 65, 339–350.

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

86

Volume 18, 2022

[22] Duan, Z.; Zhang, Y.; Zhang, C.; Shi, Z.

Unsupervised Single-Channel Music Source

Separation by Average Harmonic Structure

Modeling. IEEE Trans. Audio Speech Lang.

Process. 2008, 16, 766–778.

[23] Fangyu Li , Bangyu Wu , Naihao Liu , Ying

Hu, and Hao Wu, ‘‘ Seismic Time Frequency

Analysis via Adaptive Mode Separation-Based

Wavelet Transform’’ . IEEE GEOSCIENCE AND

REMOTE SENSING LETTERS, VOL. 17, NO. 4,

APRIL 2020. Pp 696-700.

[24] J. Gilles, “Empirical wavelet transform,” IEEE

Trans. Signal Process., vol. 61, no. 16, pp. 3999–

4010, Aug. 2013.

[25] W. Liu, S. Cao, and Y. Chen, “Seismic time–

frequency analysis via empirical wavelet

transform,” IEEE Geosci. Remote Sens. Lett., vol.

13, no. 1, pp. 28–32, Jan. 2016.

[26] N. Liu, Z. Li, F. Sun, Q. Wang, and J. Gao,

“The improved empirical wavelet transform and

applications to seismic reflection data,” IEEE

Geosci. Remote Sens. Lett., to be published.

[27] K. Dragomiretskiy and D. Zosso, “Variational

mode decomposition,” IEEE Trans. Signal

Process., vol. 62, no. 3, pp. 531–544, Feb. 2014.

[28] Dariusz Mika, Grzegorz Budzik, and Jerzy

Józwik , Single Channel Source Separation with

ICA-Based Time-Frequency Decomposition.

Sensors.

[29] J. L. Roux, S.Wisdom, H. Erdogan, and J. R.

Hershey, “SDR - half-baked or well done?,” in

Proc. IEEE Int. Conf. Acoust. Speech, Signal,

Process., 2019, pp. 626–630.

[30] M. Torcoli , T. Kastner , and J. Herre.

''Objective Measures of Perceptual Audio Quality

Reviewed: An Evaluation of Their Application

Domain Dependence''. IEEE/ACM

TRANSACTIONS ON AUDIO, SPEECH, AND

LANGUAGE PROCESSING, VOL. 29, 2021. pp.

1530-1541.

[31] onstantin Dragomiretskiy, Dominique Zosso,

‘‘Variational Mode Decomposition ’’. IEEE

TRANSACTIONS ON SIGNAL PROCESSING,

VOL. 62, NO. 3, FEBRUARY 1, 2014. Pp 531- 544

[32] C. K. Peng, S. V. Buldyrev, S. Havlin, M.

Simons, H. E. Stanley, and A. L. Goldberger,

“Mosaic organization of DNA nucleotides,” Phys.

Rev., vol. 49, no. 2, p. 1685, 1994.

[33] I. Daubechies, Ten Lectures on Wavelets.

Philadelphia, PA, USA: SIAM, 1992.

[34] Isomura, T.; Toyoizumi, T. On the

achievability of blind source separation for high-

dimensional nonlinear source mixtures. arXiv 2018,

arXiv:1808.00668.

[35] Hyvarinen, A.; Karhunen, J.; Oja, E.

Independent Component Analysis; JohnWiley &

Sons: New York, NY, USA, 2001.

[36] TIMIT database.

https://catalog.ldc.upenn.edu/ldc93s1

[37] NOIZEUS database

http://ecs.utdallas.edu/loizou/speech/noizeus/

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

87

Volume 18, 2022

Creative Commons Attribution License 4.0

(Attribution 4.0 International, CC BY 4.0)

This article is published under the terms of the Creative

Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/deed.en_US

WSEAS TRANSACTIONS on SIGNAL PROCESSING

DOI: 10.37394/232014.2022.18.11

Mina Kemiha, Abdellah Kacha

E-ISSN: 2224-3488

88

Volume 18, 2022