Examination of AI Algorithms for Image and MRI-based Autism

Detection

PRASENJIT MUKHERJEE1,2, GOKUL R. S.1, MANISH GODSE3

1Department of Technology,

Vodafone Intelligent Solutions,

Pune,

INDIA

2Department of Computer Science,

Manipur International University,

Manipur,

INDIA

3Department of IT,

Bizamica Software,

Pune,

INDIA

Abstract: - Precise identification of autism spectrum disorder (ASD) is a challenging task due to the

heterogeneity of ASD. Early diagnosis and interventions have positive effects on treatment and later skills

development. Hence, it is necessary to provide families and communities with the resources, training, and tools

required to diagnose and help patients. Recent work has shown that artificial intelligence-based methods are

suitable for the identification of ASD. AI-based tools can be good resources for parents for early detection of

ASD in their kids. Even AI-based advanced tools are helpful for health workers and physicians to detect ASD.

Facial images and MRI are the best sources to understand ASD symptoms, hence are input required in AI-based

model training. The trained models are used for the classification of ASD patients and normal kids. The deep

learning models are found to be very accurate in ASD detection. In this paper, we present a comprehensive

study of AI techniques like machine learning, image processing, and deep learning, and their accuracy when

these techniques are used on facial and MRI images of ASD and normally developed kids.

Key-Words: - ASD, Autism Detection, Machine Learning, Image Processing, Deep Learning, Support Vector

Machine, Haar Cascade, CNN, 3D-CNN.

Received: July 15, 2023. Revised: August 29, 2023. Accepted: October 11, 2023. Published: November 30, 2023.

1 Introduction

Autism spectrum disorder (ASD) is a neurological

and developmental disorder affecting the interaction

of patients with others. ASD patients have difficulty

in communication, learning, and behaviors. Autism

symptoms generally appear at an early stage in kids,

when they are two years old. At that age, kids are

not in a position to talk about their difficulties with

their parents. However, parents can play a role in

detecting autism in kids if they are aware of it.

Parents have to observe their kids and talk to

doctors about the development of kids, [1]. A

patient and his family are affected financially, and

emotionally because of ASD. Continued care of a

patient also creates physical burdens over the

individual’s lifespan and family caretaker. It also

stretches the healthcare system of local and federal

agencies as they have to support them medically and

financially during the lifespan of a patient.

Consequently, continuous research is required to

find better ASD-specific interventions and better

ways to enable families and communities with

resources, training, and tools required to diagnose

and help patients, [2]. Autism patients are 1% of the

world’s population, [3]. Hence serious attention is

required for the detection and support required for

patients of ASD. It has been observed that high

stability has been found for clinical diagnoses

between ages 2 and 3 years, [4]. Thus, early

diagnosis and interventions during preschool or

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.28

Prasenjit Mukherjee, Gokul R. S., Manish Godse

E-ISSN: 2224-2872

243

Volume 22, 2023

before, are required to have major positive effects

on treatment and later skills development, [5]. Most

of the time kids are with their parents hence it is

better to train them and provide tools so that they

can observe their kids and report ASD related

symptoms to doctors for further investigation and

diagnosis. Artificial intelligence (AI) has the

potential to play a big role in developing interactive

systems to assist in autism detection using machine

and deep learning. The data points required to

develop these systems are images of different types

covering the brain and face. These systems are

useful to parents, healthcare workers, and doctors.

The data used to develop models for AI-based

systems are facial features, facial landmarks, facial

expressions, brain MRI, electroencephalogram

(EEG) signals, eye tracking, and eye contact, [6].

Classification and clustering approaches are

common to detect ASD. Additional data required

can be captured using questionnaires.

This article presents a comprehensive study of

various models developed using existing state-of-

the-art artificial intelligence-based models for ASD

detection. The data used in the models is from the

open source and has facial images as well as MRI

images. It then provides a comparison of different

models and discusses research gaps and potential

areas that should be explored in the future to make

further progress in this field. It also suggests the

potential applications of these approaches for parents

of ASD kids and physicians.

2 Use of Face Recognition in Autism

Detection

ASD is a neurodevelopmental problem because of a

brain disorder affecting the physical appearance

especially the face of children. The facial features of

ASD children are distinctively different from

normally developed children hence facial features

are useful to identify the ASD disorder.

The complexity in face detection arises because of

1) The large visual difference between human faces

in the cluttered background of images, that is,

extreme illuminations and exaggerated

expressions can lead to large differences in the

visual appearance of the face

2) The large search space for probable face size and

position.

2.1 Support Vector Machine

The support vector machines (SVM) are a

supervised binary classification method to find the

optimal linear decision surface based on the concept

of structural risk minimization. Support Vector

Machines (SVM) operate by delineating

hyperplanes within a multi-dimensional space,

effectively segregating different classification

categories. The essence of SVM lies in determining

optimal boundaries, represented by these

hyperplanes, which segregate the training dataset

into distinct classes. In instances where the decision

boundaries are not optimally determined, there's a

potential risk of misclassifying new data. SVM

gives precedence to extreme data points, known as

support vectors, to ascertain these boundaries. These

support vectors are pivotal in defining the

hyperplane, calculated as the sum of the minimal

distances from both positive and negative data

points. SVMs are versatile and can address both

regression and classification challenges, effectively

managing datasets with multiple continuous and

categorical attributes. SVMs are effective in high-

dimensional spaces, even when the number of

dimensions is greater than the number of samples.

In image classification having two classes as inputs

for training, the images are classified as: (1) the

dissimilarities between images of the same

individual, and (2) dissimilarities between images of

different people. The SVM model is trained using

an image dataset, taking into consideration the

kernel and the values for the upper bound margin.

Once the model is trained, it generates a decision

boundary or surface. During the testing phase, any

samples that are falsely identified as positive are

cataloged and then utilized as negative examples in

the following training iterations. By incorporating

these negative examples, particularly those from

misclassified categories, the model's accuracy in

detecting ASD is enhanced. In the realm of face

recognition, SVM evaluates the decision boundary

to gauge the degree of similarity between pairs of

facial images. This evaluation process paves the

way for the development of sophisticated face-

recognition systems. SVM works well with small

datasets, and it is also able to handle complex

patterns and noisy data, [7].

2.2 Haar Cascade

The Haar Cascade (HC) algorithm was proposed by

Paul Viola and Michael Jones. Haar Cascade is

grounded in machine learning principles, where the

cascade function undergoes training using an

abundance of positive data points. These positive

points are derived from regions showcasing the

faces of children with ASD, while the negative data

points are sourced from regions depicting the faces

of typically developing children, [8]. The essence of

Haar Cascade lies in its utilization of Haar-like

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.28

Prasenjit Mukherjee, Gokul R. S., Manish Godse

E-ISSN: 2224-2872

244

Volume 22, 2023

features extracted from digital images to facilitate

object recognition. These features are characterized

by specific rectangular sections of an image, which

are then further segmented into multiple sections.

Often, these features are illustrated as juxtaposed

black-and-white rectangles. The value of each

feature, crucial for training, is computed by

subtracting the sum of pixel values underneath the

white rectangle from those beneath the black

rectangle. Owing to its intrinsic design, Haar

Cascades excels in identifying facial features like

eyes, nose, and mouth. Consequently, they possess

the ability to discern between children with ASD

and those without, [8]. Haar Cascade is a multi-

stage classifier and rapid detection framework as it

reduces the processing time substantially. It is also

able to achieve good accuracy and able to reduce

false positives compared to a single-stage classifier,

[9]. OpenCV provides a training method or pre-

trained repository for Haar Cascade, [10].

2.3 Convolutional Neural Networks (CNN)

A convolutional neural network (CNN) is a type of

neural network used in deep learning with

convolutional layers. CNN has two types of layers

(hidden): convolutional layers and pooling layers.

These layers are arranged alternately in the network.

The CNN mimics neurons and their connections and

has m × n neurons that are connected to neighboring

layers. The connection weights are shared in the

network of CNN thus less training time is required

for CNN. CNNs can be implemented in 1, 2, and 3

dimensions. 1-Dimensional (1D) CNN can

recognize patterns in 1D signals such as time-series

analysis. 1D-CNNs can learn from feature values

and the order of the features. In 2-dimensional

CNN, the CNN kernel moves in a 2-direction (x, y)

and calculates the output, which is a 2D Matrix. In

3-D regions dimensional CNN, joint spatial-spectral

information is processed simultaneously, [11]. In

digital images, pixel values are stored in a two-

dimensional (2D) grid, i.e., a two-dimensional array.

In the CNN method, a kernel is applied to every

position of the image to extract features. CNN is

highly effective in image processing as it can extract

features that may occur anywhere in the image. In

CNN, output from one layer is passed to the next

layer, hence hierarchically extracted features can

become more complex as the network passes

through the training dataset multiple times. The

training on the dataset is done to optimize the

parameters of the kernel, and it minimizes the

difference between outputs and ground truth labels

through an optimization algorithm called

backpropagation and gradient descent. The final

optimized and trained CNN is used for predictions,

[12].

CNN is very good at visual data, such as images

and videos. CNN can automatically learn features to

capture complex visual variations by leveraging a

large amount of training data. The CNN structure

consists of 12-net CNN, 24-net, and 48-net

structures, [13]. CNN gives better detection

accuracy than Haar Cascade. However, Hence for

native mobile applications, Haar Cascade is more

suitable while the hybrid application CNN can be

better.

2.4 Comparison of Models for SVM, HC,

and CNN

The authors in, [14], developed models using SVM,

HC, and CNN to detect ASD using facial images.

Authors, [14], have given an SVM mathematical

model in an optimization problem that has been

given below.

The Kernel structure and function types are also

given by the authors, [14], that has been given

below.

They used images from openly available

databases. ASD image data was divided into test

sets, train sets, and valid sets. The models were

trained with 2536 images (1268 autistic and 1268

non-autistic) and validated with 100 images (50

autistic and 50 non-autistic). Finally, trained models

were tested with 300 images (150 autistic and 150

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.28

Prasenjit Mukherjee, Gokul R. S., Manish Godse

E-ISSN: 2224-2872

245

Volume 22, 2023

non-autistic). The dataset is summarized in below

Table 1.

Table 1. Number of samples for training and test

data

Attributes

Train

Valid

Test

Autism

1268

100

300

Non-Autism

1268

100

300

The classifiers used for SVM, HC, and CNN are

as below.

1) The SVM model was trained using the “Kernel

Regularization Function”.

2) The Haar Cascade was implemented using

“Cascade Trainer GUI”.

3) The CNN model has been implemented using

“VGG16”.

Three models were developed using the below

steps as shown in Figure 1.

Fig. 1: Model Development Steps

The output of all trained models for the methods

SVM, HC, and CNN are summarized for accuracy

below in Table 2.

Table 2. Accuracy of models/algorithms, [14]

.Methods / Algorithms

Accuracy (%)

Support Vector Machines (SVM)

65

Haar Cascade (HC)

72

Convolutional Neural Networks CNN)

90

The model accuracy from the above table shows

CNN has the highest accuracy 90% as compared to

SVM and HC models. The CNN model has 90%

accuracy hence it is reliable to use for ASD

detection.

3 Use of Pre-Trained CNN Models

A pre-trained model refers to a neural network that

has undergone training on an extensive collection of

images. Such models can either be employed

directly or be fine-tuned using transfer learning to

tailor them for specific tasks. The essence of

transfer learning in image classification lies in its

presumption: if a model has been extensively

trained on a diverse and vast dataset, it's equipped to

handle unfamiliar visual content. Instead of

initiating training from scratch on large datasets, the

features learned by these models can be harnessed

directly or fine-tuned further. Transfer learning

offers the flexibility to repurpose these pre-

established models for diverse image classifications

and predictions, [15]. Over the years, a plethora of

pre-trained models, built on the backbone of

Convolutional Neural Networks (CNN), have

emerged, including but not limited to VGG1,

VGG16, VGG19, MobileNet, MobileNetV2,

Densenet, Inception V3, Resnet50, and Xception.

Each model boasts its unique architecture and set of

parameters. Developed over the previous decade,

these models have undergone various iterations and

enhancements to remain relevant and effective for

different imaging tasks, [16]. Researchers

leveraging transfer learning often draw from the

ImageNet dataset. For instance, when comparing

various models, MobileNetV2 was found to be more

parameter-efficient than its counterparts. In terms of

accuracy, MobileNet achieved between 70 to

89.5%, MobileNetV2 scored in the 71 to 90% range,

both VGG16 and VGG19 ranged from 71 to 90%,

while ResNet50 showcased a commendable 74 to

92% accuracy. The result showed that MobileNetV2

performance was relatively better than other models,

and it also used less disk space and parameters when

compared with other pre-trained models, [16]. The

accuracy of various models is summarized in below

Table 3.

Table 3. Accuracy of CNN pre-trained models, [16]

Methods

Dataset

Accuracy (%)

MobileNet

ImageNet

70 to 89

MobileNetV2

ImageNet

71 to 90

VGG16

ImageNet

71 to 90

VGG19

ImageNet

71 to 90

ResNet50

ImageNet

74 to 92

The authors in, [17], used three types of deep

learning algorithms to detect ASD using facial

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.28

Prasenjit Mukherjee, Gokul R. S., Manish Godse

E-ISSN: 2224-2872

246

Volume 22, 2023

images. They used a dataset consisting of 2,940 face

images. Half of the images were of autistic children

while the other half were of non-autistic children.

They collected datasets from various websites and

Facebook pages. The dataset was open and there

was no issue of privacy. The dataset is summarized

in below Table 4.

Table 4. Number of samples for Training and Test

data

Attributes

Train

Valid

Test

Autism

1270

100

300

Non-Autism

1270

100

300

The research was focused on the use of three

pre-trained models for ASD using facial feature

images: NASNetMobile, VGG19, and Xception.

The empirical results of these models are as in Table

5. It can be seen that the Xception model attained

the highest accuracy of 91%, [17].

Table 5. Accuracy of Models/Algorithms, [17]

Methods / Algorithms

Accuracy

NASNetMobile

75 to 82 %

VGG19

65 to 78 %

Xception

70 to 91 %

By using the YoloV8 model, [18], on a dataset

of Kaggle, Subhash and the team achieved 89.6%

accuracy in the classification of ASD with an F1-

score of 0.89, [18]. The “Accuracy” is consistently

high in the majority of models. The accuracy from

Table 1 and Table 2 indicate that deep learning

models are good for ASD detection compared to

traditional machine learning models. Even Table 2

indicates that there is no need to develop new

models from scratch, rather pre-trained models can

be used. Similarly, pre-trained models can be used

for ASD detection with some retraining.

4 Use of Radiomics

Radiomics Radiomics involves the precise

measurement of characteristics in medical imaging

modalities such as MRI (Magnetic Resonance

Imaging), CT (Computed Tomography), and PET

(Positron Emission Tomography). In this context,

MRI plays a crucial role for medical professionals in

the accurate diagnosis of Autism Spectrum Disorder

(ASD). MRI techniques are divided into functional

(fMRI) and structural (sMRI) imaging. However,

the process of diagnosing ASD through these MRI

techniques can be quite tedious and time-intensive

as in [19]. To aid specialists, the application of AI

(Artificial Intelligence)-based tools is beneficial.

Techniques in machine learning (ML) and deep

learning (DL) are increasingly being employed to

analyze MRI data for ASD diagnosis. The radiomics

workflow is depicted in Figure 2. Initially, the

workflow involves identifying and marking the

region of interest (ROI) in 2D or the volume of

interest (VOI) in 3D. These ROIs/VOIs are areas

identified for their significant radiomic features.

Following this, the next phase is image

segmentation, which can be performed manually,

through semi-automatic methods like region-

growing or thresholding algorithms, or

automatically by employing deep learning

algorithms, [20]. Next, images are processed so that

they can be homogenized. It is done for radiomic

feature extraction based on pixel spacing, grey-level

intensities, bins of the grey-level histogram, etc.,

[20]. Not all radiomic features are useful in a model

development hence non-reproducible, redundant,

and non-relevant features are removed from the

feature list. This step is known as dimension

reduction and it may be a multi-step process, [20].

Feature extraction refers to the calculation of

features where feature descriptors are used to

quantify characteristics of the grey levels within the

ROI/VOI, [21]. The step-by-step process is shown

in Figure 2.

Fig. 2: The Radiomics Pipeline

4.1 Cortical Thickness and Support Vector

Machine

ASD is linked to atypical development of certain

brain regions during the initial years of life. MRI

scans serve as valuable tools in detecting these

developmental deviations in the brain. Identifying

specific markers in brain images associated with

autism is crucial for understanding the underlying

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.28

Prasenjit Mukherjee, Gokul R. S., Manish Godse

E-ISSN: 2224-2872

247

Volume 22, 2023

causes of the condition, [22]. Notably, individuals

with autism exhibit increased cortical thickness,

[22]. In a study, a group of 76 children, comprising

40 diagnosed with ASD and 36 neurotypical

children, were subjected to MRI scans. The T1-

MPRAGE sequences were analyzed to extract

features of regions of interest and average cortical

thickness (CT) was measured for each ROI. The

extracted features were used as input for an SVM

classifier to detect kids with autism. The best

accuracy 84%, was achieved with concatenating the

gray matter thickness of the eight ROIs, [22].

Studies in the realm of neuroimaging have revealed

a connection between human cognitive abilities and

specific brain structures, particularly the thickness

of the cerebral cortex. There's a positive correlation

between general intelligence and cortical thickness

in various areas of the association cortex spanning

both hemispheres of the brain, [23].

4.2 rsfMRI Data and Graph CNN

The study, [24], analyzed a dataset comprising 539

ASD subjects and 573 neurotypical individuals.

This dataset encompassed both sMRI and rsfMRI

scans of each participant, accompanied by various

attributes: scan location, participant's gender, age at

the time of scanning, hand dominance, and scores

from multiple tests, among other factors. Before

leveraging this data to construct a model, it

demanded preprocessing. Given the inherent

variability in brain size and structure across

individuals, it's essential during the feature

extraction or segmentation process to ensure

consistency across brain images. Graph CNN was

used to train the model. The combination was

temporal graph convolution and adjacency

convolution layer. It resulted in 70% accuracy of

output, [24]. This means that a specific point in one

brain image should correspond to the same

anatomical location in another. Discrepancies in

image sizes can hinder the neural network's ability

to discern patterns based on individual brain

structures. To counteract this, it's pivotal to

standardize all brain images to a uniform shape and

size, utilizing a predefined template. This

standardized approach enhances the neural

network's learning efficiency and mitigates potential

distortions, [25].

4.3 rsfMRI Data and 3D-CNN

The authors in, [26], utilized rsfMRI data from the

ABIDE-I dataset, applying a 3D CNN model for

ASD prediction. Their preprocessing steps for the

ABIDE-I data encompassed slice timing

adjustments, motion rectification, global mean

intensity normalization, and alignment of functional

data to the MNI space at a 3x3x3 mm resolution.

Subsequently, they extracted a time series of

Regions of Interest (ROI). For this extraction, they

employed seven atlases, including Harvard-Oxford

(HO), Craddock 200 (CC200), Eickho-Zilles (EZ),

Talaraich and Tournoux (TT), Dosenbach 160

(DOS160), Automated Anatomical Labelling

(AAL), and Craddock 400 (CC400). Implementing a

CNN with 10-fold cross-validation, they reported an

accuracy of approximately 73%. In a different

study, [27], suggested integrating phenotypic data

with rsfMRI information. This phenotypic data

covered age, gender, hand dominance, overall IQ,

and eye status during the fMRI scan (whether the

eyes were open or closed per the imaging protocol).

They introduced six techniques to amalgamate the

phenotypic and fMRI data into one cohesive

network. For their model, they fed rsfMRI time-

series inputs into an LSTM-based architecture,

which, when trained, achieved an accuracy of 70%

on the ABIDE dataset.

Meanwhile, [28], refined the approach to

preprocess the ABIDE-I dataset and train a CNN

model, aiming to elevate the accuracy of autism

detection based on fMRI. Their methodology

involved a dual-phase process to generate 3D data.

1) The Time series data were generated utilizing

three different atlases: AAL, DosenBatch, and

CC200. Subsequently, connectivity matrices

were derived using three distinct methods to

determine connectivity likelihood: the

correlation approach, the covariance approach,

and the tangent space embedding technique. By

combining the three atlases with the three

connectivity likelihood methods, a total of nine

foundational metrics were established.

2) For each subject, they formulated enhanced 3D

matrices. This was done by distinguishing

between high-weight and low-weight

connections, leveraging both the maximum

spanning tree and the minimum spanning tree,

resulting in nine refined metrics.

These 3D metrics were then integrated into

seven advanced deep learning architectures:

ResNet152V2, Inception, ResNet50,

InceptionResNet, Xception, VGG19, and VGG16,

each pre-trained with ImageNet weights. By

employing these seven CNN architectures in various

combinations, they devised 126 unique

classification strategies. Impressively, over two-

thirds of these strategies achieved an accuracy

surpassing 70%. By incorporating a dropout layer

into the transfer learning architectures and using

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.28

Prasenjit Mukherjee, Gokul R. S., Manish Godse

E-ISSN: 2224-2872

248

Volume 22, 2023

cross-validation, they enhanced the models'

robustness, mitigating the risk of over-fitting.

Among the results, the ResNet152V2 stood out,

reaching a pinnacle accuracy of 91% when paired

with tangent-enhanced matrices across all atlases.

Notably, in every enhancement strategy scenario,

ResNet152V2 consistently outperformed other

models. A comprehensive breakdown of the

techniques applied to the MRI data and their

respective accuracies can be found in Table 6.

Table 6. Accuracy of Models/Algorithms for MRI

data

Methods / Algorithms

Highest

Accuracy (%)

MRI (Cortical Thickness) and

Support Vector Machine, [22]

84

rsfMRI and Graph CNN, [24]

70

rsfMRI and 3D- CNN, [26]

73

rsfMRI along with phenotypic data

and LSTM, [27]

70

rsfMRI Data and 3D- CNN

ResNet152V2, [28]

96

5 Results and Discussion

In this work, we explored several methods of

artificial intelligence covering machine learning and

deep learning to classify ASD and neurotypical

subjects. The data types used are face images of kids

and MRI data. A total of sixteen algorithms are

studied in this paper and accuracy is projected in

Table 7. The accuracy ranges from 65% to 91%.

The accuracy for a support vector machine and

Haar Cascade is less compared to deep learning

models. However, for MRI data, deep learning

models have not performed very well except the

RestNet model. The predefined models have given

good results for face image data and achieved the

highest accuracy of 92%. For MRI data, the highest

accuracy achieved is 96%, which is also the highest

in this study. For face image data, the accuracy

results are consistent for deep learning models

whereas MRI data has given inconsistent results.

Considering the accuracy of deep learning models,

efforts are required to add more samples covering

various geographies of the world. Better pre-trained

models are required for easy implementation of

systems with better accuracy. It is also necessary to

develop models considering the need for mobile

apps required for field workers to detect autism.

These mobile-based applications will be helpful to

parents for early detection of ASD. Similarly,

decision support can be developed using face

images and MRI data to support physicians in ASD

detection.

Table 7. Summary of Accuracy of

Models/Algorithms

Methods / Algorithms

Data Type

Highest

Accuracy

(%)

Support Vector

Machines (SVM), [14]

Face Image

65

Haar Cascade (HC),

[14]

Face Image

72

Convolutional Neural

Networks (CNN), [14]

Face Image

90

MobileNet, [16]

Face Image

89

MobileNetV2, [16]

Face Image

90

VGG16, [16]

Face Image

90

VGG19, [16]

Face Image

90

ResNet50, [16]

Face Image

92

NASNetMobile, [17]

Face Image

82

VGG19, [17]

Face Image

78

Xception, [17]

Face Image

91

Support Vector

Machine, [22]

MRI (Cortical

Thickness)

84

Graph CNN, [24]

rsfMRI

70

3D- CNN, [26]

rsfMRI

73

LSTM, [27]

rsfMRI along

with

phenotypic

70

3D- CNN

ResNet152V2, [28]

rsfMRI

96

6 Application of Proposed Study

In the 21st century, most organization was unaware

of the power of IT, and at that time IT dept. was

limited in software handling where the importance

of digital data was unknown according to, [29].

According to the increment of applications,

generated data is needed for further preprocessing.

The data may be characterized by volume,

complexity, variation, and specificity where these

characteristics define the formulation of an

application model, [30]. The proposed study can

help us for a good understanding of supervised and

deep learning applications in autism. Various

models of deep learning and supervised learning

have been discussed in autism detection. Each

application is an important part of autism detection.

The maximum data that has been utilized for the

detection of autism is MRI scan data. Convolutional

Neural Networks (CNN), MobileNet, VGG, ResNet,

NASNetMobile, Xception, and 3D-CNN are

supervised and deep learning models that accept

MRI scan data for the detection of autism but these

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.28

Prasenjit Mukherjee, Gokul R. S., Manish Godse

E-ISSN: 2224-2872

249

Volume 22, 2023

techniques are very cost-effective. Parents of an

autistic baby from a rural area will not get a benefit

from such kind of system due to the lack of

availability for the MRI scanning process. The

detection of autism in the early stage is fruitful for

reducing autism symptoms. According to the above

models, MRI scan data of the brain is a primary

requirement whereas an MRI scan of the brain of a

baby is not a good suggestion for radiation.

According to the rapid growth of social media,

massive digital data has been generated that is very

useful due to a large number of participants of

individuals. Many NLP-based applications have

been developed using these generated data in

various domains using NLP techniques and machine

learning models, [31]. Many parents of autistic

babies are using social sites to share their

experiences with autism. These statements from the

parents of autistic babies can be a good source for

application development of autism detection and

any parent can participate from any area on such

kind of applications. The detection of autism from

parents' experiences is our future research work.

7 Conclusion

The document provides a thorough examination of

various artificial intelligence (AI) methodologies,

particularly machine learning and deep learning

techniques, in the context of Autism Spectrum

Disorder (ASD) detection through face images and

MRI data. The early identification of ASD is critical

due to its diverse nature, and AI presents a

promising avenue for enhancing early detection

accuracy. Among the techniques evaluated using

facial images, deep learning models, especially

Convolutional Neural Networks (CNN),

consistently outperformed traditional machine

learning methods like Support Vector Machines

(SVM) and Haar Cascade. Pre-trained models on

face images, such as ResNet50, achieved high

accuracies, indicating their potential utility for

practical applications. The utilization of pre-trained

models for image classification, such as VGG16,

VGG19, and MobileNetV2, yielded substantial

accuracy, emphasizing the potential of leveraging

existing architectures and applying transfer learning

for ASD detection. Given the high accuracy of

certain models, there is an opportunity to develop

mobile applications for field workers and parents for

early ASD detection. Such applications can play a

pivotal role in facilitating timely interventions. The

document underscores the need for more extensive

datasets that cover diverse global populations. This

would ensure the generalizability of the models.

Furthermore, there is a call to develop better pre-

trained models and systems optimized for mobile

devices, enabling broader accessibility and use. In

essence, AI, especially deep learning, offers

promising tools for enhancing the accuracy and

timeliness of ASD detection. With further research

and development, these tools can be refined and

made widely accessible, ensuring early and effective

interventions for individuals with ASD. The

exponential increase in social media usage has led to

the creation of a vast amount of digital data,

enriched by the diverse contributions of its users.

This data trove has become a cornerstone for the

development of numerous applications in various

fields, leveraging Natural Language Processing

(NLP) techniques and advanced machine learning

models. Significantly, parents of children with

autism are increasingly using social media platforms

to share their personal experiences and challenges.

These firsthand accounts are invaluable, offering a

rich resource for developing applications aimed at

detecting autism. Such applications have the

potential to be universally accessible, allowing

parents from any location to participate and

contribute. The exploration of autism detection

through the analysis of parents' shared experiences

on social media is a key area of our future research

endeavors.

Acknowledgement:

The authors extend their appreciation to the

Manipur International University, Imphal, India for

supporting this research work on Autism.

References:

[1] Autism Spectrum Disorder, Mental Health

Information, 2023.

[2] Roger N. Rosenberg, Juan M. Pascual,

Rosenberg's Molecular and Genetic Basis of

Neurological and Psychiatric Disease,

Academic Press, 2015, pp. 1401-1424.

[3] Diagnostic and statistical manual of mental

disorders: DSM-5, American Psychiatric

Association, 5th edition, 2022.

[4] Catherine Lord et al., “Autism From 2 to 9

Years of Age”, Arch Gen Psychiatry. vol.

63(6), 2006, pp. 694–701.

[5] “What are the treatments for autism?”

National Institute of Mental Health, 2023.

[6] Zeyad A. T. Ahmed and Mukti E. Jadhav, A

Review of Early Detection of Autism Based

on Eye-Tracking and Sensing

Technology, International Conference on

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.28

Prasenjit Mukherjee, Gokul R. S., Manish Godse

E-ISSN: 2224-2872

250

Volume 22, 2023

Inventive Computation Technologies (ICICT),

Coimbatore, India, 2020, pp. 160-166.

[7] P. Jonathon Phillips, Support Vector

Machines Applied to Face Recognition,

Neural Information Processing, 1998, pp. 1-7.

[8] Zeyad A. T. Ahmed et al., Facial Features

Detection System To Identify Children With

Autism Spectrum Disorder: Deep Learning

Models, Computational and Mathematical

Methods in Medicine, vol. 2022, 2022, pp. 1-

9.

[9] Rainer Lienhart and Jochen Maydt, An

extended set of Haar-like features for rapid

object detection, International Conference on

Image Processing, Rochester, 2002, pp. I-I.

[10] Cascade Classifier Training, Open Source

Computer Vision, 2023.

[11] S. Ghaderizadeh, D. Abbasi-Moghadam, A.

Sharifi, N. Zhao and A. Tariq, Hyperspectral

Image Classification Using a Hybrid 3D-2D

Convolutional Neural Networks, IEEE

Journal of Selected Topics in Applied Earth

Observations and Remote Sensing, vol. 14,

2021, pp. 7570-7588.

[12] Rikiya Yamashita, Mizuho Nishio, Richard

Kinh Gian Do & Kaori Togashi,

Convolutional neural networks: an overview

and application in radiology, Insights

Imaging, vol. 9, 2018, pp. 611–629.

[13] Shivkaran Ravidas and M. A. Ansari, Deep

learning for pose-invariant face detection in

unconstrained environment, International

Journal of Electrical and Computer

Engineering (IJECE), vol. 9, no. 1, 2019, pp.

577-584.

[14] Srividhya Ganesan, Raju, J. Senthil,

Prediction of Autism Spectrum Disorder by

Facial Recognition Using Machine Learning,

Information Retrieval and Web Search,

September, vol. 18, 2021, pp. 406-417.

[15] “Transfer learning and fine-tuning”,

TensorFlow Core, 2023

[16] J Praveen Gujjar, H R Prasanna Kumar,

Niranjan N. Chiplunkar, Image classification

and prediction using transfer learning in colab

notebook, Global Transitions Proceedings,

vol. 2(2), 2021, pp. 382-385.

[17] Fawaz Waselallah Alsaade and Mohammed

Saeed Alzahrani, Classification and Detection

of Autism Spectrum Disorder Based on Deep

Learning Algorithms, Computational

Intelligence and Neuroscience, vol. 2022,

2022, pp. 1-10.

[18] Subash Gautam, Prabin Sharma, Kisan Thapa,

Mala Deep Upadhaya, Dikshya Thapa, Salik

Ram Khanal, Vítor Manuel de Jesus Filipe,

Screening Autism Spectrum Disorder in

children using Deep Learning Approach:

Evaluating the classification model of

YOLOv8 by comparing with other models,

Computer Vision and Pattern Recognition,

2023, pp. 1-15.

[19] Parisa Moridian et al., Automatic autism

spectrum disorder detection using artificial

intelligence methods with MRI neuroimaging:

A review, Frontiers in Molecular

Neuroscience, vol. 15, 2022, pp. 1-32.

[20] Janita E. van Timmeren, Davide Cester,

Stephanie Tanadini-Lang, Hatem Alkadhi and

Bettina Baessler, Radiomics in medical

imaging: a how-to guide and critical

reflection, Insights Imaging, vol. 11, 2020, pp.

1-16.

[21] Alex Zwanenburg, Stefan Leger, Martin

Vallières, Steffen Löck, Image biomarker

standardisation initiative, Computer Vision

and Pattern Recognition, 2019, pp. 1-160.

[22] Letizia Squarcina et al., Automatic

classification of autism spectrum disorder in

children using cortical thickness and support

vector machine, Brain and Behavior, vol.

11(8), 2021, pp. 1-9.

[23] Kyle Menary et al., Associations between

cortical thickness and general intelligence in

children, adolescents and young adults.

Intelligence, Intelligence, vol. 41(5), 2013, pp.

597-606.

[24] Saloni Mahendra Jain, Detection of Autism

using Magnetic Resonance Imaging data and

Graph Convolutional Neural Networks,

Thesis, Rochester Institute of Technology,

2018.

[25] Spatial normalization, 2023, Wikipedia,

[Online].

https://en.wikipedia.org/wiki/Spatial_normaliz

ation (Accessed Date: November 30, 2023).

[26] Meenakshi Khosla, Keith Jamison, Amy

Kuceyeski, Mert Sabuncu, 3D Convolutional

Neural Network for Classification of

Functional Connectomes, Deep Learning in

Medical Image Analysis and Multimodal

Learning for Clinical Decision Support, 2018,

pp. 1-10.

[27] Nicha C. Dvornek, Pamela Ventola, and

James S. Duncan’ Combining Phenotypic and

Resting-State Fmri Data for Autism

Classification with Recurrent Neural

Networks, IEEE International Symposium on

Biomedical Imaging, 2018, pp. 725-728.

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.28

Prasenjit Mukherjee, Gokul R. S., Manish Godse

E-ISSN: 2224-2872

251

Volume 22, 2023

[28] Fatima Zahra Benabdallah, Ahmed Drissi El

Maliani, Dounia Lotfi, and Mohammed El

Hassouni, A Convolutional Neural Network-

Based Connectivity Enhancement Approach

for Autism Spectrum Disorder Detection,

Journal of Imaging, vol. 9(6), 2023, pp. 1-12.

[29] Bentolhoda Abdollahbeigi, Farhang Salehi,

"A Study of Information Technology

Governance Initiatives On Organizational

Performance", WSEAS Transactions on

Computers, vol. 20, 2021, pp. 39-48.

[30] Stella Vetova, "Big Data Integration and

Processing Model", WSEAS Transactions on

Computers, vol. 20, 2021, pp. 82-87.

[31] Kristofferson Culmer, Jeffrey Uhlmann,

Examining LDA2Vec and Tweet Pooling for

Topic Modeling on Twitter Data, WSEAS

Transactions on Information Science and

Applications, vol. 18, 2021, pp. 102-115.

Contribution of Individual Authors to the

Creation of a Scientific Article (Ghostwriting

Policy)

The authors equally contributed in the present

research, at all stages from the formulation of the

problem to the final findings and solution.

Sources of Funding for Research Presented in a

Scientific Article or Scientific Article Itself

No funding was received for conducting this study.

Conflict of Interest

The authors have no conflicts of interest to declare.

Creative Commons Attribution License 4.0

(Attribution 4.0 International, CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/deed.en

_US

WSEAS TRANSACTIONS on COMPUTERS

DOI: 10.37394/23205.2023.22.28

Prasenjit Mukherjee, Gokul R. S., Manish Godse

E-ISSN: 2224-2872

252

Volume 22, 2023