Examination of AI Algorithms for Image and MRI-based Autism
Detection
PRASENJIT MUKHERJEE1,2, GOKUL R. S.1, MANISH GODSE3
1Department of Technology,
Vodafone Intelligent Solutions,
Pune,
INDIA
2Department of Computer Science,
Manipur International University,
Manipur,
INDIA
3Department of IT,
Bizamica Software,
Pune,
INDIA
Abstract: - Precise identification of autism spectrum disorder (ASD) is a challenging task due to the
heterogeneity of ASD. Early diagnosis and interventions have positive effects on treatment and later skills
development. Hence, it is necessary to provide families and communities with the resources, training, and tools
required to diagnose and help patients. Recent work has shown that artificial intelligence-based methods are
suitable for the identification of ASD. AI-based tools can be good resources for parents for early detection of
ASD in their kids. Even AI-based advanced tools are helpful for health workers and physicians to detect ASD.
Facial images and MRI are the best sources to understand ASD symptoms, hence are input required in AI-based
model training. The trained models are used for the classification of ASD patients and normal kids. The deep
learning models are found to be very accurate in ASD detection. In this paper, we present a comprehensive
study of AI techniques like machine learning, image processing, and deep learning, and their accuracy when
these techniques are used on facial and MRI images of ASD and normally developed kids.
Key-Words: - ASD, Autism Detection, Machine Learning, Image Processing, Deep Learning, Support Vector
Machine, Haar Cascade, CNN, 3D-CNN.
Received: July 15, 2023. Revised: August 29, 2023. Accepted: October 11, 2023. Published: November 30, 2023.
1 Introduction
Autism spectrum disorder (ASD) is a neurological
and developmental disorder affecting the interaction
of patients with others. ASD patients have difficulty
in communication, learning, and behaviors. Autism
symptoms generally appear at an early stage in kids,
when they are two years old. At that age, kids are
not in a position to talk about their difficulties with
their parents. However, parents can play a role in
detecting autism in kids if they are aware of it.
Parents have to observe their kids and talk to
doctors about the development of kids, [1]. A
patient and his family are affected financially, and
emotionally because of ASD. Continued care of a
patient also creates physical burdens over the
individual’s lifespan and family caretaker. It also
stretches the healthcare system of local and federal
agencies as they have to support them medically and
financially during the lifespan of a patient.
Consequently, continuous research is required to
find better ASD-specific interventions and better
ways to enable families and communities with
resources, training, and tools required to diagnose
and help patients, [2]. Autism patients are 1% of the
world’s population, [3]. Hence serious attention is
required for the detection and support required for
patients of ASD. It has been observed that high
stability has been found for clinical diagnoses
between ages 2 and 3 years, [4]. Thus, early
diagnosis and interventions during preschool or
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.28
Prasenjit Mukherjee, Gokul R. S., Manish Godse
E-ISSN: 2224-2872
243
Volume 22, 2023
before, are required to have major positive effects
on treatment and later skills development, [5]. Most
of the time kids are with their parents hence it is
better to train them and provide tools so that they
can observe their kids and report ASD related
symptoms to doctors for further investigation and
diagnosis. Artificial intelligence (AI) has the
potential to play a big role in developing interactive
systems to assist in autism detection using machine
and deep learning. The data points required to
develop these systems are images of different types
covering the brain and face. These systems are
useful to parents, healthcare workers, and doctors.
The data used to develop models for AI-based
systems are facial features, facial landmarks, facial
expressions, brain MRI, electroencephalogram
(EEG) signals, eye tracking, and eye contact, [6].
Classification and clustering approaches are
common to detect ASD. Additional data required
can be captured using questionnaires.
This article presents a comprehensive study of
various models developed using existing state-of-
the-art artificial intelligence-based models for ASD
detection. The data used in the models is from the
open source and has facial images as well as MRI
images. It then provides a comparison of different
models and discusses research gaps and potential
areas that should be explored in the future to make
further progress in this field. It also suggests the
potential applications of these approaches for parents
of ASD kids and physicians.
2 Use of Face Recognition in Autism
Detection
ASD is a neurodevelopmental problem because of a
brain disorder affecting the physical appearance
especially the face of children. The facial features of
ASD children are distinctively different from
normally developed children hence facial features
are useful to identify the ASD disorder.
The complexity in face detection arises because of
1) The large visual difference between human faces
in the cluttered background of images, that is,
extreme illuminations and exaggerated
expressions can lead to large differences in the
visual appearance of the face
2) The large search space for probable face size and
position.
2.1 Support Vector Machine
The support vector machines (SVM) are a
supervised binary classification method to find the
optimal linear decision surface based on the concept
of structural risk minimization. Support Vector
Machines (SVM) operate by delineating
hyperplanes within a multi-dimensional space,
effectively segregating different classification
categories. The essence of SVM lies in determining
optimal boundaries, represented by these
hyperplanes, which segregate the training dataset
into distinct classes. In instances where the decision
boundaries are not optimally determined, there's a
potential risk of misclassifying new data. SVM
gives precedence to extreme data points, known as
support vectors, to ascertain these boundaries. These
support vectors are pivotal in defining the
hyperplane, calculated as the sum of the minimal
distances from both positive and negative data
points. SVMs are versatile and can address both
regression and classification challenges, effectively
managing datasets with multiple continuous and
categorical attributes. SVMs are effective in high-
dimensional spaces, even when the number of
dimensions is greater than the number of samples.
In image classification having two classes as inputs
for training, the images are classified as: (1) the
dissimilarities between images of the same
individual, and (2) dissimilarities between images of
different people. The SVM model is trained using
an image dataset, taking into consideration the
kernel and the values for the upper bound margin.
Once the model is trained, it generates a decision
boundary or surface. During the testing phase, any
samples that are falsely identified as positive are
cataloged and then utilized as negative examples in
the following training iterations. By incorporating
these negative examples, particularly those from
misclassified categories, the model's accuracy in
detecting ASD is enhanced. In the realm of face
recognition, SVM evaluates the decision boundary
to gauge the degree of similarity between pairs of
facial images. This evaluation process paves the
way for the development of sophisticated face-
recognition systems. SVM works well with small
datasets, and it is also able to handle complex
patterns and noisy data, [7].
2.2 Haar Cascade
The Haar Cascade (HC) algorithm was proposed by
Paul Viola and Michael Jones. Haar Cascade is
grounded in machine learning principles, where the
cascade function undergoes training using an
abundance of positive data points. These positive
points are derived from regions showcasing the
faces of children with ASD, while the negative data
points are sourced from regions depicting the faces
of typically developing children, [8]. The essence of
Haar Cascade lies in its utilization of Haar-like
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.28
Prasenjit Mukherjee, Gokul R. S., Manish Godse
E-ISSN: 2224-2872
244
Volume 22, 2023
features extracted from digital images to facilitate
object recognition. These features are characterized
by specific rectangular sections of an image, which
are then further segmented into multiple sections.
Often, these features are illustrated as juxtaposed
black-and-white rectangles. The value of each
feature, crucial for training, is computed by
subtracting the sum of pixel values underneath the
white rectangle from those beneath the black
rectangle. Owing to its intrinsic design, Haar
Cascades excels in identifying facial features like
eyes, nose, and mouth. Consequently, they possess
the ability to discern between children with ASD
and those without, [8]. Haar Cascade is a multi-
stage classifier and rapid detection framework as it
reduces the processing time substantially. It is also
able to achieve good accuracy and able to reduce
false positives compared to a single-stage classifier,
[9]. OpenCV provides a training method or pre-
trained repository for Haar Cascade, [10].
2.3 Convolutional Neural Networks (CNN)
A convolutional neural network (CNN) is a type of
neural network used in deep learning with
convolutional layers. CNN has two types of layers
(hidden): convolutional layers and pooling layers.
These layers are arranged alternately in the network.
The CNN mimics neurons and their connections and
has m × n neurons that are connected to neighboring
layers. The connection weights are shared in the
network of CNN thus less training time is required
for CNN. CNNs can be implemented in 1, 2, and 3
dimensions. 1-Dimensional (1D) CNN can
recognize patterns in 1D signals such as time-series
analysis. 1D-CNNs can learn from feature values
and the order of the features. In 2-dimensional
CNN, the CNN kernel moves in a 2-direction (x, y)
and calculates the output, which is a 2D Matrix. In
3-D regions dimensional CNN, joint spatial-spectral
information is processed simultaneously, [11]. In
digital images, pixel values are stored in a two-
dimensional (2D) grid, i.e., a two-dimensional array.
In the CNN method, a kernel is applied to every
position of the image to extract features. CNN is
highly effective in image processing as it can extract
features that may occur anywhere in the image. In
CNN, output from one layer is passed to the next
layer, hence hierarchically extracted features can
become more complex as the network passes
through the training dataset multiple times. The
training on the dataset is done to optimize the
parameters of the kernel, and it minimizes the
difference between outputs and ground truth labels
through an optimization algorithm called
backpropagation and gradient descent. The final
optimized and trained CNN is used for predictions,
[12].
CNN is very good at visual data, such as images
and videos. CNN can automatically learn features to
capture complex visual variations by leveraging a
large amount of training data. The CNN structure
consists of 12-net CNN, 24-net, and 48-net
structures, [13]. CNN gives better detection
accuracy than Haar Cascade. However, Hence for
native mobile applications, Haar Cascade is more
suitable while the hybrid application CNN can be
better.
2.4 Comparison of Models for SVM, HC,
and CNN
The authors in, [14], developed models using SVM,
HC, and CNN to detect ASD using facial images.
Authors, [14], have given an SVM mathematical
model in an optimization problem that has been
given below.
The Kernel structure and function types are also
given by the authors, [14], that has been given
below.
They used images from openly available
databases. ASD image data was divided into test
sets, train sets, and valid sets. The models were
trained with 2536 images (1268 autistic and 1268
non-autistic) and validated with 100 images (50
autistic and 50 non-autistic). Finally, trained models
were tested with 300 images (150 autistic and 150
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.28
Prasenjit Mukherjee, Gokul R. S., Manish Godse
E-ISSN: 2224-2872
245
Volume 22, 2023
non-autistic). The dataset is summarized in below
Table 1.
Table 1. Number of samples for training and test
data
Attributes
Train
Valid
Autism
1268
100
Non-Autism
1268
100
The classifiers used for SVM, HC, and CNN are
as below.
1) The SVM model was trained using the “Kernel
Regularization Function”.
2) The Haar Cascade was implemented using
“Cascade Trainer GUI”.
3) The CNN model has been implemented using
“VGG16”.
Three models were developed using the below
steps as shown in Figure 1.
Fig. 1: Model Development Steps
The output of all trained models for the methods
SVM, HC, and CNN are summarized for accuracy
below in Table 2.
Table 2. Accuracy of models/algorithms, [14]
.Methods / Algorithms
Accuracy (%)
Support Vector Machines (SVM)
65
Haar Cascade (HC)
72
Convolutional Neural Networks CNN)
90
The model accuracy from the above table shows
CNN has the highest accuracy 90% as compared to
SVM and HC models. The CNN model has 90%
accuracy hence it is reliable to use for ASD
detection.
3 Use of Pre-Trained CNN Models
A pre-trained model refers to a neural network that
has undergone training on an extensive collection of
images. Such models can either be employed
directly or be fine-tuned using transfer learning to
tailor them for specific tasks. The essence of
transfer learning in image classification lies in its
presumption: if a model has been extensively
trained on a diverse and vast dataset, it's equipped to
handle unfamiliar visual content. Instead of
initiating training from scratch on large datasets, the
features learned by these models can be harnessed
directly or fine-tuned further. Transfer learning
offers the flexibility to repurpose these pre-
established models for diverse image classifications
and predictions, [15]. Over the years, a plethora of
pre-trained models, built on the backbone of
Convolutional Neural Networks (CNN), have
emerged, including but not limited to VGG1,
VGG16, VGG19, MobileNet, MobileNetV2,
Densenet, Inception V3, Resnet50, and Xception.
Each model boasts its unique architecture and set of
parameters. Developed over the previous decade,
these models have undergone various iterations and
enhancements to remain relevant and effective for
different imaging tasks, [16]. Researchers
leveraging transfer learning often draw from the
ImageNet dataset. For instance, when comparing
various models, MobileNetV2 was found to be more
parameter-efficient than its counterparts. In terms of
accuracy, MobileNet achieved between 70 to
89.5%, MobileNetV2 scored in the 71 to 90% range,
both VGG16 and VGG19 ranged from 71 to 90%,
while ResNet50 showcased a commendable 74 to
92% accuracy. The result showed that MobileNetV2
performance was relatively better than other models,
and it also used less disk space and parameters when
compared with other pre-trained models, [16]. The
accuracy of various models is summarized in below
Table 3.
Table 3. Accuracy of CNN pre-trained models, [16]
Methods
Dataset
Accuracy (%)
MobileNet
ImageNet
70 to 89
MobileNetV2
ImageNet
71 to 90
VGG16
ImageNet
71 to 90
VGG19
ImageNet
71 to 90
ResNet50
ImageNet
74 to 92
The authors in, [17], used three types of deep
learning algorithms to detect ASD using facial
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.28
Prasenjit Mukherjee, Gokul R. S., Manish Godse
E-ISSN: 2224-2872
246
Volume 22, 2023
images. They used a dataset consisting of 2,940 face
images. Half of the images were of autistic children
while the other half were of non-autistic children.
They collected datasets from various websites and
Facebook pages. The dataset was open and there
was no issue of privacy. The dataset is summarized
in below Table 4.
Table 4. Number of samples for Training and Test
data
Attributes
Train
Valid
Test
Autism
1270
100
300
Non-Autism
1270
100
300
The research was focused on the use of three
pre-trained models for ASD using facial feature
images: NASNetMobile, VGG19, and Xception.
The empirical results of these models are as in Table
5. It can be seen that the Xception model attained
the highest accuracy of 91%, [17].
Table 5. Accuracy of Models/Algorithms, [17]
Methods / Algorithms
Accuracy
NASNetMobile
75 to 82 %
VGG19
65 to 78 %
Xception
70 to 91 %
By using the YoloV8 model, [18], on a dataset
of Kaggle, Subhash and the team achieved 89.6%
accuracy in the classification of ASD with an F1-
score of 0.89, [18]. The “Accuracy” is consistently
high in the majority of models. The accuracy from
Table 1 and Table 2 indicate that deep learning
models are good for ASD detection compared to
traditional machine learning models. Even Table 2
indicates that there is no need to develop new
models from scratch, rather pre-trained models can
be used. Similarly, pre-trained models can be used
for ASD detection with some retraining.
4 Use of Radiomics
Radiomics Radiomics involves the precise
measurement of characteristics in medical imaging
modalities such as MRI (Magnetic Resonance
Imaging), CT (Computed Tomography), and PET
(Positron Emission Tomography). In this context,
MRI plays a crucial role for medical professionals in
the accurate diagnosis of Autism Spectrum Disorder
(ASD). MRI techniques are divided into functional
(fMRI) and structural (sMRI) imaging. However,
the process of diagnosing ASD through these MRI
techniques can be quite tedious and time-intensive
as in [19]. To aid specialists, the application of AI
(Artificial Intelligence)-based tools is beneficial.
Techniques in machine learning (ML) and deep
learning (DL) are increasingly being employed to
analyze MRI data for ASD diagnosis. The radiomics
workflow is depicted in Figure 2. Initially, the
workflow involves identifying and marking the
region of interest (ROI) in 2D or the volume of
interest (VOI) in 3D. These ROIs/VOIs are areas
identified for their significant radiomic features.
Following this, the next phase is image
segmentation, which can be performed manually,
through semi-automatic methods like region-
growing or thresholding algorithms, or
automatically by employing deep learning
algorithms, [20]. Next, images are processed so that
they can be homogenized. It is done for radiomic
feature extraction based on pixel spacing, grey-level
intensities, bins of the grey-level histogram, etc.,
[20]. Not all radiomic features are useful in a model
development hence non-reproducible, redundant,
and non-relevant features are removed from the
feature list. This step is known as dimension
reduction and it may be a multi-step process, [20].
Feature extraction refers to the calculation of
features where feature descriptors are used to
quantify characteristics of the grey levels within the
ROI/VOI, [21]. The step-by-step process is shown
in Figure 2.
Fig. 2: The Radiomics Pipeline
4.1 Cortical Thickness and Support Vector
Machine
ASD is linked to atypical development of certain
brain regions during the initial years of life. MRI
scans serve as valuable tools in detecting these
developmental deviations in the brain. Identifying
specific markers in brain images associated with
autism is crucial for understanding the underlying
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.28
Prasenjit Mukherjee, Gokul R. S., Manish Godse
E-ISSN: 2224-2872
247
Volume 22, 2023
causes of the condition, [22]. Notably, individuals
with autism exhibit increased cortical thickness,
[22]. In a study, a group of 76 children, comprising
40 diagnosed with ASD and 36 neurotypical
children, were subjected to MRI scans. The T1-
MPRAGE sequences were analyzed to extract
features of regions of interest and average cortical
thickness (CT) was measured for each ROI. The
extracted features were used as input for an SVM
classifier to detect kids with autism. The best
accuracy 84%, was achieved with concatenating the
gray matter thickness of the eight ROIs, [22].
Studies in the realm of neuroimaging have revealed
a connection between human cognitive abilities and
specific brain structures, particularly the thickness
of the cerebral cortex. There's a positive correlation
between general intelligence and cortical thickness
in various areas of the association cortex spanning
both hemispheres of the brain, [23].
4.2 rsfMRI Data and Graph CNN
The study, [24], analyzed a dataset comprising 539
ASD subjects and 573 neurotypical individuals.
This dataset encompassed both sMRI and rsfMRI
scans of each participant, accompanied by various
attributes: scan location, participant's gender, age at
the time of scanning, hand dominance, and scores
from multiple tests, among other factors. Before
leveraging this data to construct a model, it
demanded preprocessing. Given the inherent
variability in brain size and structure across
individuals, it's essential during the feature
extraction or segmentation process to ensure
consistency across brain images. Graph CNN was
used to train the model. The combination was
temporal graph convolution and adjacency
convolution layer. It resulted in 70% accuracy of
output, [24]. This means that a specific point in one
brain image should correspond to the same
anatomical location in another. Discrepancies in
image sizes can hinder the neural network's ability
to discern patterns based on individual brain
structures. To counteract this, it's pivotal to
standardize all brain images to a uniform shape and
size, utilizing a predefined template. This
standardized approach enhances the neural
network's learning efficiency and mitigates potential
distortions, [25].
4.3 rsfMRI Data and 3D-CNN
The authors in, [26], utilized rsfMRI data from the
ABIDE-I dataset, applying a 3D CNN model for
ASD prediction. Their preprocessing steps for the
ABIDE-I data encompassed slice timing
adjustments, motion rectification, global mean
intensity normalization, and alignment of functional
data to the MNI space at a 3x3x3 mm resolution.
Subsequently, they extracted a time series of
Regions of Interest (ROI). For this extraction, they
employed seven atlases, including Harvard-Oxford
(HO), Craddock 200 (CC200), Eickho-Zilles (EZ),
Talaraich and Tournoux (TT), Dosenbach 160
(DOS160), Automated Anatomical Labelling
(AAL), and Craddock 400 (CC400). Implementing a
CNN with 10-fold cross-validation, they reported an
accuracy of approximately 73%. In a different
study, [27], suggested integrating phenotypic data
with rsfMRI information. This phenotypic data
covered age, gender, hand dominance, overall IQ,
and eye status during the fMRI scan (whether the
eyes were open or closed per the imaging protocol).
They introduced six techniques to amalgamate the
phenotypic and fMRI data into one cohesive
network. For their model, they fed rsfMRI time-
series inputs into an LSTM-based architecture,
which, when trained, achieved an accuracy of 70%
on the ABIDE dataset.
Meanwhile, [28], refined the approach to
preprocess the ABIDE-I dataset and train a CNN
model, aiming to elevate the accuracy of autism
detection based on fMRI. Their methodology
involved a dual-phase process to generate 3D data.
1) The Time series data were generated utilizing
three different atlases: AAL, DosenBatch, and
CC200. Subsequently, connectivity matrices
were derived using three distinct methods to
determine connectivity likelihood: the
correlation approach, the covariance approach,
and the tangent space embedding technique. By
combining the three atlases with the three
connectivity likelihood methods, a total of nine
foundational metrics were established.
2) For each subject, they formulated enhanced 3D
matrices. This was done by distinguishing
between high-weight and low-weight
connections, leveraging both the maximum
spanning tree and the minimum spanning tree,
resulting in nine refined metrics.
These 3D metrics were then integrated into
seven advanced deep learning architectures:
ResNet152V2, Inception, ResNet50,
InceptionResNet, Xception, VGG19, and VGG16,
each pre-trained with ImageNet weights. By
employing these seven CNN architectures in various
combinations, they devised 126 unique
classification strategies. Impressively, over two-
thirds of these strategies achieved an accuracy
surpassing 70%. By incorporating a dropout layer
into the transfer learning architectures and using
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.28
Prasenjit Mukherjee, Gokul R. S., Manish Godse
E-ISSN: 2224-2872
248
Volume 22, 2023
cross-validation, they enhanced the models'
robustness, mitigating the risk of over-fitting.
Among the results, the ResNet152V2 stood out,
reaching a pinnacle accuracy of 91% when paired
with tangent-enhanced matrices across all atlases.
Notably, in every enhancement strategy scenario,
ResNet152V2 consistently outperformed other
models. A comprehensive breakdown of the
techniques applied to the MRI data and their
respective accuracies can be found in Table 6.
Table 6. Accuracy of Models/Algorithms for MRI
data
Methods / Algorithms
Highest
Accuracy (%)
MRI (Cortical Thickness) and
Support Vector Machine, [22]
84
rsfMRI and Graph CNN, [24]
70
rsfMRI and 3D- CNN, [26]
73
rsfMRI along with phenotypic data
and LSTM, [27]
70
rsfMRI Data and 3D- CNN
ResNet152V2, [28]
96
5 Results and Discussion
In this work, we explored several methods of
artificial intelligence covering machine learning and
deep learning to classify ASD and neurotypical
subjects. The data types used are face images of kids
and MRI data. A total of sixteen algorithms are
studied in this paper and accuracy is projected in
Table 7. The accuracy ranges from 65% to 91%.
The accuracy for a support vector machine and
Haar Cascade is less compared to deep learning
models. However, for MRI data, deep learning
models have not performed very well except the
RestNet model. The predefined models have given
good results for face image data and achieved the
highest accuracy of 92%. For MRI data, the highest
accuracy achieved is 96%, which is also the highest
in this study. For face image data, the accuracy
results are consistent for deep learning models
whereas MRI data has given inconsistent results.
Considering the accuracy of deep learning models,
efforts are required to add more samples covering
various geographies of the world. Better pre-trained
models are required for easy implementation of
systems with better accuracy. It is also necessary to
develop models considering the need for mobile
apps required for field workers to detect autism.
These mobile-based applications will be helpful to
parents for early detection of ASD. Similarly,
decision support can be developed using face
images and MRI data to support physicians in ASD
detection.
Table 7. Summary of Accuracy of
Models/Algorithms
Methods / Algorithms
Data Type
Highest
Accuracy
(%)
Support Vector
Machines (SVM), [14]
Face Image
65
Haar Cascade (HC),
[14]
Face Image
72
Convolutional Neural
Networks (CNN), [14]
Face Image
90
MobileNet, [16]
Face Image
89
MobileNetV2, [16]
Face Image
90
VGG16, [16]
Face Image
90
VGG19, [16]
Face Image
90
ResNet50, [16]
Face Image
92
NASNetMobile, [17]
Face Image
82
VGG19, [17]
Face Image
78
Xception, [17]
Face Image
91
Support Vector
Machine, [22]
MRI (Cortical
Thickness)
84
Graph CNN, [24]
rsfMRI
70
3D- CNN, [26]
rsfMRI
73
LSTM, [27]
rsfMRI along
with
phenotypic
70
3D- CNN
ResNet152V2, [28]
rsfMRI
96
6 Application of Proposed Study
In the 21st century, most organization was unaware
of the power of IT, and at that time IT dept. was
limited in software handling where the importance
of digital data was unknown according to, [29].
According to the increment of applications,
generated data is needed for further preprocessing.
The data may be characterized by volume,
complexity, variation, and specificity where these
characteristics define the formulation of an
application model, [30]. The proposed study can
help us for a good understanding of supervised and
deep learning applications in autism. Various
models of deep learning and supervised learning
have been discussed in autism detection. Each
application is an important part of autism detection.
The maximum data that has been utilized for the
detection of autism is MRI scan data. Convolutional
Neural Networks (CNN), MobileNet, VGG, ResNet,
NASNetMobile, Xception, and 3D-CNN are
supervised and deep learning models that accept
MRI scan data for the detection of autism but these
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.28
Prasenjit Mukherjee, Gokul R. S., Manish Godse
E-ISSN: 2224-2872
249
Volume 22, 2023
techniques are very cost-effective. Parents of an
autistic baby from a rural area will not get a benefit
from such kind of system due to the lack of
availability for the MRI scanning process. The
detection of autism in the early stage is fruitful for
reducing autism symptoms. According to the above
models, MRI scan data of the brain is a primary
requirement whereas an MRI scan of the brain of a
baby is not a good suggestion for radiation.
According to the rapid growth of social media,
massive digital data has been generated that is very
useful due to a large number of participants of
individuals. Many NLP-based applications have
been developed using these generated data in
various domains using NLP techniques and machine
learning models, [31]. Many parents of autistic
babies are using social sites to share their
experiences with autism. These statements from the
parents of autistic babies can be a good source for
application development of autism detection and
any parent can participate from any area on such
kind of applications. The detection of autism from
parents' experiences is our future research work.
7 Conclusion
The document provides a thorough examination of
various artificial intelligence (AI) methodologies,
particularly machine learning and deep learning
techniques, in the context of Autism Spectrum
Disorder (ASD) detection through face images and
MRI data. The early identification of ASD is critical
due to its diverse nature, and AI presents a
promising avenue for enhancing early detection
accuracy. Among the techniques evaluated using
facial images, deep learning models, especially
Convolutional Neural Networks (CNN),
consistently outperformed traditional machine
learning methods like Support Vector Machines
(SVM) and Haar Cascade. Pre-trained models on
face images, such as ResNet50, achieved high
accuracies, indicating their potential utility for
practical applications. The utilization of pre-trained
models for image classification, such as VGG16,
VGG19, and MobileNetV2, yielded substantial
accuracy, emphasizing the potential of leveraging
existing architectures and applying transfer learning
for ASD detection. Given the high accuracy of
certain models, there is an opportunity to develop
mobile applications for field workers and parents for
early ASD detection. Such applications can play a
pivotal role in facilitating timely interventions. The
document underscores the need for more extensive
datasets that cover diverse global populations. This
would ensure the generalizability of the models.
Furthermore, there is a call to develop better pre-
trained models and systems optimized for mobile
devices, enabling broader accessibility and use. In
essence, AI, especially deep learning, offers
promising tools for enhancing the accuracy and
timeliness of ASD detection. With further research
and development, these tools can be refined and
made widely accessible, ensuring early and effective
interventions for individuals with ASD. The
exponential increase in social media usage has led to
the creation of a vast amount of digital data,
enriched by the diverse contributions of its users.
This data trove has become a cornerstone for the
development of numerous applications in various
fields, leveraging Natural Language Processing
(NLP) techniques and advanced machine learning
models. Significantly, parents of children with
autism are increasingly using social media platforms
to share their personal experiences and challenges.
These firsthand accounts are invaluable, offering a
rich resource for developing applications aimed at
detecting autism. Such applications have the
potential to be universally accessible, allowing
parents from any location to participate and
contribute. The exploration of autism detection
through the analysis of parents' shared experiences
on social media is a key area of our future research
endeavors.
Acknowledgement:
The authors extend their appreciation to the
Manipur International University, Imphal, India for
supporting this research work on Autism.
References:
[1] Autism Spectrum Disorder, Mental Health
Information, 2023.
[2] Roger N. Rosenberg, Juan M. Pascual,
Rosenberg's Molecular and Genetic Basis of
Neurological and Psychiatric Disease,
Academic Press, 2015, pp. 1401-1424.
[3] Diagnostic and statistical manual of mental
disorders: DSM-5, American Psychiatric
Association, 5th edition, 2022.
[4] Catherine Lord et al., “Autism From 2 to 9
Years of Age”, Arch Gen Psychiatry. vol.
63(6), 2006, pp. 694–701.
[5] “What are the treatments for autism?”
National Institute of Mental Health, 2023.
[6] Zeyad A. T. Ahmed and Mukti E. Jadhav, A
Review of Early Detection of Autism Based
on Eye-Tracking and Sensing
Technology, International Conference on
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.28
Prasenjit Mukherjee, Gokul R. S., Manish Godse
E-ISSN: 2224-2872
250
Volume 22, 2023
Inventive Computation Technologies (ICICT),
Coimbatore, India, 2020, pp. 160-166.
[7] P. Jonathon Phillips, Support Vector
Machines Applied to Face Recognition,
Neural Information Processing, 1998, pp. 1-7.
[8] Zeyad A. T. Ahmed et al., Facial Features
Detection System To Identify Children With
Autism Spectrum Disorder: Deep Learning
Models, Computational and Mathematical
Methods in Medicine, vol. 2022, 2022, pp. 1-
9.
[9] Rainer Lienhart and Jochen Maydt, An
extended set of Haar-like features for rapid
object detection, International Conference on
Image Processing, Rochester, 2002, pp. I-I.
[10] Cascade Classifier Training, Open Source
Computer Vision, 2023.
[11] S. Ghaderizadeh, D. Abbasi-Moghadam, A.
Sharifi, N. Zhao and A. Tariq, Hyperspectral
Image Classification Using a Hybrid 3D-2D
Convolutional Neural Networks, IEEE
Journal of Selected Topics in Applied Earth
Observations and Remote Sensing, vol. 14,
2021, pp. 7570-7588.
[12] Rikiya Yamashita, Mizuho Nishio, Richard
Kinh Gian Do & Kaori Togashi,
Convolutional neural networks: an overview
and application in radiology, Insights
Imaging, vol. 9, 2018, pp. 611–629.
[13] Shivkaran Ravidas and M. A. Ansari, Deep
learning for pose-invariant face detection in
unconstrained environment, International
Journal of Electrical and Computer
Engineering (IJECE), vol. 9, no. 1, 2019, pp.
577-584.
[14] Srividhya Ganesan, Raju, J. Senthil,
Prediction of Autism Spectrum Disorder by
Facial Recognition Using Machine Learning,
Information Retrieval and Web Search,
September, vol. 18, 2021, pp. 406-417.
[15] “Transfer learning and fine-tuning”,
TensorFlow Core, 2023
[16] J Praveen Gujjar, H R Prasanna Kumar,
Niranjan N. Chiplunkar, Image classification
and prediction using transfer learning in colab
notebook, Global Transitions Proceedings,
vol. 2(2), 2021, pp. 382-385.
[17] Fawaz Waselallah Alsaade and Mohammed
Saeed Alzahrani, Classification and Detection
of Autism Spectrum Disorder Based on Deep
Learning Algorithms, Computational
Intelligence and Neuroscience, vol. 2022,
2022, pp. 1-10.
[18] Subash Gautam, Prabin Sharma, Kisan Thapa,
Mala Deep Upadhaya, Dikshya Thapa, Salik
Ram Khanal, Vítor Manuel de Jesus Filipe,
Screening Autism Spectrum Disorder in
children using Deep Learning Approach:
Evaluating the classification model of
YOLOv8 by comparing with other models,
Computer Vision and Pattern Recognition,
2023, pp. 1-15.
[19] Parisa Moridian et al., Automatic autism
spectrum disorder detection using artificial
intelligence methods with MRI neuroimaging:
A review, Frontiers in Molecular
Neuroscience, vol. 15, 2022, pp. 1-32.
[20] Janita E. van Timmeren, Davide Cester,
Stephanie Tanadini-Lang, Hatem Alkadhi and
Bettina Baessler, Radiomics in medical
imaging: a how-to guide and critical
reflection, Insights Imaging, vol. 11, 2020, pp.
1-16.
[21] Alex Zwanenburg, Stefan Leger, Martin
Vallières, Steffen Löck, Image biomarker
standardisation initiative, Computer Vision
and Pattern Recognition, 2019, pp. 1-160.
[22] Letizia Squarcina et al., Automatic
classification of autism spectrum disorder in
children using cortical thickness and support
vector machine, Brain and Behavior, vol.
11(8), 2021, pp. 1-9.
[23] Kyle Menary et al., Associations between
cortical thickness and general intelligence in
children, adolescents and young adults.
Intelligence, Intelligence, vol. 41(5), 2013, pp.
597-606.
[24] Saloni Mahendra Jain, Detection of Autism
using Magnetic Resonance Imaging data and
Graph Convolutional Neural Networks,
Thesis, Rochester Institute of Technology,
2018.
[25] Spatial normalization, 2023, Wikipedia,
[Online].
https://en.wikipedia.org/wiki/Spatial_normaliz
ation (Accessed Date: November 30, 2023).
[26] Meenakshi Khosla, Keith Jamison, Amy
Kuceyeski, Mert Sabuncu, 3D Convolutional
Neural Network for Classification of
Functional Connectomes, Deep Learning in
Medical Image Analysis and Multimodal
Learning for Clinical Decision Support, 2018,
pp. 1-10.
[27] Nicha C. Dvornek, Pamela Ventola, and
James S. Duncan’ Combining Phenotypic and
Resting-State Fmri Data for Autism
Classification with Recurrent Neural
Networks, IEEE International Symposium on
Biomedical Imaging, 2018, pp. 725-728.
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.28
Prasenjit Mukherjee, Gokul R. S., Manish Godse
E-ISSN: 2224-2872
251
Volume 22, 2023
[28] Fatima Zahra Benabdallah, Ahmed Drissi El
Maliani, Dounia Lotfi, and Mohammed El
Hassouni, A Convolutional Neural Network-
Based Connectivity Enhancement Approach
for Autism Spectrum Disorder Detection,
Journal of Imaging, vol. 9(6), 2023, pp. 1-12.
[29] Bentolhoda Abdollahbeigi, Farhang Salehi,
"A Study of Information Technology
Governance Initiatives On Organizational
Performance", WSEAS Transactions on
Computers, vol. 20, 2021, pp. 39-48.
[30] Stella Vetova, "Big Data Integration and
Processing Model", WSEAS Transactions on
Computers, vol. 20, 2021, pp. 82-87.
[31] Kristofferson Culmer, Jeffrey Uhlmann,
Examining LDA2Vec and Tweet Pooling for
Topic Modeling on Twitter Data, WSEAS
Transactions on Information Science and
Applications, vol. 18, 2021, pp. 102-115.
Contribution of Individual Authors to the
Creation of a Scientific Article (Ghostwriting
Policy)
The authors equally contributed in the present
research, at all stages from the formulation of the
problem to the final findings and solution.
Sources of Funding for Research Presented in a
Scientific Article or Scientific Article Itself
No funding was received for conducting this study.
Conflict of Interest
The authors have no conflicts of interest to declare.
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
WSEAS TRANSACTIONS on COMPUTERS
DOI: 10.37394/23205.2023.22.28
Prasenjit Mukherjee, Gokul R. S., Manish Godse
E-ISSN: 2224-2872
252
Volume 22, 2023