Revolutionizing Space:
The Potential of Artificial Intelligence
AHMAD AL-DAHOUD1, MOHAMED FEZARI2, ALI-AL-DAHOUD3, DARAH AQEL3,
HANI MIMI3, MOHAMMAD SH. DAOUD4
1Faculty Architecture and Design,
Al-Zaytoonah University of Jordan,
JORDAN
2Dept. Electronics,
University Badji Mokhtar Annaba,
ALGERIA
3Faculty of Science and IT, Al-Zaytoonah University of Jordan,
JORDAN
4College of Engineering,
Al Ain University,
UNITED ARAB EMIRATES
Abstract: - Generative AI is a new branch of artificial intelligence, which creates fresh content using neural
networks and machine learning methods. Systems of generative AI can generate music, images, text, speech,
and other types of content by finding new styles in huge databases. The automation of tedious tasks through the
creation of personalized content, and the improvement of accuracy in difficult tasks makes generative AI
technology to transform a variety of industries, including gaming, advertising, and healthcare. There are many
types of generative AI models. Each has pros and cons of its own. Despite being a relatively young technology,
generative AI has many potential applications that make it a fascinating field to research. More research,
growth, and advancement in the future may be seen. Future potential uses for generative AI include improving
cybersecurity by identifying and preventing cyberattacks, creating human-interactive virtual assistants, and
creating intelligent robots that can do challenging tasks in various industries. As generative AI continues to be
developed, we should expect to see increasingly sophisticated applications in the years to come, which will
open up new opportunities for growth across numerous industries.
Key-Words: - Generative AI, Artificial Intelligence, Machine Learning, OpenAI models, ChatGPT, DALL-E,
and GPT-4.
Received: January 29, 2024. Revised: July 11, 2024. Accepted: August 9, 2024. Published: September 4, 2024.
1 Introduction
Artificial intelligence (AI) has achieved growing
momentum in its application in many fields with
limitless possibilities. It has changed every aspect of
businesses, industries, and lives by applying it for
different purposes including intelligent marketing,
fraud detection, and customer support. It was
applied to many important applications such as
natural language processing, [1], [2], [3], agriculture
[4] and stock marketing, [5]. Currently, it also
enables machines to utilize visual or textual data for
developing new content, and that is what is called
Generative AI, [6]. Therefore, Generative AI
represents a developing AI field that is concerned
with generating new content, such as images, music,
and text. In particular, it generates these new
contents based on the patterns recognized in the
training data that it has been applied to. Generative
AI can renovate many industries, including
healthcare, gaming, and advertising. Moreover, it
can transform industries by enhancing
personalization, creating recommendations,
automating repetitive duties, and generating unique
content that can be utilized to facilitate customer
engagement and satisfaction.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
404
Volume 12, 2024
Fig. 1: Example of Generative AI, Transforming a Text into an Image
It also increases the accuracy in the health care
domain including drug discovery and medical
diagnosis as well as it helps in automating the time-
consuming tasks, where this can take less time and
decrease costs for industries. ChatGPT [7] and
DALL-E [8] are controlling the headlines as some
of the generative AI models.
Also, different types of models can be applied
for Generative AI, such as auto-encoders,
variational auto-encoders (VAEs) [9], generative
adversarial networks (GANs) [10], Boltzmann
Machines [11], and transformers [12]. Each type of
model has its benefits and restrictions. Selecting the
best Generative AI model is based on the task and
type of data being created. Generative AI is mainly
based on the use of advanced machine learning
algorithms and neural networks, including
transformers, autoencoders, VAEs, GANs, and
Boltzmann Machines, [13].
Whereas Generative AI is still a new
technology, its advantages and demands make it an
interesting field of development and research. In
addition, we are expecting to see more advanced
and innovative usages of Generative AI in the
coming years.
The OpenAI models such as ChatGPT, GPT-4
[14] and DALL-E are widely getting the attention of
people in the world of business, industries, and
content creation. In this research, we describe
Generative AI, the types of models that use
Generative AI, the test results on some Generative
AI models, and the dangers and limitations of using
Generative AI. The rest of the paper is organized as
follows: Section 2 defines Generative AI and
presents how it works. Section 3 describes the most
popular Generative AI examples. Section 4
illustrates the types of Generative AI models.
Section 5 demonstrates the main benefits of
Generative AI. Section 6 presents the most
important applications of Generative AI. Section 7
shows the dangers and limitations of Generative AI.
Finally, Section 8 concludes the paper and presents
the future work.
2 Defining Generative AI and How it
Works?
Generative AI is a sub-domain of artificial
intelligence, in which computer algorithms are
applied to create outputs that are the same as
human-created content such as texts, images, and
music, graphics, and computer codes. It allows the
use of existing content like texts, images, or audio
files to generate new reasonable content. In
particular, it enables computer algorithms to extract
the most important patterns related to a given input
and then utilizes it to produce a similar content.
Algorithms in Generative AI are applied to learn
from labeled training data examples. By exploring
the patterns within the training data examples,
Generative AI models can generate new content that
has the same features as the original input data and
appears reliable and human-like. It is based on the
use of machine learning and deep learning
techniques such as neural networks. To train a
Generative AI model, large amounts of data must be
given to a machine learning or deep learning
algorithm to learn useful patterns from this data.
This data could be texts, codes, graphics, or any
other type of contents that are suitable to the given
task. So, after collecting the training data, the
Generative AI model analyzes the patterns within
the data to find the main rules that control the
content. Then, the AI model adjusts its parameters
during the learning phase, to improve its capability
of simulating human-created content. If the AI
model can generate more content, then the created
output will be more sophisticated, accurate, and
believable.
3 The Most Popular Generative AI
Examples
We describe with examples the most popular GAI
developed by OpenAI like ChatGPT, GPT-4, and
DALL-E.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
405
Volume 12, 2024
3.1 Chat-GPT of GPT-3.5
OpenAI introduced ChatGPT [7], an AI language
model built on the Generative Pre-trained
Transformer (GPT) architecture. It is designed to
respond to natural language inputs that resemble
those of a human, enabling conversational
interaction between humans and AI. ChatGPT can
recognize and respond to a range of themes, from
straightforward questions and answers to more
sophisticated conversations, thanks to the vast
database of textual material that it was trained on.
This model can learn from vast volumes of data and
produce high-quality responses because it employs a
deep learning method known as transformer
architecture. ChatGPT is capable of comprehending
context, identifying patterns, and generating
meaningful and logical responses to a range of
inputs. Its capabilities span abroad.
Example 1: use
Input: What is generative deep learning? Figure 1
shows the answer to that question from ChatGPT.
Fig. 1: “Answer to what is a generative deep
learning?”
Example 2: generating code.
Input: Generate code python bandpass active filter.
Figure 2 shows the answer to the previous question.
Fig. 2: ChatGPT response to « Generate code
python bandpass active filter”
Assuming an 8000 Hz sample frequency, the
preceding example involves creating a second-order
Butterworth bandpass filter with a 1000 Hz center
frequency and a 200 Hz breadth. To find the filter
coefficients B and A, use the signal.butter() method.
The bandpass filter is specified by setting the btype
argument to 'bandpass'. The frequency response of
the filter is then plotted using the signal.freqz()
function and displayed using matplotlib. You can
modify the filter specifications to design different
types of passband filters.
3.2 GPT-4
GPT-4’s, [14], is the same as its predecessor, GPT-
3.5, in which it generates its output in response to
natural language queries and other
requests. OpenAI says that GPT-4 can “follow
complex instructions in natural language and solve
difficult problems with accuracy”. In specific, GPT-
4 can resolve mathematical problems, write
programming codes, respond to questions, and tell
stories. Furthermore, GPT-4 can review and
summarize large amounts of content, where which
supports the business use cases and customers.
OpenAI examined the capability of GPT-4’s to
replicate information in a consistent order by using
many assessments, such as AP and Olympiad exams
and the Uniform Bar Examination. GPT-4 gained
different scores on AP examinations. The results of
running GPT-4 through standardized tests have
shown that the GPT-4 model can construct accurate
and current responses. GPT-4 forecasts which token
is probable to come next in a sequence of words. (A
token may be a word, numbers, letters, symbols, or
punctuations).
In addition, GPT-4 does not include information
more recent than September 2021 in its lexicon.
Google Bard represents one of GPT-4’s
competitors, which does have up-to-date data,
material, and information because it is trained on the
modern internet.
3.3 GPT-4 vs. ChatGPT
The OpenAI Company has mentioned that the GPT-
4 model can process more data and perform more
computations than the billions of parameters that
ChatGPT was trained on. It has also proven more
efficient in writing a huge diversity of materials,
such as stories and fiction. Additionally, GPT-4
achieves higher performance than ChatGPT on the
uniform tests stated above. Responses to prompts of
GPT-4 may be more accurate and simpler to parse.
Moreover, GPT-4 is preferred over GPT-3.5 in
decision-making, content summarization, and time
scheduling. OpenAI argued that GPT-4 is 82% less
likely to answer questions for banned content and
40% more likely to generate realistic answers.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
406
Volume 12, 2024
Figure 3 presents an illustration of GPT-2 as a
translator.
Fig. 3: An Illustration of GPT-2 as a Translator
3.4 Dalle-E
DALL-E (pronounced like "Dali") [8], is an AI
model developed by OpenAI, which stands for
"Denoising Autoencoder for Learned Language
Embeddings." It is a type of generative model that
can create novel images from textual descriptions.
DALL-E can recognize natural language
characterizations of images and create a new image
that matches the description. Mainly, it is a neural
network-based generative model for creating images
from textual descriptions, developed by OpenAI in
early 2021. The name "DALL-E" is said to be a
combination of Salvador Dali, the famous surrealist
painter, and the robot character EVE from the Pixar
film "Wall-E".
DALL-E is based on the GPT-3 architecture and
generates visuals from textual inputs by utilizing a
combination of transformers and convolutional
neural networks. The model can both generate new
pictures to fit a textual content description or alter
an already-present picture to fit a brand-new textual
content input. A considerable series of picture-
caption pairs was used to educate it. DALL-E is
terrific for its capability to generate remarkably
revolutionary and surreal pictures that defy the
expectancies of picture synthesis models. This is
because the model is not reduced by the laws of
physics, allowing it to produce images that are not
truly usable and consistent with the textual
description. For example, it can draw pictures and
drawings such as "snail made of harps" or "pizza
with a giraffe pattern". This makes DALL-E a
beneficial tool for both artistic content material
production, along with making visuals for novels,
movies, or advertisements, and scientific and
engineering programs, together with constructing
sensible representations of complicated machines or
organic systems.
Two deep gaining knowledge strategies that
DALL-E makes use of are a transformer-based
totally language version and a denoising
autoencoder. Using a huge dataset of picture-text
pairs, it changed into teaching to recognize the
relationships among textual descriptions and related
images. DALL-E is beneficial in some of the
approaches. For example, it can be applied to the
design, fashion, and entertainment industries to
produce creative designs or visual content for
motion pictures.
It can create images in different styles, from
photorealistic imagery3 to paintings and emoji. It is
capable also of processing, handling, and
rearranging objects in its images. Figure 4 and
Figure 5 present some photos generated by DALL-E
as outputs.
Example 1 of DALL-E 2
Input: cat with sunglasses on a boat in a sunny day
DALL-E response:
Fig. 4: Dalle-E response to “cat with sunglasses on a
boat on a sunny day”
Example 2:
Input: astronaut in virtual space
Fig. 5: Some photos generated by DALL-E as
Outputs
3.5 Other Examples of Generative AI
Models
Generative AI has achieved important
improvements in recent years, by using some tools
that attract public attention and create a sensation
between content developers and creators. Big
technology enterprises such as
Google, Microsoft, and Amazon, have released their
own Generative AI tools.
Midjourney: released by San Francisco-based
research lab Midjourney Inc., Midjourney
understands and analyzes text prompts and
context to create visual content, like DALL-E 2.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
407
Volume 12, 2024
GitHub Copilot: is an AI programming tool
developed by GitHub and OpenAI. It helps
users to write codes faster and proposes code
completions for them in development
environments such as Visual Studio and
JetBrains.
Cisco brings a Chat-GPT experience to
WebEx
Related to Gartner.com, articles, we believe that
Generative AI will increase and speed up designs in
many businesses as well as has the capability of
discovering new and effective designs. The output
of AI systems, that use Generative AI, may contain
high-value artifacts such as video, code, narrative,
and synthetic data. AI use cases for Generative AI
are increasing, particularly in five areas such as
parts design, drug design, chip design, material
science, and synthetic data.
4 Types of Generative AI Models
Generative AI models can be used to create new
content and samples that are the same as the original
data by sampling data from the complex
distribution. In summary, there are many types of
models utilized for Generative AI, each with its
benefits and weaknesses. Choosing the right model
depends on the type of the utilized data and the task
at hand. These models are all designed to address
certain problems and applications. The following
categories apply to these generative AI models.
4.1 Transformer-based Models
Transformer-based models are a representation
of a deep learning framework that
revolutionized sequential data processing such
as natural language processing (NLP), [12],
[13], [15]. After being presented in the 2017
paper "Attention Is All You Need" by [12], the
transformer scheme has become an industry
standard for many NLP projects, [12].
The attention model provides the key to self-
attention, which is essential for dedicating time
for contemplation while the architecture of each
meaning is progressively being constructed.
With the use of token-pair-level attention and
importance scoring, this would determine the
sequence of how the input tokens are shown.
Consecutively the model concludes stressing
over the most significant data when it is in the
prediction mode.
Among all transformer-based neural network
models, one could argue that the BERT model
(Bidirectional Encoder Representations from
Transformers), which was introduced by Google
in 2018, is the most widely applied model up
until now. Large amounts of text are being
converted from raw text into a transformer
model, which is trained with a transformer-base
model and masked language modeling method.
BERT stands out as its performance has been
commendable, recording top scores and getting
the best answer orientation among other NLU
tasks such as question answering, and natural
language inference.
XLNet (bootstrapping multilingual intermediate
modeling via meta-learning), T5 (text-to-text
transfer transformer), and GPT (generative pre-
trained transformer) are the other sets of well-
known transformer-based models.
By enabling the models to recognize distant
syntactic relationships and comprehend word
context, transformer models have together
brought about a dramatic shift in natural
language processing (NLP) systems. In the field
of NLP, these models may be used for a wide
range of tasks, including machine translation,
text categorization, and language model
creation. Neural networks for natural language
processing produce the predictions for the
ChatGPT and GPT-3.5 OpenAI models. For
everyone who is trained in big data, the process
remains the same until these patterns are
identified. Additionally, they can quickly notice
the interactions with sequential data, which
makes them effective for jobs involving text
production.
4.2 Example: Vision transformers (VITs)
[12], [15], [16]
One kind of deep learning model that is utilized for
picture categorization tasks is called Vision
Transformer (VIT), [12], [15]. It applies
transformers over patches of the image, where
patches are supposed to be independent and
distributed. The VITs components are self-attention,
positional encoding, and multi-head self-attention.
The positional encoding component is based on
finding the position of an entity in a sequence of
input tokens, learning the relative distances between
patches, and recognizing the spatial structure of an
image. The self-attention method specifies the
significance of each patch and estimates the
correlation and context between the other patches.
Multi-head self-attention allocates several self-
attention blocks to one account for many types of
interactions between the patches and integrates them
into a single self-attention output. The main
architecture of VIT is illustrated in Figure 6. A WSI
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
408
Volume 12, 2024
is transformed into a series of patches, where each
patch is associated with its positional information.
Learnable encoders transform each patch and its
position into one embedding vector called a token.
An extra token is given for the classification
procedure. The transformer encoder takes the class
token along with the patch tokens as inputs to
estimate multi-head self-attention and then outputs
the class and the learnable embedding of patches.
The output class token is used as a slide-level
representation for the last classification of the
model. The transformer encoder contains many
stacked identical blocks. Each block includes multi-
head self-attention and MLP, coupled with a layer
normalization and residual connections. The
multiple self-attention heads and the positional
encoding are useful for integrating spatial
information and increasing the context and
effectiveness, [17] of the VIT technique over other
techniques. However, there is a limitation in VITs,
in which it is considered to be more data-hungry,
[15].
Poorly supervised methods offer many
advantages. Eliminating the manual annotations
decreases the data preprocessing cost and minimizes
the bias and interrater variability. Therefore, the
models can be simply applied to huge datasets, for
various tasks. As the models can learn from the
whole scan free, then they can classify the
predictive features even if the regions were assessed
by pathologists. These weakly supervised methods
achieve a good performance and propose that many
tasks can be resolved without expensive manual
annotations. Figure 6 presents the vision
transformers architecture application in X-ray image
analysis for COVID-19.
Fig. 6: Vision Transformers Architecture
Application in X-ray Image Analysis for COVID-19
Detection
4.3 Generative Adversarial Networks
‘GAN’
Generative Adversarial Networks (GANs) [10]
consist of two neural network models, a
discriminator, and a generator. The discriminator
estimates the authenticity and the quality of said
data, while the generator generates data. After a
while, both neural networks perform their roles
effectively, and more accurate outputs are generated
by them.
In the adversarial training process of the GANs,
the generator and discriminator are trained together
on a given data. In this process, the generator
attempts to generate data that fools the
discriminator, while the discriminator attempts to
recognize whether the given data is fake or real. The
output of the generator is then tuned depending on
the discriminator's feedback, and the process
remains until the generator can create data that is
similar to the real data. There are several
applications of GANs, such as data augmentation,
image synthesis, and anomaly detection. GANs have
also been applied to produce real images of
buildings, faces, and cities. They have proven their
worth in the healthcare domain, as they can be
utilized to create synthetic medical images and data,
which can then be used for training machine
learning methods for disease diagnosis and
treatment. Figure 7 displays the GAN Architecture.
Fig. 7: The GAN Architecture
4.4 Autoencoders
One kind of neural network design used in
unsupervised learning is the autoencoder. By first
mapping the input data to a lower-dimensional
representation and then mapping the lower-
dimensional representation back to the original
input, known as the decoding, they are intended to
learn a compressed representation of a given input
data, known as the encoding or latent space.
Numerous data kinds, including text, music, and
images, can be used to train autoencoders.
The following are the main uses for autoencoders:
Image Compression and Reconstruction: By
using autoencoders for image compression and
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
409
Volume 12, 2024
reconstruction, massive image datasets can be
stored and transmitted more effectively.
Anomaly Detection: By highlighting data points
that the model is unable to correctly reconstruct,
autoencoders can be used to find abnormalities
in datasets.
Feature extraction: Auto-encoders are capable
of efficiently identifying data and extracting
features, which can be utilized as input for
machine learning models.
Dimensionality reduction: dimensionality of
high dimensional datasets can be reduced by
using auto-encoders, which facilitates the
visualization and study of complicated data.
Data Generation: New data samples can be
produced by auto-encoders. To do this, samples
are selected at random from the sample space
and then decoded.
In brief, autoencoders are widely used in many
fields, such as computer vision, signal processing,
and natural language processing. They are a helpful
instrument for unsupervised learning and may help
uncover patterns and insights in massive
and complex datasets. Figure 8 displays a denoising
model utilizing an autoencoder.
Fig. 8: Illustration of autoencoder as denoising
model
4.5 Variational Autoencoders
An encoder and a decoder are used by variational
autoencoders (VAEs) [9] to produce material. After
receiving input data—such as text or images—the
encoder compresses the data before sending it out
again. The decoder then uses this encoded data to
reassemble it into new data that bears a resemblance
to the original input data.
One class of generative models known as VAEs
is used to learn a condensed representation of input
data. With the addition of a probabilistic model that
allows them to execute data sampling from the
compact representation, they are regarded as a kind
of autoencoder.
VAEs are trained using unsupervised learning
and can be applied to many data types, such as
images, text, and sound. Figure 9 presents an
illustration of VAE with the latent space defined by
mean and variance.
The primary applications of VAEs include:
Data Compression: VAEs can be used to encode
data into a lower-dimensional representation
compress it and then decode this data back into
its original structure.
Anomaly Detection: VAEs can be used to detect
anomalies in datasets by comparing the
reconstruction error of a given data sample with
the reconstruction error of the training data.
Semi-Supervised Learning: VAEs can be used
for semi-supervised learning tasks by
incorporating label information into the model
training process.
Data Augmentation: VAEs can be utilized to
create new data samples that can be applied to
increase the training data for other machine
learning models.
Image and Video Generation: VAEs can be
trained to generate new images and videos by
sampling from the compressed representation of
the input data.
Fig. 9: VAE with the Latent Space Defined by Mean and Variance
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
410
Volume 12, 2024
4.6 Multimodal Models
Multimodal AI, [17], [18], [19], [20], [21] is a new
AI pattern, in which different data types (numerical
data, image, text, and speech) are integrated with
multiple intelligence processing algorithms to attain
superior performance results. Multimodal AI often
achieves better performance results than single-
modal AI in several real-world applications.
Multimodal deep learning is a new area, in which
algorithms learn from data of multiple modalities.
For instance, a human can use both hearing and
sight senses to recognize an object or a person.
Similarly, multimodal deep learning is based on
developing similar capabilities for computers.
Multimodal models can analyze and process many
types of input data, including audio, text, and
images. They join different modalities to generate
more developed outputs. Some examples of these
models are such as DALL-E 2 and OpenAI’s GPT-
4, which also accepts image and textual data inputs.
Multimodal deep learning has many important uses,
such as:
Automatically creating illustrations of images,
such as captioning for blind people.
Looking for images that suit text queries (e.g.
“find for me an image of a yellow cat”).
Generative art system that creates images from
text illustration (e.g. “create a picture of a
yellow cat”).
They can be used to create new data samples
that look like the original data by performing data
sampling from the complex distribution. In
summary, many types of models are applied for
Generative AI, each with its strengths and
weaknesses. Choosing the right model depends on
the data type and the task at hand.
5 Benefits of Generative AI
The main benefit that Generative AI offers is
efficiency, whereas with this benefit, Generative AI
allows businesses to systematize specific tasks and
direct their effort, time, and resources on more
significant strategic objectives. This reduces the
labor costs and maximizes the operational
efficiency.
Generative AI suggests several benefits that can
help in transforming various industries and
improving efficiency and personalization. Some of
the main benefits of Generative AI are:
Creative content creation: Generative AI can
help to create new and unique content, including
images, music, and text, that can be used in
various industries, such as advertising, gaming,
and art.
Time and money savings: Companies may save
time and money by using generative artificial
intelligence for automated repetitive and time-
consuming processes such as data analysis and
production of content.
Money and Time: Businesses may use
generative artificial intelligence to save money
and time by automating repetitive and time-
consuming tasks like data processing and
generation.
Increased precision: artificial intelligence
examines huge amounts of data and finds
relevant patterns that people normally miss.
This can help in increasing accuracy in a variety
of activities that include drug discovery and
medical diagnostics.
Novel uses: Generative AI creates new
opportunities and uses in several sectors,
including gaming and healthcare, that were
either impractical or too difficult to implement
in the past.
In summary, Generative AI offers several
benefits that can help in transforming industries and
improving efficiency, personalization, accuracy, and
creativity.
Generative AI proposes additional benefits to
businesses and entrepreneurs, such as:
Creating new ideas, content, or designs.
Typing, developing, testing, and optimizing
computer code.
Formulating templates for articles or essays.
Improving customer support with virtual
assistants, dialog systems, and chatbots.
Streamlining the procedure for gathering and
expanding data so that machine-learning
algorithms may use it.
Enhancing decision-making through data
processing and analysis.
6 Additional Domains in Which
Generative AI is Employed
Artificial intelligence that creates new content
through algorithms, such as writing, photos, movies,
and music, is known as generative AI. Its uses are
numerous and spread across several industries, such
as:
Healthcare: The application of generative AI in
health expedites the search for new treatments
and reduces the time and expense of research.
Medical imaging may benefit from the usage of
generative AI in healthcare. Researchers can
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
411
Volume 12, 2024
design novel medicines with the use of AI
models, which can produce new medicinal
molecules and forecast their efficacy.
Generative artificial intelligence is used to
create 3D medical images for treatment
planning and improving diagnosis.
Marketing: Advertisers can organize
personalized campaigns and modify material
according to the preferences of their clients
using Generative Artificial Intelligence. Product
recommendations and advertisements are
examples of the previous point. Artificial
intelligence algorithms may produce marketing
messages that are both audience-targeted and
appealing to the target audience because they
are trained on consumer data.
Education: Many teachers use generative
Artificial Intelligence models to assess and learn
materials that are tailored to the unique learning
preferences of each student.
Finance: financial analysts that are based on
generative artificial intelligence are used to
examine market trends and predict the direction
of the stock market.
Virtual colleagues: Generative fake insights
may be utilized to create human-like virtual
colleagues that can comprehend and answer
normal dialect requests. Virtual colleagues who
take after individuals can upgrade client
fulfillment. In addition, it can raise the esteem
of virtual associates for an assortment of errands
counting planning arrangements, and reacting to
requests.
Climate scientists use generative AI models to
study the environment to forecast weather
patterns and comprehend climate change.
Generative AI may be used in creative sectors
such as music, design, and painting. Utilizing
inspiration from previously composed music, a
computer program may create new music.
Video games: Characters, environments, and
levels may all be created with generative AI.
Game developers may provide more original
and captivating game material in addition to
saving time and money,
In conclusion, generative artificial intelligence
contains a wide range of applications and the
capacity to change many businesses by creating
modern substance and improving proficiency
and personalization.
7 Dangers and Limitations of
Generative AI
Many difficulties raised by generative AI require
consideration. One of the problems is that it might
spread false information or sensitive or dangerous
content, which may hurt individuals or companies
and endanger national security.
Policymakers have given careful thought to
these linkages. For instance, in April 2023, the
European Union imposed new copyright regulations
on enterprises that develop generative AI tools.
These regulations require the disclosure of any
copyrighted content utilized in the development of
these tools. These regulations would reduce
intellectual property exploitation while also
promoting ethics and openness in the field of
artificial intelligence development. Moreover, these
rules protect the content creators from mimicking or
plagiarizing their work by Generative AI tools.
Automating the Generative AI tasks will have
effects on the workforce and employees, in which
those affected employees, are required to be
reskilled or upskilled. Furthermore, Generative AI
tools can magnify biases existing in the training
data, where this causes problematic results that
spread stereotypes and harmful beliefs. ChatGPT,
Bing AI, and Google Bard have all stirred up
controversy by creating damaging outputs since they
have been released. All these issues should be
resolved over time as Generative AI develops.
New Roman as here.
8 Conclusion
In summary, generative AI is a powerful artificial
intelligence tool that has the potential to
revolutionize a wide range of industries through the
production of fresh, original content, enhanced
personalization, cost and timesavings, improved
accuracy, increased efficiency, and the creation of
new opportunities. Generative AI can create text,
images, music, and other types of content that are
useful in industries like marketing, gaming,
healthcare, and the arts by utilizing machine
learning and deep learning algorithms. Even though
generative fake insights (AI) is still moderately
unused and has a few disadvantages, just like the
requirement for enormous volumes of preparing
information, its potential benefits and employments
make it a beneficial field for advance consider and
headway. Overall, the prospects for generative AI
show up shrinking since the innovation is still
creating, and we expect to see more cutting-edge
and inventive applications of generative AI in the
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
412
Volume 12, 2024
future. However, there are other deterrents to
overcome, just as the gigantic sums of preparing
information required, the plausibility of
predisposition and moral issues, and the necessity
for the created fabric to be straightforward and
comprehensible.
Hence, future investigation on the conceivable
risks of generative AI is fundamental. As generative
AI innovation is created, it is basic to form beyond
any doubt that AI is utilized morally by lessening
predispositions, boosting responsibility, expanding
straightforwardness, and helping with information
administration. An adjustment between human
interaction and assignment computerization for
generative AI is essential to maximize its benefits.
On the other hand, any unfavorable effects on the
workforce have to be decreased or evacuated.
Furthermore, we hypothesize that combining
several generative models—like VAEs and GANs—
produced reliable multimodal outcomes. Finally, we
performed a thorough analysis to determine how
missing data and poor supervision affected
multimodal learning. In terms of large amounts of
missing data, we investigated the possibility that the
suggested VAEVAE and VAEGAN models perform
better than the other generative AI models.
Subsequent research in multimodal data may
examine the outcomes of utilizing similar concepts
in the design of videos, where each frame comprises
text, audio, and visual elements. Further study and
development in this area will certainly provide new
chances and solutions for people, companies, and
society as a whole.
References:
[1] Kanan, T., Mughaid, A., Al-Shalabi, R. Al-
Ayyoub, M. Business intelligence using deep
learning techniques for social media contents.
Cluster Comput 26, 2023, 1285–1296.
[2] Aqel, D. and Hawashin, B., Arabic relative
clauses parsing based on inductive logic
programming. Recent Patents on Computer
Science, 11(2), 2018, pp.121-133.
[3] Radford, A., Narasimhan, K., Salimans, T.
and Sutskever, I., Improving language
understanding by generative pre-training.
2018, [Online].
https://openai.com/research/language-
unsupervised (Accessed Date: April 28,
2024).
[4] Igried, B., AlZu’bi, S., Aqel, D., Mughaid, A.,
Ghaith, I. and Abualigah, L., An Intelligent
and Precise Agriculture Model in Sustainable
Cities Based on Visualized Symptoms.
Agriculture, 13(4), 2023, p.889.
[5] Mukherjee, S., Sadhukhan, B., Sarkar, N.,
Roy, D. and De, S., Stock market prediction
using deep learning algorithms. CAAI
Transactions on Intelligence Technology,
8(1), 2023, pp.82-94.
[6] Dwivedi, Y.K., Kshetri, N., Hughes, L.,
Slade, E.L., Jeyaraj, A., Kar, A.K.,
Baabdullah, A.M., Koohang, A., Raghavan,
V., Ahuja, M. and Albanna, H., “So what if
ChatGPT wrote it?” Multidisciplinary
perspectives on opportunities, challenges and
implications of generative conversational AI
for research, practice and policy. International
Journal of Information Management, 71,
2023, p.102642.
https://doi.org/10.1016/j.ijinfomgt.2023.1026
42.
[7] Brown, T., Mann, B., Ryder, N., Subbiah, M.,
Kaplan, J.D., Dhariwal, P., Neelakantan, A.,
Shyam, P., Sastry, G., Askell, A. and
Agarwal, S., Language models are few-shot
learners. Advances in neural information
processing systems, 33, 2020, pp.1877-1901.
[8] Daras, G. and Dimakis, A.G., Discovering the
hidden vocabulary of dalle-2. 2022 arXiv
preprint arXiv: 2206.00169,
DOI:10.48550/arXiv.2206.00169.
[9] Kingma, D.P. and Welling, M., Auto-
encoding variational bayes. 2013, arXiv
preprint arXiv:1312.6114,
https://doi.org/10.48550/arXiv.1312.6114.
[10] I. Goodfellow, J. Pouget-Abadie, M. Mirza,
B. Xu, D. Warde-Farley, S. Ozair, A.
Courville, and Y. Bengio, “Generative
adversarial nets,” in Advances in Neural
Information Processing Systems 27, Montreal,
Quebec, Canada, 2014, pp. 2672-2680.
[11] Ackley, D.H., Hinton, G.E. and Sejnowski,
T.J., A learning algorithm for Boltzmann
machines. Cognitive science, 9(1), 1985,
pp.147-169.
[12] Vaswani, A., Shazeer, N., Parmar, N.,
Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser,
Ł. and Polosukhin, I., Attention is all you
need. 31st Conference on Neural Information
Processing Systems, 2017. arXiv preprint
arXiv.v,
https://doi.org/10.48550/arXiv.1706.03762.
[13] Basem S. Abunasser, Salwani Mohd Daud
and Samy S. Abu-Naser, Predicting Stock
Prices using Artificial Intelligence: A
Comparative Study of Machine Learning
Algorithms, International Journal of
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
413
Volume 12, 2024
Advances in Soft Computing and its
Application, 15(3), 2023, 41-53. doi:
10.15849/IJASCA.231130.03.
[14] Liu, Y., Han, T., Ma, S., Zhang, J., Yang, Y.,
Tian, J., He, H., Li, A., He, M., Liu, Z. and
Wu, Z., Summary of chatgpt/gpt-4 research
and perspective towards the future of large
language models. Meta-Radiology, 1(2). 2023.
pp. 100017,
https://doi.org/10.1016/j.metrad.2023.100017
[15] Dosovitskiy, A., Beyer, L., Kolesnikov, A.,
Weissenborn, D., Zhai, X., Unterthiner, T.,
Dehghani, M., Minderer, M., Heigold, G.,
Gelly, S. and Uszkoreit, J., An image is worth
16x16 words: Transformers for image
recognition at scale. 2020, arXiv preprint
arXiv:2010.11929.
[16] Akb Akbari, H., Yuan, L., Qian, R., Chuang,
W.H., Chang, S.F., Cui, Y. and Gong, B.,
Vatt: Transformers for multimodal self-
supervised learning from raw video, audio and
text. Advances in Neural Information
Processing Systems, 34, 2021, pp.24206-
24221.
[17] Bao, H., Wang, W., Dong, L., Liu, Q.,
Mohammed, O.K., Aggarwal, K., Som, S.,
Piao, S. and Wei, F., Vlmo: Unified vision-
language pre-training with mixture-of-
modality-experts. Advances in Neural
Information Processing Systems, 35, 2022,
pp.32897-32912.
[18] Varghese, E.B. and Thampi, S.M., A
multimodal deep fusion graph framework to
detect social distancing violations and FCGs
in pandemic surveillance. Engineering
Applications of Artificial Intelligence, 103,
2021, p.104305.
[19] Baruch, E.B. and Keller, Y., Multimodal
matching using a hybrid convolutional neural
network. 2018, (Doctoral dissertation, Ben-
Gurion University of the Negev).
[20] Wang, C., Wang, X., Long, Z., Yuan, J., Qian,
Y. and Li, J., August. Multimodal gait
analysis based on wearable inertial and
microphone sensors. In 2017 IEEE
SmartWorld, Ubiquitous Intelligence &
Computing, Advanced & Trusted Computed,
Scalable Computing & Communications,
Cloud & Big Data Computing, Internet of
People and Smart City Innovation, (pp. 1-8).
2017, IEEE.
[21] Mohammad Husien Almajali, Laith Nasrawin,
Faisal Tayel Alqudah, Ahmad Abdallah
Althunibat and Nasir Albalawee, Technical
Service Error as a Pillar of Administrative
Responsibility for Artificial Intelligence (AI)
Operations, International Journal of Advances
in Soft Computing and its Application, 15(3),
2023, 274-287. doi:
10.15849/IJASCA.231130. 18
Contribution of Individual Authors to the
Creation of a Scientific Article (Ghostwriting
Policy)
The authors equally contributed in the present
research, at all stages from the formulation of the
problem to the final findings and solution.
Sources of Funding for Research Presented in a
Scientific Article or Scientific Article Itself
No funding was received for conducting this study.
Conflict of Interest
The authors have no conflicts of interest to declare.
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.40
Ahmad Al-Dahoud, Mohamed Fezari,
Ali-Al-Dahoud, Darah Aqel, Hani Mimi,
Mohammad Sh. Daoud
E-ISSN: 2415-1521
414
Volume 12, 2024