New chatGPT 3.5 Instruction (Prompt) to Calculate Statistical
Indicators for Student Graduation Projects
VALERY OKULICH-KAZARIN
National Louis University,
Nowy Sącz,
POLAND
Abstract: - The paper aims to develop a new chatGPT 3.5 instruction (prompt) for computing statistical
indicators in student graduation projects. A bibliometric analysis of 79606 sources published in the Scopus
database revealed a high level of interest in solving problems related to "graduation projects" and "statistical
indicators." Numerous studies emphasize the importance of probability and statistics education. Concurrently,
educators are advised to abandon teaching manual calculation methods to students. ChatGPT could serve as a
modern tool for computing statistical indicators. Modern methods employed in this research included reviewing
scientific literature, analysis and synthesis, bibliometric analysis, mathematical modeling, computation of
statistical indicators, and verification of statistical hypotheses using Z-statistics. Five examples of calculating
statistical indicators are provided in this paper. Three tools were used for computing statistical indicators, with
the new chatGPT 3.5 instruction (prompt) serving as the experimental method, while Excel tables and
Windows calculator were used as control methods. Verification of statistical hypotheses using Z-statistics
demonstrated the equality of results between experimental and control methods. The standard testing level was
set at α = 0.05. The novelty of this work lies in the creation of the new chatGPT 3.5 instruction (prompt) for
computing statistical indicators in student graduation projects. Additionally, a User's Guide has been published.
The practical value of this work lies in reducing the time and simplifying the method for computing statistical
indicators in preparing graduation projects, as well as in improving their quality. An additional benefit is the
expanded use of computers for educational purposes.
Key-Words: - chatGPT 3.5, prompt, computer, graduation project, statistical indicators, student.
5HFHLYHG$SULO5HYLVHG)HEUXDU\$FFHSWHG0DUFK3XEOLVKHG0D\
1 Introduction
The development of information technologies leads
to growth the possibility of implementing new tools
in education, [1], [2], [3], [4], [5], [6], [7],
mathematical modeling, [8], [9] and the use of
intelligent systems, [10], [11], [12]. For instance,
ChatGPT provides users with the ability to interact
with a computer using natural language. This makes
ChatGPT a valuable tool in educational processes.
Currently, such tools as ChatGPT are becoming
increasingly popular in various fields, including
higher education, [12], [13], [14], [15], [16].
"ChatGPT is an AI-powered language model
developed by OpenAI. It has been trained on a
massive amount of text data from the internet and
can generate human-like text responses to a given
prompt. It can answer questions, converse on a
variety of topics, and generate creative writing
pieces" (https://chatgpt.org/). Model 3.5 is a
publicly available free model. Therefore, it was
chosen in this study.
This article is dedicated to a new guide on using
ChatGPT version 3.5 for computing statistical
indicators in student graduation projects.
In the modern educational process, student
graduation projects play a significant role in the
development and assessment of students' skills.
Computing statistical indicators is a key stage in the
analysis of empirical data. Therefore, scientists from
different countries are researching the process of
teaching students the methodology of computing
statistical indicators, [17], [18], [19], [20], [21],
[22], [23], [24], [25], [26].
This article presents an instruction for
computing statistical indicators, such as the mean
and standard deviation, using the neural network
ChatGPT version 3.5. The possibilities of using
various software tools and technologies, including
artificial intelligence, for automating the process of
computing and presenting statistical data are
discussed.
The use of ChatGPT 3.5 in student graduation
projects opens up new perspectives for students.
This research enables students to more effectively
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
307
Volume 12, 2024
apply statistical methods in their graduation
projects, thus contributing to the development of
their professional skills and the improvement of the
quality of their final work.
The goal was to provide students with an
instruction (prompt) for effectively using this
technology to compute statistical indicators in their
graduation projects. This instruction (prompt) can
also be used in other research and practical projects
of students. The author will present the prompt and
a guide for its application. A very important
advantage of the new chatGPT 3.5 prompt is the
rejection of entering an array of data for computing
statistical indicators.
Additionally, the author will provide 5 examples
of its usage to simplify the processing of statistical
data in student projects.
The author hopes that this instruction (prompt)
will become a valuable resource for students,
helping them reduce the time spent on preparing
their graduation projects and improve their quality.
2 Problem Formulation
Works, [27], [28], [29], demonstrate that graduation
projects as a didactic tool have been endorsed by
practitioners and researchers in the field of higher
education. They value this didactic tool for its rigor,
depth of content, and engagement of students in
cutting-edge academic work. Student graduation
projects reflect a causal model of behavior, the
ability to exchange knowledge, and teamwork. Such
projects have had a positive impact on graduates.
The results of the first step of a bibliometric
analysis are visualized in Figure 1. The analysis was
conducted using 2590 document search results from
the last 30 years (from 1994 to 2023). Document
search was performed using the keywords
"graduation project" in the Scopus database
(https://www.scopus.com/term/analyzer.uri).
Fig. 1: Number of documents from 1994 to 2023 by
topic «graduation project»
Figure 1 illustrates a generally unstable growth
trend in the number of publications on the topic of
"graduation project" from the beginning to the end
of the analyzed period. The overall trend of
increasing publication numbers indicates a growing
interest among researchers in the topic of
"graduation project." Figure 1 provides indirect
evidence of existing issues in the field of
"graduation project" that require increasing attention
from researchers.
The second stage of the bibliometric analysis is
visualized in Figure 2. Here, 72774 document
search results from the last 30 years (from 1994 to
2023) were processed. Document search was
conducted using the keywords "statistical
indicators" in the Scopus database
(https://www.scopus.com/term/analyzer.uri).
Fig. 2: Number of documents from 1994 to 2023 by
topic «statistical indicators»
Figure 2 depicts a consistent and stable growth
trend in the number of publications on the topic of
"statistical indicators" from the beginning to the end
of the analyzed period. The increasing trend in
publication numbers indicates a growing interest
among researchers in the topic of "statistical
indicators." This suggests that researchers are
continuously working on addressing issues in the
field of "statistical indicators" (Figure 2).
Numerous scientific works emphasize the
importance of this topic. For example, the
importance of studying probability and statistics is
recognized in the current Australian mathematics
curriculum, [21]. Meanwhile, work [23], asserts that
psychology students often experience anxiety about
studying statistics, which can affect their
performance. Simultaneously, the author of paper
[25], states that sociology students may reluctantly
engage in the study of statistical methods with
apprehension. Furthermore, statistics courses are
perceived by students as challenging, [17]. The
attitude, motivation, and preparedness of students
can negatively impact their experience and
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
308
Volume 12, 2024
potentially hinder students' learning of statistics,
[17].
One of the reasons for students' negative attitude
towards studying statistics is the limited amount of
time dedicated to diversity in courses. Additionally,
teachers refer to barriers such as lack of relevance
and insufficient time, [26]. Integrating content into
statistics is a challenging task, however, the efforts
are worthwhile. Therefore, integrating diverse
content into statistics courses is highly beneficial,
[26].
Paper [19], notes that statistics textbooks
typically teach manual methods of computing
statistical indicators. The authors recommend that
teachers move away from teaching manual
computations and focus on conceptual information
and software applications, [19].
Thus, the research problem was justified by the
increasing interest in solving issues in the fields of
"graduation project" and "statistical indicators." The
results of previous research directly guide
researchers in seeking a method of computing
statistical indicators without manual computations.
This implies the need to create new information
tools for such computations.
On the third stage of the bibliometric analysis, the
author visualized 4242 document search results for
the keyword "chatGPT" (Figure 3). Document
search was conducted in the Scopus database
(https://www.scopus.com/term/analyzer.uri) over
the last 30 years (from 1994 to 2023).
Unfortunately, the Scopus database only provided
results for the period 2015-2023.
Fig. 3: Number of documents from 2015 to 2023 by
topic «chatGPT»
Figure 3 illustrates a consistent absence of
publications on this topic until 2022. However, in
2023, there were 4233 scientific documents related
to chatGPT published. This indicates a sharp
increase in research interest in this topic (Figure 3).
Summarizing the results of the bibliometric
analysis (Figure 1, Figure 2 and Figure 3), you will
notice three new scientific facts:
1. The interest of researchers in the topic of
“graduation project” can be described by an
unsustainably growing trend line.
2. The interest of researchers in the topic of
“statistical indicators” can be described by a
steadily growing trend line.
3. Research interest in the topic “chatGPT” is
characterized by a sharp increase from zero in
2022 to more than 4233 publications in 2023.
As a result, a joint analysis of these three
Figures may indicate a high scientific novelty of the
idea of using chatGPT to perform statistical
calculations when completing graduate work. Figure
1 and Figure 2 prove the high practical significance
of this article.
The author chose to compute statistical
indicators using chatGPT 3.5. When a student uses
the new chatGPT prompt for computing statistical
indicators, they save time on manual computations.
This time can be allocated to focusing on conceptual
information and software applications, as
recommended, [19].
Why did the author choose version 3.5? This
version of chatGPT was chosen due to its free
availability and accessibility for students. The
possibility of errors in calculations using chatGPT
3.5 was also confirmed in the study, [30].
The research goal is to create a new chatGPT
3.5 instruction (prompt) to calculate statistical
indicators for student graduation projects.
The research goal was achieved using modern
methods: review of scientific sources, analysis and
synthesis, bibliometric analysis, mathematical
modeling, calculation of statistical indicators,
verification of statistical hypotheses (Z-statistics).
3 Problem Solution
The research was performed from December 2023
to February 2024. The empirical part of the study
was carried out at National Louis University. For
calculations, standard tools were used that are used
in the normal educational process (including the
implementation of graduation projects) of the named
university. The author used standard package for
Microsoft Office Excel 2007 and Windows 10.
After the problem formulation, the problem was
solved in five steps. The steps outlined in the article
are as follows:
Step 1: Search for important sample sizes by
mathematical modeling.
Step 2: Creation of the new chatGPT 3.5
instruction (prompt) for computing statistical
indicators.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
309
Volume 12, 2024
Step 3: Check of the new chatGPT 3.5 instruction
(prompt) using 5 examples, including 2 real cases
and 3 abstract cases.
Step 4: Execution of Z-statistics to compare the
results of calculations of statistical indicators
obtained by the new prompt and by using Windows.
Step 5: Writing a user's guide to the new chatGPT
3.5 instructions (prompt) for students conducting
graduation projects.
It's noted that the numbering of the steps in the
text has been changed.
3.1 Sample Sizes
The first step involved calculating 5 sample size
values for which modeling the instruction (prompt)
to calculate statistical indicators for student
graduation projects would be performed.
The logic of this step is based on the experience
of supervising student graduation projects (works).
The maximum size of the population under
investigation by students when conducting
graduation projects was taken as the basis. In
practice, this population size was assumed to be less
than 50000 units. The minimum number of units for
a master's graduation project should be around 200.
So, these sample sizes (200-50000) are considered
sufficient for a graduation project.
It is considered that for making business
decisions, the sampling error should not exceed
4.00%, [31]. The standard testing levels of 0.95 or
0.99 are most commonly used, [31]. In our case, a
sampling error of 4.00% and the standard testing
level of 0.95 were chosen.
Table 1 shows the sample size values for further
calculations. Mathematical modeling is performed
using an online calculator, [32].
Table 1. Results of mathematical modeling for
choosing the sample sizes
General
population size
200
500
3000
50000
Sample size, N
150
273
500
593
Table 1 shows that with a population size of
50,000 units, the sample size is N=593 units.
Increasing the population size to 100000 units
resulted in a sample size of N=597 units, [32]. For
populations exceeding 100000 units, the sample size
is N=600 units, [32]. This means that the chosen
maximum sample size of N=593 units is justified for
graduation projects.
Additionally, Table 1 displays 5 sample size
values for which calculations of statistical indicators
M(x) and δx will be performed, [33], [34]. In other
words, further calculations will be conducted for
five sample size values (Table 1).
3.2 Creating the New chatGPT 3.5
Instruction (Prompt) to Calculate
Statistical Indicators
As mentioned earlier, chatGPT 3.5 easily performed
computing statistical indicators for sample sizes of
less than 10 units. However, for sample sizes
exceeding 10 units, chatGPT 3.5 occasionally
produced errors in the results. As shown in Table 1,
the sample size theoretically can reach up to 593
units in graduation projects.
When creating the instruction (prompt), the
following simple scenarios were considered:
students conduct a basic study where only 2 states
need to be evaluated (agree - disagree; on - off; yes -
no; black - white; growth-decrease; etc.). These are
situations where responses can be digitized as "0"
and "1". Creating an instruction (prompt) for
situations with a larger number of responses (e.g.,
0.0, 0.5, 1.0) may be a task for future research.
Multiple refinements led to the following new
instruction (prompt) for chatGPT 3.5:
"The sample of numbers consists of N=Х
digits, of which N1=Х1 digits "1", N2=Х2 digits
"0" ::
1) Calculate the value of the sample mean M(x).
Write down the value of M(x) ::
2) Calculate Yi as the difference between each
element of the sample and the average value of the
sample, Yi = xi−M(x). Write the values of Yi in the
column ::
3) Calculate Bi as the squares of the differences
between each element of the sample and the average
value of the sample, Bi = (xi−M(x))2. Write down
the Bi values in the column ::
4) Calculate Zi as the product of Ni and Bi, Zi =
Ni × Bi. Write down the values of Zi in the column
:: 5) Calculate Z as the sum of Z = Zi. Write
down the value of Z ::
6) Calculate C as a quotient of C = Z/N. Write
down the value of C ::
7) Calculate the value of the standard deviation
for the sample δx as the square root of C, that is, δx
= √ C. Write down the value of δx ::
8) Write down the result in the form: M(x) = the
result of calculations up to the fourth decimal place,
δx = the result of calculations up to the fourth
decimal place. Write down the letters M(x) and δx
in bold, please".
This instruction (prompt) contains the following
input symbols, [33], [34]:
- N - the sample size,
- N1, N2 - the numbers of alternative answers (for
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
310
Volume 12, 2024
example, for "yes" and for "no").
This instruction (prompt) contains the following
numerical input values:
- Х, Х1, Х2 - the number of respondents'
responses, and Х = Х1 + Х2.
This instruction (prompt) contains the following
output symbols, [33], [34]:
- M(x) - the expected value,
- δx - The standard deviation for the sample.
The rest of the symbols do not matter to the user.
The division of the prompt into separate steps is
caused by a problem that the author encountered in
trying to obtain a result by setting the task of
directly calculating statistical indicators. A research
problem was that chatGPT 3.5 easily performed
computations of statistical indicators for sample
sizes of less than 10 units. However, for larger
sample sizes, chatGPT 3.5 did not yield stable
results. Nevertheless, student graduation projects
require larger sample sizes than 10 units. СhatGPT
3.5 did a good job of dividing the process of
calculating statistical indicators into small
individual steps. Each stage is responsible for
performing one simple mathematical operation.
At the first step, chatGPT calculated the
mathematical expectation M(x). Steps 2-7 were
needed for step-by-step calculation of the standard
deviation for the sample δx. Step 8 gives the
command to print the obtained results with a
precision of 4 decimal places.
A very important advantage of the new chatGPT
3.5 instruction (prompt) lies in relieving the student
from entering data arrays into the new chatGPT 3.5
instructions (prompt).
3.3 The User's Guide to the New chatGPT
Instructions (Prompt)
The user of the new chatGPT 3.5 instructions
(prompt) performs the following actions:
1. Registers on the website
https://chat.openai.com/auth/login.
2. Copy the prompt from section 3.2 of this paper.
3. Input three individual numbers into the
instruction (prompt): X, X1, X2. These
represent the sample size and the number of
responses from respondents for each of the two
alternatives.
4. Pastes the instruction (prompt) into the chatGPT
dialog box and presses the "Send message"
button.
5. Write down the obtained values of the statistical
indicators M(x) and δx. These data will be
calculated with four decimal places.
The advantage of the new instruction (prompt)
is the absence of the need to input arrays of
empirical data or perform any manipulations with
them.
3.4 Five Examples of Calculations of
Statistical Indicators
Five examples are provided in Table 2 to assess the
accuracy of the new instruction (prompt). The initial
data N are taken from Table 1, which presents
sample sizes obtained through mathematical
modeling for various population sizes. Calculations
of statistical indicators M(x) and δx were performed
using three methods:
- Using the new instruction (prompt) for chatGPT
3.5,
- Using Excel tables,
- Using the Windows calculator.
This can be considered one of the limitations of
our study, that two simple and accessible tools were
used to calculate statistical indicators in the
empirical part of the study. We are talking about
Excel tables and the Windows calculator. However,
these are tools accessible to ordinary students and
the use of more complex tools is not planned.
Example 1 combines rows 1-3. Example 2
combines rows 4-6. Example 3 combines rows 7-9.
Example 4 combines rows 10-12. Example 5
combines rows 13-15.
When deciding on the sample size, the
following conditions were chosen when considering
the examples:
- Sampling error 4.00%, [31],
- Standard testing level 0.95, [31],
- Boundaries of the general population from 200 to
50,000.
Intermediate values of the general population size
were selected randomly as psychologically easily
perceived values.
The sample size values N, N1, and N2 for
Examples 1 and 5 (Table 2) were taken from a real-
life case, [35]. They are borrowed from source, [35]
because the sample sizes of N = 144 and N = 599 in
the real cases are close to the modelled sample size
(Table 1), [32]. In the first example, N is slightly
smaller than what was obtained through
mathematical modeling (Table 1). In the fifth
example, N is slightly larger than what was obtained
through mathematical modeling (Table 1).
In Examples 2, 3, and 4 (Table 2), the values of
N1 and N2 were arbitrarily chosen.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
311
Volume 12, 2024
Table 2. Five examples of calculations of statistical
indicators M(x) and δx
N
N1
N2
Тhe tool
M(x)
δx
1
144
51
93
New prompt
0.3542
0.4787
2
Excel tables
0.3542
0.4783
3
Windows
0.3542
0.4783
4
273
33
240
New prompt
0.1209
0.3269
5
Excel tables
0.1209
0.3266
6
Windows
0.1254
0.3302
7
375
34
341
New prompt
0.0907
0.2871
8
Excel tables
0.0907
0.2871
9
Windows
0.0933
0.2909
10
500
40
460
New prompt
0.0800
0.2716
11
Excel tables
0.0800
0.2713
12
Windows
0.0820
0.2744
13
599
65
534
New prompt
0.1085
0.3114
14
Excel tables
0.1085
0.3110
15
Windows
0.1085
0.3110
Table 2 demonstrates a complete match of the
statistical indicators M(x) and δx for all calculation
methods in Examples 1 and 5. In Examples 2, 3, and
4 (Table 2), the values of M(x) and δx coincide for
the new instruction (prompt) and Excel tables.
The figures obtained in Table 2 indicate
differences in the calculation of statistical indicators
using Excel tables and the Windows calculator. The
coincidence of the results of calculations using
Excel tables and the new prompt in all cases
eliminates errors of the new method. The difference
in values obtained using Excel tables and the
Windows calculator could cast doubt on the
reliability of using the Windows calculator.
An example of computing statistical indicators
M(x) and δx using chatGPT 3.5 instructions
(prompt) for a sample with N=375 is presented in
Figure 4 and Figure 5. In Table 2, this example is
located in row 7.
Fig. 4: Screenshot: Example of the new chatGPT 3.5
instruction (prompt) for the sample with N=375
(Table 2, example 3, lines 7-9)
Figure 4 illustrates the detailed appearance of the
instruction (prompt).
Fig. 5: Screenshot: Results of computing statistical
indicators M(x) and δx for the sample with N=375
(Table 2, example 3, lines 7-9)
Figure 5 displays the obtained result in the red
rectangle.
The words "ChatGPT 3.5" have been shifted to
the right by the author in Figure 4 and Figure 5. The
purpose of this shifting is to achieve the compact
figures.
The values of M(x) obtained using ChatGPT
differ by 0.0020-0.0045 from those obtained using
the Windows calculator (Table 2). Thus, the
presented calculation results have a difference of the
third decimal place. This means that high accuracy
of calculations has been achieved, which is
sufficient to complete students’ graduation projects.
Modern prompt engineering has already
accepted the Rule that 3 checks of a new prompt are
a sufficient condition for debugging it
(https://clockwork-
school.com/aistaff?utm_source=vebinar&utm_medi
um=030424&utm_campaign=workshop#about). In
our case, we checked the reliability of the prompt 5
times.
Although you have equal reason to believe in
the accuracy of all three tools, the author
sequentially verifies three pairs of statistical
hypotheses for Examples 2-4.
3.5 Z-statistics
During the verification of the statistical hypotheses,
the author utilized the method of comparing the
means of two independent samples. The essence of
this method lies in computing the Z-statistic, [33],
[34].
Z-statistics are often used to analyze whether
the means of two sets of data are the same, provided
that the population variance is known
(https://habr.com/ru/companies/otus/articles/793678
/). Thus, hypothesis testing is based on assessing the
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
312
Volume 12, 2024
significance of the difference M(x1) M(x2).
When performing the Z-statistic, the research
hypothesis is the assumption that there is no
significant difference between the variables under
study, [33], [34]. In the context of the Z statistic,
which is often used to compare means, research
hypothesis typically states that the means of the two
groups being compared are equal. The alternative
hypothesis is the opposite of the null hypothesis: the
means of the two groups being compared are not
equal. Therefore, the author chose the Z-statistics
method.
The comparison of the two means will be
conducted for two methods of calculating statistical
indicators: using the new prompt and using the
Windows calculator. The comparison is carried out
for three pairs of rows in Table 2:
- Row 4 and Row 6,
- Row 7 and Row 9,
- Row 10 and Row 12.
So, the author made three pairs of hypotheses,
[33], [34]: a Research hypothesis and an Alternative
hypothesis.
Research hypothesis: M(х1) M(х2) = 0.00.
That is, if you do not take into account random
deviations.
Alternative hypothesis: M(х1) M(х2) 0.00.
That is, if you do not take into account random
deviations. In this case, the difference can be either
greater than 0.00% or less than 0.00%.
If the difference M(x1) M(x2) equals 0.00, it
means that the two samples are equal to each other.
There is no statistically significant difference
between the two means. The results of calculating
the statistical indicator M(x) using two methods are
equal their difference is explained by random
deviations.
If the difference M(x1) M(x2) is greater or
less than 0.00, it means that the two samples are not
equal to each other. There is a statistically
significant difference between the two means. The
results of calculating the statistical indicator M(x)
using two methods differ from each other.
A detailed description of the Z-statistic is
provided in statistics textbooks, including [33], [34],
[36]. Here, a two-sided test was adopted because the
comparison result can be both greater than 0.00%
and less than 0.00%.
For each verification case (Table 3, Table 4,
Table 5), the author used the standard significance
level, [33], which is 0.05 (α = 0.05).
Table 3. Results of comparing two sample averages
for row 4 and row 6
Calculation stage
Row 4
Row 6
The size of a sample, N
273
273
The expected value, M(x), %
0.1209
0.1254
| M(x1) M(x2) |
0.0045
μ1 μ2
0.00
The standard deviation for
the sample, δх
0.3269
0.3302
Average error, Ṡ = δх / √n
0.0198
0.0200
2
0.00039
0.000340
| Ṡ12 - 22 |
0.00001
√ (Ṡ12 - 22)
0.00316
| zstat | = [M(x1) M(x2) - 1 μ2)]
/ √ (Ṡ12 - 22)
1.4240
The value ztabl for the standard testing
level of 0.05 [33,34]
1.96
Result, | zstat | > ztabl
No
Table 3 shows that the Z-statistics | zstat | is less
than the ztabl. In this case, the Research hypothesis is
accepted: M(x1) M(x2) = 0.00. This means that
the results of calculating of statistical indicator M(x)
are equal for both ways their difference is
explained by random deviations.
Table 4. Results of comparing two sample averages
for row 7 and row 9
Calculation stage
Row 7
Row 9
The size of a sample, N
375
375
The expected value, M(x), %
0.0907
0.0933
| M(x1) M(x2) |
0.0026
μ1 μ2
0.00
The standard deviation for
the sample, δх
0.2871
0.2909
Average error, Ṡ = δх / √n
0.0148
0.0150
2
0.00022
0.00023
| Ṡ12 - 22 |
0.00001
√ (Ṡ12 - 22)
0.00316
| zstat | = [M(x1) M(x2) - 1 μ2)]
/ √ (Ṡ12 - 22)
0.8228
The value ztabl for the standard testing
level of 0.05 [33,34]
1.96
Result, | zstat | > ztabl
No
Table 4 shows that the Z-statistics | zstat | is less
than the ztabl. In this case, the Research hypothesis is
accepted: M(x1) M(x2) = 0.00. This means that
the results of calculating of statistical indicator M(x)
are equal for both ways their difference is
explained by random deviations.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
313
Volume 12, 2024
Table 5. Results of comparing two sample averages
for row 10 and row 12
Calculation stage
Row 10
Row 12
The size of a sample, N
500
500
The expected value, M(x), %
0.0800
0.0820
| M(x1) M(x2) |
0.0020
μ1 μ2
0.00
The standard deviation for
the sample, δх
0.2716
0.2744
Average error, Ṡ = δх / √n
0.0121
0.0123
2
0.000147
0.000151
| Ṡ12 - 22 |
0.000004
√ (Ṡ12 - 22)
0.002
| zstat | = [M(x1) M(x2)
1 μ2)] / √ (Ṡ12 - 22)
1.000
The value ztabl for the standard
testing level of 0.05 [33,34]
1.96
Result, | zstat | > ztabl
No
Table 5 shows that the Z-statistics | zstat | is less
than the ztabl. In this case, the Research hypothesis is
accepted: M(x1) M(x2) = 0.00. This means that
the results of calculating of statistical indicator M(x)
are equal for both ways their difference is
explained by random deviations.
Thus, the Z-statistic confirmed the equality of
the statistical indicators calculated using the new
chatGPT 3.5 instruction (prompt) and two other
traditional methods. The standard testing level is α =
0.05.
This means that students can use the new
chatGPT 3.5 instruction (prompt) to calculate
statistical indicators when completing graduation
and other projects. However, we may encounter
both a prejudice that the use of artificial intelligence
is not in demand for simple statistical calculations,
and a prejudice about the uselessness of Artificial
Intelligence in general.
4 Conclusion
The purpose of the study has been achieved.
The research problem has two main aspects.
Firstly, the three stages of bibliometric analysis on
79606 sources published in the Scopus database
showed increased interest in addressing issues
related to "graduation project" and "statistical
indicators." The research results suggest the need for
finding a way to calculate statistical indicators
without manual computations. Secondly, the
possibility of errors in calculations using chatGPT
3.5 was empirically confirmed, as well as in
scientific literature.
The novelty of the work lies in the creation of
the new chatGPT 3.5 instruction (prompt) to
calculate statistical indicators for student graduation
projects. The new chatGPT 3.5 instruction (prompt)
enables the calculation of statistical indicators
without manual computations. The significant
advantage of the new chatGPT 3.5 instruction
(prompt) lies in relieving the student from entering
data arrays into the tool for computing statistical
indicators.
The created new chatGPT 3.5 instruction
(prompt) should become a valuable resource for
students, helping them save time in preparing their
graduation projects. Students, instructors, and other
interested parties can use the prompt from section
3.3.
The study has scientific and practical
significance, since it reduces the time and simplifies
the process of calculating statistical indicators of
graduation projects, and also improves their quality:
1. New ChatGPT 3.5 instruction (prompt) is a
useful resource for students to save time on
graduation projects.
2. Using the proposal allows you to improve the
quality of graduation projects.
3. The new ChatGPT 3.5 instruction (prompt) can
help students with various statistics knowledge.
4. This discovery has the potential to expand the
use of computers in education.
One of the limitations of the study is the choice
of the simplest option for calculating statistical
indicators. As empirical data and prompt
engineering experience are collected, instructions
(prompts) will be compiled for solving more
complex statistical problems. Exploring how the
performance of ChatGPT 3.5 might vary with
different types of statistical indicators, sample sizes,
or data distributions is one of the next steps of our
study. The next goal of the research is to create a
new chatGPT 3.5 instruction (prompt) for
computing statistical indicators for a larger number
of respondents (units). Creating instructions
(prompts) for situations with a larger number of
responses (e.g., 0.0, 0.5, 1.0) and receiving quality
feedback from students who use the new instruction
may also be the tasks for future research.
Acknowledgement:
The study was carried out with the support of the
Eastern European Scientific Group. The author
would like to thank the reviewers for their valuable
comments.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
314
Volume 12, 2024
References:
[1] Sassy Bayona-Oré, Student Learning Styles in
Information Technology, WSEAS
Transactions on Computer Research, vol. 11,
2023, pp. 385-392,
https://doi.org/10.37394/232018.2023.11.35.
[2] Razeale G. Resultay, Rudjane C. Tunac, Mary
Joy O. Macaraeg, Anti-Bullying Intervention
(ABBI) Program Manual: Grounding
Teachers and Counselors, WSEAS
Transactions on Computer Research, vol. 12,
2024, pp. 181-195,
https://doi.org/10.37394/232018.2024.12.18.
[3] Jakoet-Salie, Amina, Ramalobe, Kutu, The
digitalization of learning and teaching
practices in higher education institutions
during the Covid-19 pandemic, Teaching
Public Administration, Vol. 41, No.1, 2023,
pp.59-71. DOI: 10.1177/01447394221092275.
[4] Mariam A. Shomoye, Najeem O. Adelakun,
Kehinde L. Adebisi, Exploring the Use of
Electronic Resources for Undergraduate
Learning at the National Open University of
Nigeria in Kwara State, WSEAS Transactions
on Computers, vol. 22, 2023, pp. 302-310,
https://doi.org/10.37394/23205.2023.22.34.
[5] Siti Nur Zahirah Omar, Rusnifaezah Musa,
Maliani Mohamad, Che Mohd Syaharuddin
Che Cob, Azmahani Yaacob Othman, Razli
Ramli, Efficiency Online Learning During
Covid-19 Pandemic, WSEAS Transactions on
Business and Economics, Vol. 20, 2023, pp.
30-39,
https://doi.org/10.37394/23207.2023.20.4.
[6] Liliana Sobreira, David Pascoal Rosado,
Helga Santa Comba Lopes, Maria Jose Sousa,
The Implementation of e-Learning in the
Continuous Training of the Military of the
Republican National Guard. Case Study:
Lisbon Territorial Command, WSEAS
Transactions on Systems and Control, Vol.
16, 2021, pp. 66-76,
https://doi.org/10.37394/23203.2021.16.4.
[7] Nursel Selver Ruzgar, Clare Chua-Chow,
Perception of Students on Online Exams and
How Sequential Exams and the Lockdown
Browser Affect Student Anxiety and
Performance, WSEAS Transactions on
Computer Research, vol. 11, 2023, pp. 92-
110,
https://doi.org/10.37394/232018.2023.11.9.
[8] Bazzi A., Chafii M., On Outage-Based
Beamforming Design for Dual-Functional
Radar-Communication 6G Systems, IEEE
Transactions on Wireless Communications,
vol. 22, no. 8, 2023, pp. 5598-5612, doi:
10.1109/TWC.2023.3235617.
[9] Bazzi A., Chafii M., Secure Full Duplex
Integrated Sensing and Communications,
IEEE Transactions on Information Forensics
and Security, vol. 19, 2024, pp. 2082-2097,
doi: 10.1109/TIFS.2023.3346696.
[10] Pence H., Artificial Intelligence in Higher
Education: New Wine in Old Wineskins?
Journal of Educational Technology Systems,
Vol.48, No.1, 2019, pp. 5-13. doi:
10.1177/0047239519865577.
[11] Gellai Dániel Béla, Enterprising Academcs:
Heterarchical Policy Networks for Artificial
Intelligence in British Higher Education,
ECNU Review of Education, December 19,
2022. doi: 10.1177/20965311221143798.
[12] West C. G., AI and the FCI: Can ChatGPT
project an understanding of introductory
physics? 2023, arXiv preprint
arXiv:2303.01067
[13] Choi J., Hickman K., Monahan A., Schwarcz
D., ChatGPT goes to law school. Minnesota
Legal Studies Research Paper, 23(03), 2023.
[14] Leswing K., OpenAI announces GPT-4,
claims it can beat 90% of humans on the SAT.
CNBC, [Online].
https://www.cnbc.com/2023/03/14/openai-
announces-gpt-4-says-beats-90percent-of-
humans-on-sat.html (Accessed Date: March
26, 2023).
[15] Geerling Wayne, Mateer G. Dirk, Wooten
Jadrian, Damodaran Nikhil. ChatGPT has
Aced the Test of Understanding in College
Economics: Now What? The American
Economist, 68(2), 2023, рр. 233-245, DOI:
10.1177/05694345231169654.
[16] Gilson A., Safranek C. W., Huang T.,
Socrates V., Chi L., Taylor R. A., Chartash D.
How does ChatGPT perform on the United
States medical licensing examination? The
implications of large language models for
medical education and knowledge assessment,
JMIR Medical Education, Vol.9, No.1,
e45312, 2023. DOI: 10.2196/45312.
[17] Wilson S. G. The Flipped Class: A Method to
Address the Challenges of an Undergraduate
Statistics Course, Teaching of Psychology,
40(3), 2013, рр. 193-199, DOI:
10.1177/0098628313487461.
[18] Trinh M. P., Chico R. J., Reed R. M. How
Fun Overcame Fear: The Gamification of a
Graduate-Level Statistics Course, Journal of
Management Education, 0(0), 2023,
https://doi.org/10.1177/10525629231181120.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
315
Volume 12, 2024
[19] Pirlott A. G., Hines J. C. Eliminating ANOVA
Hand Calculations Predicts Improved Mastery
in an Undergraduate Statistics Course.
Teaching of Psychology, 0(0), 2023, DOI:
10.1177/00986283231183959.
[20] Reynolds J. J. Let’s Talk about Stats:
Revising Our Approach to Teaching Statistics
in Psychology. Psychological Reports, 126(1),
2023, рр. 5-33, DOI:
10.1177/00332941211043447.
[21] Callingham R., Watson J., Oates G. Learning
progressions and the Australian curriculum
mathematics: The case of statistics and
probability, Australian Journal of Education,
65(3), 2021, рр. 329-342,
https://doi.org/10.1177/00049441211036521.
[22] Eastridge J. A., Benson W. L. Comparing
Two Models of Collaborative Testing for
Teaching Statistics, Teaching of Psychology,
47(1), 2020, рр. 68-73,
https://doi.org/10.1177/0098628319888113.
[23] Paltoglou A. E., Morys-Carter W. L., Davies
E. L. From Anxiety to Confidence: Exploring
the Measurement of Statistics Confidence and
its Relationship with Experience, Knowledge
and Competence within Psychology
Undergraduate Students, Psychology Learning
& Teaching, 18(2), 2019, рр. 165-178,
https://doi.org/10.1177/1475725718819290.
[24] Fielding S, Poobalan A, Prescott G, Marais D,
Aucott L. Views of medical students: what,
when and how do they want statistics taught?
Scottish Medical Journal, 60(4), 2015, рр.
164-169. doi:10.1177/0036933015608329.
[25] Ralston K. Sociologists Shouldn’t Have to
Study Statistics: Epistemology and Anxiety of
Statistics in Sociology Students. Sociological
Research Online, 25(2), 2020, рр. 219-235.
https://doi.org/10.1177/1360780419888927
[26] Walker R. V., Osborn, H., Madden, J., &
Jennings Black, K. Integrating Diversity into
Psychology Statistics Courses: Advice,
Reflections, and Special Considerations.
Teaching of Psychology, 0(0), 2023,
https://doi.org/10.1177/00986283231199461.
[27] Yang H.-H., Lin Y.-T. How Knowledge
Sharing and Cohesion Become Keys to a
Successful Graduation Project for Students
from Design College, SAGE Open, 12(3),
2022,
https://doi.org/10.1177/21582440221121785.
[28] Conner J. O. Student Engagement in an
Independent Research Project: The Influence
of Cohort Culture, Journal of Advanced
Academics, 21(1), 2009, рр. 8-38.
https://doi.org/10.1177/1932202X0902100102
[29] Herzallah R. Practically Oriented Design
Projects in Mechatronics Engineering: A Case
Study of Design Experiences and Outcomes,
International Journal of Electrical
Engineering & Education, 47(1), 2010, рр.
31-46. doi:10.7227/IJEEE.47.1.4.
[30] Leswing K., OpenAI announces GPT-4,
claims it can beat 90% of humans on the SAT.
CNBC, [Online].
https://www.cnbc.com/2023/03/14/openai-
announces-gpt-4-says-beats-90percent-of-
humans-on-sat.html (Accessed Date: March
26, 2023).
[31] Sample. Size is not the main thing. Or the
main thing, [Online].
https://scanmarket.ru/blog/vyborka-razmer-
ne-glavnoe-ili-glavnoe (Accessed Date:
January 31, 2024).
[32] A calculator to calculate a sufficient sample
size, [Online].
https://scanmarket.ru/blog/vyborka-razmer-
ne-glavnoe-ili-glavnoe#calc1 (Accessed Date:
May 24, 2024).
[33] Singpurwalla D., A Handbook of Statistics: An
Overview of Statistical Methods, bookboon.
2015.
[34] Kravchenko A., Sociologia: textbook for
students [in Russian], Yurayt. 2014.
[35] Okulich-Kazarin V., Artyukhov A., Skowron
Ł., Artyukhova N., Dluhopolskyi O., Cwynar
W. Sustainability of Higher Education: Study
of Student Opinions about the Possibility of
Replacing Teachers with AI
Technologies, Sustainability, 16(1), 2024, 55,
https://doi.org/10.3390/su16010055.
[36] BUS_9641. Business_Statistics_5, Textbook
for the Program 'Masters of Business
Administration', Kingston University, 2010.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
316
Volume 12, 2024
Contribution of Individual Authors to the
Creation of a Scientific Article (Ghostwriting
Policy)
The author contributed in the present research, at all
stages from the formulation of the problem to the
final findings and solution.
Sources of Funding for Research Presented in a
Scientific Article or Scientific Article Itself
No funding was received for conducting this study.
Conflict of Interest
The authors have no conflicts of interest to declare.
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.30
Valery Okulich-Kazarin
E-ISSN: 2415-1521
317
Volume 12, 2024