Applying Compiling Techniques in DTD Conversion
SOUHEIL TAWK1, MAROUN ABI ASSAF *2, JOSEPH CONSTANTIN1, KABLAN BARBAR1,
JACQUES BOU ABDO3
1LaRRIS, Faculty of Sciences, Lebanese University, Fanar, LEBANON
2Department of Computer Science and Information Technology,
Holy Spirit University of Kaslik, Jounieh, LEBANON
3School of Information Technology, University of Cincinnati, USA
*Corresponding Author
maroun.abiassaf@usek.edu.lb
Abstract: In this paper, we propose to apply compilation techniques to Documents Type Definition (DTD). At
the syntactic level, we create a context free grammar, denoted GDT D , for the set of all DTDs in order to generate
them. The semantic level is defined by using attributes. Each nonterminal of GDT D is assigned a set of attributes
and for each production of GDT D a set of attribute equations is defined. This DTD compilation process can
be used in different domains. In this article, we provide an attribute grammar that can be used to eliminate
unnecessary parentheses in element declarations in a DTD. In addition, the DTD compilation process can be
applied to the domain of model conversion. It makes it possible to split the model conversion into two steps: the
syntactic transformation step, which consists of rewriting the source DTD in terms of XML syntax, and the second
step of the effective conversion algorithm, which can be easily expressed by an Extensible Stylesheet Language
Transformation (XSLT).
Key-Words: Attributed Grammar, Compiler Design, Computation Theory, DTD, Models Conversion
Received: March 14, 2024. Revised: September 13, 2024. Accepted: November 13, 2024. Published: December 20, 2024.
1 Introduction
DTDs are textual models for the structure of Exten-
sible Markup Language (XML) documents which as
well as used for data exchange over the Internet, [1].
They are integrated into XML documents as well as
converted into relational models of databases. This is
because the Structured Query Language (SQL) lan-
guage offers a very powerful way of extracting data
from databases, [2]. It is important to know that we
store data in relational databases if we are interested
in data manipulation. If we need to exchange or trans-
form data over the Internet, we store it in XML docu-
ments.
Many mapping algorithms have been proposed in the
literature for the problem of mapping DTDs to rela-
tional schemas, [3], [4], [5], [6], [7], [8], [9], [10],
[11], [12], [13]. Some of these algorithms did not con-
sider the semantics of DTDs during the mapping pro-
cess, so the resulting relational schema loses the orig-
inal DTD information, [3], [4], [5]. In [6], the authors
use a table and object relational mapping. The ta-
ble based mapping cannot handle mixed content, and
the object based mapping of mixed content is ineffi-
cient. Therefore, these two models are poorly suited
for centered documents. Other mapping algorithms
preserved the cardinality constraints, [7], [8]. How-
ever, they did not take functional dependencies via
DTDs into account. In [9], the authors developed a
mapping algorithm that preserves both the structure
of DTDs and the semantics implied by the constraints
of DTDs, but the multivalued dependencies are not
preserved by the mapping algorithm. The mapping
algorithm in [10], was developed to map DTDs to a
relational schema that preserves not only the content
and structure, but also the semantics of the original
XML documents. A hybrid inline algorithm was pro-
posed to map DTDs to relational schemas while pre-
serving functional dependencies, [11]. In [12], the au-
thors defined the dependency relationships in XML
documents and, based on these relationships, created
a set of rules for mapping the XML document to a
relational database. In [13], the authors described a
technique for mapping a specific DTD to a relational
schema. This technique consisted of converting the
DTD to an Xschema, simplifying the Xschema con-
straints, and mapping the collection types.
Unfortunately, all studies focus on the conversion al-
gorithms and ignore how to read the DTD source
and divide it into small units for the conversion al-
gorithms. We believe that our approach is the first
to focus on applying of syntatic and semantic analy-
sis of the compilation process to DTDs. In fact, the
concept of attributed grammar has already been ap-
plied in various research area, [14], [15], [16]. In [14],
the author introduces the concept of attributed gram-
mar for compiler description as a declarative way by
associating attributes with nonterminals and attribute
equations with productions. In logic programming,
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
547
Volume 21, 2024
the authors of [15], consider an abstract substitution
as an attribute attached to the nodes of a tree, and
then the propagation process of abstract substitutions
through the tree can be expressed in terms of attribute
evaluation. In the field of web development, the au-
thors of [16], give a system that is able to automati-
cally generate web forms or editors to collect data re-
lated to predefined DTDs. Once the the data entry is
complete, the web based form creates the correspond-
ing XML documents. The system receives a DTD
as input and relies on attributed grammars to assign
semantics to the elements and generates an HTML
based editor on the fly.
In this paper, we want to create a formal framework
for analyzing DTDs viewed as text models or strings.
For this purpose, we consider the set of DTDs as a
context free language called LDT D . We define a con-
text free grammar GDT D that recognizes the language
LDT D . Then, any DT D conversion can be realized
by enriching the grammar GDT D with an attributed
system which consists of associating attributes with
the nonterminals and attribute equations with the pro-
ductions of the grammar GDT D . We give an exam-
ple of an attribute grammar that eliminates unneces-
sary parentheses in an element declaration in a DTD.
An element declaration is a list or regular expres-
sion composed of the operator ’,’ , the disjunction se-
quence ’!’ , the opening and closing parentheses, and
the names of the subelements. Another example is the
model conversion from DT D to a relational model,
which consists of converting the DTD to another in-
termediate model in terms of XML syntax. The con-
tribution of this approach is that in the case of mul-
tiple transformations, a stylesheet can be written and
applied to a suitable intermediate XML based model
for each transformation without having to reanalyze
the source DTD, [17], [18]. The main advantage of
our approach:
1. The formalization of the set of DTDs by a context
free grammar GDT D . Then each operation over a
DTD consists of defining attributes over GDT D ,
which means developing an attributed grammar
based compiler.
2. The application of GDT D in the field of DTD
conversion. We can define attributes via GDT D
that convert a given DTD into another DTD
while preserving the declaration of elements
and attributes and respecting the XML syntax.
The intermediate model thus obtained is called
DT DXM L and any conversion algorithm for
the source DTD can be expressed by an XSLT
stylesheet applicable to the intermediate model
DT DXM L. This approach makes it possible to
write the DTD conversion process in two sepa-
rate modules: one module is an attribute gram-
mar application for DTDS or a DTD compiler
and the other module is just an XSLT stylesheet
for the conversion algorithm (Fig. 1).
The paper is structured as follows: Section 2 ex-
plains the principle of algebraic and attributed gram-
mars in arithmetic theory. Section 3 defines an alge-
braic grammar for generating DTDs. Section 4 de-
fines an attributed grammar for eliminating unnec-
essary parentheses in DTD declarations. Section 5
shows the results of the experiment, while section 6
summarizes the paper with a conclusion.
2 Algebraic and Attributed
Grammars
This section is dedicated to explain the syntactic anal-
ysis of DTDs. These concepts include context free
and attributed grammars, which form the basis of
compilation theory in order to create compilers for
programming languages. A context free grammar
(CFG) is a system of rules that enables the genera-
tion of a set of words from an axiom. The alphabet of
the language is divided into two groups: Terminals,
i.e. the letters of the generated words, and nontermi-
nals, i.e. intermediate symbols that can be replaced by
terminals or other nonterminals. In formal language
theory, a context free grammar is a formal grammar
whose production rules can be applied to a nontermi-
nal symbol independently of its context. In particu-
lar, in a context free grammar, each production rule
has the form Aα, where A is a single nonterminal
symbol and αis a string of terminals and/or nontermi-
nals. αcan be empty). Regardless of which symbols
surround it, the single nonterminal A on the left can
always be replaced by αon the right.
Attributed grammars are a formalism for describ-
ing the semantic analysis of programming languages
(Fig. 2). The principle is to assign attributes to each
nonterminal and to assign a system of attribute equa-
tions to each production of the grammar, [19], [20].
The attributes are evaluated along a derivation tree
Figure 1: Conversion steps: formalisation of the set of
DTDs by a context free grammar GDT D . Transform
of the DTD into intermediate model DT DXM L using
GDT D with attributes. Conversion algorithm can be
expressed by an XSLT stylesheet
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
548
Volume 21, 2024
(for a specific input word) to determine the seman-
tics of the input word. In the case of programming
languages, there is a context free grammar for each
programming language. The attributes are used to
calculate the semantics of all statements of a pro-
gram when they are considered as input words of
the grammar of the corresponding programming lan-
guage. An Attribute Grammar (AG) is defined as
AG =< G, A, E > where:
G=<Σ, N, S, P > where:
Σis the set of terminals symbols.
N is is a set of nonterminal symbols.
S is the start symbol (axiom) and XN.
P is the set of the productions rules.
P = p / p: X0α1X1α2αnXnαn+1
where i[0 . . . n],XiNand j
[1 . . . n + 1] ,αjΣ
A is a set of attributes divided into two subsets
Asand Ah,A=AsAh, and for each XN
are associated: As(X)is the set of synthesized
attributes where As(X)As,Ah(X)is the set
of inherited attributes where Ah(X)Ah.
Eis the set of equations in which for each syn-
thesized attribute sAs(X0), there is an at-
tribute equation defining s(X0)of the form :
s(X0) = fs(s1(X1), . . . , sn(Xn), h(X0)) where
i[1 . . . n], si(Xi)As(Xi)and h
Ah(X0), as shown in Fig. 3, and for each inher-
ited attribute hjAh(Xj), there is an attribute
equation defining hj(Xj)of the form:
hj(Xj) = fhj (h0(X0), s1(X1), . . . , sn(Xn))
where h0Ah(X0)and i[1 . . . n],si
As(Xi)as shown in Fig. 4.
Figure 2: Structure of a compiler. It composed of syn-
tax analysis and semantic analysis.
Figure 3: Order of computation of synthesized at-
tributes.
Figure 4: Order of computation of synthesized at-
tributes.
3 Definition of an Algebraic
Grammar for DTDs
In our approach for the formalization of the syntac-
tical analysis of the DTDs, we define the algebraic
grammar GDT D =<Σ, N, S, P > by :
Σ = {<, >, (,),|,,?,+,",!,[,],#, , , identifier,
DOCTYPE, ELEMENT, ATTLIST, EMPTY,
ANY, PCDATA, id, CDATA, ID, IDREF,
FIXED, REQUIRED, IMPLIED, string } : de-
notes the alphabet.
N = {Start, DTD, LineType, Attlist, AttSuite, El-
ementType, List, Occurs, AttType, Enumerate,
EnumerateSuite, AttOption, InitialVal}: is the
set of non terminals,
Start is the axiom.
P is the following set of productions:
P0: Start DTD
P1: DTD <!DOCTYPE identifier [Line-
Type]
P2: LineType <!ELEMENT identifier
ElementType>AttList LineType
P3: LineType
P4: AttList <!ATTLIST identifier1 identi-
fier2 AttType AttOption AttSuite>
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
549
Volume 21, 2024
P5: AttList
P6: AttSuiteidentifier AttType AttOption
AttSuite
P7: AttSuite
P8: ElementType EMPTY
P9: ElementType ANY
P10: ElementType(#PCDATA)
P11: ElementType(List) Occurs
P12: List id Occurs , List
P13: List id Occurs |List
P14: List (List) Occurs, List
P15: List (List) Occurs |List
P16: List (List) Occurs
P17: List id Occurs
P18: Occurs *
P19: Occurs ?
P20: Occurs
P21: Occurs +
P22: AttType CDATA
P23: AttType ID
P24: AttType IDREF
P25: AttType (Enumerate) Initialval
P26: Enumerate string EnumerateSuite
P27: EnumerateSuite Enumerate
P28: EnumerateSuite
P29: AttOption #FIXED InitialVal
P30: AttOption #REQUIRED
P31: AttOption #IMPLIED
P32: InitialVal string
P33: InitialVal ””
We start in the grammar GDT D to write the produc-
tions P0and P1that define the syntax of a DT D. We
declare the elements with their productions. We asso-
ciate a nonterminal to produce an element and another
nonterminal to define the type of an element. These
nonterminals are ElementType and LineType respec-
tively. We add the Attlist nonterminal because the el-
ement can have attributes and the LineType can pro-
duce multiple elements (P2and P3). The Attlist can
produce attribute declarations, the AttType nontermi-
nal stands for the type of an attribute and the AttOp-
tion nonterminal produces the options for an attribute.
We write the productions P4-P7. The element type
can be (#PCDATA), ANY, EMPTY, composed or
generate a list of operators (P8-P11). To develop
the nonterminal List, let’s look at an example of the
element declaration <!element A(B,C*,(D|E),F*)>.
The structure of a type element is therefore a paren-
thesized expression with operators (+, *, …) and sep-
arators (, and |). We therefore suggest the productions
P12 -P17 for List. The productions P18 -P21 generate
terminal operators such as *, ?, +, and . The produc-
tions P22 -P25 generate the attribute type, which can
be CDATA, ID, IDREF and enumerate with initial
value. The productions P26 -P28 are used to generate
the character string enumerate in XML. The produc-
tions P29 -P33 are used to generate attribute options
with an initial value. The attribute options can be
#FIXED, #REQUIRED, #IMPLIED. The initial value
is generated in the P32 and P33 productions. This
value can be empty or defined as a string.
Based on this grammar, the derivation tree of the
first two lines of following DTD is shown in Fig. 5
DTD <!DOCTYPE A [
<!ELEMENT A(B,(C,D)) >
<!ELEMENT B(#PCDATA) >
<!ELEMENT C(#PCDATA) >
<!ELEMENT D(#PCDATA) >
]>
Figure 5: Example of a derivation tree for a given
DTD: The productions P0and P1generate the first
line of the DTD. The production P2defines the ele-
ment A and calls P11 using the nonterminal Element-
Type. We use the nonterminal List of productions
P11,P12,P16 and P17 to define the elements B, C and
D.
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
550
Volume 21, 2024
4 Attributed Grammar for
Simplifying DTD Declarations.
In this section, we give an example of an attributed
grammar for eliminating useless parenthesis element
declarations in DTDs. The structure of an element
declaration in a DTD is a rational expression on the
alphabet composed of the operator (,), the or operator
(|), the star operator (*), the exclamation mark (!), the
parentheses, and the identifiers. We distinguish three
categories of lists:
The lists that contain only the comma operator,
e.g. (B, C, D, E , F).
The lists that contain only the or operator. For
example, (B|C|D|E|F).
The list with the commas and an or operator. As
an example, (B, C |D, E ).
Now, we will define the notion of useless parenthesis
in the structure of the element declaration.
In general, parentheses are considered unnecessary if
they do not change the meaning of the declaration
and do not improve the readability of the code. In
our case, we consider parentheses unnecessary if they
only occur in inner lists. For example, the parenthe-
ses of the inner list (C, D) are useless in the list (B, (C,
D), E) and then (B, (C, D), E) can be reduced to the
corresponding list (B, C, D, E). In contrast, the paren-
theses of the inner list (C, D) should be retained in the
list (B |(C, D), E) as the operators of the inner and
outer lists are different.
The parentheses of the inner list (D, E, F) in the outer
list (B, C, ( D, E ,F), G) are useless, as the only opera-
tor of the list D, E, F is a comma (,) and the preceding
and following operators are also a comma (,). The
list (B, C ,(D, E, F), G) can be reduced to (B, C , D,
E, F, G). In addition, the parenthesis of the inner list
(C|D) should be retained in the list (A,B,(C|D),E). By
definition, the inner brackets are useless iff :
Its list has a unique operator (, or |).
The two operators before and after the inner list in
the outer list are the same and equal to the unique
operator of the inner list.
Now, we consider the occurrence of the inner list in
a derivation tree in the grammar GDT D where we ex-
press:
The unique operator of the inner list.
The preceding and following operators in the
form of attributes over the production of GDT D .
First, we need an attribute to calculate the reduced list
and the other parts of an element declaration; this at-
tribute is the synthesized attribute which is called ”d”.
The unique operator of an inner list is represented by
a synthesized attribute ”o”. To calculate the value
of ”o” for an inner list, we use another inherited at-
tribute o1”, which is initialized to null at the begin-
ning of an inner list. Thus, the productions and their
attributes are given in Table 1, Table 2, Table 3, Ta-
ble 4, Table 5, Table 6, and Table 7 (Appendix). We
can divide the productions of the grammar GDT D
into three groups:
group 1 : from P0to P11.
group 2 : from P12 to P17.
group 3 : from P18 to P33.
The productions of group 1 and group 3 reproduce in
the attribute ”d” all parts of the given DTD that are
outside the inner parentheses. If we use the DTD :
<! Doctype A [
<! ELEMENT A ((B, (C))) >
<! ELEMENT B (#REQUIRED) >
<! ELEMENT C (#REQUIRED) >
]>
The evaluation of the attributes via the productions of
group 1 and group 3 generates the main parts of the
previous DTD, namely:
<! Doctype A [
<! ELEMENT A ( ) >
<! ELEMENT B (#REQUIRED) >
<! ELEMENT C (#REQUIRED) >
]>
The attribute evaluation on the productions of group
2 decides whether the parentheses of the inner lists
(B, (C)) and (C) in the productions P14 ,P15,P16
should be eliminated or not. We note that the produc-
tion P11 generates the outer parentheses in the dec-
larations <! ELEMENT A (. . .)>and initializes
the preceding operator of the inner lists to the empty
string. The production P12 and P13 do the same as
P11 in terms of attributes. Note that if a nonterminal
appears multiple times in a production, these occur-
rences are numbered in order. Let us define in pro-
duction P14,o(List2) is the operator of the inner list
”List2”. The nonterminal List appears three times
in the production. These occurrences are numbered
from 0 to 2. o1(List0) is the operator preceding the
inner list List2”. The operator ”,” which appears
after the nonterminal Occurs is the following op-
erator of the list List2”. We check if the opera-
tor of the inner list List2 and the preceding oper-
ator o1(List0) are the same as the following operator
”,”. If it is true, we eliminate the parentheses, oth-
erwise we regenerate them in the statement defining
d(List0). The production P15 does the same as the
production P14 by using the operator |. The produc-
tion P16 verifies if the the operator preceding the inner
list o1(List0) is the same as the operator of the inner
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
551
Volume 21, 2024
list o(List2), then the parenthesis are eliminated, oth-
erwise they should be generated. The production P17
generates the last identifier.
5 Experimental Results
At the implementation level, the syntactic analyzer
consists of functions that correspond to all GDT D
productions while the semantic analyzer consists of
a single function for the evaluation of attributes in a
derivation tree. This evaluation function is a switch
statement, in which there is one case for each GDT D
production. Fig. 6 shows the structure of the DTD
compiler. It is worth mentioning here that the syntac-
tic analyzer is the same for all DTD conversion meth-
ods. However, we create a new attribute evaluation
function for each DTD conversion method. The de-
veloped program takes a DTD as input and produces
a DTD as output by calculating the structure of an ele-
ment and removing unnecessary parentheses. An ex-
ample of the DTD:
<!DOCTYPE A [
<!ELEMENT A ((B, (C)))>
<!ATTLIST A id ID #REQUIRED>
<!ELEMENT B (#PCDATA)>
<!ELEMENT C (((E) |F))>
<!ELEMENT E (#PCDATA)>
<!ELEMENT F (#PCDATA)>
]>
We obtain the following output:
<!DOCTYPE A [
<!ELEMENT A (B, C)>
<!ATTLIST A id ID #REQUIRED>
<!ELEMENT B (#PCDATA)>
<!ELEMENT C (E |F)>
<!ELEMENT E (#PCDATA)>
<!ELEMENT F (#PCDATA)>
]>
6 Conclusion
In this paper, we have created a new formal frame-
work for the set of DTDs in the form of a context free
language and the corresponding context free gram-
mar. The context free grammar generates a tree based
implementation, called derivation tree, for each DTD
in the computer memory. In addition to this, we
can enrich the context free grammar GDT D with at-
tributes:
Figure 6: Algorithm Architecture using Compiling
Techniques in DTD Conversion
Table 1: Semantic rules for the productions P0-P4
of the grammar GDT D
P0:
d(Start) = d(DTD)
P1:
d(DTD) = <!DocType & identifier & ”[” &
d(LineType) & ”] >"
P2:
d(LineType0) = <! Element” & identifier &
d(ElementType) > & d(AttList) & d(Line-
Type8)
P3:
d(LineType) = ””
P4
d(AttList) = <!ATTLIST” & identifier1 &
identifier2 & d(AttType) & d(AttOption) &
d(AttSuite) & >
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
552
Volume 21, 2024
Perform syntactic transformation of the source
DTD such as eliminating the unnecessary paren-
theses in the declaration in the DTD.
Rewrite the source DTD into an XML docu-
ment or an intermediate model while retaining
the DTD declarations. Then each DTD transfor-
mation can be expressed by an XSLT stylesheet
applicable to the intermediate model.
Our approach differs from existing approaches in that
it provides a formal framework for analyzing DTDs
based on compilation techniques. We define a con-
text free grammar that recognizes the language of the
DTD. Then any DTD conversion can be realized by
enriching this grammar with an attributed system.
Our contribution is that in the case of multiple trans-
formations, we can convert the DTD to another inter-
mediate model such as XML and apply a stylesheet to
the intermediate model for each transformation with-
out having to analyze the source DTD.
In the future, the entire framework, grammar with at-
tributes, can be used to split the model conversion
into two steps: The first step is to convert the DTD
into an intermediate model in terms of XML syntax,
while the second step of effective conversion can be
expressed by an XSLT stylesheet. We will also ex-
plore a way to map DTDs to relational schemas and
use traditional relational database engines to process
XML documents, since XML is emerging as the data
format of the Internet and there is a need for storing
and querying XML documents.
Acknowledgment:
Special thanks to Miss Ferial Srour Nemr for her
excellent proofreading.
References:
[1] A. M. Saba, E. Shahab, H. Abdolrahimpour, M.
Hakimi, and A. Moazzam, A comparative
analysis of xml documents, xml enabled
databases and native xml databases, arXiv, vol.
abs/1707.08259, 2017. [Online]. Available:
https://arxiv.org/abs/1707.08259.
[2] M. Reed, Python Programming and SQL,
Independently published, January 2023.
[3] S. Lu, Y. Sun, M. Atay, and F. Fotouhi, A new
inlining algorithm for mapping XML DTDs to
relational schemas, in: Jeusfeld, M.A., Pastor,
O. (eds) Conceptual Modeling for Novel
Application Domains. Lecture Notes in
Computer Science, Springer, Berlin,
Heidelberg., vol. 2814, 2003, pp. 366–377.
[4] P. Bohannon, J. Freire, P. Roy, and J. Simeon,
Form XML Schemas to relations: a cost based
approach to XML Storage, in Proceedings of the
18th International Conference on Data
Engineering (ICDE2002) IEEE Computer
Society, 2002, p. 564–580.
[5] P. Bohannon, J. Freire, J. Haritsa, M. Ramanath,
P. Roy, and J. Simeon, Bridging the XML
relational divide with LegoDB: a demonstration,
in Proceedings 19th International Conference
on Data Engineering (Cat. No.03CH37405),
Bangalore, India, 2003, p. 759–761, doi:
10.1109/ICDE.2003.1260859.
[6] R. Bourret, C. Bornhovd, and A. Buchmann, A
generic load/extract utility for data transfer
between XML documents and relational
databases, in Proceedings Second International
Workshop on Advanced Issues of Ecommerce
and Web Based Information Systems, WECWIS
2000, Milpitas, CA, USA, 2000, pp. 134–143,
doi: 10.1109/WECWIS.2000.853868.
[7] Y. Chen, S. Davidson, and Y. Zheng,
Constraints preserving XML storage in relations,
International Workshop on the Web and
Databases, 2002. [Online]. Available:
https://api.semanticscholar.org/Cor-
pusID:59848379
[8] D. Lee, M. Mani, and W. Chu, Schema
conversion methods between XML and
relational models, B. Omelayenko, M. Klein
(Eds.), Knowledge Transformation for the
Semantic Web, IOS Press, 2003, pp. 1–17,.
[9] T. Lv and P. Yan, Mapping DTDs to relational
schemas with semantic constraints, Information
and Software Technology, vol. 48, no. 4, 2006,
pp. 245–252.
[10] Z. Tan, J. Xu, W. Wang, , and B. Shi, Storing
normalized xml documents in normalized
relations, in The Fifth International Conference
on Computer and Information Technology
(CIT’05), Shanghai, 2005, pp. 123–129, doi:
10.1109/CIT.2005.175.
[11] A. J. Rafsanjani and S. H. Mirian
Hosseinabadi, RIAL: Redundancy Reducing
Inlining Algorithm to Map XML DTD to
Relations, In Proceedings of International
Conferences on Computational Intelligence for
Modelling Control and Automation, 2008, pp.
25–30, doi: 10.1109/CIMCA.2008.19.
[12] Q. Zhu and W. Yang, Mapping from the XML
Schema to the Relational Database with
Functional Dependency Preserved, in 2010
International Conference on Machine Vision
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
553
Volume 21, 2024
and Human machine Interface, Kaifeng, China,
2010, pp. 412–415.
[13] A. A. A. El Aziz and A. Kannan, Survey of
mapping xml dtds(documents) to relational
schemas, International Journal of Computer
Science and Management Research, vol. 2,
2013, pp. 1175–1193.
[14] D. Knuth, Semantics of context free languages,
Math. Systems Theory, vol. 2, 1968, pp.
127–145, https://doi.org/10.1007/BF01692511.
[15] K. Barbar and K. Musumbu, Implementation of
abstract interpretation algorithms by means of
attribute grammar, in Proceedings of 26th
Southeastern Symposium on System Theory,
Athens, OH, USA, 1994, pp. 87–92, doi:
10.1109/SSST.1994.287905.
[16] K. Barbar, Automatic generator of xml
documents editors based on attributed
grammars, in Proceedings of the 5th
international conference on Soft computing as
transdisciplinary science and technology, 2008,
pp. 166–172,
https://doi.org/10.1145/1456223.1456260.
[17] D. Li, X. Li, V. Stolz, QVT based model
transformation using XSLT, ACM SIGSOFT
Software Engineering Notes, Vol.36, No.1,
2011,pp. 1-8,
https://doi.org/10.1145/1921532.1921563.
[18] H. Mizumoto, N. SUZUKI, An XSLT
transformation method for distributed XML,
Journal of information processing, Vol.23, No.3,
2015,pp. 353-365, doi:10.2197/ipsjjip.23.353.
[19] A. V. Aho, Compilers: Principles, Techniques,
and Tools,ser. Always Learning. Pearson, 2014.
[20] T. Mogensen, Introduction to Compiler Design,
ser. Undergraduate Topics in Computer Science.
Springer Cham, 2024. [Online]. Available:
https://doi.org/10.1007/978-3-031-46460-7
Contribution of Individual Authors to the
Creation of a Scientific Article (Ghostwriting
Policy)
SOUHEIL TAWK is a PHD student of the the-
sis ”Systems based on attribute grammars for
model conversion: Application to the conver-
sion of DTD into relational schema”. He carried
out the experimental and the optimization parts.
MAROUN ABI ASSAF was responsible of the
Logic, algorithms, and the bibliography parts.
JOSEPH CONSTANTIN is the cosupervisor
of the thesis. He was responsible of find-
ing the Grammar productions, the attributed
Grammar equations, and writing of the paper.
KABLAN BBRBAR is the supervisor of the thesis.
He was responsible of the Logic part and the imple-
mentation the attributed Grammar equations.
JACQUES BOU ABDO was an examiner
of the thesis. He was responsible for the
correction of the paper from logic errors.
Sources of Funding for Research Presented in a
Scientific Article or Scientific Article Itself
No funding was received for conducting this study.
Conflicts of Interest
The authors have no conflicts of interest to
declare that are relevant to the content of this
article.
Creative Commons Attribution License 4.0
(Attribution 4.0 International , CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
554
Volume 21, 2024
APPENDIX
Table 2: Semantic rules for the productions P5-P10
of the grammar GDT D
P5:
x(AttList) =
P6
d(Attsuite0) = identifier & d(AttType) & d(At-
tOption) & d(AttSuite4)
P7:
d(AttSuite) =
P8:
d(ElementType) = ”Empty”
P9:
d(ElementType) = ”ANY”
P10:
d(ElementType) = ”#PCDATA”
Table 3: Semantic rules for the productions P11-P13
of the grammar GDT D
P11:
o1(List2) =
d(ElementType) = ”(” & d(List) & ”)” & d(Oc-
curs)
P12:
o1(List4) =
0,0if(o1(List0) =00 or
o1(List0) =0,0)
0m0,otherwise
d(List0) = id & d(Occurs)& ”,” & d(List4)
o(List0) = o(List4)
P13:
o1(List4) =
0|0, if(o1(List0) =00 or
o1(List0) =0|0)
0m0,otherwise
d(List0)= id & d(Occurs) & | & d(List4)
o(List0) = o(List4)
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
555
Volume 21, 2024
Table 4: Semantic rules for the productions P14-P15
of the grammar GDT D
P14:
o1(List2) =’
o1(List6) =
0,0, if(o1(List0) =00 or
o1(List0) =0,0)
0m0,otherwise
if (o1(List0) = o(List2) & o(List2) =’,’) then
d(List0) = d(List2) & d(Occurs) & ”,” & d(List6)
else
d(List0) = ( & d(List2) & ) & d(Occurs) &
”,” & d(List6)
endif
o(List0) =o(List6)
P15:
o1(List2) =’
o1(List6) =
0|0, if(o1(List0) =00 or
o1(List0) =0|0)
0m0,otherwise
if (o1(List0) = o(List2) & o1(List2) = |’) then
d(List0) = d(List2) & d(Occurs) & | & d(List6)
else
d(List0) = ”(” & d(List2) & ”)” & d(Occurs) &
| & d(List6)
endif
o(List0) = o(List6)
Table 5: Semantic rules for the productions P16-P20
of the grammar GDT D
P16:
o1(List2) =’
if (o1(List0) = o(List2) ) then
d(List0) = d(List2) & d(Occurs)
else
d(List0) = “(” & d(List2) & ”)” & d(Occurs)
endif
o(List0) = o1(List0)
P17:
d(List) = identifier & d(Occurs)
o(List) = o1(List)
P18:
d(Occurs) = ”*”
P19:
d(Occurs) = ”?”
P20:
d(Occurs) =
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
556
Volume 21, 2024
Table 6: Semantic rules for the productions P21-P27
of the grammar GDT D
P21:
d(Occurs) = ”+”
P22:
d(AttType) = ”CDATA”
P23:
d(AttType) = ”ID”
P24:
d(AttType) = ”IDREF”
P25:
d(AttType) = ”(” & d(Enumerate) & ”)” & d(Ini-
tialVal)
P26:
d(Enumerate) = string & d(EnumerateSuite)
P27:
d(EnumerateSuite) = d(Enumerate)
Table 7: Semantic rules for the productions P28-P33
of the grammar GDT D
P28:
d(Enumerate) = ””
P29:
d(AttOption) = ”#” & ”FIXED” & d(InitialVal)
P30:
d(AttOption) = ”#REQUIRED”
P31:
d(AttOption) = ”#IMPLIED”
P32:
d(InitialVal) = string
P33:
d(InitialVal) =
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.51
Souheil Tawk, Maroun Abi Assaf,
Joseph Constantin, Kablan Barbar,
Jacques Bou Abdo
E-ISSN: 2224-3402
557
Volume 21, 2024