THE 3D XML BENCHMARK
Mohammed Al-Badawi, Siobhán North, Barry Eaglestone
Department of Computer Science, The University of Sheffield, Sheffield, UK
m.badawi@dcs.shef.ac.uk,s.north@dcs.shef.ac.uk, b.eaglestone@sheffield.ac.uk
Keywords: XML Benchmark, XQuery Processing, Performance Evaluation.
Abstract: In the context of benchmarking XML implementations, several XML benchmarks have been produced to
either test the application’s overall performance or evaluate individual XML functionalities of a specific
XML implementation. Among six popular XML benchmarks investigated in this article, all techniques rely
on code-generated datasets which disregard many of XML’s irregular aspects such as varying the depth and
breadth of the XML documents’ structure. This paper introduces a new test-model called the “3D XML
benchmark” which aims to address these limitations by extending the dataset and query-set of existing XML
benchmarks. Our experimental results have shown that XML techniques can perform inconsistently over
different XML databases for some query classes, thus justifying the use of an improved benchmark.
1 INTRODUCTION an individual XML techniques is determined by two
main factors; the nature of the XML database
In the context of XML technology, an XML processed and the inclusion of these features (e.g.
benchmark is a tool for evaluating and comparing the database’s three dimensions) in the XQuery
the performance of new XML developments with syntax.
existing XML technology (Lu et al. 2005). Because The rest of this paper is organized as follows.
of the nature of XML databases and the variety of Section 2 reviews the XML benchmarking while the
different platforms used to store these databases (e.g. new benchmark is introduced and tested in Sections
RDBMS and OO-RDBMS), the benchmarking 3 and 5 respectively. Section 4 describes a node-
process mainly examines the performance of the based scaling algorithm used by the new benchmark
underlying storage-model, the associated query to reduce the size of the XML databases; and
processor and the update handler (Lu et al. 2005). In Section 6 concludes the paper.
terms of query processing, the literature (Schmidt et
al. 2001) identified ten functionalities to be tested:
XML data bulk-loading, XML reconstruction, path 2 RELATED WORK
traversals, data-type casting, missing elements, order
access, reference navigation, joins, construction of XML benchmarks can be divided into application
large results, containment and full-text searching. benchmarks and micro benchmarks. This section
Most of the existing XML benchmarks evaluate reviews six most popular XML benchmarks from
the above functionalities using an XML application both categories, showing their strengths and
scenario where a benchmark consists of one or more weaknesses. The characteristics of these benchmarks
interrelated XML documents with a limited variation are summarised in Table 1.
–in most cases- in terms of the database’s
dimensions including the depth, breadth and size. XMark: this benchmark (Schmidt et al. 2002) is
The query-set pays little or no attention to the impact widely used by the XML development community
of the document’s nested structure on the XML because it generates XML databases of any size and
querying/updating processes. This paper introduces covers all XML query-able aspects identified in
a new XML test-model (called “The 3D XML (Schmidt et al. 2001). The underlying dataset
Benchmark”) that extends the existing benchmarks’ consists of a single, code-generated XML document
design to include these features. Experiments, of a size controlled by a positive floating-point
discussed in Section 5, show that the performance of
, Tab 1: A Comparison between different XML benchmarks
Dataset Query-set
DB #of #of
Benchmark #of Min/Max Depth
Source Environment Size Search Update
Docs Depth Aware?
(TC/DC)† Queries Queries
Controlled
1 TC by SF: tiny
XMark Synthetic 1 12 20 0 No
rest DC (KB)
to huge (GB)
Small
Majority DC
XOO7 Synthetic 1 Medium 5 23 0 No
Few TC
Large
Small (10MB)
Normal(199MB) Limited
XBench Synthetic Mixed Mixed 20 0 No
Large(1GB) variation
Huge(10GB)
Multi
Mostly TC 2KB to 100KB
XMach~1 Synthetic (104-107 6 levels 8 3 No
Few DC per document
docs)
Multiple of
Michigan
Synthetic DC 1 728KB nodes 5 to 16 28 3 Yes
(MBench~v1)
Max. 100 times
6
3.610 controlled
Mix of TC and
TPoX Synthetic to 3KB-20KB each by 7 10 No
DC
3.61011 template
†
TCDocument-centric DB, DCData-centric DB
scaling factor (SF=1.0 produces 100MB), and with a Although it uses a template-based generation
depth which is always 12. algorithm which simulates some real database
Although it simulates a real-life database scenarios, the features of the database produced are
scenario, elements in the corresponding XML tree restricted by the features encoded in the generation
tends to be evenly distributed at each level. This templates. The benchmark also does not incorporate
feature omits several irregular aspects of the the document’s depth-variation into the XML
underlying database such as the diversity in the querying process.
node’s fanouts. Using fixed-depth XML documents
XMach~1: Among the benchmarks investigated in
is also an issue in this benchmark.
this paper, XMach~1 (Böhme and Rahm 2003) is a
XOO7: This benchmark (Li et al. 2001) is the XML benchmark that targets multi-user environments
version of the “OO7” (Carey et al. 1993), an using a Web-based application scenario. This section
evaluation technique used to benchmark object- only discusses the structure of underlying dataset
oriented RDBMS. The benchmark’s dataset contains and query-set.
a single document, translated from its base The benchmark’s dataset can contain a huge
benchmark. The XML file can be produced in three number (104 to 107) of XML documents with file-
versions: small, medium and large. Regardless its size ranges from 2KB to 100KB. The interrelated
size, the depth of any generated XML file is always XML documents are generated by a parameterised
5. algorithm which controls the size of the document,
Using code-generated, fix-depth, and only 3- the number of elements and attributes, the length of
levels of document’s size make the benchmark textual-contents, and the number of levels in each
impractical for scalability tests and other irregularity document. The number of levels is restricted to 6
evaluation. levels in all documents, and the variation in the
number of levels is not incorporated in the query-set
XBench: XBench (Yao et al., 2004) is another XML
design.
benchmark which uses code-generated XML
Besides the depth-restriction, XML documents
documents. In XBench, the underlying dataset can
generated by XMach-1 are very small, making the
be one of four types: single-document/data-centric
benchmark inappropriate for evaluating large scale
(SD/DC), single-document/text-centric (SD/TC),
implementations and/or scalability testing.
multiple-documents/data-centric (MD/DC) and
Furthermore, the query-set does not cover all XML
multiple-documents/text-centric (MD/TC). The size
query-able functionalities identified in (Schmidt et
of these documents varies from small (10MB),
al. 2001); examples include path traversal, joins, and
normal (100MB), large (1BG) and huge (10GB); but
aggregation.
the depth ranges over a very limited domain.