Native XML Databases

OSGi, Eclipse Equinox, ECF, Virgo, Gemini, Apache Felix, Karaf, Aires, Camel, Eclipse RCP

HBase, Hadoop, ZooKeeper, Cassandra

Flex4, AS3, Swiz framework, GraniteDS, BlazeDS etc.

There is nothing that software can't fix. Unfortunately, there is also nothing that software can't completely fuck up. That gap is called talent.

About Me

Overview

As defined by the members of the XML:DB mailing list, a native XML database is one that:

Defines a (logical) model for an XML document -- as opposed to the data in that document -- and stores and retrieves documents according to that model. At a minimum, the model must include elements, attributes, PCDATA, and document order. Examples of such models are the XPath data model, the XML Infoset, and the models implied by the DOM and the events in SAX 1.0.

Has an XML document as its fundamental unit of (logical) storage, just as a relational database has a row in a table as its fundamental unit of (logical) storage.

Is not required to have any particular underlying physical storage model. For example, it can be built on a relational, hierarchical, or object-oriented database, or use a proprietary storage format such as indexed, compressed files.

Native XML databases fall into two broad categories:

Document-based storage Store the entire document in text or binary form and provide some sort of database functionality in accessing the document. A simple strategy for this might store the document as a BLOB in a relational database or as a file in a file system and provide XML-aware indexes over the document. A more sophisticated strategy might store the document in a custom, optimized data store with indexes, transaction support, and so on.
Node-based storage Store individual nodes of the document (such as the DOM or a variant thereof) in an existing or custom data store. For example, this might map the DOM to relational tables such as Elements, Attributes, Entities or store the DOM in pre-parsed form in a data store written specifically for this task. This includes the category formerly known as "Persistent DOM Implementations".

There are two major differences between the two strategies. First, document-based storage can exactly round-trip the document, down to such trivialities as whether single or double quotes surround attribute values. Node-based storage can only round-trip documents at the level of the underlying document model. This should be adequate for most applications but applications with special needs in this area should check to see exactly what the database supports.

The second major difference is speed. Document-based storage obviously has the advantage in returning entire documents or fragments. Node-based storage probably has the advantage in combining fragments from different documents, although this does depend on factors such as document size, parsing speed (for document-based storage), and retrieval speed (for node-based storage). Whether it is faster to return an entire document as a DOM tree or SAX events probably depends on the individual database, again with parsing speed competing against retrieval speed.

Native XML databases differ from XML-enabled databases in three main ways:

Native XML databases can preserve physical structure (entity usage, CDATA sections, etc.) as well as comments, PIs, DTDs, etc. While XML-enabled databases can do this in theory, this is generally not (never?) done in practice.
Native XML databases can store XML documents without knowing their schema (DTD), assuming one even exists. Although XML-enabled databases could generate schemas on the fly, this is impractical in practice, especially when dealing with schema-less documents.
The only interface to the data in native XML databases is XML and related technologies, such as XPath, the DOM, or an XML-specific API, such as the XML:DB API. XML-enabled databases, on the other hand, offer direct access to the data, such as through ODBC.

For more information about native XML databases, see "Native XML Databases".

Related categories

Content (Document) Management Systems: Applications built on top of native XML databases and/or the file system for content/document management. Include features such as check-in/check-out, versioning, and editors.
XML-Enabled Databases: Databases with extensions for transferring data between XML documents and themselves. Some of these also support native XML storage.

Products

4Suite, 4Suite Server

Developer: FourThought
URL: http://4suite.org/index.xhtml
License: Open Source
Database type: Object-oriented
Entry last updated: November, 2001

From the Web site:

"4Suite is a collection of Python tools for XML processing and object database management. It is an integrated package of several components: 4DOM, DbDom, 4XPath, 4XSLT, 4XPointer, 4XLink, 4RDF and 4ODS. All the tools are designed for maximum extensibility through custom Python code."

"DbDOM is a DOM implementation that is stored persistently in a 4ODS object database in order to support arbitrarily large documents and applications with specialized persistent needs."

"4XPath is a library implementing the W3C's full XPath 1.0 specification for indicating and selecting portions of an XML document."

"4Suite Server is a platform for XML processing. It is an XML data repository with a rules-based engine. It supports DOM access, XSLT transformation, XPath and RDF-based indexing and query, XLink resolution and many other XML services. It also provides support services such as distributed transactions, and access control lists. It supports remote, cross-platform and cross-language access through CORBA, SOAP and HTTP GET."

Berkeley DB XML

Developer: Oracle (formerly owned by Sleepycat Software)
URL: http://www.oracle.com/database/berkeley-db/xml/index.html
License: Open Source
Database type: Key-value
Entry last updated: December, 2005

Berkeley DB XML is a native XML database built on top of Berkeley DB, adding an XML parser, XML indexes, and an XQuery engine. From Berkeley DB it inherits a storage engine, transaction support (including XA), automatic recovery, and other features.

Berkeley DB XML stores XML documents in logical groupings called containers, which are the same as collections in other native XML databases. Users can specify a number of properties on a per-container basis, including whether to validate documents, whether to store documents whole or as individual nodes, and what indexes to create (element, attribute, or metadata). It is worth noting that schemas are specified through schemaLocation hints in documents, rather than being associated with the container as a whole.

In addition to storing XML documents, Berkeley DB XML can store non-XML documents (in the underlying Berkeley DB data store) as well as metadata for XML documents. The latter take the form of user-specified property-value pairs and can be queried as if they were child elements of the root element, although they do not actually appear in stored XML documents.

Berkeley DB XML supports XQuery as its query language. It provides an API for updating documents that uses XQuery to identify a set of nodes to update and allows users to append a new child to a target node, insert a new node before or after a target node, remove a target node, rename a target node, or change the value of a target node. Updates are performed at the node level.

Like Berkeley DB, Berkeley DB XML is a library that is linked directly to applications, rather than being used in client-server mode. It has a command-line interface as well as APIs for C++, Java, Tcl, Perl, Python, and PHP. Third-party APIs for other languages are available as well.

Birdstep RDM XML

Developer: Birdstep
URL: http://www.birdstep.com/database_technology/rdm_xml.php3
License: Commercial
Database type: Object-oriented
Entry last updated: October, 2002

Birdstep RDM XML is a native XML database built on top of the Birdstep RDM Mobile database. Documents can be stored in one of two ways. DTD-less documents are stored using a set of predefined classes that roughly correspond to the information items defined in the XML Information Set. Documents with DTDs can be stored with these classes or with classes generated from the DTD. The latter classes provide an "XML data binding" solution; that is, binding of XML documents to DTD-specific classes. In addition, they inherit from the generic classes, so they can also be viewed generically.

Birdstep RDM XML implements XPath as its query language. XPath queries can be compiled and stored in the database for faster re-execution. It also provides implementations of DOM and SAX. The DOM interface is available in both Java and C++ and is live. That is, changes made through the DOM are immediately visible to other users. The SAX implementation uses Expat to parse incoming documents; the database itself can act as a SAX Reader for outgoing documents.

The Birdstep RDM Mobile database on which Birdstep RDM XML is based is an "ultra small" database designed for use on handheld devices and in embedded systems. It stores data as "molecules", each of which consists of two or three "atoms": a content atom (which contains the actual data value when it is longer than 4 bytes), an instance atom (which contains data values of 4 bytes or less, or pointers to data values longer than 4 bytes), and a type atom (which defines the type of an instance atom). Because content atoms are reused, no data longer than 4 bytes is stored more than once. In addition, content atoms contain pointers to all instances, so they serve as indexes. By constructing chains of instances and links between chains, Birdstep RDM Mobile can store virtually any kind of data structure, including XML.

Birdstep RDM Mobile supports object inheritance, encapsulation, and polymorphism; indexes; queries based on object type, value, or hierarchical relation; and transactions.

Centor Interaction Server

Developer: Centor Software Corp.
URL: http://www.centor.com/solutions/technology.shtml
License: Commercial
Database type: Proprietary
Entry last updated: September, 2001

Centor Interaction Server consists of the following parts. (Description quoted from the Web page.)

"Interaction Store - ... Data resides in the Interaction Store in a structured and unstructured layout, using XML documents with a fast and scalable indexing technology. ..."

"Data Engines - The Centor Interaction Server has four engines used to manage and manipulate data in the Interaction Store. ..."

"o The Query Engine gives users the ability to categorize, organize and search data ... including capabilities for popular decision support features such as "drill-down", query by range of values, compare/contrast, and content export to other applications via XML data interchange. ...

"o The Index Engine indexes all data stored in the Interaction Store. It supports advanced searching capabilities that provide intelligent search criteria via either metadata, attribute and value, or text keywords ..."

"o The Update Engine is a command-driven application that is used to create or modify XML entries using the Centor Data Exchange Language (CDXL). CDXL is a neutral file format used to exchange data between the Interaction Server and external sources."

"o The Data Processing Engine provides complex decision support functionality such as rules-based query, traffic lighting, and complicated mathematical calculations."

"Security Engine - The Interaction Server provides a complete security policy management infrastructure for Web-based enterprise applications. It manages end user information including user name, password, departments and the role of each user. ..."

"Data Access (API) - The Data Access layer provides two-way content access and integration capabilities. The Interaction Server offers a number of Application Programming Interfaces (APIs), including URL, C++, and EJB programming interfaces. This layer also provides ODBC and JDBC interfaces used to access data stored in relational databases."

Centor Interaction server also includes a workflow engine, styling engine, and presentation manager.

DBDOM

Developer: K. Ari Krupnikov
URL: http://dbdom.sourceforge.net/
License: Open Source
Database type: Relational
Entry last updated: November, 2000

DBDOM is an implementation of the DOM over a relational database, using a fixed set of tables to store the DOM tree in the database. DOM methods are implemented as stored procedures, but also included are a set of adapters so these can be called from Java. The initial version will run on PostgreSQL, with later versions planned for Oracle, DB2, and Microsoft SQL Server.

dbXML

Developer: dbXML Group
URL: http://www.dbxml.com/product.html
License: Open Source
Database type: Proprietary
Entry last updated: March, 2004

dbXML is a native XML database that supports four different data stores. The first of these is a proprietary data store that uses B trees. The second is an in-memory data store, which is used for temporary storage and whose contents are deleted when the database is stopped. The third is the file system. And the fourth is a mapping to a relational database (it is not known what mapping is used). Which data store to use is specified on a per-collection basis.

dbXML has a directory-like collection model. Collections can be nested and can store documents that match any XML schema, although it is suggested that a single collection contain documents that match a single XML schema to simplify indexing and querying. Collections can also contain binary streams (such as JPEG files), although a collection cannot contain both binary streams and XML documents.

dbXML supports XPath, XSLT, XUpdate, and full-text searches. XPath and XSLT have been extended for use against collections, and both XSLT and full-text searches can be run against the results of an XPath query.

dbXML supports three different types of indexes. Name indexes index element and attribute names. Value indexes index element and attribute values and support strings, characters, bytes, integers, real numbers, and booleans. Full text indexes index tokens in element and attribute values. They are case insensitive and actually index word stems; for example, both "happening" and "happen" have the same stem. Individual indexes are associated with a particular collection and users specify what to index according to an XPath-like expression.

dbXML supports triggers. These are user-specified Java classes that can be fired before or after an insert, update, delete, or data retrieval. They can be used to do such things as validating documents on insertion or modifying documents on retrieval. dbXML also supports extensions to the server through Java classes.

dbXML supports transactions and security. Security options are no security, a single user name and password for the entire database, and role-based security (the default).

dbXML has four different APIs: the direct API, the client API, XML:DB, and Web services. The direct API allows applications to work directly with dbXML. The client API allows applications to use dbXML in client-server fashion. This can be done where both client and server are in the same process, or through XML-RPC. The Web services interface supports both XML-RPC and REST (URL encoding).

dbXML comes with a set of command line tools for connecting to the database, managing collections, indexes, security, triggers, and extensions, and storing and retrieving documents.

NOTE: dbXML is a complete rewrite of the code that became Xindice and is therefore different from that product.

Dieselpoint

Developer: Dieselpoint, Inc.
URL: http://www.dieselpoint.com/xmlsearch.html
License: Commercial
Database type: None (indexes only)
Entry last updated: January, 2007

Dieselpoint is a search engine, not a native XML database. It indexes documents and data specified by the user and then executes queries against those indexes. Dieselpoint is written in Java will run in any J2EE-compliant application server. It is designed to be called from a user-written application and its API is designed with such applications in mind. For example, it returns metadata about search results so applications can dynamically create user interfaces relevant to those results. Applications can call Dieselpoint through a Java API, a JSP front end, JDBC, or XML. For users who do not want to write their own application, Dieselpoint ships with a number of sample applications (including a product catalog application) and a generic, JSP-based user interface that is "suitable for common uses".

Dieselpoint indexes documents and data retrieved by a crawler from Web sites, directories, and databases. It can index documents (XML, HTML, PDF, Microsoft Office), databases (via JDBC), and flat files (comma-separated, tab-separated, and so on). Data in other formats can be indexed via calls to a user-implemented API. The indexer extracts data in the form of attributes, such as document metadata, XML elements and attributes, and database columns. A preprocessor allows user-written code to modify, categorize, or reject items before they are indexed.

Dieselpoint uses a proprietary query language, which supports full-text and parametric searching. (Parametric searching limits a search to a particular attribute, such as a title, part number, or description.) Search clauses can be joined in any way by AND, OR, NOT, and parentheses, and can include comparisons (=, >, >=, <, <=, <>), wildcards, and regular expressions. Full-text features include stemming, thesauruses, stop words, misspellings, relevance, hit highlighting, and support for 40 languages and 140 dialects. Search results can be returned as a JDBC result set or XML document and can be sorted by relevance or attribute value.

XML-specific features include searching by element or attribute and by XML path. (The indexer preserves the XML hierarchy.) The query engine can return complete documents or fragments, and can also treat fragments of a document (headed by a particular element name) as separate documents. Dieselpoint understands both ECCMA (an XML language for catalogs) and Dublin Core and provides special processing for both. In addition, it can handle XMP metadata (RDF documents) embedded in PDF documents.

Dieselpoint includes an adminstrator for performing such tasks as managing indexes, defining data sources, and scheduling the crawler. It also contains a Web server and servlet container.

DOMSafeXML

Developer: Ellipsis
URL: http://www.ellipsis.nl/content/products.htm
License: Commercial
Database type: File system(?)
Entry last updated: June, 2004

DOMSafeXML is a main-memory native XML database that stores XML files on disk and monitors them "for external changes". It supports XPath, SAX, DOM level 2, and the XML:DB API, with language bindings for COM, C++, Java, and C#. DOMSafeXML supports multi-user access through transactions and node-level locking and comes with a built-in Web server.

eXist

Developer: Wolfgang Meier
URL: http://exist.sourceforge.net
License: Open Source
Database type: Proprietary
Entry last updated: March, 2004

eXist is a native XML database that uses a proprietary data store (B+ trees and paged files). It can be run as a standalone database server, as an embedded database, or in the servlet engine of a Web application. Documents are stored in a hierarchy of collections. Collections can contain child collections and do not constrain documents to any particular schema or document type.

eXist supports XQuery/XPath 2.0 and XQuery statements can query any combination of collections and documents. eXist does not support strong data typing but does provide a number of extensions to XQuery. In particular, eXist's implementation of XQuery can execute full text searches, call the XML:DB API (such as to store query results in the database), execute dynamically constructed XQuery statements, apply XSLT stylesheets to a node, work with HTTP, and execute arbitrary Java methods. eXist also provides partial support for XInclude and XPointer.

Updates are primarily supported through XUpdate. When eXist is being used as an embedded database, live DOM trees are supported as well. eXist supports the XML:DB API, with additional services for preparing and executing XQuery statements, managing users, managing multiple database instances, and querying indexes. DOM and SAX are supported for documents returned through the XML:DB API. eXist can also be called via XML-RPC, a REST-style Web services API, SOAP, and WebDAV.

eXist automatically indexes all element and attribute structure. By default, it creates full text indexes over all text and attribute values, but users can turn this off for selected parts of a document. It supports concurrent read/write access for multiple users, as well as access control at both the collection and document level. It does not currently support transactions.

Of note, eXist has complete documentation.

eXtc

Developer: M/Gateway Developments Ltd.
URL: http://www.mgateway.tzo.com/php/mgw/extc.php
License: Commercial
Database type: Post-relational (Cache)
Entry last updated: March, 2004

eXtc is a native XML database built on top of the Cache database. It provides an implementation of the DOM over Cache, storing documents as DOM objects. Because eXtc is written in Cache ObjectScript (Cache's extension of the MUMPS programming language), the DOM implementation inherits the features of that language, such as integral support for transactions, multi-user access, and remote access. DOM level 2 is supported, with additional support for the Abstract Schemas and Load and Save features of DOM level 3.

eXtc supports XPath queries over individual documents, as well as SQL queries against the Cache tables used to store DOM trees. The latter is useful for SQL-like queries -- for example, finding the value of all CustomerName elements -- as well as finding the IDs of documents that match a certain criteria.

eXtc also supports XSL-FO, SVG (through a library of functions for creating SVG documents), WebDAV, and HTTP access through Cache's WebLink module. In addition, it includes client and server implementations of SOAP and WSDL, which allow Cache applications to be exposed as Web services and to be integrated with Web services.

Extraway

Developer: 3D Informatica
URL: http://www.3di.it/h3/h3/aSito_3DI/finizio (Italian)
http://www.3di.it/nuovo/html/extraway.pdf (English, PDF)
License: Commercial
Database type: Files plus indexes
Entry last updated: August, 2005

From the company:

"Extraway is a native XML database that is designed to preserve data as "Information Units", which are objects defined by the database administrator and which use an XML data model. By default, information units correspond to the root element of a document, but can also correspond to lower-level elements. For example, a single document may contain multiple bibliographic records where each record is considered to be a single information unit."

"Extraway supports synchronous and asynchronous use cases. In the synchronous case, client applications create and retrieve information units. The engine receives XML information units and stores them on a private area of the file system. The storage process is configured by the database administrator, who determines the aggregation policy, the directory, and the file name settings. For example, the administrator can organize the file system by year, department, and classification and arrange information units of a given type in the same XML document."

"During the aggregation process, Extraway adds system metadata for each unit: time/username of submission/modification, an integrity hash, and the current versions of the organization's structure, DTD/XML Schema, classification plan, and client software."

"Extraway also manages multimedia objects, storing them in the same directory as the corresponding information unit. For the most common formats, the text is extracted and indexed, and metadata like size, resolution, compression, duration, and hash code are extracted and added to the information unit."

"In the asynchronous case, Extraway monitors XML files at a given network address and simply indexes them."

"Indexes are built relative to the root of each information unit. Each index has a specific type (string, number, or date) and can index the entire content of an element or attribute or individual tokens within that content. Indexes can also concatenate individual values, or be created from custom code that is run at index time. Indexes can be built on demand, at regular intervals, or, by default, in response to events such as adding, deleting, and modifying information units."

"Extraway has a proprietary query language that allows users to combine path expressions with boolean operators. Path expressions can be declared and aliased at design time. This allows path details to be abstracted, which is useful when merging different paths in the same model and in handling different DTD / XML Schema versions. The language supports equality, arithmetic, and full-text operators. Query results are returned as a named result set, which can be browsed, refined, referenced in other queries, or made persistent."

"Extraway can also be queried with SQL, which is used for its joining operators: when the selected columns return non-repetitive values, the result is a trivial table having information unit as rows; in the other cases the result is an array of information unit identifiers."

"Extraway includes a GUI-based DTD editor, an administration console, Java, .Net, and Web services APIs, and a JDBC driver. Other features include support for thesauruses and encryption of XML units stored on the file system."

GoXML DB

Developer: XML Global
URL: http://www.xmlglobal.com/prod/db/index.jsp
License: Commercial
Database type: Proprietary (Model-based)
Entry last updated: May, 2003

GoXML DB is the same as XStreamDB. For more information, see the XStreamDB entry.

Infonyte DB (formerly PDOM)

Developer: Infonyte
URL: http://www.infonyte.com/prod_db.html
License: Commercial
Database type: Proprietary (Model-based)
Entry last updated: February, 2002

Infonyte is a native XML database built from two components: Infonyte PDOM (Persistent DOM) and Infonyte XQL (which can be purchased separately). Infonyte PDOM is a storage engine for storing the XML documents in indexed, binary files. The PDOM engine provides an implementation of the DOM over these files. The DOM implementation can handle arbitrarily large documents because it swaps DOM nodes to disk as needed. It includes defragmentation and garbage collection facilities, commit points (for writing the in-memory tree to disk), file compression with gzip, and thread-safe operation.

Infonyte XQL is an implemenation of XQL with extensions for variables, multi-document queries, restructuring of query results, full-text search, result construction, and sequencing. It is addressable through HTTP.

Version 2.0 (feature complete and in beta as of December, 2001) includes support for XPath, DOM Level 2, and XSLT. XSLT support is provided by Xalan, which has been extended to work directly on data in the database.

Ipedo

Developer: Ipedo
URL: http://www.ipedo.com/html/products.html
License: Commercial
Database type: Proprietary
Entry last updated: August, 2003

Ipedo consists of three main components: the XML Database (provides native XML storage and XQuery engine), the Integration Manager (integrates external data using XML), and the Web Express (creates output documents).

The XML Database is a native XML database that uses a proprietary data store. It supports XQuery, indexing, schema management, a proprietary linking language, collections, and transactions. The XQuery engine can query documents in the native data store as well as views built over external data sources (see below). It supports proprietary extensions for updates and full-text searches and uses XML Schemas if they are available. The linking language is used to build virtual documents. It uses URLs to include documents or fragments stored on the Web, in the native data store, or in views. Links may be parameterized. The Schema Manager supports XML Schemas and DTDs, which are converted to XML Schemas. If a schema is associated with a collection, documents in that collection are validated on insert and on update. The Schema Manager also supports versioning.

The Integration Manager has three components: XML Views, Adapters, and Content Converters. XML Views provide dynamic, read-only access to external data. They are supported for relational databases (using a table-based mapping?), SOAP, HTTP, and XML documents stored in local or remote Ipedo databases. Views are accessed in XQuery statements via the proprietary view() function and are evaluated at query time. Adapters synchronize local copies of external data with their external source. They consist of an inbound adapter (running in Ipedo) and an outbound adapter (running in the external source). Adapters are run by triggers in both Ipedo and the external source and use a proprietary XML protocol that runs over HTTP or JMS. Adapters are available for Oracle and DB2; users can write their own Adapters as well. Content Converters provide static, read-only access to external documents. They convert documents from a variety of formats (Word, PDF, etc.) into XML documents, which are then stored in the XML Database.

The Web Express provides a User Profile Manager, a Transformation Engine, Pipelines, and Web Tags. The User Profile Manager allows designers to customize views of the data on a per-user basis. The Transformation Engine is an index-aware XSLT engine. Pipelines are a named set of transformations; these can be performed by the Transformation Engine or components written by the user. Pipelines can use any source of XML as input (an XML document, an XQuery statement, another Pipeline, etc.) and send the output to a specific destination, such as a Web site or a specific application. Pipelines can be parameterized. Web Tags are tag library for writing Java Server Pages (JSPs) that use Ipedo.

Ipedo is written in Java and is accessible from Java, EJBs, COM, SOAP, and WebDAV. It provides security through users, groups, and access control lists; supports journaling, on-line backups, and external clustering mechanisms; and allows read-only replicas of the data store to be deployed. It comes with GUI-based design and administration tools.

Lore

Developer: Stanford University
URL: http://www-db.stanford.edu/lore/home/index.html
License: Research
Database type: Semi-structured
Entry last updated: November, 2000

Semi-structured data is data with more structure than a conversation, but less structure than a telephone book. A good example is a resume (curriculum vitae). While virtually all resumes include a name, address, and telephone number, only some will include an email address, Web site, or FAX number. Most will include a list of previous jobs, but others might include only a list of university courses. Depending on the profession, there might be a list of software used or licenses held.

XML is well-suited to storing semi-structured data and shares a feature common to many semi-structured data models: it is self-describing. That is, it carries a certain amount of metadata with the data. In the case of XML, this is in the form of element type and attribute names. The legality of well-formed documents mirrors another feature found in many semi-structured data models: the data model is not required to have a definitive schema, and the model can be extended at will by the addition of new fields.

Lore is a database designed for storing semi-structured data. Although it predates XML, it has recently been migrated for use as an XML database. It includes a query language (Lorel), multiple indexing techniques, a cost-based query optimizer, multi-user support, logging, and recovery, as well as the ability to import external data. Because Lore is designed for use with semi-structured data, XML documents without DTDs can be easily stored.

An interesting feature of Lore is a DataGuide, which is a "structural summary of all paths in the database". Unlike structured databases, in which the structure is specified first and data is added according to that structure, data is entered first into Lore and the structure is then summarized. The resulting information useful for query processing.

The Lore executables are "available for public use". Source code may be available in some circumstances.

MarkLogic Server (formerly Cerisent Content Interaction Server)

Developer: Mark Logic Corp.
URL: http://www.marklogic.com/products/index.html
License: Commercial
Database type: Proprietary (?)
Entry last updated: June, 2004

MarkLogic Server is a native XML database that uses a proprietary(?) data store. It stores all content as XML, converting documents from formats such as Microsoft Office, PDF, and StarOffice when they are loaded. At load time, documents are indexed with both full-text and parametric (structured?) indexes. They can be stored with "configurable levels of document fidelity". Presumably, this means that users can choose which types of XML structures to store. For example, users might be able to discard comments, processing instructions, insignificant white space, and sibling order from data-centric documents.

MarkLogic Server supports XQuery, with extensions for full-text queries, updates, and transaction handling. Updates can be performed at the node level through the XQuery extensions or a proprietary API, and can be grouped together into a single transaction. Queries are "lock-free" and journaling is performed to allow recovery in the event of a system crash. Content Interaction Server supports XML Schemas as well.

Applications interact with Content Interaction Server through the Java XDBC API or an integrated HTTP server. In addition, the server can be customized through the Services Component Adapter Layer.

myXMLDB

Developer: Mladen Adamovic
URL: http://sourceforge.net/projects/myxmldb/
License: Open Source
Database type: MySQL
Entry last updated: January, 2005

myXMLDB is a native XML database implemented on top of MySQL. It stores documents as BLOBs and can store documents up to 256 MB in size. It supports XPath and XQuery through Saxon and provides a Java implementation of the XML:DB API. A GUI interface is provided through XMLdbGUI.

Natix

Developer: data ex machina
URL: http://www.dataexmachina.de/natix.html
http://pi3.informatik.uni-mannheim.de/~moer/natix.html (in German)
License: Commercial
Database type: Proprietary(?)
Entry last updated: June, 2004

Natix is a native XML database designed to:

"...support compact storage of structure and content of XML documents, index structures for content and structure retrieval, validation, recovery, isolation of multiple users that work on the same document(s), query evaluation, a rich set of application programming interfaces (APIs) for languages like C++, Java, and support for legacy applications."

Although the Web site is unclear about the status of Natix, it is a released product and is the engine behind Xyleme Zone Server. It is primarily designed to be embedded in applications or other systems, but can be bought separately.

NaX Base (formerly Lucid XML Data Manager)

Developer: Naxoft (bought from Lucid'i.t.)
URL: http://www.naxoft.com/produit-presentation.html (in French)
License: Commercial
Database type: Proprietary
Entry last updated: May, 2004

NaX Base is a native XML database built on a proprietary data store, which maintains both node-level and full-text indexes. Of interest, indexes are optimized asynchronously, which might allow for faster updates. Documents can be organized in collections, which can be nested inside each other.

The query language is an extended version of XPath which supports queries along the hierarchy of collections as well as a grep operator for doing full-text searches. Full-text search features include case-sensitive and -insensitive searches, wildcards, and searches that omit specific words. Updates are done through an API with methods such as insertBefore, insertAfter, and appendChild. These methods accept the new node and an XPath expression identifying the location of the change.

NaX Base allows users to assign access privileges on a per-user and per-collection basis. In addition, individual documents can be locked by a single user. NaX Base can be run either locally or via a network, and has both Java and COM APIs. A GUI-based administration and development tool is provided.

Neocore XMS

Developer: Xpriori (who bought NeoCore's intellectual property)
URL: http://www.xpriori.com/index.html
License: Commercial
Database type: Proprietary
Entry last updated: Summer, 2001

From the company:

"NeoCore XML Management System (XMS) is a fully transactional native XML database system that serves as a bi-directional web server, accepting and returning XML documents and fragments via HTTP(S). It supports all basic database functions, including storage, delete, copy, and query for XML documents, and insert, modify, and query for XML data elements. NeoCore XMS is schema-independent and requires no database or schema design before using the system. That is, rows, columns, tables, fields, or indexing instructions do not need to be created before documents are added to the database. When new documents are added to the database, their structure and data - metadata and data - are derived and automatically indexed. Users then can change the structure of existing documents without database system redesign. Specific features of NeoCore XMS include XPath-based query support, access control, user-defined document data management, GUI-based administration, and session control."

"NeoCore XMS uses a variant of XPath as its query language; query responses return elements, document fragments, full documents, and multiple documents. Queries can be made without knowing the document structure - context can be queried for data, and data can be queried for context. Boolean and wildcard options are fully supported and all query results are well-formed XML. For query processing, NeoCore XMS uses Digital Pattern Processing, a patented technology that streamlines queries by using fixed-length icons."

"NeoCore XMS has built-in access control to set permissions at the document or fragment level, and at user or group levels. NeoCore XMS also supports access control by specifying IP addresses and supporting X.509 certificates via Netscape Server."

"Interfaces to NeoCore XMS include HTTP(S)/SSL, Java, C++, and Microsoft COM. NeoCore XMS also can interface with existing databases through the X-Aware data integration tool."

ozone

Developer: ozone-db.org
URL: http://ozone-db.org/frames/home/what.html
License: Open Source
Database type: Object-oriented
Entry last updated: March, 2004

From the Web site:

"ozone is a fully featured, object-oriented database management system completely implemented in Java ... ozone includes a fully W3C compliant DOM implementation that allows you to store XML data. You can use any XML tool to provide and access these data. Support classes for Apache Xerces-J and Xalan-J are included."

"Besides the native API, ozone provides a ODMG 3.0 interface. Although not fully ODMG compliant it helps you to port applications to/from ozone."

"ozone does not depend on any back-end database or mapping technology to actually save objects. It contains its own clustered storage and cache system to handle persistent Java objects."

"[ozone] includes the following features:
o multi-user, multi-thread support
o object level access rights
o fully transaction based
o JTA/XA support
o deadlock recognition
o BLOB support
o XML (DOM) support
o ODMG 3.0 support
o Garbage collection"

ozone is part of the Infozone framework.

Sedna XML DBMS

Developer: Management Of Data & Information Systems, Institute for System Programming of the Russian Academy of Sciences
URL: http://modis.ispras.ru/sedna/index.htm
License: Free
Database type: Proprietary
Entry last updated: June, 2004

From the Web site:

"Sedna XML DBMS is a native full-featured data management system. It is designed having the following main goals in mind:

o Support for all traditional DBMS features (such as update and query languages, query optimization, fine-grain concurrency control, various indexing techniques, recovery and security),

o Efficient support for unlimited volumes of document-centric and data-centric XML documents that may have a complex and irregular structure,

o Full support for the W3C XQuery language in such a way that the system can be efficiently used for solving problems from different domains such as XML data querying, XML data transformations and even business logic computation (in this case XQuery is regarded as a general-purpose functional programming language)."

"[Features include:]"

o Support for the W3C XQuery language

o Support for a declarative update language

o Native XML data storage structures designed for efficient support for both queries and updates (no underlying relational or another DBMS). The XML data storage is based on descriptive schema (also called DataGuide)

o JAVA API and Scheme API for application development

o Open client/server protocol over sockets that allows implementing APIs for other programming languages

o Administration via easy-to-use command line utilities"

[Ed. -- The declarative update language is based on the extensions to XQuery proposed by the W3C and Patrick Lehti.]

Sekaiju (known as Yggdrasill in Japan)

Developer: Media Fusion
URL: http://www.mediafusion.co.jp/usa/seihin/sekaiju/index.html
License: Commercial
Database type: Proprietary
Entry last updated: February, 2002

Sekaiju is a native XML database that has a proprietary data store designed to store well-formed XML documents. This uses "baskets" and "pockets" (the latter are "like a table" in a relational database), supports two-byte characters, and can store documents that are up to 2 GB in size.

Sekaiju has local and remote COM interfaces, making it accessible via Visual Basic, as well as an HTTP interface. Its query language is XBath, a proprietary language based on XQL. Indexes are automatically built for all nodes (element tags, attributes, and PCDATA) in version 1.0 and for user-specified nodes in version 1.5. Updates are supported only by replacing entire documents.

Transactions are supported through a versioning (log file) mechanism which is designed to minimize conflicts due to reading and writing the same document at the same time. Locking is done at the pocket level in version 1.0 and at the pocket or document level in version 1.5. Rollbacks occur automatically when problems occur in version 1.0; users can also request them directly in version 1.5. Security features include 256-bit encryption and password protection, with access controllable at the pocket level.

Tools include a forms editor, a GUI-based management tool, backup/restore tools, and a toolkit for parallel processing.

SQL/XML-IMDB

Developer: QuiLogic
URL: http://www.quilogic.cc/
License: Commercial
Database type: Proprietary XML store plus relational store
Entry last updated: February, 2003

SQL/XML-IMDB is an in-memory database with both native XML and relational data stores. While both data stores organize data in tables, a "table" in the XML data store is what most other native XML databases refer to as a collection, with one XML document per "row". Tables can be created as either local to a particular process or shared among processes and use compression to minimize memory use. Both types of tables are indexed with TST-trees, which "combine the speed advantage of a hash table with the ordered access of a binary tree", and XML tables are also indexed with "Reverse-Lookup" and "Token-Segment-Build-Up" mechanisms. While there does not appear to be a way to directly store the entire database to disk, individual relational tables can be saved as text files and individual XML tables can be saved as XML documents.

SQL/XML-IMDB supports both XQuery and a "significant subset" of SQL92. This allows XML queries against XML data and SQL queries against relational data. In addition, it extends XQuery so that users can mix XML and relational data. To do this, it allows SQL statements in "any part of [an] XQuery statement where an expression is allowed". From a practical standpoint, it appears that this means SELECT statements are used anywhere except in a RETURN clause and INSERT, UPDATE, and DELETE statements are used in RETURN clauses.

When a SELECT statement is used, the returned result set is mapped to an XML document with a table-based mapping. That is, each row in the result set is mapped to a <row> element and each column is mapped to a child of that <row> element. This allows XQuery variables to be bound to individual rows or columns in the result set. When any type of SQL statement is used, it can include XQuery variables. For example, these can be used in the WHERE clause of a SELECT statement to correlate relational and XML data, or in the VALUES clause of an INSERT statement to transfer data from XML documents to relational tables.

SQL/XML-IMDB also extends XQuery with operators to update XML documents. Supported operations include deleting nodes, renaming nodes, updating node values, replacing nodes, and inserting new nodes before or after existing nodes. Note that these operations cannot be performed inside a transaction.

SQL/XML-IMDB has a proprietary API for interacting with the database. This includes functions for preparing and executing SQL and XQuery statements, beginning, committing, and rolling back transactions, transferring data between internal tables and external files or application variables, and bi-directional iteration over result sets. It is worth noting that XQuery results are returned in result sets just like SQL results. Each item in an XQuery sequence is returned as a separate column, with atomic values mapped to columns of the appropriate data type and nodes mapped to XML strings. When an XQuery statement returns multiple sequences, these are mapped to multiple rows in the result set.

SQL/XML-IMDB can be used from Microsoft .NET, Visual C++, Visual Basic, Office, and IIS/ASP, Borland C++ and Delphi, Perl, and PHP.

Sonic XML Server (formerly eXcelon)

Developer: Sonic Software (who bought eXcelon Corp.)
URL: http://www.sonicsoftware.com/products/sonic_xml_server/index.ssp
License: Commercial
Database type: Object-oriented (ObjectStore). Relational and other data through Data Junction
Entry last updated: April, 2003

[Note: The following is a description of eXcelon's eXtensible Information Server (XIS). Sonic Software bought eXcelon and renamed XIS as Sonic XML Server. It is not known whether the following description is still accurate, since the Sonic Web site has little technical information about Sonic XML Server. Ed. -- 4/04]

eXtensible Information Server is a native XML database built on top of ObjectStore. Documents are parsed on import, with individual nodes stored as hierarchically linked objects. This means that documents do not have to be parsed at run time and large documents can be processed without having to read the entire document into memory. Documents are not required to have a DTD or conform to a predescribed schema. They can be indexed using both value and structural indexes. (Value indexes index element and attribute values; structural indexes index element and attribute names.) They can also be arranged in collections; these can be nested, resulting in a file system metaphor.

eXtensible Information Server supports queries through XQuery, XPath with extension functions, and a proprietary update language (updategrams). Updategrams consist of an XPath to a node, an operation on that node (insert before/after, update, delete), and any data needed to carry out the operation. As an add-on, eXtensible Information Server supports full-text search through the Verity engine. Users pass queries (using Verity's query language) to eXtensible Information Server, which passes them to Verity. Verity executes the queries (using its own indexes) and returns pointers to the relevant documents in eXtensible Information Server.

eXtensible Information Server supports two kinds of server-side functions, which can be written in Java, VB, or COM. The first, known as server extensions, run inside the current transaction and are commonly used in XPath expressions or as triggers associated with inserts, updates, or deletes. These can directly manipulate data in the cache using a server-side DOM implementation. The second, known as servlets, must define their own transaction boundaries and are generally used to implement extensions to the database as a whole, such as a JMS queue.

eXtensible Information Server also supports a concept called "Binder Documents". This allows users to link existing documents as well as to build virtual documents that consist of nothing but links. Links are traversed transparently during queries and update operations, which means that virtual documents can be used to perform queries and updates over multiple documents in a single operation. Note that the application must currently enforce the referential integrity of links (such as through triggers). That is, it must ensure that the document/fragment to which a link points actually exists.

eXtensible Information Server can integrate backend data through the XConnects Integration Engine, which uses the Data Junction Universal Translation Suite. This provides links to many different data formats, including relational databases. Because the links are two-way, it means that backend data sources can be updated through eXtensible Information Server. Users can also write their own XConnects connectors with a Java API, a scripting language, and Stylus Studio (an IDE for XSLT and XML).

eXtensible Information Server supports transactions and can participate in XA transactions. However, it cannot currently manage XA transactions, so the application must coordinate any XA transactions that include eXtensible Information Server and other data sources, such as backend data stores. Other database features include distributed caching, partitioning, online backup and restore, and clustering support.

Finally, eXtensible Information Server comes with Java, COM, and .NET APIs, a JCA-compliant driver, a built-in XSLT processor, and a set of GUI development tools. These include an XML editor, an XSLT editor, a schema editor (XML Schemas and DTDs), an XSLT/Java debugger, an XML-to-XML mapping tool, and tools for mapping backend data to XML documents.

Tamino

Developer: Software AG
URL: http://www.softwareag.com/Corporate/products/wm/tamino/default.asp
License: Commercial
Database type: Proprietary. Relational through ODBC.
Entry last updated: November, 2002

Tamino XML Server is a suite of products built in three layers -- core services, enabling services, and solutions (third-party applications) -- which may be purchased in a variety of combinations. Core services include a native XML database, an integrated relational database, schema services, security, administration tools, and Tamino X-Tension, a service that allows users to write extensions that customize server functionality.

The XML engine uses the Data Map, which describes where the data in a given XML document is stored. This allows individual XML documents to be composed of data from multiple, heterogeneous sources, such as the native XML data store, relational databases, and the file system. Since the connections to external data (made through the X-Node module) are live and bidirectional, Tamino may thus be used to perform heterogeneous joins and updates.

Tamino's XML support includes the DOM, JDOM, SAX, and XML:DB APIs, an extended XPath implementation called X-Query (not to be confused with W3C XQuery, which it predates), full-text retrieval, processing of XML documents with server-side XSL and CSS, and limited support for SOAP. It can store schema-less documents and can use schema information (including a subset of XML Schemas) if it is available.

The internal SQL engine is directly addressable through ODBC, JDBC, and OLE DB. However, when addressed via these APIs, it cannot integrate data from the internal XML data store or from external data sources. (As noted above, the reverse is true. That is, with the help of the X-Node, the XML engine can integrate data from the XML data store and other databases, including the internal SQL engine.)

Enabling services include X-Port, X-Plorer, X-Application, various APIs (mentioned above), X-Node (also mentioned above), and the WebDAV Server. X-Port provides URL-based data transfer through various standard HTTP servers, X-Plorer is a browser-based navigation tool for documents stored in Tamino, and X-Application is a set of JSP tags for accessing Tamino through Web pages.

The WebDAV Server adds namespace management (nested collections or directories), additional properties (such as last-modified, content length or content type) and overwrite protection (persistent locking) to the existing Tamino XML Server functionality. This allows Tamino to serve as a virtual file system (Web folder) where the information can be stored and retrieved using a standard Web browser and the common drag and drop metaphor.

(Note: In spite of rumors to the contrary, Tamino is not built on top of Adabas, a hierarchical database from Software AG. Instead, the Tamino data store was built from the ground up as a native XML database, obviously drawing on the knowledge gained from developing Adabas.)

TeraText DBS (formerly SIM (Structured Information Manager))

Developer: TeraText Solutions (A Division of SAIC)
URL: http://www.teratext.com/get/page/browser/browser?category=Products/TeraText%20DBS,
http://www.saic.com/products/software/teratext/
License: Commercial
Database type: Proprietary
Entry last updated: August, 2002

From the Web site:

"TeraText DBS was designed specifically to store, retrieve and manipulate structured text. ... [It] also indexes all or part of the document using XML standards, enabling complex and comprehensive searching."

"[TeraText DBS is] designed to support XML, SGML, Unicode, Z39.50, HTTP and other industry standards, [and its] components are modular. They can be installed as a suite or as individual modules to work with existing database management and document-authoring systems."

"A content server enables searches on structural elements or document characteristics ... [It] also supports the ... worldwide industry standard protocol for information retrieval, Z39.50."

"A unique applications server provides immediate access to any TeraText database. TeraText DBS supports plug and play modules for complex value added Web services."

"Java , C++ and SOAP APIs as well as WebDav, LDAP, Microsoft Word, PDF and other plug-in adapters are available."

TEXTML Server

Developer: IXIASOFT, Inc.
URL: http://www.ixiasoft.com/default.asp?xml=/xmldocs/webpages/textml-server.xml
License: Commercial
Database type: Proprietary (Document-based)
Entry last updated: June, 2005

TEXTML Server is a native XML database that stores, indexes, and retrieves whole XML documents. A TEXTML Server installation consists of one or more document bases, each of which consists of a document repository and a set of indexes. The document repository is organized as a hierarchical set of collections and can store both XML and non-XML documents. All documents are stored intact. The major difference between XML and non-XML documents is that XML documents are parsed at insert time to create indexes. While non-XML documents are not parsed, they can be associated with an XML document that provides indexable metadata for the non-XML document.

Unlike most native XML databases, the indexes in TEXTML Server effectively form an additional schema layer on top of the documents stored in the database. This is because indexes are defined using one or more XPath expressions. Since these can refer to any document in the database, the effect is that a single index can refer to more than one field. For example, an author index might refer to the AuthorName element in one set of documents and the StoryAuthor attribute in another set of documents. Furthermore, because indexes are defined using XPath expressions, it is possible to transform values and index the transformed values. TEXTML Server supports five different types of indexes: word (token), string, numeric, date, and time.

TEXTML Server has its own, XML-based query language. Queries are defined as a series of boolean tests over specific indexes or the full text of the documents. Tests are generally for equality. In addition, numeric, date, and time indexes support range tests, and word and string indexes support wild-card tests. Tests can then be joined with a number of operators, including And, Or, And Not, Near, adjacency, and frequency. Queries return whole documents and can sort results based on index values, document properties, and hit counts.

In addition to being able to associate XML documents with non-XML documents, TEXTML Server also has a Universal Converter that can convert more than 225 file formats (word processor, spreadsheet, presentation, drawing, bitmap, and so on) to XML. This uses Stellent's Outside In XML Export and extracts document "contents, presentation information, and metadata". Extracted information is stored in a document that uses the SearchML schema, also defined by Stellent. Converted documents can then be searched directly or associated with the original documents as indexing documents.

Other features of TEXTML Server include check-in/check-out, versioning, support for plug-ins that are run at insert time, and COM, Java, .NET, WebDAV, and OLE DB APIs. Security can be specified at the document, collection, or document-base level. System features include fault tolerance, replication, load management, and automated recovery.

TigerLogic XML Data Management Server (XDMS)

Developer: Raining Data
URL: http://www.rainingdata.com/products/tl/abouttl.html
License: Commercial
Database type: Pick
Entry last updated: January, 2003

TigerLogic XML Data Management Server (XDMS) is a database designed to store multiple kinds of data, including "structured, XML, and unstructured information". (Examples of the latter are office documents, email, and graphics.) Data is stored in the TigerLogic Native XML Data Store, which "leverages the Pick Universal Data Model". As XML documents are inserted into the database, an XML Profiler reads the incoming documents and gathers information to build indexes. These are used by the query processor, which supports XPath. TigerLogic XDMS also supports XSLT.

TigerLogic XDMS has a Java API and is also accessible over SOAP, HTTP, and JCA. It supports both DTDs and XML Schemas. Of interest, it supports XA transactions, and provides "on-line backup and recovery".

Timber

Developer: University of Michigan
URL: http://www.eecs.umich.edu/db/timber
License: Open Source (for non-commercial users)
Database type: Shore, Berkeley DB
Entry last updated: October, 2005

Timber is a native XML database that has an architecture "as close as possible to that of a relational database," in order to "reuse, where appropriate, the technologies developed for relational databases over the past several decades". The basis of Timber is "an XML algebra that manipulates sets of ordered, labeled trees". The primary difficulties of such an algebra include the "complex and variable structure of trees in a set, and issues of ordering."

By default, Timber uses Shore as its underlying data store. It can also use Berkeley DB. It supports a number of different types of indexes, including element, attribute, text, inverted, parent, and join indexes.

Timber supports a subset of XQuery. Users can enter queries either as XQuery expressions or as logical or physical query plans using Timber's logical or physical plan syntax. The latter allows advanced users to optimize queries by hand, as well as to perform some operations not supported through XQuery. Timber extends XQuery with functions for deleting nodes or their contents, updating the contents of a node, and inserting elements or attributes. In addition, Timber has a command line option for appending the contents of an XML document to a document already in the database.

Timber has command line, GUI, SOAP, and Web interfaces for performing both queries and administrative functions.

TOTAL XML (formerly Socrates XML)

Developer: Cincom
URL: http://tiger.cincom.com/pages/aboutTotalXML.html
License: Commercial
Database type: Object-relational, external relational through ODBC
Entry last updated: July, 2003

TOTAL XML is a native XML database that can store documents as objects or text. It can store data in its own object-relational data store, an external relational database, or a combination of the two. It is therefore possible to distribute the data for a document across multiple databases. In addition, TOTAL XML can store non-XML data, such as "standard relational data" and BLOBs.

Unlike other native XML databases, the objects used to store XML documents are specific to each DTD, but inherit from an object model that supports the Infoset. Thus, TOTAL XML has characteristics similar to both native XML databases and XML-enabled relational databases. Like an XML-enabled relational database, it is possible to query the data directly with SQL. However, documents cannot be stored until the user has defined a map from the DTD to the database. (A utility is available for generating maps for DTD-less documents.) Like a native XML database, the database stores information about the full physical structure of a document and it is possible to round-trip documents.

TOTAL XML supports three different query languages. XML documents can be queried with XPath or an extended form of SQL, which can query relational data and BLOBs as well. Text data can be queried with regular expressions. TOTAL XML also supports the XML:DB API.

When XML documents are stored as DTD-specific objects, applications can access these objects through the object-oriented capabilities of JDBC 2.0. The objects can be used directly or converted to a DOM tree using the previously defined maps. The DOM is lazily populated, so data is retrieved from the database only when needed. When documents are stored as text, applications can access them with JDBC or ODBC and they are returned as text.

TOTAL XML can integrate data from legacy databases (including VSAM, IMS, IDM, and Adabas) using Striva DETAIL. The integrated data can be live or a copy stored in TOTAL XML.

TOTAL XML ships with a number of tools, including utilities to generate classes and maps from DTDs and administration tools.

Virtuoso

Developer: OpenLink Software
URL: http://www.openlinksw.com/virtuoso/
License: Commercial
Database type: Proprietary. Relational through ODBC
Entry last updated: November, 2000

Virtuoso is a heterogeneous join engine featuring security, transactions (including two-phase commit), and replication. Its query engine supports heterogeneous views, stored procedures, scrollable cursors, and full-text search. It accesses external data sources through ODBC, as well as having its own relational data store.

Virtuoso supports XML in a number of ways. First, it contains a native XML data store, which is non-relational and can store and index XML documents in parsed or unparsed form. Second, it can transfer data from relational databases to XML documents (although not the other direction), using the same mapping found in the FOR XML clause in Microsoft SQL Server. Third, it includes an implementation of XPath. Although this only works on "native" XML data, relational data can be included by first transferring it to XML. Finally, it includes support for XSLT, executing stored procedures through SOAP, and WebDAV.

[February, 2002] Virtuoso has a demo implementation of XQuery that runs over its database. Of interest, this can query virtual documents, such as those built at run time from a relational database.

XDBM

Developer: Matthew Parry, Paul Sokolovsky
URL: http://sourceforge.net/projects/xdbm/
License: Open Source
Database type: Proprietary (Node-based)
Entry last updated: November, 2000

XDBM uses an interface that is "based upon the DOM standard". It stores XML documents in a pre-parsed, indexed format and resolves memory problems by leaving parts of the document on disk until they are needed.

XDB

Developer: ZVON.org
URL: http://zvon.org/index.php?nav_id=61
License: Open Source
Database type: Relational (PostgreSQL only?)
Entry last updated: November, 2001

A native XML database built on a relational database. (It is not clear if databases other than PostgreSQL are supported.) The database stores data in proprietary set of tables and includes a partial implementation of XPath. Written in C++.

XediX TeraSolution

Developer: AM2 Systems
URL: http://www.am2systems.com/technologies-EN.html
License: Commercial
Database type: Proprietary
Entry last updated: June, 2005

XediX TeraSolution is a native XML database built on a proprietary data store. Users can specify which elements and attributes to index, and searches are performed with a proprietary language that "permits addressing in the XML tree in accordance with XPath expressions". Search results can further be refined through the use of regular expressions.

XediX TeraSolution can also store non-XML documents through the use of external entities. Apparently, an XML document provides metadata for one or more non-XML documents and references them through external entities. The non-XML documents are stored alongside the metadata document and are also indexed and searched via the XML metadata document.

Security is provided through the use of users and groups, and can be applied at any level of granularity within XML documents. This allows administrators to assign access rights such that specific users can view only parts of a given XML document.

X-Hive/DB

Developer: X-Hive Corporation
URL: http://www.x-hive.com/products/db/index.html
License: Commercial
Database type: Proprietary
Entry last updated: May, 2005

X-Hive/DB is a native XML database that includes support for XQuery, XPath, XML Schemas, DOM Level 3, XSLT, and XSL-FO, as well transactions, user- and group-level access control, JAAS (Java Authentication and Authorization Service), replication, load balancing across multiple servers, and BLOB storage. Additional features include:

o Indexes. X-Hive/DB supports element name, value, full-text indexes, and custom, as well as "library, ID attribute, and context-conditioned" indexes. Full-text indexes use a proprietary indexing mechanism; these indexes can be searched from XQuery through the xhive:fts (full-text search) function. In addition, users can integrate their own full-text index engines. Custom indexes are based on a user-implemented DOM NodeFilter.

o Linking. A link engine that implements XLink and XPointer supports bi-directional links, link-bases, and link management.

o External data. The JDBC Bridge can retrieve a snapshot of relational data through JDBC. The data is converted to XML using a table model and can be integrated into other documents.

o WebDAV. Remote clients can directly access collections and documents in the database through WebDAV.

o SOAP. Applications can store and retrieve documents, execute XQuery queries, retrieve XML schemas, and so on through SOAP.

o Custom JSP tags. A tag library for calling X-Hive/DB through Java Server Pages.

o J2EE Resource Adapater. An implementation of J2EE Resource Adapter allows X-Hive/DB applications to use the transaction management facilities of an EJB application server.

o Versioning. Both linear and branched versioning (multiple versions of the same document) are supported.

In addition, an implementation of XUpdate (from the XML:DB Initiative) that uses Lexus may be downloaded from the X-Hive Web site.

Xindice (see also dbXML)

Developer: Apache Software Foundation
URL: http://xml.apache.org/xindice/
License: Open Source
Database type: Proprietary (Node-based)
Entry last updated: June, 2005

Xindice is a native XML database written in Java that is designed to store large numbers of small XML documents, as well as non-XML documents. It can index element and attribute values and compresses documents to save space. Documents are arranged into a hierarchy of collections and can be queried with XPath. (Collection names can be used as part the XPath query syntax, meaning it is possible to perform XPath queries across documents.) For updates, Xindice supports the XUpdate language from the XML:DB Initiative. Finally, Xindice comes with an experimental linking language that is similar to XLinks, and allows users to replace or insert content in an XML document at query time.

Xindice supports three APIs: the XML:DB API (also from the XML:DB Initiative), a CORBA API, and an XML-RPC plugin which supports access from languages such as PHP, Perl, and Applescript. In addition, Xindice provides XMLObjects, which allows users to extend the server functionality.

Xindice comes with a set of command line tools for using and administering the database, as well as complete documentation.

XML Transactional DOM

Developer: Ontonet
URL: http://ontonet.com/XML_Product.html
License: Commercial
Database type: Object-oriented (R1 Enterprise)
Entry last updated: August, 2003

XML Transactional DOM is a native XML database built on top of Ontonet's R1 Enterprise object-oriented database. XML documents are stored in the database as Infoset objects. They may be created by passing an existing XML document to the database, which is then parsed and used to create Infoset objects, or directly through a DOM tree.

The XML Transactional DOM implements DOM Level 2 on top of the Infoset objects. It lazily instantiates nodes and uses Java Soft Reference Caching to allow the Java Garbage Collector to collect nodes when the JVM runs out of memory. (Collected nodes that are still in use are reinstantiated from the database as needed.) The XML Transactional DOM also implements XQuery over the same Infoset objects, with query results returned as DOM nodes.

Transactions are supported by an additional interface on the Document object. This has a method to return a DOMTransaction object, which implements JTA (Java Transaction API) transactions as well as savepoints. Savepoints are implemented using the same methods as are used in JDBC. They can be nested to arbitrary depth.

Additional features of the XML Transactional DOM include URI-addressable document collections, a JAXP implementation, the ability to store XML documents in their original form (such as for legal reasons), and serialization of DOM trees or fragments to Java OutputStreams or Writers.

XpSQL

Developer: Makoto Yui
URL: http://gborg.postgresql.org/project/xpsql/projdisplay.php
License: Open Source
Database type: Relational (PostgreSQL)
Entry last updated: March, 2004

XpSQL is a native XML database built on top of PostgreSQL. It stores documents by decomposing them into fragments and storing these fragments in a set of predefined tables. XpSQL has a command line utility for loading XML documents into the database, as well as PostgreSQL functions for retrieving document fragments by node ID. It also has PostgreSQL functions that implement DOM Level 2 and XPath.

There are two main XPath functions. XPath2SQL converts an XPath query into an SQL query over the tables used to store XML documents. The SQL query can then be used to "execute" the XPath query. Results from the SQL query are returned as XML(?). The XPath_Eval function accepts an XPath query and returns rows containing two columns: document ID and node ID. In other words, it returns a list of nodes that satisfy a given XPath query. XPath_Eval is typically used in a FROM clause. For example, the following query uses XPath_Eval to retrieve the value of Price elements. (xml_node is the table used by XpSQL to store individual nodes.)

   SELECT xml_node.value

FROM xml_node, XPath_Eval('/Books/Book/Price') AS price_nodes

WHERE xml_node.id = price_nodes.id

XQuantum XML Database Server

Developer: Cognetic Systems
URL: http://www.cogneticsystems.com/server.html
License: Commercial
Database type: Proprietary
Entry last updated: June, 2006

XQuantum XML Database Server is a native XML database built on a proprietary data store. It supports a subset of XQuery, a subset of the XQuery full-text specification, and XSLT.

XQuantum optimizes queries with a cost-based algorithm, which uses statistics about the data to optimize the search process. The query processor also relies on "recursive XML indexing" (a schemaless indexing method), lazy query evaluation, and stream processing of queries.

XQuantum supports static typing through its own typing mechanism, which "generalizes XQuery's sequence type syntax to include full regular expression types" and is used instead of XML Schemas. Types (effectively schemas for individual XML documents) can be declared in the prolog of an XQuery query or in external type modules. They are applied in the query through explicit validation and are used to provide type information to the query processor.

XQuantum includes a Web server, which allows it to use HTTP as its API. That is, queries are embedded in URLs and results are returned as an XML stream. Queries can also be placed in XQuery Server Pages. These are preferrable for URLs exposed to the public, as they are more secure (the query is not exposed to the public) and less fragile (the query can be changed without changing the URL).

XQuantum is also available as the XQuantum XML Database Appliance, a dedicated server running Linux and XQuantum.

XStreamDB Native XML Database

Developer: Bluestream Database Software Corp.
URL: http://www.bluestream.com/products/xstreamdb32
License: Commercial
Database type: Proprietary (Node-based)
Entry last updated: May, 2003

From the company:

"Bluestream XStreamDB(tm) version 3.0 is a native XML database, built in pure Java with XQuery, full text search, Java API, and support for schemas, DTDs, and binary and other non-XML datatypes. XStreamDB is accessible using a JDBC-like Java API, the XStreamDB Explorer GUI application, scripter, or using WebDAV to reach documents exposed as URIs. Security is enforced using MD5 message digest authentication and a user permissions scheme."

"XML documents are stored in a compressed object representation, using Bluestream's Streamstore database storage engine (also available separately). The database has a full transactions architecture that meets the four ACID requirements: Atomic, Consistent, Isolated, and Durable. Transaction support includes read, write, and update locks, as well as deadlock detection and victim selection. Commits and rollbacks are supported so that the system can recover in the case of a crash."

"It supports multiple, concurrent sessions, as well as session pooling, and recycles free space automatically, so compaction is not required. In addition, it allows partial document updates and document fragment insertion."

"Documents are stored in 'roots' in 'databases' on the XStreamDB server. A root is equivalent to a collection. Schemas or DTDs can be loaded and stored in a collection of 'schemas', and users are kept in a collection of 'users'. Access permissions can be assigned on documents with the built-in user permission scheme. XStreamDB stores both XML and binary document types, with associated mimetypes."

"Collections of XML documents in document roots can be forced to be schema valid by attaching a schema to the root. XStreamDB supports both W3C XML Schemas and DTDs. The XStreamDB resource manager can assign resource information to documents to expose them as URI unique identifiers (Universal Resource Identifier) through WebDAV, or the Resources API. Databases and roots are exposed as 'categories', and documents are exposed as 'resources' within those categories. The resource manager supports mimetypes, created sub-categories, locking, and naming. Resources can also be checked out and checked in to the file system by users."

"XStreamDB supports the XQuery query language for XML data, and has extended it to support insert, update, and full text searching capabilites."

"XStreamDB supports both value indexes and full text indexing. XML document roots with value indexes, will index on the value of data in a specified element or attribute. Full text indexes store a complete index of all content in all documents in the root."

"XQuery queries with full text expressions will finds text within XML document content using wildcard matching, word proximity, and phrase matching. Results are matched to the element or attribute in matching documents, and can be automatically marked."

A note about the history of XStreamDB, also from the company:

"XStreamDB was introduced by Bluestream Database Software Corp. in the spring of 2000. Soon after its introduction, Bluestream was acquired by XML Global and its XML database product renamed renamed GoXML DB. In September 2002, XML Global spun off the XML database division, reinstating the original company and product names. Bluestream XStreamDB version 3.0 is built by Bluestream and marketed by XML Global and other authorized resellers."

Xyleme Zone Server

Developer: Xyleme SA
URL: http://www.xyleme.com/xml_server
License: Commercial
Database type: Proprietary (Natix)
Entry last updated: July, 2002

Xyleme Zone Server is a native XML database that uses Natix as its engine. It supports XQuery and indexes documents at run time as they are added to the database. Xyleme Zone Server can run in clusters and can distribute queries across multiple machines. Local applications can access the server directly from C++ or Java, and remote applications can access it with SOAP. Security is provided on a per-document basis and the product ships with a set of administration tools.

Users can categorize documents according to their semantic type -- financial statements, product documentation, legal documents, etc. Each category is defined by an "abstract view", which is mapped to the schema of each class of documents in the category. This allows users to query all documents in a category by querying the view, rather than having to each class of documents separately. The query processor translates the query against the view into queries against each schema and returns results that correspond to the view.

Users can also subscribe to a service that notifies them of changes to documents. Individual subscriptions are defined as queries, using a proprietary language that (apparently) extends XQuery. Subscription queries run at specified individuals and applications check the output of these queries to determine what has changed.

Of interest, Xyleme SA provides an online repository of Web pages. This may be queried across the Web, presumably as part of queries that also query local data.

posted on 2008-07-29 13:50 gembin 阅读(1175) 评论(0) 编辑收藏所属分类: Database 、XML

新用户注册刷新评论列表


只有注册用户登录后才能发表评论。




网站导航: 博客园博客园最新博文博问管理
相关文章: MySQL on Mac OS X Try ORM for HBase with DataNucleus hbase plugin Try HBase on single host on fedora Eclipse Plugin for eXist (COOL) XQuery Search and Update Native XML Databases SoftwareAG的Tamino Server JPA批注参考 MySQL的常用操作 MYSQL密码加密函数的实际用途

gembin