Swank Wiki
Recently Visited

Swank v0.04.04

Semi-Structured Databases

Unorganized Links and Notes

http://www.unixspace.com/context/databases.html has a good summary of different data models.

Several systems equate "semi-structured data" with XML.  I am not sure this is always true, and currently use XML mainly as a convenient storage format.

Research Issues in Structured and Semistructured Databases

On Database Theory and XML is much too focused on mapping XML to relational issues, and is even defensive ("XML was imposed on us"), but makes a good point about developing XML standards: "standards and concepts are sometimes created faster than research communities can validate them."

Implementations

Early research in semi-structured databases used a graph-based conceptual model.  This paper from 1999 is a good example: Designing Good Semi-Structured Databases and Conceptual Modeling

SODA2 Its homepage at unsw.edu is broken, but a cached version is here (from 2000).

LORE "declared a success" and abandoned in 2000

XML databases

Recent research seems to concentrate on XML databases since the graph tree structure maps well into an XML tree, and query languages are already defined.  

wikipedia:XML database

There is a XML:DB initiative for developing standards for XML databases

eXist-db (uses Lucene internally for full text indexing)

MonetDB/XQuery look fairly good

Apache Xindice

BaseX does not seem mature, e.g. XQuery does not yet use indexes (v5.0).  Nice interactive features though.

Sedna

Whole bunch of links here: http://www.rpbourret.com/xml/XMLDBLinks.htm

The XML:DB initiative defines 3 requirements for a Native XML Database, which are succinctly summarized as

  1. The database is specialized for storing XML data and stores all components of the XML model intact.
  2. Documents go in and documents come out.
  3. A NXD may not actually be a standalone database at all.

This is not a whole lot to satisfy, and does not say anything about the ability to do queries, updates, enforce integrity, establish relations other than the inherent document tree heirarchy, or other things we might want from structured-ness.  Even the "simplest thing which can possibly work" approach currently taken by Swank satifies these requirements and goes beyond them by providing a query facility.  These deficiencies are addressed by other standards and subprojects, especially XQuery.