Tukwila Overview

Use of Tukwila

Linux Execution Engine

From any Linux box, go to /homes/gws/zives/tukwila.

Add /homes/gws/zives/lib to your LD_LIBRARY_PATH and LIBRARY_PATH (this is necessary for several loadable libraries).

From csh:
- setenv LD_LIBRARY_PATH /homes/gws/zives/lib:$LD_LIBRARY_PATH
- setenv LIBRARY_PATH /homes/gws/zives/lib:$LIBRARY_PATH
From bash:
- export LD_LIBRARY_PATH=/homes/gws/zives/lib:$LD_LIBRARY_PATH
- export LIBRARY_PATH=/homes/gws/zives/lib:$LIBRARY_PATH
Start the execution engine:

./exec

Windows Execution Engine

Run Tukwila.exe. All DLLs should be in the same directory.

GUI

The GUI can be run with:

./TukwilaGUI [{hostname}]

where {hostname} defaults to localhost and is the name of the machine running the execution engine.

To test your connection, open "Maya1.xq" or "GS.xq" and run it.

Java API

The API is documented in InvokeXQuery.html.

Web Status Monitoring

Go to http://{hostname}:7777 to see the system status.

Code Overview

Execution Engine Core

Memory/Page Management

The memory management system is divided into several components. The core component is CMemoryManager, which is provided with a pool size (currently specified in tukwila_config.xml) and grants memory to the other managers in terms of extents, via the requestExtent() and freeExtent() calls.

Extents are used by the following components:

CBufferManager, which creates the buffer pool and grants portions of the pool to implementors of the IRelation interface, currently descendants of CRelationBase:

CXMLRelation is responsible for buffering and retrieving XML content. It encodes this content in compressed form. The compression is handled by internal hash tables and a tokenizer. Note that these structures aren't currently persistent, nor is the CXMLRelation itself.
CFlatRelation is responsible for buffering and retrieving standard tuples. In general, this is how we
CMappedFile is a memory-mapped file, useful for nested loops joins and such. It adds a variety of additional functions to CFlatRelation, for instance the ability to search based on a key. These are basically equivalent to the features supported by CHashTable.

RelationBase's descendants are assumed to have a single reader and a single writer. This could be changed by switching from CMutex to CSem, but CSem is currently untested.

Internally, there is a RelationManager that creates and tracks mappings between relation IDs and paths.

CHashTable, which directly allocates a hash table from the memory manager and allows explicit control of buckets. There are functions to find tuples, replace them, etc.

Hash tables are not thread-safe; the assumption is that all reads and writes will be synchronous.

CPageManager, which requests a group of pages from the buffer manager and allows for indirect access to them. Basically, the role of the PageManager is to reserve a fixed portion of the buffer pool for use by an operator (instead of relying on the default LRU replacement of the pool).

Typically, CPageManager itself isn't directly used -- instead, its descendant, CTupleManager, is. CTupleManager allows us to directly read and write tuples to/from pages that were allocated in a block.

A particular user of CTupleManager is the external sort, implemented in CTupleSorter.

Data values

IOperand is the basic entity of manipulation, and it can be a computed expression or a primitive data value. Both algebraic operators and data values should subclass from this interface. Operands can be queried as to their return type, null-ness, and so on.

Note that null-ness is not preserved by an entity when it is written (to disk or to memory). Instead, for efficiency reasons the tuple container should record and maintain null-ness of its attributes. (This way each null only takes 1 bit of storage.)

IValue is the basic data value, and it has a sub-interface template called IPrimitiveType<T> that is used for primitive scalar types.

Specific classes include:

CInteger
CDouble
CBool
CString
CXML

Values generally implement:

read() and write() to memory and to files
assign() of values
get...Value() and get...ValueAt() to read and return values (optionally at pointers)
cmp() and cmpAt() to compare with values (optionally at pointers)

CTuple is the basic tuple. It is currently not a subclass of IValue, but this could easily be changed. The big question is what operators would return tuples (e.g., instead of relations), and how scalar-based operators would handle things when given a tuple.

Tuples are responsible for knowing which of their attributes are null, and they store and read this from memory. They call the appropriate attribute containers to make sure nullness is carried down to the attribute, and read from it.

Operators

IOperator is a child of IOperand, and thus every operator can return an operand that can be used in an expression (and similarly, every IValue can be an operand). There are three "interface"-style classes that define the arity of an operator:

INullary
IUnary
IBinary

Additionally, there is an IBoolean interface for classes that return booleans (namely, boolean operators and comparisons).

Operators must implement computeTypeInfo(), which determines their return types and creates any internal data structures. This function is given the schema of the tuple upon which it can operate.
Operators must implement rebind(), which is given a tuple that can supply values to variables.

Subclasses of IOperator include:

IBoolean
- CUnaryBoolean
- CBinaryBoolean
- CComparison
CUnaryArithmetic
CBinaryArithmetic
CNamedVariable, which is a reference to a value within a tuple

IQueryOperator is the interface for all query operators. Typically operators actually subclass one if its descendants, which are named according to the number of child operators:

ILeafOperator has no child query operators and is typically implemented directly. Instances include tablescans, tuple producers, and x-scans.
IOneChildOperator has descendants:
- IProjectingOperator affects the arity of the child operator's tuple (e.g., projection)
  - IAttributeAddingOperator adds precisely one attribute to the child's tuple (e.g., XML tagging)
- IPassThroughOperator does not modify the child operator's arity (e.g., selection)
ITwoChildOperator is a binary query operator, e.g., a union or a join.

IQueryOperator is not currently a subclass of IOperator because it isn't clear that we want to interchange between tables and scalars. However, it is possible to extend in this way (and in fact, there are datatypes such as GROUP that are reserved for such a feature).

The (standard) order of iterator execution goes:

open()
- opSpecificOpen() does operator-specific initialization before metadata exists.
  It sets _initFirstTuple if setNewTuple() should be called on the first tuple (as opposed to having this called by opSpecificNext()).
- createMetadata() determines the output metadata based on the child or any initialization
- bindSyms() computes any necessary bindings (e.g. between select exprs and the schema)
next()
- if complete then exit
- if new plan then set new child
- set _firstTuple
- if _initFirstTuple then
  - setNewTuple() is called here
- opSpecificNext() which should call setNewTuple() as necessary, e.g., on firstTuple()
- setNewTuple() should call rebind() on any predicates
close() should deallocate any tuples created by the operator

Tuple creation and deletion is done by the parent operator, after it has read schema information from the child. setNewTuple() establishes the necessary bindings between the output tuple and the child tuple.

Current query operators include:

NSelect
NProject
NJoin (using any of several TableAccessors and JoinConsumers)
NGroup
NPseudoGroup
NRemap
NRootOp
NWebJoin
NIdentity
NDematerialize
NMaterialize
NMateralizePoint
NXAttribute
NXElement
NXLiteral
NXLookup
NXOutput
NXScan

Internal Query Optimizer/Re-Optimizer

The optimizer works on PlanNodes, which are higher-level representations than standard query operators. In particular, a PlanNode encapsulates the following operations:

An optional grouping operation done at the end
Any number of projections
Some number of selection predicates
Either a join or x-scan
Of course, information about costs, join types, available attributes, etc. are maintained in the PlanNode.

The main class is OptimizePlan. The actual optimizer is split among ExecPlan.cpp (actual operator generation) and Optimizer.cpp (PlanNode optimization). Each re-optimization creates a new dynamic programming enumeration; after optimization, only the actual nodes in the final plan are preserved.

Important methods in OptimizePlan are:

optimize() does an initial optimization
reoptimize() which can either do a cleanup or normal re-optimization
shouldReoptimize() is supposed to determine whether stats have changed enough to re-optimize.
duplicate() will create a new phase context (i.e. it'll copy all of the PlanNodes from the previous phase)
scaleBy() attempts to rescale memory allocation by a factor (this is really more a kludge than a good thing)
enumerateAlternatives() is the main DP exploration routine, but it makes use of the following subroutines:
- createLeaf() creates a leaf-node operator
- createJoin() creates a join operator with two children. Note that createJoin() tries to find the least expensive join method internally.
- Both of these subroutines construct a new PlanNode that includes estimated cost information for the subtree. Depending on whether bushy enumeration is enabled, enumerateAlternatives may try multiple subplans.

External Interface

AQPDaemon.cpp contains the server daemon (which now handles web requests, AQP/CQP requests, and also XML query plans).

Currently the daemon resides by default on port 7777. Queries are triggered via SOAP over HTTP. Status information can be obtained simply via HTTP requests (GET / will return the root status page). Performance monitoring can be done over executing queries; by default, the dot package is used to obtain graph-formatted query plans in GIF or PNG.

Query.cpp is the main routine and includes keyboard I/O control (which will probably be deprecated and replaced by web forms for reconfiguring the system)

Pre-Optimizer

The pre-optimizer is a Java component that is mostly responsible for parsing and language-level rewrites. It is not 100% compliant with the current XQuery specs, and in fact it does not even support multiple lexing modes (so some words are reserved).

The rewrites of the optimizer mostly involve the following:

Remove [] filter expressions from XPaths (note that the current version may be buggy in some cases)
Flatten source mappings and conjunctive predicates (may not work with disjunction)
Do analysis of conjuncts to add transitive relationships (e.g., $a < $b, $b = "abc" => $a < "abc")
(Currently ignores expressions that do arithmetic, e.g. $a = $b + 10)
Separate query blocks (may not work with nested FORs -- only nested queries in RETURN)