Historically, DaytonaTM databases have scaled up to the point where either the customer runs out of data to store or else runs out of the means to store the data. Our most recent best example of that is AT&T's 312 terabyte SCAMP database which is implemented using Daytona as a federation of two HP Integrity Itanium Superdome partitions running HP-UX.

The 64 CPU (SMP) partition houses 6 largish Daytona tables of call detail data. As of Sept 14, 2005, the largest one of these tables contained 743 billion records whose average (compressed) length was 52.2 bytes. This 52.2 bytes uncompresses on average to a uncompressed format measuring 216 bytes, implying an expansion factor of 4:1 . (This uncompressed format is essentially a human-readable ASCII format.) This compressed table takes up 38.8 terabytes; clearly it is much better to buy 38.8 terabytes of disk than 159 terabytes. In all, this partition contains 1.026 trillion records. (FYI, all terabytes here are 1024 based.)

The second partition is qualitatively different: it consists of a low-level data store that in and of itself does not use Daytona and which stores all of the raw data that is distilled into the Daytona records in the first partition. Most of this raw data is in the so-called AMA format, which is a standard format long used by telecoms. There is a lot of information in the raw records that is not present in the corresponding distilled Daytona records and actually the distilled Daytona records contain some information that is not in the raw records. This raw data is stored using an exceedingly effective AT&T compression algorithm called pzip achieving compression ratios here of 6:1. Special home-grown C code is used to parse unpzip'd AMA data into a human-readable form.

There is a web-interface that binds these two partitions into a federated database: the interface invokes precompiled web-parameterized Daytona queries on the first partition and displays their output. Then if the user would like to see a display of the corresponding raw records located on the second partition, they press a button and the web-interface performs a Daytona query on the first partition that uses the Daytona indices to produce information that informs the custom C code on the second partition where to find the corresponding raw records. This is essentially an indexed nested-loops join. The fact that this integration is so simple and was so easy to implement is a testament to Daytona's flexibility.

Various size measures for this federated database

When all the data in this federated database is decompressed, it decompresses to an amount exceeding 312 terabytes (stored in over 1.9 trillion records). This compares favorably to the roughly 67.7 terabytes of disk space (not counting any RAID redundancy) it takes to store the compressed version of the data.

Of course, the first partition also functions as an independent database in an SMP environment. For this first partition, the disk footprint (i.e., total filesystem space) used by the data is 46.3 terabytes and the efficiently-constructed compact indices take up about 24.6 terabytes. The uncompressed size of the data is 201 terabytes. Of course, the size of the indices is proportional to the number of B-trees deemed appropriate for the application -- and the kind of B-trees chosen: well-chosen cluster B-trees take up much less space; SCAMP uses three cluster B-trees. Consequently, computing the ratio of index to data bytes for SCAMP has no predictive import for other applications.

The data stored in the largest table has a disk footprint of 35.3 terabytes, is stored in 90,583 UNIX files and has 743 billion records; Daytona imposes no upper limit on the number of files that can be used: this largest table could grow indefinitely.

Comparing database sizes when different DBMS are using different measures of size

How does this all compare with other commercial alternatives? The first thing to keep in mind is that the less disk to accomplish the same ends, the better. Due to its compression (and other) capabilities, Daytona is very efficient in that regard. After all, it's not how much disk you have attached, it's how well you use what you do have. As an example, B-trees built by inserting entries one at a time can become quite bloated with unused space; Daytona has an initial bulk load feature that packs its B-trees very tightly and so large applications find it highly useful to rebuild indices in this fashion that have become bloated due to random record-at-a-time updates. Likewise, some commercial data management systems can be storing data on disk block by block in a bloated manner with lots of free space to handle random updates quickly. Daytona tables by nature are more efficient because they simply store their records one right after the other -- with reuse of freed space leading to efficient steady-state disk utilization in the face of ongoing updates. So, just because a database claims to be using a lot of disk, doesn't mean that it is using it wisely. It's your money.

Likewise, when comparing database sizes, it's important to normalize the data in each to some common/canonical (like human-readable) format so as not to penalize a database that uses compression (like SCAMP) for doing such a good job in saving money on disk (and speeding up queries due to decreased I/O time).

For the record (as it were), Daytona on this HP Itanium Superdome platform had no problem in loading all this data and could have easily loaded at least twice as much more but that didn't happen because we ran out of data.

In short, a superior measure of database size is the total number of bytes taken up by the data as written in a human-readable or uncompressed format. Such a measure of size highlights the utility of data compression and does not penalize the DBMS for using it. It is much better than measuring size by total disk filesystem footprint for both data and indices because the latter figure can be bloated by storage inefficiencies -- which inefficiencies can be gauged by taking the ratio of the filesystem footprint to the uncompressed data size. In Daytona's case, when compression is being used, this ratio is almost always less than one; for other DBMS, this ratio can well be upwards of 9 or more: what are they doing with that extra space?? The uncompressed size measure is also better than total number of records, which while easy-to-understand and meaningful up to a point, clearly is not the whole story because a trillion records 200 bytes each on average is clearly more impressive than a trillion records each, say, of 20 bytes on average. So, that's the question: who else has a two-machine federated database larger than 312 terabytes of uncompressed information with one of the machines managing a trillion 50+ byte (compressed) records?




Back