Difference between revisions of "Table Segmentation in Content Manager OnDemand"

Jump to navigation Jump to search
m
Minor corrections to IBM CMOD Table Segmentation.
m (A variety of small fixes and spelling corrections to IBM CMOD Table segmentation.)
m (Minor corrections to IBM CMOD Table Segmentation.)
 
Line 1: Line 1:
== Introduction ==
== Introduction ==


In order to keep database queries fast, OnDemand uses a concept called "database table segmentation".    The term 'table segmentation' refers to splitting extremely large tables into 'segments' of smaller tables for performance and/or ease of maintenance.  Although the latest versions of database engines can do this natively, at the time Content Manager OnDemand was created, there was no built-in support for this functionality, so it still uses the old style segmentation (based on date fields) to achieve the scalability and speed that customers require.
In order to keep database queries fast, CMOD uses a concept called "database table segmentation".    The term 'table segmentation' refers to splitting extremely large tables into 'segments' of smaller tables for performance and/or ease of maintenance.  Although the latest versions of database engines can do this natively, at the time Content Manager OnDemand was created, there was no built-in support for this functionality, so it still uses the old style segmentation (based on date fields) to achieve the scalability and speed that customers require.


When an end user performs a search, IBM CMOD performs the search on one or more tables, based on the date range contained in individual tables.  This 'date range' is called the 'segment date'.
When an end user performs a search, IBM CMOD queries one or more tables, using the date fields in the submitted search to determine which individual tables to look through to satisfy the requestThe date field used to organize the tables is called the 'segment date'.


Before DB2 supported its own table segmentation natively, the Content Manger OnDemand developers decided to split index data into tables of 10 million rows each. Using this method keeps search performance linear, as only the tables containing documents in the date range you're looking for ( for example, 3 months, or 1 year) are actually searched.
Before DB2 supported its own table segmentation natively, the Content Manger OnDemand developers decided to split index data into tables of 10 million rows each. Using this method keeps search performance linear, as only the tables containing documents in the date range you're looking for ( for example, 3 months, or 1 year) are actually searched.


In order to complete queries as quickly as possible, it's important that you minimize the number of tables that are searched.  Each additional table is more work for the CPU and more data transfer from disks ("I/O") that must be performed -- and delaying the response to the end user.
In order to complete queries as quickly as possible, it's important that you minimize the number of tables that are searched.  Each additional table searched is more work for the CPU and more data transfer from disks (Input/Output, or "I/O") that must be performed -- and delaying the response to the end user.


== Optimizing Segment Size ==
== Optimizing Segment Size ==

Navigation menu