Caveman's Blog

My commitment to learning.

SQL Server: Table Partitions as an archiving solution

leave a comment »


Problem Statement

How can you design an archiving solution on a large table without deleting any data and also achieve improved performance on CRUD operations (on the same table)? The assumption in this scenario is that the most recent data would be accessed more often that the older data.

Solution

Archiving solutions can be of two types; 1) passive and 2) active. A passive solution is one where the historic data is archived in another table on another database, making the data unavailable. An active solution is one where the historic data is archived and will still be available for access without much of an impact on the application performance. A large table typically contains millions of rows and probable has a size that runs into several gigabytes. Just the size of the table makes it very expensive to perform CRUD operations and difficult to maintain indexes.

Table Partitioning in SQL Server 2005 and up lets a table data and indexes to be stored in several smaller partitions. This feature is very useful in providing a way to easily maintain and perform database operations on that table. Each partition will be stored in a different file which can be part of a filegroup. Data is distributed evenly between the files in a filegroup. Allowed columns can be used as a partition key which is the criteria for partitioning. You will be able to define the boundaries of a partition by defining the limits of the partition key. This division is based on how you access the rows of the table. In addition to identifying the partition key, we also will want to include that key as part of the index and also partition the index. Partitioning the index is called as index alignment. This was when the index portion will be stored along with the data rows stored in the partition.

Dividing a table into several files gives us the flexibility of storing those files on separate drives. We can store the files that contain data from the recent years on faster drives as opposed to storing the older data on slower drives. Going with assumption in the problem statement that the most recent data is accessed more often that the older data, we will have improved the performance on this table because we will have faster response times thanks to the faster drives.

Note: All data types that are valid for use as index columns can be used as a partitioning column, except timestamp. The ntext, text, image, xml, varchar(max), nvarchar(max), or varbinary(max) data types cannot be specified. [1]

References:
1. Partitioned Tables and Indexes
2. Create Partitioned Tables and Indexes

Advertisements

Written by cavemansblog

July 2, 2013 at 9:32 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: