Sliding window sanario in PTF vs Availability of recently loaded data in the staging table for reporting purpose

Mar 23, 2015 at 1:46 PM
Hello everybody, I am a SQL server DBA and I am planning to implement table partitioning on some of our large tables in our data warehouse. I am thinking to design it using the sliding window scenario. I do have one concern though; I think the staging tables we use for new data loading and for switching out the old partition are going to be non-partitioned, right?? Well, I don't have an issue with the second staging table that is used for switching out the old partition. My concern is on the first staging table that we use it for switch in purpose, since this table is non-partitioned and holding the new data, HOW ARE WE GOING TO USE/access THIS DATA FOR REPORTING PURPOSE before we switch in to our target partitioned table????? say, this staging table is holding a one month worth of data and we will be switching it at the end of the month. Correct me if I am wrong okay, one way I can think of accessing this non-portioned staging table is by creating views, which we don’t want to change our codes.

Do you guys share us your thoughts, experiences???

We really appreciate your help.
Coordinator
Mar 23, 2015 at 3:27 PM

Like you already mentioned, one way to solve this problem is by using a view that unions the selects on the partitioned table and the staging table (but make sure to have check constraints on the partitioning column in place on both tables that check the upper and lower limit so that the optimizer is able to decide which part of the union holds which data).

Another option would be to use one partition per month and switch in the data incrementally on a daily basis, e.g. by using SwapPartition or SwitchOut/SwitchIn in PTF, or you might change the granularity of your partitions, e.g. daily versus monthly. Thereby, everything up to today’s data always would be in the partitioned table.

HansPo

Mar 23, 2015 at 9:56 PM
Thanks Hanspo,
I will look in to the Swap Partition option as you suggested and make sure that the space requirement for the staging tables is fair enough.
Aug 21, 2015 at 10:23 AM
This is actually quite an interesting problem for me too.

I'm building a reporting database whereby I am pulling in data from a variety of sources to provide metric information for a client website. I plan on using monthly partitions, with a process to swap in the staging table at month end. Using monthly partitions because clients view the data as monthly reports.

Talking to the guys who actually will be using the front-end tp analyse the data before sending it to clients, they obviously want data daily, if not perhaps weekly which I am resistant to. Would it be possible, using PTF to stick to a monthly partition system, but have a process to duplicate the staging table once a week, swap in the weekly table into the active partition without then dropping the staging table. The following week the staging table contains the data from the previous week, adds the new data, and then swaps that in. At month end, we'll still do a final swap into the partition, rolling the next partition forward? Like an incremental staging table?

Essentially I want to dump the staging data into the partitioned metric table every week, but consolidate the partition and move the sliding window at month end?