3

I have cube where measure groups are partitioned into 2 partition for different time frame as below:

  1. latest 2 years data ( PRT_DATA_2YEARS )

  2. older data than 2 years ( PRT_DATA_REST )

By processing 1st partition ( with any of processing option ), How can I make sure to process/refresh only last 2 years of data? Meaning only latest 2 years of data will be updated and rest of the data will be intact.

Example: lets say actual data on database got updates for all years. and till now cube has not reflected those changes because it hasn't processed yet. Now we run cube processing job, that will update or fetch updated data only for latest 2 years in the cube. And rest of the data will be intact. There will not be any change in the cube data which is older than 2 years.

Request you to please help to resolve this.

Any suggestions/hints/trick are welcomed.

Regards

ScriptSoft
  • 163
  • 1
  • 2
  • 11

2 Answers2

2

I would recommend you change from sliding windows partitions. For example, yesterday when you processed the “latest 2 years” partition it contained January 18, 2018-January 17, 2020. (Or I’m not sure if it’s full year boundaries, but the point is the same.) If you reprocess that partition then you will lose January 18, 2018 data and will this have to reprocess the older partition too.

Instead switch to 5 partitions: 2016, 2017, 2018, 2019, and 2020. Then you should be able to reprocess the 2020 partition only and not have to reprocess the prior partitions. Basically you can achieve incremental processing. When you are ready to stop reporting on 2016 data then drop that partition and create the 2021 partition.

GregGalloway
  • 11,355
  • 3
  • 16
  • 47
  • Thanks Greg For your response.. Since I have to process dimensions as well, dont you think processing 2 partitions(2019 & 2020) will drop rest of the data from other partitions? – ScriptSoft Jan 20 '20 at 04:55
  • @ScriptSoft it’s Multidimensional I assume. No. ProcessUpdate oh dimensions then ProcessData on 2019 and 2020 partitions then ProcessIndexes on the cube will get you processed without reprocessing previous years. But time full reprocess vs incremental as it’s possible full reprocess (ProcessFull on the Database) is faster. – GregGalloway Jan 20 '20 at 05:02
  • Thanks @Greg, time is not a constraint at this point of time but to manage the requirement. I will try & test the way you suggested n inform. Can I also try same approach on my original partition ( PRT_DATA_2YEARS) and previous partition (PRT_DATA_REST ) will not get impacted? – ScriptSoft Jan 21 '20 at 06:19
  • I tested the way you suggested: ProcessUpdate(Dim)--> ProcessData(Partitions)-->ProcessIndex(Cube). But it's not giving desired result. It is wiping out data from old partitions and loading only latest 2 years of data. can you please comment out anything on this? – ScriptSoft Jan 28 '20 at 07:42
  • @ScriptSoft that doesn’t make any sense. Are you 100% sure? Can you retry the test and make sure the older partitions are processed to begin with? I assume when you process the old partitions on Monday they point to Date dimension members which still exist when you ProcessUpdate on Tuesday? How did you test to confirm it is wiping out old partitions? Did you run a query or just assume based upon what you thought it was doing in a processing dialog? – GregGalloway Jan 28 '20 at 07:59
  • there are 2 ssis jobs.one process complete ssas db to fetch all data from source DB. other job process only partitions which trying to fetch latest 2 years of data. I run first job and can see all records in cube. Then I update the facts for all periods/years in source DB. and then I run 2nd job as per this approach: ProcessUpdate(Dim)-->ProcessData(partions PRT_DATA_2YEARS)--> processIndex(Cube). My expectation is: when browse cube, it should have latest data ONLY for last 2 years and data before 2 years should contain old data which was there before i updated facts. – ScriptSoft Jan 29 '20 at 03:12
  • But after 2nd job execution, when I browse cube, i can see only latest 2 years data with updated facts. but there is no older data at all. means no rows exits for data older than 2 years, hence it completely wiped out previous records. – ScriptSoft Jan 29 '20 at 03:15
  • @ScriptSoft yeah. That doesn’t make sense. Feel free to include the code you use for incremental processing or screenshots. Do any dimension keys in the old data become invalid (disappear from dimension tables) after ProcessUpdate on the dimensions? That’s the only way I can fathom that old data disappears. – GregGalloway Jan 29 '20 at 05:06
  • I have included 3 screenshots of SSIS Job- how exactly it is configured. please do let me know in case you require anything specific. – ScriptSoft Jan 29 '20 at 08:04
  • @ScriptSoft that’s a bunch and plenty of room for error. Are you sure you don’t accidentally have one ProcessFull in there on one dimension? – GregGalloway Jan 29 '20 at 13:24
  • @ScriptSoft I’m not optimistic that running multiple processing tasks in parallel (rather than one processing task with all the steps) will improve performance. You might actually simplify into a single processing task for everything (including dimensions, partitions, and indexes). SSAS will understand dependencies and do it’s own parallelism. I can’t see how that’s the culprit but it’s not typical the way you have done it. I would recommend you start simple and test one dimension and one partition and ProcessIndexes on that one cube. Then expand from there as you confirm it works. – GregGalloway Jan 29 '20 at 13:28
  • Thanks Greg, I will test the way you suggested. Also the update I am using before 2 year job execution is simple update query as below::: **UPDATE factEALDeposits SET bln_cls_pos = bln_cls_pos / 1000, bln_opn_pos = bln_opn_pos / 1000 WHERE PRD_ID IN (SELECT PRD_ID FROM DimPeriod WHERE PRDYR IN ('2019', '2018', '2017', '2016'));** – ScriptSoft Jan 29 '20 at 14:59
  • 1
    Hello Greg, apologize as its been long to revert back. I have tested the way you suggested. 1. ProcessUpdate(Dimensions), ProcessData(Partitions), ProcessIndex(Cube) 2. initiated this approach from 1 Dim, 1 partitions, related 1 cube. And extended as it grows. everytime it was giving expected result. Finally with all dim, partitions, cubes, it succeeded and I got expected result. 3. as suggested, I also had included all objects in one processing task since beginning of the test. Many thanks for your suggestions. It helped me a lot. – ScriptSoft Mar 06 '20 at 02:53
  • One questions here: in case some Dimensions are getting updated everyday, does this method still works? My thought is No because dimension data will be changed for older partitions as well which are Not being processed and hence data for older years may be missing. Please guide. many thanks – ScriptSoft Mar 06 '20 at 03:02
  • 1
    @ScriptSoft if you need to ProcessUpdate dimensions daily then ensure you also so ProcessIndexes on the cube and measures groups should be fine. – GregGalloway Mar 06 '20 at 04:10
1

First, make sure that all partitions storage method is defined as MOLAP (not ROLAP or HOLAP), then just process one partition and the others will not be affected. But make sure that you did not process the dimensions related to the other partitions (if reprocessing is needed try to use ProcessAdd method).

Hadi
  • 36,233
  • 13
  • 65
  • 124
  • Hello Hadi, Many thanks for response, but I have to process dimensions as well because there are 8 cubes in the ssas db and many dimensions are shared among those cubes. – ScriptSoft Jan 20 '20 at 04:48
  • @ScriptSoft try to do incremental processing if possible (read about ProcessAdd) – Hadi Jan 20 '20 at 06:41
  • 1
    Thanks for your suggestion, but base tables of the cube can expect any type of sql (Insert/update/delete) which should be reflected in cube after processing, hence it won't work this way. – ScriptSoft Jan 21 '20 at 06:14
  • @ScriptSoft I don't think that you can do that since if you need to process dimensions each time (ProcessFull or ProcessUpdate) all related partitions should be processed – Hadi Jan 21 '20 at 06:21
  • by saying this, do you mean this can not be achieved in SSAS at all? not any work around even from your experience? – ScriptSoft Jan 28 '20 at 07:43