1

A friend told me that his new employer needs an SSRS report that parse a column that contains n consecutive occurrences of 1) the literal "Date:" 2) An optional separator character 3) followed by a date in DD-MM-YY format (leading zeros are optional) 4) a separator space 5) A single "WORD" of data that is associated with the date. This word will have no embedded spaces.

I'll populate a Sample table with data that meets this critera to give you an example to make it clear:

CREATE TABLE [dbo].[Sample](
    [RowNumber] [int] NOT NULL,
    [DataMess] [varchar](max) NOT NULL
) ON [PRIMARY]


INSERT [dbo].[Sample] ([RowNumber], [DataMess]) VALUES (1, N'Date:12-21-13 12/13/14/15 Date:4-2-11 39/12/134/14 Date:4-1-13 19/45/5/12')
INSERT [dbo].[Sample] ([RowNumber], [DataMess]) VALUES (2, N'Date:7-21-13 12/13/14/15 Date:8-21-12 39/12/34/14 Date:12-1-13 19/4/65/12')
INSERT [dbo].[Sample] ([RowNumber], [DataMess]) VALUES (3, N'Date:3-21-13 12/11233/14/15 Date:4-28-13 39/12/34/14 Date:9-19-13 19/45/65/12')

For the first record, "12/13/14/15" is considered to be the "Word" of data that is associated with the Date 12-21-13.

He was aked to produce the following report in SSRS:

Row Number               DataMess
1                        Date: 12-21-13 12/13/14/15 
                         Date: 4-1-13 19/45/5/12 
                         Date: 4-2-11 39/12/134/14 

2                        Date:12-1-13 19/4/65/12 
                         Date:7-21-13 12/13/14/15 
                         Date:8-21-12 39/12/34/14 

3                        Date:9-19-13 19/45/65/12 
                         Date:4-28-13 39/12/34/14 
                         Date:3-21-13 12/11233/14/15  

Note that the Dates for each source row number are sorted in descending arder alomng with the associated wor of data.

I don't know SSRS, but my reaction was to recommend to him that he not even attempt the task but to tell his employer that the data shouldn't really be trying to do all of that ugly string parsing with T-SQL. Instead this repeating "Date: DATA" should be stored in individual child records that are associated with a parent Row record. I believe that the code would be ugly, inefficient, brittle and hard to maintain. What are your thoughts?

Assuming that management\client is always right or to conceed that "ideally" this is correct, but "for now" we need a SQL that produces the following report, how would one do this? The expectation was that this can be produced quickly ( a half day, for example)

Nathan Griffiths
  • 12,277
  • 2
  • 34
  • 51
Chad
  • 23,658
  • 51
  • 191
  • 321

2 Answers2

1

You are of course correct, it's certainly far from the best way of storing the data. Any way of extracting the data for this report will be much more complex than it would be if it was stored differently.

However, based on the data it still wouldn't be too tough to actually generate the report. Due to the table structure actually generating the dataset for the report would be the hardest part.

So to generate the dataset, you need to split the data in DataMess to get one row per Date/Word, and be able to extract the date from that split data to be able to order by date as required.

Take your pick on how you want to split the data:

Split function equivalent in T-SQL? has many options, as does this link - Best Split Function.

Here's a SQL Fiddle with one of the functions in action.

Once you've split the data, use the appropriate function to extract the date portion, i.e. between the colon and the space before the word data, then CAST this as a date.

Once you've actually got the dataset, it's the most trivial of reports - just add a row group based on RowNumber, add the split Date/Word data as a detail field and you're done. Make sure the dataset is ordered by the extracted date field, even though you don't actually display this in the report.

As an interim measure I would certainly expect this to be doable in half a day of work or so. So for just this report it's not too bad, but for anything else you'll probably have trouble.

For a few rows it will likely run fine, but on any non-trivial sized dataset performance will be suboptimal.

Community
  • 1
  • 1
Ian Preston
  • 38,816
  • 8
  • 95
  • 92
0

Thank you. Here's what I did for the remaining part to get the dates sorted in DESC order.

SELECT
     RowNumber
     ,'Date: ' + ss.Item AS Data
     --,cast(substring(ss.Item, 1, charindex(' ' , ss.Item) ) AS date)
FROM
     Sample s
     CROSS apply dbo.SplitStrings_XML(s.DataMess,
                                      N'Date:') ss
WHERE
     Item IS NOT NULL
ORDER BY
     rownumber, 
     cast(substring(ss.Item, 1, charindex(' ' , ss.Item) ) AS date) desc

If the data fails to hold up to this expected format and we encounter a date that is not valid, then the whole report blows up.

Chad
  • 23,658
  • 51
  • 191
  • 321
  • Yep, perfect, that's exactly the sort of thing I was thinking for the extra date sorting the report requires... It will probably work if everything is perfect in the data. As you say, a poor situation to begin with. – Ian Preston Apr 28 '13 at 00:06