0

I am currently working on a logging system in sql server 2014, and need to find a way to assign similar strings into groups. Each string contains a prefix, an ID or Guid and then a suffix. I'd like to find a way find if there already exists strings with the same pre- and suffix in my table.

A typical string might look like:

'ApsisQueue.Synchronize @ApiKey = AA24FA75-CB78-451E-A11F-39115ABF8AA1, @CallbackId = 1'

Or

'ApsisQueue.Synchronize @ApiKey = AA24FA75-CB78-451E-A11F-39115ABF8AA1, LogMessage = NULL'

Or

'ApsisQueue.Synchronize @ApiKey = BC2CFC45-CB78-411E-A66F-3CC654AB125, LogMessage = NULL'

Or

'File id19957 does not exist'

Or

'File id1955 does not exist'

I'd like to group the messages depending on what appears before and after the Id/Guid values, so the messages above would be grouped into 3 groups.

I've considered using Levenshtein to find similar strings, but it seems a bit overkill since I know all differences will follow one another (it also causes performance issues due to the amount of strings that needs to be compared). The problem is that I don't know where in string the Guid/Id starts or ends.

Community
  • 1
  • 1
Kull
  • 21
  • 1
  • 4

1 Answers1

0

Not really sure what you are trying to accomplish but I suspect that once you figure out how to isolate the key values you are working with you can move forward. This works with your sample data. If you have other patterns this may not work.

create table #Something
(
    SomeString varchar(200)
)

insert #Something (SomeString) values
('ApsisQueue.Synchronize @ApiKey = AA24FA75-CB78-451E-A11F-39115ABF8AA1, @CallbackId = 1')
,('ApsisQueue.Synchronize @ApiKey = AA24FA75-CB78-451E-A11F-39115ABF8AA1, LogMessage = NULL')
,('ApsisQueue.Synchronize @ApiKey = BC2CFC45-CB78-411E-A66F-3CC654AB125, LogMessage = NULL')
,('File id19957 does not exist')
,('File id1955 does not exist')


select *
    , Case when SomeString like 'ApsisQueue%' 
        then SUBSTRING(SomeString, charindex('=', SomeString) + 2, CHARINDEX(',', SomeString) - charindex('=', SomeString) - 2) 
        else left(SUBSTRING(SomeString, 8, 200), charindex(' ', SUBSTRING(SomeString, 8, 200)) - 1)
    end as YourKeyValue
from #Something

drop table #Something
Sean Lange
  • 33,028
  • 3
  • 25
  • 40