1

This is kind of complicated, so bear with me.

I've got the basic concept figured out thanks to THIS QUESTION

SELECT LENGTH(col) - LENGTH(REPLACE(col, 'Y', ''))

Very clever solution.

Problem is, I'm trying to count the number of instances of a string token, then take that counter and multiply it by a modifier that represents the string's numeric value. Oh, and I've got a list of 50-ish tokens, each with a different value.

So, take the string "{5}{X}{W}{b/r}{2/u}{pg}"

Looking up the list of tokens and their numeric value, we get this:

{5}   5
{X}   0
{W}   1
{b/r} 1
{2/u} 2
{pg}  1
....  ....

Therefore, the sum value of the string above is 5+0+1+1+2+1 = 10

Now, what I'm really looking for is to a way to do a Join and perform the aforementioned replace-token-get-length trick for each column of the TokenValue table.

Does that make sense?

Psuedo-SQL example:

SELECT StringColumn, TotalTokenValue
???
FROM TableWithString, TokenValueTable

Perhaps this would work better as a custom Function?


EDIT

I think I'm about halfway there, but it's kind of ugly.

SELECT        StringColumn, LEN(StringColumn) AS TotalLen, Token, 
                { fn LENGTH(Token) } AS TokenLength, TokenValue, 
                { fn REPLACE(StringColumn, Token, '') AS Replaced, 
                { fn LENGTH(Replaced) } AS RepLen,
                { TotalLen - RepLen / TokenLength} AS TokenCount }, 
                { TokenCount * TokenValue} CalculatedTokenValue
FROM            StringTable CROSS JOIN
                         TokenTable

Then I need to wrap that in some kind of Group By and get SUM(CalculatedTokenValue) I can picture it in my head, having trouble getting the SQL to work.

Community
  • 1
  • 1
Tinkerer_CardTracker
  • 3,397
  • 3
  • 18
  • 21
  • Ooh, what about replacing the token with the value, then doing some kind of SUM() function? Still stuck on the looping. In code, I'd do some kind of FOR loop, but still learning this "do everything in database" trend. lol. – Tinkerer_CardTracker Aug 03 '13 at 02:47
  • 1
    What is the DB? Oracle? A custom function would be database specific. – AngelWarrior Aug 03 '13 at 02:51
  • Whatever comes packaged with .NET 2013. Probably a variant of SQL Server. There's a FUNCTION folder, so I assume I can make custom ones. :P – Tinkerer_CardTracker Aug 03 '13 at 02:52
  • Oracle has the link below which looks like what you would want. I'm not sure about SQL Server. https://forums.oracle.com/thread/585623 – AngelWarrior Aug 03 '13 at 02:54
  • Don't do everything in the database -- this kind of semantic parsing will probably be faster in your client language of choice. A string based one like Awk, perl or Lisp would be the easiest to make fast. – Hogan Aug 03 '13 at 04:07
  • Is there a possibility of tokens getting repeated in the string? – Noel Aug 03 '13 at 05:42
  • Tokens FREQUENTLY repeat. This is actually more of an error checking routine, or maintenance routine. It will not need to be computed very frequently. I was hoping to generate a View that does a lot of my error checking and presents me with a list of possible errors at the Data layer. Makes a lot of sense to have it there, not the program. – Tinkerer_CardTracker Aug 03 '13 at 12:32

1 Answers1

1

If you create a view like this one :

Create or replace view ColumnsTokens as 
  select StringColumn, Token, TokenValue, 
   (length(StringColumn) - length(replace(StringColumn, token, ''))) / length(Token) TokenCount 
from StringTable 
join TokenTable on replace(StringColumn, Token, '') <> StringColumn ;

that will act as a many-to-many relationship table between columns and tokens, I think you can easily write any query you need. For example,this will give you the total score :

select StringColumn, sum(TokenCount * TokenValue) TotalTokenScore
   from ColumnsTokens 
group by StringColumn ;

(You're only missing the StringColumns with no tokens)

Nicolas
  • 923
  • 7
  • 11
  • OK, so the REPLACE() function isn't working properly. It appears to only be replacing the LAST token. Therefore, only the TokenValue of the final token is being calculated. I tried several variations in the StringColumn, and the results are consistent. – Tinkerer_CardTracker Aug 03 '13 at 22:23
  • That's weird. Maybe your database provides some kind of regexp function that you could use instead. Or -if you can make you a safe assumption on the maximum number of tokens in a column- you coould use nested replace but that would be a bit messy. – Nicolas Aug 04 '13 at 14:37
  • Got this to work! Replace() issue resolved with: http://stackoverflow.com/questions/18039188 (need a TRIM() in the REPLACE() ) – Tinkerer_CardTracker Aug 05 '13 at 20:40