6

I have a field in my database which is encoded. After using from_base64 on the field it looks like this:

<string>//<string>//<string>/2017//06//21//<string>//file.txt

There may be an undetermined number of strings at the beginning of the path, however, the date (YYYY//MM//DD) will always have two fields to the right (a string followed by file extension).

I want to sort by this YYYY//MM//DD pattern and get a count for all paths with this date.

So basically I want to do this:

select '<YYYY//MM//DD portion of decoded_path>', count(*) from table group by '<YYYY//MM//DD portion of decoded_path>' order by '<YYYY//MM//DD portion of decoded_path>';
Steve Chambers
  • 37,270
  • 24
  • 156
  • 208
JimRomeFan
  • 407
  • 6
  • 19
  • 2
    About the only way you'll get this to work is if you break out the text you want to search into it's own field. Since the data doesn't have identifiers (ie. name/value pairs or json or xml), it will be processor intensive to parse the string inside a query. I presume that's a typo in your first sentence, and you meant encoded instead of decoded. You will see a significant performance hit by using the method you propose. It would be much faster to have an update run that parses the decoded column and puts the searchable info into it's own column. – Sloan Thrasher Jun 21 '17 at 22:07
  • Why don't you insert the date as a separate column when you insert a row? – mikep Jun 25 '17 at 12:43
  • Is the single rather than double slash before 2017 intentional? – Steve Chambers Jun 25 '17 at 14:49

2 Answers2

5

Summary

MySQL's SUBSTRING_INDEX comes in useful for doing this by looking for the specified delimiter and counting backwards from the end if a negative count value is specified.

Demo

Rextester demo: http://rextester.com/TCJ65469

SQL

SELECT datepart, 
       COUNT(*) AS occurrences
FROM
(SELECT CONCAT(
     LEFT(SUBSTRING_INDEX(txt, '//', -5), INSTR(SUBSTRING_INDEX(txt, '//', -5), '//') - 1),
     '/',
     LEFT(SUBSTRING_INDEX(txt, '//', -4), INSTR(SUBSTRING_INDEX(txt, '//', -4), '//') - 1),
     '/',
     LEFT(SUBSTRING_INDEX(txt, '//', -3), INSTR(SUBSTRING_INDEX(txt, '//', -3), '//') - 1))
   AS datepart
 FROM tbl) subq
GROUP BY datepart
ORDER BY datepart;

Assumptions

Have assumed for now that the single slash before the year in the example given in the question was a typo and should have been a double slash. (If it turns out this isn't the case I'll update my answer.)

Steve Chambers
  • 37,270
  • 24
  • 156
  • 208
1

little crazy but it works

select REPLACE(SUBSTRING_INDEX(SUBSTRING_INDEX(REPLACE('<string>//<string>//<string>/2017//06//21//<string>//file.txt',"//","-"),"/",-1),"-<",1),"-","/"), count(*) from `chaissilist` group by REPLACE(SUBSTRING_INDEX(SUBSTRING_INDEX(REPLACE('<string>//<string>//<string>/2017//06//21//<string>//file.txt',"//","-"),"/",-1),"-<",1),"-","/") order by REPLACE(SUBSTRING_INDEX(SUBSTRING_INDEX(REPLACE('<string>//<string>//<string>/2017//06//21//<string>//file.txt',"//","-"),"/",-1),"-<",1),"-","/");
Towfik Alrazihi
  • 536
  • 3
  • 8
  • 25