1

I'm trying to pull a substring in a query using instr. The following command:

SELECT instr(ZSYNCPAYMENT, '{"t') 
FROM ZPAYMENT;

provides a result of 64.

I copied and pasted the output of SELECT ZSYNCPAYMENT FROM ZPAYMENT into a hex editor and selected the number of bytes up to and including the { symbol, which is part of my instr function above. The selected number of bytes shows as decimal length of 71. Why is my instr output showing a value of 64? The screenshot below is the output of the SELECT ZSYNCPAYMENT above.

enter image description here

forpas
  • 160,666
  • 10
  • 38
  • 76
SnakeDoc65
  • 91
  • 7

1 Answers1

1

From INSTR():

The instr(X,Y) function finds the first occurrence of string Y within string X and returns the number of prior characters plus 1, or 0 if Y is nowhere found within X. Or, if X and Y are both BLOBs, then instr(X,Y) returns one more than the number bytes prior to the first occurrence of Y, or 0 if Y does not occur anywhere within X.

In the hex editor you use you see the difference in bytes of the position of the string '{"t' from the start of the value of ZSYNCPAYMENT.
This is not the same as the difference in characters when a string contains unicode characters, which I suspect is the case with the string you posted in the image.

If you want the difference in bytes cast both ZSYNCPAYMENT and '{"t' to BLOBs:

SELECT INSTR(
         CAST(ZSYNCPAYMENT AS BLOB), 
         CAST('{"t' AS BLOB)
       ) 
FROM ZPAYMENT 

See a simplified demo.

forpas
  • 160,666
  • 10
  • 38
  • 76
  • This worked thank you! However I am having problems understanding the difference in the positioning. So I look at the data in DB Browser, and same thing, counting each character, I get to 71. So I just don't understand where the SQLite command is getting 64 from. Thank you for your assistance, I really do appreciate it! – SnakeDoc65 Apr 15 '21 at 15:56
  • @SnakeDoc65 for a better explanation about unicode chars you can read this topic: https://stackoverflow.com/questions/5290182/how-many-bytes-does-one-unicode-character-take For example in the demo I provided the unicode char `'Ä'` although is just 1 char it is stored in 2 Bytes and not just 1 like the other chars. INSTR() counts characters for strings and bytes for BLOBs. – forpas Apr 15 '21 at 16:00
  • Maybe you can help me a little further then. In my database, as well as in your demo, I applied this command: SELECT substr(ZSYNCPAYMENT, INSTR(cast(ZSYNCPAYMENT as blob), cast('{"t' AS BLOB))) FROM ZPAYMENT So I expected the output from your demo (edited with the above command) to produce the same result of {"t156. However, one value returned is {"t156 and the other is "t156. Because of the instr() function I thought it would tell the substr where the substring is located that I want pulled. If that makes sense. – SnakeDoc65 Apr 15 '21 at 18:09
  • 1
    @SnakeDoc65 you must also cast the 1st argument of SUBSTR() to BLOB. Check this: https://dbfiddle.uk/?rdbms=sqlite_3.27&fiddle=f4fabba82cc81757937a8b35c27d3f2f – forpas Apr 15 '21 at 18:23
  • You're my hero. Thank you! – SnakeDoc65 Apr 15 '21 at 18:25
  • @SnakeDoc65 if what you need is to return the substring of `ZSYNCPAYMENT` starting from `'{"t'` then do not do any casting. Treat the values as regular strings. Check this: https://dbfiddle.uk/?rdbms=sqlite_3.27&fiddle=973a8b7c4dbddae0582db84da3edc4a8 My answer was about the difference in the results of SQLite's INSTR() and the hex editor. This does not mean that you have to treat the string values as BLOBs when you want to perform simple string operations like INSTR() or SUBSTR(). – forpas Apr 15 '21 at 18:43