I am starting with data like this:
- John went "to the store"
I'd like to tokenize this into
- John
- went
- to the store
I am trying to use the REGEXP_SUBSTR method to do this, but can't seem to come up with a regular expression that says 'match non-spaces that are not between double-quotes'
SELECT DISTINCT
REGEXP_SUBSTR ('John went "to the store"', '[^[:space:]]+', 1, level) AS word
FROM
dual
CONNECT BY level <= LENGTH(regexp_replace('John went "to the store"','[^[:space:]]+'))
+1
ORDER BY word ASC;
My brain is a little regexp fried at this point. Thanks in advance!