0

I have a database, which contains a field 'topic'. This field is a short text, about 200-300 words, I would say.

I need to find a string in this field, but in a particular way: the field must contains the string as specified. It can't be a substring of another word. Such things as

SELECT * FROM table WHERE abstract like '%myString%"

Are not possible, because it would also return something+my_string+something.

SELECT * FROM table WHERE abstract like '_myString_"

Does not work either, because my field could start with myString, and it would not match.

I would love to use REGEXP, but sqlite doesn't seem to support it yet. I read it is possible to tweak it, and use a personal function, but I would need a way to implement it for PyQt, to use it in my application.

NoDataDumpNoContribution
  • 10,591
  • 9
  • 64
  • 104
JPFrancoia
  • 4,866
  • 10
  • 43
  • 73

1 Answers1

0

If I understand correctly, you want to select records in whose abstract your string 'myString' appears as a word delimited by spaces and/or either end of the string. You say otherwise -- "only the string, and nothing else" to quote you -- but if that were indeed the case then obviously WHERE abstract = 'myString' would suffice; hence my inference that you're after something else despite what you say.

So, if I've read your mind correctly, then

WHERE ' ' || abstract || ' ' LIKE '% myString %'

would do the job -- bypassing your worry that "my field could start with myString, and it would not match" by concatenating spaces before and after the actual abstract.

Maybe you're after something more, e.g words in your abstract can also be delimited by (for example) commas; in that case,

' ' || replace(abstract, ',', ' ') || ' ' LIKE '% myString %'

would do the trick. Of course, if there are too many possible punctuation characters that you want to tread this way, this gets cumbersome due to a "cascade" of replace, and that's where the craving for regular expressions gets fierce. And in that case it is indeed possible to augment sqlite with a regexp library (depending on what platform you're on: I don't see what PyQt has to do with the issue), but it can be messy. See How do I use regex in a SQLite query? if that's really the only avenue you can pursue... though Python may make it easier, e.g:

import sqlite3
import re

def resub(pat, sub, s):
return re.sub(pat, sub, s)

con = sqlite3.connect(":memory:")
con.create_function("resub", 3, resub)
cur = con.cursor()
cur.execute("select resub('[,!?]',' ', 'O Romeo, Romeo, wherefore art thou Romeo?')")
print cur.fetchone()[0]

displays "O Romeo Romeo wherefore art thou Romeo " as desired... so maybe it's not quite as messy as with other non-Python uses of sqlite!-)

Community
  • 1
  • 1
Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • You nailed it, thanks. Actually, your answer gave me leads to solve my issue. I created a field abstract_simple (a sluggified version of abstract), and added a blank space at the beginning and at the end. Then I can use '% myString %' with no risk. And you also solve another issue, for the 'authors' field :) – JPFrancoia Feb 01 '15 at 21:07
  • @Rififi, always glad to be of help! – Alex Martelli Feb 01 '15 at 21:13