-2

I have a DB, which holds words with position identifier. For instance the following string:

The Quick Brown Fox Jumps Over The Lazy Dog

The position identifiers would be:

  1. The
  2. Quick
  3. Brown
  4. Fox
  5. Jumps
  6. Over
  7. The
  8. Lazy
  9. Dog

What I would like to accomplish is that I select a string where Fox is followed directly by Jumps. I use the following MySQL syntax:

SELECT PositionIdentifier 
FROM TableName 
WHERE Word='Fox' 
   OR PositionIdentifier + 1 IN 
       (SELECT PositionIdentifier 
        FORM TableName 
        WHERE Word='Jumps'
       )

However, the plus one statement is not working as expected.

How could I achieve this?

Update:

Oke, to clearify. This is my table:

StringID    Word     Counter     Positions

1           Fox      2           6|35

1           Jumps    4           7|12|36|46

1           Over     3           8|37|41

So, from this tabel I would like to retrieve the Positions where Over follows Jumps which follows Fox. So in this example, I would like to retrieve 6,7 and 8 as a pair and 35, 36 and 37 as a pair of positions

I hope this is more clear. Thx

TVA van Hesteren
  • 1,031
  • 3
  • 20
  • 47

3 Answers3

1

Give it a try like this

SELECT PositionIdentifier 
FROM TableName 
WHERE Word='Fox' 
   AND PositionIdentifier IN
       (SELECT (PositionIdentifier -1 ) AS PrevPosID
        FROM TableName 
        WHERE Word='Jumps'
       )

assuming that your PositionIdentifier column is an INT otherwhise you might have to CAST the substraction result

Thomas G
  • 9,886
  • 7
  • 28
  • 41
0

Do a self-join on the positions table using positionidentifier+1 as the join condition to get the strings where jumps follows fox. I assume that your positions table has the following 3 fields:

  • string_id: identifier of a string that has words. Words with a common identifier for the string. If you simply join on the PositionIdentifier field, then your results may be incorrect, since multiple strings may contain these 2 words.

  • PositionIdentifier: identifies a word's position within a string

  • word: the actual word

I would use the following query:

select p1.string_id, p1.PositionIdentifier
from positions p1
inner join positions p2 on p1.string_id=p2.string_id and p1.PositionIdentifier+1=p2.PositionIdentifier
where p1.Word='Fox' and p2.Word='Jumps'

Alternatively, you can use an exists() subquery as well, if you do not need any details from the 'jumps' record.

Shadow
  • 33,525
  • 10
  • 51
  • 64
  • Well, I would also like to have the PositionIdentifier of the 2nd word. And what if I don't want to find Fox Jumps but Fox Jumps Over The. The Query would be growing rapidly, and I need all PositionIdentifiers returned – TVA van Hesteren Feb 16 '17 at 19:47
  • @TVAvanHesteren then you should have described these criteria in your question, not in a subsequent comment to an answer. Btw, just add p2.PositionIdentifier to the select list and you have your wish. – Shadow Feb 16 '17 at 19:56
  • yes I understand that. Is it also possible to make SQL select the position from a | symbol separated string? For instance: positions are : |6|58|74|97| ? – TVA van Hesteren Feb 16 '17 at 22:34
  • Pls ask your new question appropriately in its own question on SO. – Shadow Feb 16 '17 at 22:56
  • Normalise your data structure. – Shadow Feb 17 '17 at 09:29
  • well how would you store it? I can't give it a table reference and give each reference from a string it's own position of that word, because I have millions of words in my table. – TVA van Hesteren Feb 17 '17 at 09:36
  • That's exactly what you need to do. Probably you can reduce the number of records by having a separate word table where you store each word only once. In your positions table you reference the word id, not the actual word. And now you have your own fulltext index. – Shadow Feb 17 '17 at 09:45
  • I understand, I had exactly that storing Word and WordID in Words table with the ID and positions stored separately in the WordPositions table. However, I have many words will this affect speed? I thought this approach (stated above) would be faster but I'm not sure.. :) – TVA van Hesteren Feb 17 '17 at 09:48
  • See the following SO topic on this: http://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad – Shadow Feb 17 '17 at 11:51
0

I tried your sql and it looks like it's working fine? Other than the fact that FROM was misspelled (FORM) and it looks like you want "AND" instead of "OR" for what you want.

SELECT PositionIdentifier FROM TableName WHERE Word='Fox' OR PositionIdentifier+1 IN (SELECT PositionIdentifier FROM TableName WHERE Word='Jumps')

Could you clarify what you mean by "the plus one statement is not working as expected."?

ndev
  • 1
  • 2