How to find values that contain less than three words in SQL?

Question

How to select strings with less than three words?

For example:

id keyword
1  my name is john
2  im mike
3  david cameron
4  electra
5  goliath is my name
6  michael jordan

So i want to select the keyword column with less than 3 word count, it should be :

im mike
david cameron
electra
michael jordan

What database are you using (oracle, sql server, mysql, etc. )? — Brian DeMilia, Sep 25 '14 at 00:59
possible duplicate of [count the number of spaces in values in sql server](http://stackoverflow.com/questions/21454684/count-the-number-of-spaces-in-values-in-sql-server) — radar, Sep 25 '14 at 01:14

score 2 · Answer 1 · answered Sep 25 '14 at 01:02

2

Try:

select *
  from table_name
 where length(keyword) - length(replace(keyword, ' ', '')) + 1 < 3

This takes the length of the whole string minus the length of the string without spaces (calculating the # of spaces), and then adds one (ie. if a string has one space that means there are 2 words not 1 word), and filters out any rows where the calculation is not < 3

answered Sep 25 '14 at 01:02

Brian DeMilia

13,103
1
23
33

1

This will be thrown off if words are separated by multiple spaces. – user2864740 Sep 25 '14 at 01:07
@user2864740 that is true but I'm not sure that it's important, will wait until the OP says he has keywords split w/ multiple spaces – Brian DeMilia Sep 25 '14 at 01:10

score 1 · Answer 2 · answered Sep 25 '14 at 02:01

Using Jeff Moden's 8K Tally-Ho CSV splitter

create function [dbo].[DelimitedSplit8K](
   @pString varchar(8000) --WARNING!!! DO NOT USE MAX DATA-TYPES HERE!  IT WILL KILL PERFORMANCE!
  ,@pDelimiter char(1)
)
returns table with schemabinding as
return
  with E1(N) AS ( 
    select N from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1) )E1(N)
  ),                                             --10E+1 or 10 rows
  E2(N) as (select 1 from E1 a cross join E1 b), --10E+2 or 100 rows
  E4(N) as (select 1 from E2 a cross join E2 b), --10E+4 or 10,000 rows max
  cteTally(N) as (
    select top (isnull(datalength(@pString),0))
      row_number() over (order by (select null)) 
    from E4
  ),
  start(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
    select 1 union all
    select t.N+1 from cteTally t where substring(@pString,t.N,1) = @pDelimiter
  ),
  Len(N1,L1) as(--==== Return start and length (for use in substring)
    select s.N1,
        isnull(nullif(charindex(@pDelimiter,@pString,s.N1),0) - s.N1,8000)
    from start s
  )
  --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
  select
    ItemNumber = row_number() over(order by l.N1),
    Item       = substring(@pString, l.N1, l.L1)
  from Len l
;
go

and sample data as suggested in a CTE

with 
data as (
  select id,keyword from (values
     --id   keyword
     (1,  'my name is john')
    ,(2,  'im mike')
    ,(3,  'david cameron')
    ,(4,  'electra')
    ,(5,  'goliath is my name')
    ,(6,  'michael jordan')
    ,(7,  'michael   jordan')
  )data(id,keyword)
),

this SQL

strings as (
  select
     data.id
    ,keyword
    ,s.Item
    ,s.ItemNumber
  from data
  cross apply dbo.DelimitedSplit8K(data.keyword,' ') as s
)

select
    strings.id
from strings
where len(strings.item) > 0  -- remove zero-length string and thus doubled-up delimters
group by strings.id
having count(itemnumber) <=3;

yields as desired:

id
-----------
2
3
4
6
7

Note that the doubled-up spaces in id=7 are compressed by the removal of zero-length strings.

Most of it is actually Jeff's - but mine's no good without his. ;-) — Pieter Geerkens, Sep 25 '14 at 22:22

How to find values that contain less than three words in SQL?

2 Answers2