-1

I need to remove words having specific length from a column. Here in my case I need to remove words having length <=2. I have total 1 million records.

+---------------------------+------------------------+
| Input                     | Output                 |
+---------------------------+------------------------+
| Steel rod 10 X 15 MM      | Steel rod              |
+---------------------------+------------------------+
| Syringe SM B 2.5 ML       | Syringe 2.5            |
+---------------------------+------------------------+
| 3.5 ML Syringe Disposable | 3.5 Syringe Disposable |
+---------------------------+------------------------+
| Syringe Disposable 2.5 ML | Syringe Disposable 2.5 |
+---------------------------+------------------------+

Even I don't need numbers as well from my Input Description column. I have the function to remove numbers. Please advise.

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
Shiva
  • 432
  • 8
  • 20

1 Answers1

1

SQL Server has very poor string processing functions. This might be something that you want to do when you load the data into the database.

One solution within the database is a recursive CTE:

with cte as (
      select input, convert(varchar(max), input + ' ') as rest, convert(varchar(max), '') as output, 1 as lev
      from t
      union all
      select input,
             stuff(rest, 1, charindex(' ', rest), ''),
             (case when charindex(' ', rest) <= 3 then output
                   else output + left(rest, charindex(' ', rest))
              end),
             lev + 1
      from cte
      where rest <> ''
     )
select input, output
from (select cte.*, max(lev) over (partition by input) as max_lev
      from cte
     ) cte
where lev = max_lev;

Here is a db<>fiddle.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786