13

Basically i would like to return rows based on one column value.

If the column contains non numeric values, then return those rows from a hive table.

Any UDF is available in Hive?

leftjoin
  • 36,950
  • 8
  • 57
  • 116
Shankar
  • 8,529
  • 26
  • 90
  • 159

2 Answers2

21

I believe Hive supports rlike (regular expressions). So, you can do:

where col rlike '[^0-9]'

This looks for any non-digit character. You can expand this, if your numeric values might have decimal points or commas.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • an extra like for 'rlike' – Ash Mar 08 '18 at 02:01
  • 1
    The above answer is wrong ,if white space is there in between. Try the below , u can see wrong results select if(substring(trim("10 10"),1,4) rlike '[^0-9]',"NUMERIC","STRING") as result from limit 1; – vinu.m.19 Oct 15 '18 at 18:16
  • 1
    @vinu.m.19 . . . Spaces are non-numeric in all databases I'm familiar with. Of course, if you want to know if there are digits *and* spaces, then the regular expression would need to be adapted. – Gordon Linoff Oct 15 '18 at 18:34
15

Use cast(expr as <type>). A null is returned if the conversion does not succeed.

case when cast(col as double) is null then 'N' else 'Y' end as isNumber 

or simply use Boolean expression in the WHERE: cast(col as double) is not null

Also you can create isNumber macro:

create temporary macro isNumber(s string)
       cast(s as double) is not null;

And use it in your queries:

hive> select isNumber('100.100'), isNumber('100'), isNumber('.0'), isNumber('abc');
OK
_c0     _c1     _c2     _c3
true    true    true    false

If you need to check for Integer then use cast(s as Int)

This approach works correctly with negative and fractional numbers.

leftjoin
  • 36,950
  • 8
  • 57
  • 116