0

I have some input information where I'm trying to remove the part .0 from my input where an ID string ends with .0.

select student_id, regexp_replace(student_id, '.0','') from school_result.credit_records where student_id like '%.0';

Input:

01-0230984.03
12345098.0
34567.0

Expected output:

01-0230984.03 
12345098
34567

But the result I'm getting is as follows: It's removing any character having with a 0 next to it instead of removing only the occurrences that end with .0

0129843
123498
34567

What am I doing wrong? Can someone please help?

leftjoin
  • 36,950
  • 8
  • 57
  • 116
LearneR
  • 2,351
  • 3
  • 26
  • 50

1 Answers1

0

Dot in regexp has special meaning (it means any character). If you need dot (.) literally, it should be shielded using double-slash (in Hive). Also add end-of-the-line anchor($):

with mydata as (
select stack(3,
'01-0230984.03',
'12345098.0',
'34567.0'
) as str
)

select regexp_replace(str,'\\.0$','') from mydata;

Result:

01-0230984.03
12345098
34567

Regexp '\\.0$' means dot zero (.0) literally, end of the line ($).

leftjoin
  • 36,950
  • 8
  • 57
  • 116
  • Just wanted to know: If I want to specify something to be removed at the beginning of the line instead of the end, then what should I specify in place of `$` ?? – LearneR Sep 11 '19 at 13:55
  • 1
    @LearneR beginning anchor is `^` Like this: '^0' - means zero at the beginning – leftjoin Sep 11 '19 at 13:56