Have to extract the start of word by stripping the last occurrence of symbol ^ characters to end by regex extract

Question

I want to use the regex_replace or regexp_extract in hive where i can get only the first n occurrences of special character "^" except the last occurrence and text after last occurrence. f For example:

101^11 should extract 101
101^1111^232 should extract 101^1111
144^899^890^1 should extract 1144^899^890

I have to do with only regex and not with string functions or udf.

Wouldn't this work? https://stackoverflow.com/questions/8374742/regex-last-occurrence — Yassin Hajaj, Jun 10 '18 at 12:12

notNull · Accepted Answer · 2018-06-10T16:53:17.360

2

Try with regexp_extract function

In hive:

hive> select regexp_extract('101^11',"(.*)\\^",1)
+------+--+
| _c0  |
+------+--+
| 101  |
+------+--+

In Scala:

scala> val hiveContext = new HiveContext(sc)
scala> val df= hiveContext.sql("""select regexp_extract('101^11','(.*)\\\^',1)""");
scala> df.show()
+---+
|_c0|
+---+
|101|
+---+

In Pyspark:

>>> hiveContext=HiveContext(sc)
>>> df=hiveContext.sql("select regexp_extract('101^11','(.*)\\\^',1)")
>>> df.show()
+---+
|_c0|
+---+
|101|
+---+

edited Jun 10 '18 at 16:53

answered Jun 10 '18 at 12:18

notNull

30,258
4
35
50

1

I have tried this example and it was success.Thanks for the quick help. – ae8 Jun 10 '18 at 12:24
1

If i try the same query with spark sql the data is not being returned for extract column. DataFrame df= hiveContext.sql("select regexp_extract(eid,'(.*)\\^',1) as eid1,name from employee"); System.out.println("schema "+df.schema()); System.out.println("first record"+df.first()); first record[,rakes] – ae8 Jun 10 '18 at 14:15
Please enclose with triple quotes ***val df= hiveContext.sql("""select regexp_extract('101^11','(.*)\\^',1)""");*** it should work..!! – notNull Jun 10 '18 at 16:27
I am trying in spark sql java, it is showing error with triple quotes in eclipse. – ae8 Jun 11 '18 at 02:33

Have to extract the start of word by stripping the last occurrence of symbol ^ characters to end by regex extract

1 Answers1