My question is how to properly use SerDeProperties to parse the lines below. I have tried multiple variations and I continue to get fill my tables with null values. Below I have the SerDe and the sample data. From my under standing ([^\s]*)
should be anthing before ^
whitespace \s
match 0 or more characters*
. Likewise the next regex should put everything before the line return in the next column
My intent is to divide the numbers into one column and everything else into another column. What is wrong with my interpretation of the SerDe?
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ("input.regex" = "([^\s]*) ([^\n]*)");
1134999 06Crazy Life
6821360 Pang Nakarin
10113088 Terfel, Bartoli- Mozart: Don
10151459 The Flaming Sidebur
6826647 Bodenstandig 3000
10186265 Jota Quest e Ivete Sangalo
6828986 Toto_XX (1977