In our project, we move the data from tables on RDBMS to HDFS using Scala and Spark. Before moving the data, we apply a "regex_replace" on the data to eliminate some discrepancies in the data. Below is the regex_replace:
regexp_replace(
regexp_replace(
regexp_replace(
regexp_replace(
regexp_replace(..., E'[\\n]+', ' ', 'g' ),
E'[\\r]+', ' ', 'g'
),
E'[\\t]+', ' ', 'g'
),
E'[\\cA]+', ' ', 'g'
),
E'[\\ca]+', ' ', 'g'
)
What is the meaning of the E
that preceeds the single quoted strings in each regexp_replace
call?