I have an Oracle query which is fetching 25 million records, there is no pk or no columns which is distributed properly to make as a split by column. So I have thought of making a sequence number using ROW_number() over () as RANGEGROUP
. But when I use this pseudo column its giving me an error saying
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164) Caused by: java.sql.SQLSyntaxErrorException: ORA-00904: "P"."RANGEGROUP": invalid identifier at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:91).
I am properly giving the alias, even I tried with out alias to the pseudo column, its still giving the same error. Can we use derived columns in Sqoop split by, or the column should be physically present in table?