Impala Column Name Issue

Question

We are facing a problem with Imapla Column naming convention which seems unclear to us.

The CDH imapala documentation (http://www.cloudera.com/documentation/archive/impala/2-x/2-0-x/topics/impala_identifiers.html) 3rd bullet point says : An identifier must start with an alphabetic character. The remainder can contain any combination of alphanumeric characters and underscores. Quoting the identifier with backticks has no effect on the allowed characters in the name.

Now, due to dependency with the upstream SAP systems, we had to name a column name starting with (0) zero as numeric. While defining and extracting the records from the table impala does not show any semantic error. While connecting Imapala with SAP HANA through SDA (Smart Data Access), the extraction is failing for this particular column which is starting with a leading zero (0) and fine for rest of the columns which are starting with an alphabet. The error shows as "... ^ Encountered: DECIMAL LITERAL "

I have to points.

If the documentation says, an identifier can not start anything other that alphabet, then how the imapla query is running without any issues.
Why the error is only raised while it is getting extracted from SAP HANA.

Any insight will be highly appreciable.

score 0 · Answer 1 · answered Feb 05 '16 at 11:27

0

Ok, I can only say something about the SAP HANA side here, so you will have to check the Impala side somehow. The error message you get while accessing an external table via SDA typically comes from the 3rd party client software, in this case the ODBC driver you use to connect to Impala. So, SAP HANA tries to access the table through the Impala ODBC driver and that driver returns the error message.

I assume that the object name check for Impala is implemented in the client in this case. Not sure if the way you use to run the query in Impala also uses the driver.

But even if Impala has the limitation for the table naming in place, I fail to see why this would force you to name the table in SAP HANA that way. If the upstream data access requires the leading 0 just create a view on top of the table and you're good to go.

answered Feb 05 '16 at 11:27

Lars Br.

9,949
2
15
29

Thanks for responding. Not, sure if I follow your correctly. The reason the column is staring with 0 as its as per SAP systems which HANA also has link to. So, Imapala gets the data from HANA which is already having leading zero as column. Now we are doing some heavy processing in Big Data as HANA is not capable to do so. After processing, we want to get the data back from Impala to Hana, this is where it get fails. – Murari Goswami Feb 08 '16 at 15:32
Ok, so first of all, SAP HANA is perfectly capable of doing Big Data analytics - it's merely a matter of costs that would be incurred. So your source tables have the zero as the first character, so what? As you apparently pump the data out you can name the target structure whatever you like. Instead of a 0 you could put a X in the target column name. You simply have to adjust the column mapping then. – Lars Br. Feb 09 '16 at 03:07
Thanks for the feedback. Renaming the column is always as option. However, we are very keen to know the limitations on the naming convention and the synchronization. We need to know, is this a genuine limitation that HANA can not read columns starting with a numeric in Hive/ Impala. If that is the case we have to architect accordingly. Also renaming is not a good option for enterprise as we will loose the naming synchronization across different SAP systems and the Big Data Apps. – Murari Goswami Feb 16 '16 at 10:30
As you have shown with the link to the impala documentation, this limitation is due to impala, not SAP HANA. It seems like Impala is not able to support your enterprise naming approach. – Lars Br. Feb 17 '16 at 01:46

Impala Column Name Issue

1 Answers1