We have an HDInsight cluster in our setup and we are storing the data in Hive tables(The data lies as External tables in ADLS and the metadata in the External metastore and accessed using the Hive service from our Azure cluster). What is the best way to share this data with other Azure clusters, not necessarily within the same subscription?
Azure has this concept of Service principals, so we’d need to setup the acls to allow the other cluster’s service principal access to the ADLS folders corresponding to the hive tables that we share. Additionally, how can our cluster’s hiveserver2 url be used as jdbc connection by the other Azure instances, so that they can query the data? What cluster login should we provision for them to be able to use our HiveServer2 to query the data in our Hive tables?
I understand the right way to do this would be to use the Azure ESP service, but that is apparently a costly choice.
Providing them access only to the ADLS folders also seems incorrect as the metadata is then not used for accessing the data...