Questions tagged [azure-data-lake-gen2]

Ask question related to Azure Data Lake Storage Gen2.

Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob storage. Data

669 questions
25
votes
3 answers

Parquet vs Delta format in Azure Data Lake Gen 2 store

I am importing fact and dimension tables from SQL Server to Azure Data Lake Gen 2. Should I save the data as "Parquet" or "Delta" if I am going to wrangle the tables to create a dataset useful for running ML models on Azure Databricks ? What is the…
10
votes
3 answers

Refresh powerBI data with additional column

I have built a powerBI dashboard with data source from Datalake Gen2. I am trying to add new column into my original data source. How to refresh from PowerBI side without much issues or whats the best way to do?
Ajju Bajju
  • 105
  • 1
  • 1
  • 4
9
votes
1 answer

Processing upserts on a large number of partitions is not fast enough

The Problem We have a Delta Lake setup on top of ADLS Gen2 with the following tables: bronze.DeviceData: partitioned by arrival date (Partition_Date) silver.DeviceData: partitioned by event date and hour (Partition_Date and Partition_Hour) We…
9
votes
1 answer

How to connect AMLS to ADLS Gen 2?

I would like to register a dataset from ADLS Gen2 in my Azure Machine Learning workspace (azureml-core==1.12.0). Given that service principal information is not required in the Python SDK documentation for .register_azure_data_lake_gen2(), I…
6
votes
3 answers

Reading Excel file from Azure Databricks

Am trying to ready Excel file (.xlsx) from Azure Databricks, file is in ADLS Gen 2. Example: srcPathforParquet = "wasbs://hyxxxx@xxxxdatalakedev.blob.core.windows.net//1_Raw//abc.parquet" srcPathforExcel =…
Sreedhar
  • 29,307
  • 34
  • 118
  • 188
5
votes
2 answers

Display image in Databricks notebook error

I am working on creating a databricks notebook template with company logo. Using the below code to display image is throwing error. Code: %md Error: HTTP ERROR 403: Invalid or missing CSRF token Please guide…
ITHelpGuy
  • 969
  • 1
  • 15
  • 34
5
votes
2 answers

java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem not found

I am new to the world of Spark and Kubernetes. I built a Spark docker image using the official Spark 3.0.1 bundled with Hadoop 3.2 using the docker-image-tool.sh utility. I have also created another docker image for Jupyter notebook and am trying to…
5
votes
2 answers

Transfer from ADLS2 to Compute Target very slow Azure Machine Learning

During a training script executed on a compute target, we're trying to download a registered Dataset from an ADLS2 Datastore. The problem is that it takes hours to download ~1.5Gb (splitted into ~8500 files) to the compute target with the following…
4
votes
1 answer

Azure Databricks - Resolve : User does not have permission SELECT on any file error stopping from executing 'save'

We have two different Azure cloud resource groups, RG1 and RG2, where RG1 hosts the ADB_source of the data source, and RG2 hosts the ADB_sink & ADLS_sink(gen2) of the data sink. Use Case: We have a few delta tables in ADB_source (ACL enabled) where…
vvk24
  • 470
  • 5
  • 18
4
votes
1 answer

Unable to Query Serverless Pool View in Azure Synapse using SQL Admin Credentials

I have set up a Serverless SQL pool in Azure Synapse that is querying a view I had set up of a linked Azure Data Lake. CREATE VIEW DeviceTelemetryView AS SELECT corporationid, deviceid, version, Convert(datetime, dateTimestamp, 126) AS…
4
votes
3 answers

Azure databricks cluster don't have acces to mounted adls2

I followed the documentation azure-datalake-gen2-sp-access and I mounted a ADLS2 storage in databricks, but when I try to see data from the GUI I get the next error: Cluster easy-matches-cluster-001 does not have the proper credentials to view the…
Andrés Bustamante
  • 442
  • 1
  • 4
  • 15
4
votes
1 answer

Not able to see 'Lifecycle management' option for ADLS Gen2

I have created ADLS (Azure Data Lake Storage) Gen2 resource (StorageV2 with hierarchical name space enabled). The region I created the resource in is Central US and the performance/access tier is Standard/Hot and replication is LRS. But for this…
3
votes
3 answers

Azure storage connection to Datalake G2 in SSIS using Access Key

Test connection Connection manger in SSIS to the azure storage using access key succeeded. While copying data using Flexible file task in SSIS throwing an error "[Flexible File Task] Error: Could not load file or assembly…
3
votes
0 answers

How to read delta table inside Azure Functions using python

I'm Currently working on Azure Functions where I need to read delta table from ADLS GEN2 directly. is there any way that I can use it like Azure SDK's or other alternatives ?
3
votes
2 answers

Azure ADLS Gen2 file created by Azure Databricks doesn't inherit ACL

I have a databricks notebook that is writing a dataframe to a file in ADLS Gen2 storage. It creates a temp folder, outputs the file and then copies that file to a permanent folder. For some reason the file doesn't inherit the ACL correctly. The…
Bee_Riii
  • 814
  • 8
  • 26
1
2 3
44 45