Need some guidance as I am new to Power BI and Redshift ..
My Raw JSON data is stored in Amazon S3 bucket in the form of .gz files (Each .gz file has multiple rows of JSON data) I wanted to connect Power BI to Amazon s3 Bucket. As of now based on my research I got three ways:
- Amazon S3 is a web service and supports the REST API. We can try to use web data source to get data
Question: Is it possible to unzip the .gz file (inside the S3 bucket or Inside Power BI), extract JSON data from S3 and connect to Power BI
- Importing data from Amazon S3 into Amazon Redshift. Do all data manipulation inside Redshift using SQL workbench. Use Amazon Redshift connector to get data in Power BI
Question 1: Does Redshift Allows Loading .gzzipped JSON data from the S3 bucket? If Yes, is it directly possible or do I have to write any code for it?
Question 2: I have the S3 account, do I have to separately purchase Redshift Account/Space? What is the cost?
- Move data from an AWS S3 bucket to the Azure Data Lake Store via Azure Data Factory, transform the data with Azure Data Lake Analytics (U-SQL), and then output the data to PowerBI
U-SQL recognize GZip compressed files with the file extension .gz and automatically decompress them as the part of the Extraction process. Is this process valid, if my gzipped files contain JSON data rows?
Please let me if there is any other method, also your valuable suggestions on this post.
Thanks in Advance.