I use Python version 3.7.13 and create a virtual environment (venv) for a MLOps project.
A dvc package (=2.10.2) that is compatible with Python== 3.7.13 is installed in this venv.
(venv) (base) tony3@Tonys-MacBook-Pro mlops % dvc…
I want to access google cloud storage as in the code below.
# amazon s3 connection
import s3fs as fs
with fs.open("s3://mybucket/image1.jpg") as f:
image = Image.open(f).convert("RGB")
# Is there an equivalent code like this GCP side?
with…
I'm testing this locally where I have a ~/.aws/config file.
~/.aws/config looks some thing like:
[profile a]
...
[profile b]
...
I also have a AWS_PROFILE environmental variable set as "a".
I would like to read a file in which is accessible with…
I'm trying to read a file geodatabase file into a geodataframe using the geopandas python library. The geodatabase file is on S3, so I'm using fssspec to read it in, but I'm getting an error:
import geopandas as gpd
import fsspec
fs =…
I have a several gigabyte CSV file residing in Azure Data Lake. Using Dask, I can read this file in under a minute as follows:
>>> import dask.dataframe as dd
>>> adl_path = 'adl://...'
>>> df = dd.read_csv(adl_path, storage_options={...})
>>>…
recently I got myself into data analysis with some friends and to improve our data exchange we got a linux server which we use as a SFTP server. Following this we no longer want to write outputs to our local filesystem and then move it to the SFTP…
I am working with Google Drive in Python using fsspec to perform various operations like listing and downloading files and directories. However, I have encountered a challenge when dealing with items that share the same name. For example, there…
I want to initalize a fsspec filesystem based on a URL - both the protocol and the root directory.
E.g. I could create a filesystem from gcs://my-bucket/prefix that would use my-bucket on GCS, or file:///tmp/test that would use the /tmp/test…
I have a program that, for data security reasons, should never persist anything to local storage if deployed in the cloud. Instead, any input / output needs to be written to the connected (encrypted) storage instead.
To allow deployment locally as…
Given a filepath, how do I obtain the parent directory containing the file using fsspec ? Filepath can be using local filesystem or cloud storage, that's why fsspec is prefered.
Reading xarray goes16 data directly from S3 without downloading into the system. the issue is that I cannot concatenate S3Files. I am recalling 24 files from S3 and want to read and extract the data for these files for the time range:
This is the…
I want to use s3fs based on fsspec to access files on S3. Mainly because of 2 neat features:
local caching of files to disk with checking if files change, i.e. a file gets redownloaded if the local and remote file differ
file version id support for…
I have some error to read parquet into pandas in databricks like in the following :
anyone has an idea ?following is my databricks runtime.
my pandas version
I would like to read into the remote zarr store of https://hrrrzarr.s3.amazonaws.com/index.html#sfc/20210208/20210208_00z_anl.zarr/. Info of the zarr store is at https://mesowest.utah.edu/html/hrrr/zarr_documentation/zarrFileVariables.html
I am able…
I'm looking to read a remote zarr store using xarray.open_mfdataset()
I'm getting a zarr.errors.GroupNotFoundError: group not found at path ''. Traceback at the bottom.
import xarray as xr
import s3fs
fs = s3fs.S3FileSystem(anon=True)
uri =…