4

I am using Python 3.6 interpreter in my PyCharm venv, and trying to convert a CSV to Parquet.

import pandas as pd    
df = pd.read_csv('/parquet/drivers.csv')
df.to_parquet('output.parquet')

Error-1 ImportError: Unable to find a usable engine; tried using: 'pyarrow', 'fastparquet'. pyarrow or fastparquet is required for parquet support

Solution-1 Installed fastparquet 0.2.1

Error-2 File "/Users/python parquet/venv/lib/python3.6/site-packages/fastparquet/compression.py", line 131, in compress_data (algorithm, sorted(compressions))) RuntimeError: Compression 'snappy' not available. Options: ['GZIP', 'UNCOMPRESSED']

I Installed python-snappy 0.5.3 but still getting the same error? Do I need to install any other library?

If I use PyArrow 0.12.0 engine, I don't experience the issue.

rpanai
  • 12,515
  • 2
  • 42
  • 64
Himalay Majumdar
  • 3,883
  • 14
  • 65
  • 94

2 Answers2

2

In fastparquet snappy compression is an optional feature.

To quickly check a conversion from csv to parquet, you can execute the following script (only requires pandas and fastparquet):

import pandas as pd
from fastparquet import write, ParquetFile
df = pd.DataFrame({"col1": [1,2,3,4], "col2": ["a","b","c","d"]})
# df.head() # Test your initial value
df.to_csv("/tmp/test_csv", index=False)
df_csv = pd.read_csv("/tmp/test_csv")
df_csv.head() # Test your intermediate value
df_csv.to_parquet("/tmp/test_parquet", compression="GZIP")
df_parquet = ParquetFile("/tmp/test_parquet").to_pandas()
df_parquet.head() # Test your final value

However, if you need to write or read using snappy compression you might follow this answer about installing snappy library on ubuntu.

MarcosBernal
  • 562
  • 5
  • 13
0

I've used the following versions: python 3.10.9 fastparquet==2022.12.0 pandas==1.5.2

This code works seemlessly for me

import pandas as pd

df = pd.read_csv('/parquet/drivers.csv')
df.to_parquet('output.parquet', engine="fastparquet")

I'd recommend you move away from python 3.6 as it has reached end of life and is no longer supported.

Azan
  • 37
  • 1
  • 4