1

Trying the following code in jupyter notebook (pip install pandas - pip install pyarrow > are installed)

import pandas as pd

parquet_file = r'C:\Users\Future\Desktop\userdata1.parquet' 
df = pd.read_parquet(parquet_file, engine='auto')
print(df.head())

When trying the code in jupyter notebook, the kernel appears to have died. I restarted the kernel and tried again but the same error. I even tried to put the code in .py file and run the code from the terminal but I didn't get any output.

the engine is auto and i tried too pyarrow engine ..

** I have installed python 3.8.6 and pandas 1.1.4 and pyarrow 2.0.0 and when trying to run the code I encountered the following error

 ** On entry to DGEBAL parameter number  3 had an illegal value
 ** On entry to DGEHRD  parameter number  2 had an illegal value
 ** On entry to DORGHR DORGQR parameter number  2 had an illegal value
 ** On entry to DHSEQR parameter number  4 had an illegal value
Traceback (most recent call last):
  File "demo.py", line 1, in <module>
    import pandas as pd
  File "C:\Users\Future\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\__init__.py", line 11, in <module>
    __import__(dependency)
  File "C:\Users\Future\AppData\Local\Programs\Python\Python38\lib\site-packages\numpy\__init__.py", line 305, in <module>
    _win_os_check()
  File "C:\Users\Future\AppData\Local\Programs\Python\Python38\lib\site-packages\numpy\__init__.py", line 302, in _win_os_check
    raise RuntimeError(msg.format(__file__)) from None
RuntimeError: The current Numpy installation ('C:\\Users\\Future\\AppData\\Local\\Programs\\Python\\Python38\\lib\\site-packages\\numpy\\__init__.py') fails to pass a sanity check due to a bug in the windows runtime. See this issue for more information: https:// tiny url.com/y3dm3h86
YasserKhalil
  • 9,138
  • 7
  • 36
  • 95

1 Answers1

1

Running

import pandas as pd

parquet_file = r'userdata1.parquet' 
df = pd.read_parquet(parquet_file, engine='auto')
print(df.head())

returns

    registration_dttm  id first_name last_name                     email  \
0 2016-02-03 07:55:29   1     Amanda    Jordan          ajordan0@com.com   
1 2016-02-03 17:04:03   2     Albert   Freeman           afreeman1@is.gd   
2 2016-02-03 01:09:31   3     Evelyn    Morgan   emorgan2@altervista.org   
3 2016-02-03 00:36:21   4     Denise     Riley          driley3@gmpg.org   
4 2016-02-03 05:05:31   5     Carlos     Burns  cburns4@miitbeian.gov.cn   

   gender      ip_address                cc       country  birthdate  \
0  Female     1.197.201.2  6759521864920116     Indonesia   3/8/1971   
1    Male  218.111.175.34                          Canada  1/16/1968   
2  Female    7.161.136.94  6767119071901597        Russia   2/1/1960   
3  Female   140.35.109.83  3576031598965625         China   4/8/1997   
4          169.113.235.40  5602256255204850  South Africa              

      salary                   title comments  
0   49756.53        Internal Auditor    1E+02  
1  150280.17           Accountant IV           
2  144972.51     Structural Engineer           
3   90263.05  Senior Cost Accountant           
4        NaN 

using pyarrow 2.0.0 on python 3.8.6 and pandas 1.1.4

with df.shape giving (1000, 13)

Paul Brennan
  • 2,638
  • 4
  • 19
  • 26