0

I am reading a file from google storage with google datalab, then I have a variable with the data but I need to convert it to pandas Dataframe.

I am reading:

%%gcs read --object $objeto1 --variable prueba

The variable prueba looks like:

1/1/2016 08:35:56,1,4756798,"7501073831988",1.00,15.00,0.16,"S0394",4388,2,10.43\r\n1,1/1/2016 08:35:56,1,4756798,"850697002395",1.00,13.50,0.00,"S0394",4388,2,10.36\r\n1,1/1/2016 08:35:56,1,4756798,"850697002425",1.00,10.00,0.00,"S0394",4388,2,7.29\r\n1,1/1/2016 08:38:55,2,1013642,"8469760102003",1.00,200.00,0.16,"C0278",2595,1,161.20\r\n

Any help please?

CDspace
  • 2,639
  • 18
  • 30
  • 36
  • When i read a query from bigQuery for example: df = bq.Query('select * from tabla').to_dataframe(), it is enough to convert my object to a pandas Dataframe but when i do something like that in a variable from storage: AttributeError: 'str' object has no attribute 'to_dataframe' – Jaime Hernandez Apr 04 '17 at 22:09
  • Wrap your variable in StringIO as shown here: https://stackoverflow.com/questions/37990467/how-can-i-load-my-csv-from-google-datalab-to-a-pandas-data-frame – Tautvydas Jun 27 '17 at 10:17

1 Answers1

0

I would suggest you read the file from GCS into your datalabs machine:

def (gcs_path, csv_file_name):
    get_ipython().system(u'gsutil cp ' + path + csv_file_name+' .')
    df = pd.read_csv(csv_file_name)
    return df