1

I am trying to read 6 files into 7 different data frames but I am unable to figure out how should I do that. File names can be complete random, that is I know the files but it is not like data1.csv data2.csv.

I tried using something like this:

import sys
import os
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
f1='Norway.csv'
f='Canada.csv'
f='Chile.csv'

Norway = pd.read_csv(Norway.csv)
Canada = pd.read_csv(Canada.csv)
Chile = pd.read_csv(Chile.csv )

I need to read multiple files in different dataframes. it is working fine when I do with One file like

file='Norway.csv
Norway = pd.read_csv(file)

And I am getting error :

NameError: name 'norway' is not defined
Abhinav Kumar
  • 177
  • 2
  • 5
  • 22

2 Answers2

3

You can read all the .csv file into one single dataframe.

for file_ in all_files:
    df = pd.read_csv(file_,index_col=None, header=0)
    list_.append(df)

# concatenate all dfs into one
big_df = pd.concat(dfs, ignore_index=True)

and then split the large dataframe into multiple (in your case 7). For example, -

import numpy as np
num_chunks = 3  
df1,df2,df3 = np.array_split(big_df,num_chunks)

Hope this helps.

  • Not possible to merge in my case and split, data maybe almost equal in each frame , but I cannot risk for almost, even difference of one row will create allot of differences – Abhinav Kumar Feb 19 '19 at 13:07
  • pd.read_csv() expects first argument as file_path of type `str`, so you should use `Norway = pd.read_csv('')` – Prashant Jamkhande Feb 19 '19 at 13:07
  • Thanq for this, I was missing the 'file' '' there, – Abhinav Kumar Feb 19 '19 at 13:11
  • @AbhinavKumar, well you said "want to read 6 files into 7 different data frames". You will have to be sure that, count of all rows in all 6 files combined together can be equally divided into 7. – Prashant Jamkhande Feb 19 '19 at 13:26
  • No there is no way I can combine and split, I need to perform identical.operation on 3 files, and some complete diffferent operation on 3 . They contain information which is also identical but I cannot merge into one dataframe. I hope i am sounding clear. – Abhinav Kumar Feb 19 '19 at 13:29
  • Your help worked for me . Now able to do wt i was trying to get done. Files in different dataframes. – Abhinav Kumar Feb 19 '19 at 13:31
  • @AbhinavKumar glad it helped. Can you please mark it as answered? – Prashant Jamkhande Feb 19 '19 at 13:36
1

After googling for a while looking for an answer, I decided to combine answers from different questions into a solution to this question. This solution will not work for all possible cases. You have to tweak it to meet all your cases.

check out the solution to this question

 # import libraries
import pandas as pd
import numpy as np
import glob
import os
# Declare a function for extracting a string between two characters
def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""
path = '/path/to/folder/containing/your/data/sets' # use your path
all_files = glob.glob(path + "/*.csv")
list_of_dfs = [pd.read_csv(filename, encoding = "ISO-8859-1") for filename in all_files]
list_of_filenames = [find_between(filename, 'sets/', '.csv') for filename in all_files] # sets is the last word in your path
# Create a dictionary with table names as the keys and data frames as the values
dfnames_and_dfvalues = dict(zip(list_of_filenames, list_of_dfs))
Confusion Matrix
  • 116
  • 2
  • 14