How to Read file of specific pattern csv(regex) and create DataFrame in python using pandas

Question

I try to create the DataFrame using method to csv.in place of the path I want to give regex pattern so that all file with this pattern gets. But this I don't get the file as per my expectation.

Please help me to solve the problem.

import pandas as pd

df=pd.to_csv(path+"^\d{8}_\d{6}$",sep="|",Header=none,names=col)

But this line does not fetch the exact file pattern. directly this regular expression comes for search, please help me solve this.

Please provide a code sample, including the error traceback, and improve the syntax of your post. — Arn, Nov 14 '19 at 19:11
Please [edit] your question to include a [mcve] including sample input sample output, and _code for what you have tried_ so far — G. Anderson, Nov 14 '19 at 19:13
Let me see if i can understand your question correctly. You want to read a set of files under `path` that match a specific pattern and create a single dataframe using those files? Please confirm this is what you are looking for? Check whether this link helps : https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe — nitin3685, Nov 15 '19 at 04:59
yes @nitin3685, in path both directory and common part of the file. — Sudhakar, Nov 15 '19 at 05:12

score 1 · Answer 1 · answered Nov 15 '19 at 05:33

The solution have 2 steps. The first step is you have to find all path that match a specific pattern. The second one is you read data from each DataFrame and concat it after that. The pandas library do not support the 1 step (I think, need recheck soon). So you could use glob library for that.

Code sample:

import pandas as pd
import glob

root_path = './'
datasheet_path_pattern = root_path + ('[0-9]' * 8) + '_' + ('[0-9]' * 6)
datasheet_paths = [path for path in glob.iglob(datasheet_path_pattern)]
datasheet = []
for datasheet_path in datasheet_paths:
  df = pd.read_csv(datasheet_path, sep="|", Header=none, names=col)
  datasheet.append(df)

datasheet = pd.concat(datasheet)

How to Read file of specific pattern csv(regex) and create DataFrame in python using pandas

1 Answers1