-2

I have some csv files with name pattern below

IU_2022-09-01T09_43_56_0100-0018-0000-0002

Bold part of the name is important and i want to create different data frames if that changes but i am not able to read csv files by specifying something in the middle of the name. i want to read only those csv files which has 0100 in their names. i used glob method

ls_data = list() 

for idx, f in glob.glob('[0100]*.csv'):

df_temp = pd.read_csv(f, delimiter=';')

df_temp["layer_number"] = idx

ls_data.append(df_temp)

print (idx)

df_L = pd.concat(ls_data, axis=0)

`

but i am getting empty data

broo
  • 11
  • 2
  • 1
    To me, it is unclear what you are trying to accomplish. Can you provide more detail? – It_is_Chris Sep 21 '22 at 19:34
  • 2
    `for filename in os.listdir(your_folder): if 'some_pattern' in filename: pd.read_csv(f'{your_folder}/{filename})`? Of course, with proper indentation. – Quang Hoang Sep 21 '22 at 19:37
  • I have a specific pattern in my file name (IU_date_time_pnumber_weldid_layernumber_job_program. i want to read csv files based on (weld id). i want all csv files having same weld id in their names. – broo Sep 21 '22 at 19:43
  • use the `glob` module or the `pathlib.Path.glob` method – Paul H Sep 21 '22 at 19:48
  • ls_data = list() # for filename in glob.glob('/path/to/csvfiles/*.csv'): for idx, f in glob.glob('[0100]*.csv'): df_temp = pd.read_csv(f, delimiter=';') df_temp["layer_number"] = idx # df = pd.concat([df, df_temp], axis=0) ls_data.append(df_temp) print (idx) df_L = pd.concat(ls_data, axis=0) but i am getting empty data frames – broo Sep 21 '22 at 19:49
  • Does this answer your question? [How do I read a large csv file with pandas?](https://stackoverflow.com/questions/25962114/how-do-i-read-a-large-csv-file-with-pandas) – bad_coder Sep 21 '22 at 22:10

1 Answers1

0

Use pathlib :

from pathlib import Path
import pandas as pd

ls_data = []

csv_directory = r'/path/to/csvfiles/'

for idx, filename in enumerate(Path(csv_directory).glob('*_0100-*.csv')):
    df_temp = pd.read_csv(filename, delimiter=';')
    df_temp.insert(0, 'layer_number', idx)
    ls_data.append(df_temp) 

df = pd.concat(ls_data, axis=0)
Timeless
  • 22,580
  • 4
  • 12
  • 30