I have multiple CSV files in the same directory that contain different survey response data (different questions, different order of questions). What I'm looking to achieve is a loop through all of the CSVs to find specific column headings and store the results in a pandas dataframe.
What I have so far:
import pandas as pd
import csv
import os
import glob
path = "file/path/"
all_files = glob.glob(os.path.join(path, "*.csv")) #make list of paths
for file in all_files:
file_name = os.path.splitext(os.path.basename(file))
dfn = pd.read_csv(file, encoding='latin1')
dfn.index.name = file_name
So the code currently reads in all of the CSVs from the directory, now I think I need another loop to go through them to find the data within a column. The column in question I'm looking for contains the text "would recommend" (there's a possibility not all column names will be worded the same so would need to be contains). I am quite new to Python and really struggling, any help is greatly appreciated.
Example of CSV1:
Programme,"Overall, I am satisfied with the quality of the programme",I would recommend the company to a friend or colleague,Please comment on any positive aspects of your experience of this programme
Nursing,4,4,[IMAGE]
Nursing,1,3,very good
Nursing,4,5,I enjoyed studying tis programme
Example of CSV2:
Programme,I would recommend the company to a friend,The programme was well organised and running smoothly,It is clear how students' feedback on the programme has been acted on
IT,4,2,4
IT,5,5,5
IT,5,4,5