1

I have created a dataframe and I need to check that all relevant files have been loaded in. I can check number of records, but I want to check that all 120 files from the zip loaded in.

My code so far:

df_ht = pd.DataFrame(columns=['date','time','s-sitename','s-ip','cs-method','cs-uri-stem','cs-uri-query','s-port','cs-username','c-ip','cs(User-Agent)','cs(Referer)','sc-status','sc-substatus','sc-win32-status'])

ColumnNames=['date','time','s-sitename','s-ip','cs-method','cs-uri-stem','cs-uri-query','s-port','cs-username','c-ip','cs(User-Agent)','cs(Referer)','sc-status','sc-substatus','sc-win32-status']

for entry in file_names:
  df_csv = pd.read_csv('//xx//drive//MyDrive//xx//'+entry, header=None, skiprows=4, delimiter=" ", names=ColumnNames, encoding = 'iso-8859-1')
  df_ht = df_ht.append(df_csv)
print(df_ht)

Any suggestions?

Dimitar
  • 460
  • 4
  • 12
  • Can you share the `file_names` variable as well. One solution could be listing all the files from the directory https://stackoverflow.com/questions/3207219/how-do-i-list-all-files-of-a-directory And then check it against your `file_names` variable. – Dimitar Apr 15 '21 at 23:54
  • all_files = Zipfile (xxxxxx) ........ file_names = all_files.namelist() – user15652938 Apr 16 '21 at 05:21

0 Answers0