1

I am trying to have an app that loops through a directory (folder and sub) searching for multiple extensions or keywords and outputs the list in a dynamic frame but the code (or returns Zero if nothing found).

Current code fails both when multiple extensions (or keywords) are inputted or fails by grouping multiple results in a single row of the frame.

I need help in debugging my code posted below.

Thank you

   from st_aggrid import AgGrid
    import streamlit as st
    import pandas as pd
    import os

    office= st.text_input("Enter your Office : ")
    path= st.text_input("Enter directory path to search : ")
    extensions= st.text_input("Enter the File Extension :")

    file_names = [fn for fn in os.listdir(path) if any(fn.endswith(ext) for ext in extensions)]
    df = pd.DataFrame({'Office' : [office],'Directory' : [path],'File_Name' : file_names})
    AgGrid(df, fit_columns_on_grid_load=True)
Pynew
  • 13
  • 4

1 Answers1

0

There are some problems in your code:

  1. When getting the extension using st.text_input the returned value is a string not a list of extension. In order to solve this just ask the user to enter the extension seperated with a comma and split the string to get a list of extensions.

    extensions= st.text_input("Enter the File Extension (Seperated with comma):").split(",")
    
  2. when you first run the Streamlit code the values of office, path, extensions are None causing a FileNotFoundError so we need to run the code to loop through a directory and display files only if the these values are not None.

     if office and path and extensions:
         # code
    
  3. Searching file is not recursive so we need to change it to get subfolder files also we can do this using the solution suggested in this question How to do a recursive sub-folder search and return files in a list?. Also we need to check if file has the extension or keyword.

Putting it all together:

import os
import streamlit as st
import pandas as pd
from st_aggrid import AgGrid

office= st.text_input("Enter your Office : ")
path= st.text_input("Enter directory path to search : ")
extensions= st.text_input("Enter the File Extension (Seperated with comma):").split(",")

if office and path and extensions:
  file_names = []
  file_names_ext = []
  dirs = []
  for dp, dn, filenames in os.walk(path):
    for fn in filenames:
      for ext in extensions:
        if ext in fn:
          dirs.append(dp)
          file_names.append(os.path.join(dp, fn))
          file_names_ext.append(ext)
          break

  df = pd.DataFrame({'Office': office , 'Directory': dirs, 'File_Name' : file_names, 'Term': file_names_ext})
  AgGrid(df, fit_columns_on_grid_load=True)

So For example if i have the following folder structure:

Structure

These are examples of the output of the working code:

out1

out2

RoseGod
  • 1,206
  • 1
  • 9
  • 19
  • Thank you very much! @RoseGod would this work also if I want to search for a specific key-word-string series (comma separated) within the file full path in a specified directory? Lastly, how can I add the related key-word-string, or file extension on a new column in the dataframe to understand why a file was marked for review? – Pynew Jan 15 '22 at 20:05
  • @Pynew I updated the code to retrieve all files with the specific key-word-string and to add the key-word-string as a a new column. Also if this or any answer has solved your question please consider [accepting](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work) it by clicking the check-mark. This indicates to the wider community that you've found a solution and gives some reputation to both the answerer and yourself. There is no obligation to do this. – RoseGod Jan 15 '22 at 21:47
  • Thank you for your help. I have accepted it. Last question, how can I modify the search to be Not-Case Sensitive? @RoseGod – Pynew Jan 16 '22 at 03:07
  • @Pynew Changing the following line `if ext in fn:` to `if ext.lower() in fn.lower():` should solve the case sensitive. – RoseGod Jan 16 '22 at 08:03
  • I posted a new question. I hope you can help. – Pynew Feb 21 '22 at 22:58