0

the below holder have 100+ XML files. I have to open and read all those files.

F:\Process\Process_files\xmls

So far, I did the below code to open single XML file from the folder. What I need to change to open/read all the XML files from the folder.

from bs4 import BeautifulSoup
import lxml
import pandas as pd

infile = open("F:\\Process\\Process_files\\xmls\\ABC123.xml","r")
contents = infile.read()
soup = BeautifulSoup(contents,'html.parser')
Community
  • 1
  • 1

2 Answers2

3

Use the glob and the os module to iterate over every file in a given path with a given file extension:

import glob
import os

path = "F:/Process/Process_files/xmls/"

for filename in glob.glob(os.path.join(path, "*.xml")):
    with open(filename) as open_file:
        content = open_file.read()

    soup = BeautifulSoup(content, "html.parser")

Tip: Use the with statement so the file gets automatically closed at the end.

Source: Open Every File In A Folder

winklerrr
  • 13,026
  • 8
  • 71
  • 88
0

So you need to iterate over files in the folder? You can try something like this:

for file in os.listdir(path):
    filepath = os.path.join(path, file)
    with open(filepath) as fp:
        contents = fp.read()
        soup = BeautifulSoup(contents, 'html.parser')
dmitrybelyakov
  • 3,709
  • 2
  • 22
  • 26