How to save results into dataframe?

Question

I am using this code: BeautifulSoup on multiple .html files This code is saving extratced text into .txt files. I want to save each record extracted in DataFrame as a separate row.

I want to save the results into DataFrame as a single column as "file". How to achieve the same?

import glob
import os.path
from bs4 import BeautifulSoup
dir_path = r"C:\My_folder\tmp"
results_dir = r"C:\My_folder\tmp\working"

for file_name in glob.glob(os.path.join(dir_path, "*.html")):
    with open(file_name) as html_file:
        soup = BeautifulSoup(html_file)

    results_file = os.path.splitext(file_name)[0] + '.txt'
    with open(results_file, 'w') as outfile:        
        for i in soup.select('font[color="#FF0000"]'):
            print(i.text)
            outfile.write(i.text + '\n')

Can you please provide the code that you tried to use to solve this so far? We need to see what you tried to be able to help you. :) — Mike_H, Apr 09 '19 at 11:02
https://stackoverflow.com/questions/31674557/how-to-append-rows-in-a-pandas-dataframe-in-a-for-loop — B. Go, Apr 09 '19 at 11:52

score 0 · Accepted Answer · answered Apr 09 '19 at 11:58

0

You could create an empty dataframe at the beginning of your code, and then append to it row by row within the loop.

df = pd.DataFrame(columns=['columname'])

Then in your loop (at the place where print(i.text) is at the moment), you could use:

dataframe.append(i.text))

Or a possibility is to create a list, add all i.text to the list and then turn that into a df by using:

df = pd.DataFrame({'columname':created_list})

answered Apr 09 '19 at 11:58

Ana Goessens

36
5

Great. Thanks a lot. Second one worked (created_list) – Keval Apr 10 '19 at 09:11

How to save results into dataframe?

1 Answers1