Export to csv in perfect format

Question

I want to print this data in csv so that i can loop many companies for my web scraping code.

I am getting this code with the help of stackoverflow itself and want to get this printed format to excel or csv with or without Rs 149 each column .

import pandas as pd
import requests
from bs4 import BeautifulSoup as bs

url = 'https://www.zaubacorp.com/documents/KAKDA/U01122MP1985PTC002857'
res = requests.get(url)
soup = bs(res.content,'lxml')
headers = [header.text for header in soup.select('h3.pull-left')]
tables = pd.read_html(url)
items = zip(headers,tables)
for header, table in items:
    print(header)
    print(table)

**

Certificates
         Date                         Title   ₨ 149 Each
0  2006-04-24  Certificate of Incorporation  Add to Cart
1  2006-04-24  Certificate of Incorporation  Add to Cart
Other Documents Attachment
         Date Title   ₨ 149 Each
0  2006-04-24   AOA  Add to Cart
1  2006-04-24   AOA  Add to Cart
2  2006-04-24   MOA  Add to Cart
3  2006-04-24   MOA  Add to Cart
Annual Returns and balance sheet Eform
         Date                    Title   ₨ 149 Each
0  2006-04-24  Annual Return 2002_2003  Add to Cart
1  2006-04-24  Annual Return 2003_2004  Add to Cart

**

Python comes with its own [CSV input and output library](https://docs.python.org/3/library/csv.html). Please try that first as currently the question is a bit too broad, IMO. — Ken Y-N, Feb 12 '19 at 05:26
You’re storing the table using pandas. Why not use 'df.to_csv()' ? — chitown88, Feb 12 '19 at 11:42
That is not happening as header value is str and table is df. — akshit aggarwal, Feb 12 '19 at 11:47
You can use that string to set the column names of the dataframe. — chitown88, Feb 12 '19 at 22:25

score 0 · Answer 1 · answered Feb 13 '19 at 10:54

It's really unclear exactly what you want as your expected output. But you can use pandas to write it to csv once you combine the dataframes.

import pandas as pd
import requests
from bs4 import BeautifulSoup as bs

url = 'https://www.zaubacorp.com/documents/KAKDA/U01122MP1985PTC002857'
res = requests.get(url)
soup = bs(res.content,'lxml')
headers = [header.text for header in soup.select('h3.pull-left')]
tables = pd.read_html(url)

tables = [ table[1:] for idx, table in enumerate(tables) ]

df = pd.concat(tables)   
df.columns = headers 
df = df.reset_index(drop=True)


df.to_csv('path/to/filename.csv', index=False)

Export to csv in perfect format

1 Answers1