0

Update - I solved the below problem by adding rows to the data frame and then only writing the data frame to Excel once. Other readers may find Add one row to pandas DataFrame helpful.

Update 2 - If you want to stop the header names being written to Excel then you may find this help How do you remove the column name row from a pandas DataFrame?.

Update 3 - If you want to remove the row numbers when you write the data frame to Excel then you may find this link helpful Is there any way to remove column and rows numbers from DataFrame.from_dict?.

I want Python to read a list of Amazon URLs from one Excel file (python.xlsx) and then populate a different Excel file (python2.xlsx) with the URL, product title and product price. I don't want to create a new Excel sheet for each product. I don't want existing data to be overwritten when new data is written to sheet1. And I don't need the to_excel command to write header names and row numbers to Excel either.

I looked at How to write to an existing excel file without overwriting data (using pandas)? but could not apply it to my problem.

Is there a way of stopping the header names and row numbers being written to Excel? Is there a way of stopping existing text being overwritten in the spreadsheet?

Row = 0
MaxRow = len(df)
while Row <= MaxRow:
    URL = (df.iloc[Row,0]) 
    headers  = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36'}
    page = requests.get(URL, headers=headers)
    soup = BeautifulSoup(page.content, 'html.parser')
    title = soup.find(id="productTitle").get_text()
    price = soup.find(id="priceblock_ourprice").get_text()
    converted_price=float(price[1:6])
    df2 = pd.DataFrame({'Url':[URL],
                        'Title':[title.strip()],
                        'Price':[converted_price]})
    writer = ExcelWriter(r'C:\Users\HP\Documents\python2.xlsx')
    df2.to_excel(writer, sheet_name='Sheet1', startrow=Row,startcol=2)
    writer.save()
    Row = Row + 1
    if Row == MaxRow:
        break
Ross Symonds
  • 690
  • 1
  • 8
  • 29
  • You're writing to the same file, row by row. If the frame contains 10 rows, this means you'll keep overwiting the previous rows. At the very least increment `Row` by the actual number of rows. Better yet, open the file only once and create a *single* writer, outside the loop – Panagiotis Kanavos Oct 09 '19 at 10:23
  • So are you suggesting I write information to the data frame first and then write the data frame to Excel? – Ross Symonds Oct 09 '19 at 10:47

0 Answers0