1

AssertionError: 22 columns passed, passed data had 21 columns in html tables they show error

import requests
from bs4 import BeautifulSoup
import pandas as pd

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.3"
}
r = requests.get("https://www.worldometers.info/coronavirus/?utm_campaign=homeAdvegas1?")
soup = BeautifulSoup(r.content, "lxml")
table = soup.find("table", id="main_table_countries_yesterday")

header = [th.get_text(strip=True) for th in table.tr.select("th")][1:]

all_data = []
for row in table.select("tr:has(td)"):
    tds = [td.get_text(strip=True) for td in row.select("td")]
    all_data.append(tds)

df = pd.DataFrame(all_data, columns=header)
print(df)
df.to_csv("data.csv", index=False)
Arslan Aziz
  • 193
  • 8
  • Does this answer your question? [AssertionError: 22 columns passed, passed data had 21 columns](https://stackoverflow.com/questions/40855030/assertionerror-22-columns-passed-passed-data-had-21-columns) – johnchase Aug 18 '21 at 17:42

2 Answers2

0

Try removing [1:] from the header list

header = [th.get_text(strip=True) for th in table.tr.select("th")]

That is dropping the first column of the header so you only have 21 items in the headers, whereas the data has 22 items, hence the error.

johnchase
  • 13,155
  • 6
  • 38
  • 64
  • header = [th.get_text(strip=True) for th in table.tr.select("th")] AttributeError: 'NoneType' object has no attribute 'tr' – Arslan Aziz Aug 18 '21 at 16:05
  • That error is not related to updates I suggested. It would appear that the line prior is not returning anything so your `table` variable is being assigned `None`. I was able to run the code withouth errors, try restarting and running again – johnchase Aug 18 '21 at 17:39
0

Your header has 21 columns and the items of all_data has 22 columns.

Just add an additional column to header and it works. I have added S.No as additional column name.

header.insert(0, 'S.No')

After this modification, your code prints

    S.No  Country,Other  ... New Deaths/1M pop Active Cases/1M pop
0                  Asia  ...                                      
1         North America  ...                                      
2         South America  ...                                      
3                Europe  ...                                      
4                Africa  ...                                      
..   ...            ...  ...               ...                 ...
233              Total:  ...                                      
234              Total:  ...                                      
235              Total:  ...                                      
236              Total:  ...                                      
237              Total:  ...                                      

[238 rows x 22 columns]
Ram
  • 4,724
  • 2
  • 14
  • 22