I have read in the following dataset:
from bs4 import BeautifulSoup as bs
import requests
import pandas as pd
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36'
}
url = 'https://www.un.org/securitycouncil/sites/www.un.org.securitycouncil/files/consolidated.xml'
soup = bs(requests.get(url, headers=headers).text, 'lxml')
df = pd.read_xml(str(soup), xpath='.//individual')
In that dataset there are three columns called:
- first_name
- second_name
- third_name
I need to concatenate those three columns so to get a new column (called name
). Now, based on answers to questions which are similar to this, I have tried the following code:
df['name'] = [''.join(i) for i in zip(df["first_name"].map(str),df["second_name"].map(str), df['third_name'].map(str))]
However, the resulting dataset is not what I want.
So, this is what I get from the code above:
Basically:
- there is no blank between the concatenated names
- when one of the names is blank, "None" is concatenated.
What I'd like to get is this:
Can anyone help me please?