-1

I am trying to split a column but I noticed split changing the other values. For example, some values of row 10 exchange with row 8. Why is that?

Actual data on ID 10

| vat_number | email                                            | foi_mail       | website 
|     10     | abc@test.com;example@test.com;example@test.com   | xyz@test.com   | example.com

After executing this line of code:

base_data[['email','email_1','email_2']] = pd.DataFrame(base_data.email.str.split(';').tolist(),
                                                        columns = ['email','email_1','email_2'])

base_data becomes:

| vat_number | email                  | foi_mail               | website     | email_1 | email_2
|     10     | some other row value   | some other row value   | example.com | ------  | -----

Before:

Before executing code that split column

After:

After executing code that split column

Data contains thousands of row, but I showed only one row.

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
  • As posted, there isn't enough date to reproduce the error. Consider posting more data. Please [provide a reproducible copy of the DataFrame with `to_clipboard`](https://stackoverflow.com/questions/52413246/provide-a-reproducible-copy-of-the-dataframe-with-to-clipboard/52413247#52413247) – Trenton McKinney Nov 06 '19 at 19:18

3 Answers3

0

try do table in table:

def test():
base_data = []
base_data.append(['12','32'])
base_data.append(['352','335'])
base_data.append(['232','32'])

print(base_data)
a = base_data[0]
print(a)
print(a[0])
print(a[1])

input("Enter to contuniue. . .")

and use loop to add

0

if i understand the case. I believe you need something like that:

 base_data = base_data.merge(base_data['email'].str.split(';', expand = True).rename(columns = {0:'email',1:'email_1',2:'email_2']}), left_index = True, right_index = True)

Here is the logic explanation:

a1 = list('abcdef')
b1 = list('fedcba')
c1 = [f'{x[0]};{x[1]}' for x in zip(a1, b1)]
df1 = pd.DataFrame({'c1':c1})
df1

Out[1]:

    c1
0   a;f
1   b;e
2   c;d
3   d;c
4   e;b
5   f;a

df1 = df1.merge(df1['c1'].str.split(';', expand = True).rename(columns = {0:'c2',1:'c3'}), left_index = True, right_index = True)
df1

Out[2]:

    c1  c2  c3
0   a;f a   f
1   b;e b   e
2   c;d c   d
3   d;c d   c
4   e;b e   b
5   f;a f   a
Alex
  • 1,118
  • 7
  • 7
0

Use the expand parameter of .str.split:

import pandas as pd

# your dataframe
 vat_number                                           email      foi_mail      website
        NaN  abc@test.com;example@test.com;example@test.com  xyz@test.com  example.com

# split and expand
df[['email_1', 'email_2', 'email_3']] = df['email'].str.split(';', expand=True)

# drop `email` col
df.drop(columns='email', inplace=True)

# result
 vat_number      foi_mail      website       email_1           email_2           email_3
        NaN  xyz@test.com  example.com  abc@test.com  example@test.com  example@test.com
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158