Separate dataframe row to multiple rows by newline character

Question

I have a dataframe in which some rows have newline characters. I want them to get separated in separated rows.

Here is a sample of my dataframe:

   col1     col2
    a       123
    b       234
    c\nd\ne 345
    f       456
    g       567

I want to make it like this:

Can anybody help me?

Welcome to Stack Overflow! Please include a _small_ subset of your data as a __copyable__ piece of code that can be used for testing as well as your expected output for the __provided__ data. See [MRE - Minimal, Reproducible, Example](https://stackoverflow.com/help/minimal-reproducible-example), and [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/15497888) for more information. — Henry Ecker, Sep 06 '21 at 16:44
Please post your data as code and not as screenshots. For example, assuming you read your excel/csv to a dataframe called `df`, include the output of `df.to_dict()` — not_speshal, Sep 06 '21 at 16:44

Ali Crash · Accepted Answer · 2021-09-07T10:16:00.377

this code should help it can split any column of dataframe by any character you want and return splited dataframe.

class Seperator:
    row_list=[]
    def __init__(self,df,column_name,split_char):
        df.apply(lambda row:self.seperate(row,column_name,split_char),axis=1)
            

    def seperate(self,row,column_name,split_char):
        items = row[column_name].split(split_char)
        row_dic = dict(row)
        for item in items:
            row_dic[column_name] = item
            tmp = {key:row_dic[key] for key in row_dic}
            self.row_list.append(tmp)
        return row
    def dataframe(self):
        return pd.DataFrame(self.row_list)

now let's use this class:

df = pd.DataFrame({'col1':['a','b','c\nd\ne','f','g'],'col2':[123,234,345,456,567]})
df
col1    col2
0   a   123
1   b   234
2   c\nd\ne 345
3   f   456
4   g   567

after that :

seperator = Seperator(df,column_name='col1',split_char='\n')
seperator.dataframe()

col1    col2
0   a   123
1   b   234
2   c   345
3   d   345
4   e   345
5   f   456
6   g   567

enjoy.

thank you for your reply. but your output is not exactly the same as I need. in this output 'e' has come three times but i needed 'c','d','e' in separate rows. please reply if you could help — nikhil kumar, Sep 07 '21 at 07:09

Separate dataframe row to multiple rows by newline character

1 Answers1