Split a string in pandas row and insert new rows by enlarging the dataframe

Question

I have the following DataFrame:

	no	word	status
0	0	one	to_check
1	1	two	to_check
2	2	:)	emoticon
3	3	dr.	to_check
4	4	"future"	to_check
5	5	to	to_check
6	6	be	to_check

I want to iterate trough each row to find quotes at word initial and final positions and create a DataFrame like this:

	no	word	status
0	0	one	to_check
1	1	two	to_check
2	2	:)	emoticon
3	3	dr.	to_check
4	4	"	quotes
5	4	future	word
6	4	"	quotes
7	5	to	to_check
8	6	be	to_check

I can strip quotes and split the word into three pieces but I got the this DataFrame, it overwrites the last two rows:

	no	word	status
0	0	one	to_check
1	1	two	to_check
2	2	:)	emoticon
3	3	dr.	to_check
4	4	"	quotes
5	4	future	word
6	4	"	quotes

I tried df.loc[index], df.iloc[index], df.at[index] but none of them helped me to extend the number of rows in the DataFrame.

Is it possible to add new rows at specific index without overwriting last two rows?

Did you try any of these: https://stackoverflow.com/questions/24284342/insert-a-row-to-pandas-dataframe ? — Chaos_Is_Harmony, Sep 14 '21 at 01:31

BENY · Answer 1 · 2021-09-14T02:37:44.170

6

In your case you can split then explode

out = df.assign(word = df.word.str.split(r'(\")')).explode('word').\
           loc[lambda x : x['word']!='']
   no    word    status
0   0     one  to_check
1   1     two  to_check
2   2      :)  emoticon
3   3     dr.  to_check
4   4       "  to_check
4   4  future  to_check
4   4       "  to_check
5   5      to  to_check
6   6      be  to_check

For change the status

out['status'] = np.where(out['word'].eq('"'), 'quotes',out['status'])

edited Sep 14 '21 at 02:37

answered Sep 14 '21 at 01:38

BENY

317,841
20
164
234

This is beautiful! You must change `status` as well though. – haneulkim Sep 14 '21 at 01:50

haneulkim · Answer 2 · 2021-09-14T01:54:46.403

This is not very efficient however I think it is very readable, maybe you could use it if you don't care too much of efficiency.

no_lst = list()
word_lst = list()
status_lst = list()

def check_quote(w):
    if w.startswith('"') and w.endswith('"'):
        return True
    else:
        return False

for i, row in enumerate(df.itertuples()):
    word = getattr(row, "word")
    status = getattr(row, "status")
    
    if check_quote(word):
        no_lst += [i,i,i]
        stripped_w = word.strip('"')
        
        word_lst.append('"')
        word_lst.append(stripped_w)
        word_lst.append('"')
        
        status_lst += ["quotes", "word", "quotes"]
        continue
        
    no_lst.append(i)
    word_lst.append(word)
    status_lst.append(status)

new_df = pd.DataFrame({"no":no_lst,
                   "word":word_lst,
                   "status":status_lst})

score 0 · Answer 3 · answered Sep 14 '21 at 01:49

0

you try creating this function.

def InsertRow(df, idx):
    df1 = df[0:idx]
    df2 = df[idx:]

    arr_empty = np.array([idx-1, '\"', 'quotes'])

    df1.iloc[-1]=arr_empty
    df = pd.concat([df1, df2])

    return df

answered Sep 14 '21 at 01:49

Hanseo Park

11
2

As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-ask). – Community Sep 14 '21 at 02:23

Split a string in pandas row and insert new rows by enlarging the dataframe

3 Answers3