Concatenating Strings and Items from a Pandas Data Frame

Question

I have the following dframe and attempted code below.

import pandas as pd
dframe = pd.DataFrame({
        
            'A' : [1,2,3],
            'B' : ['Guitar', 'Piccolo', 'Sax']
        })
    
dframe

for item in dframe:
    print(f"{"Mary's favorite number is"}{dframe[0]}{"and her favorite instrument is"}{dframe[1]}")

The expected output is a formatted string for each line of the code indicating "Mary's favorite letter is 1 and her favorite instrument is Guitar", "Mary's favorite letter is 2 and her favorite instrument is Piccolo", etc.

However, the code does not seem to produce these results.

Could we do a data frame where each line would contain the print statement? @KaviHarjani — HelpMeCode, Jun 19 '22 at 04:36

Corralien · Accepted Answer · 2022-06-19T05:20:10.723

3

You can use string concatenation. You have to convert column 'A' to string before:

out = "Mary's favorite number is " + dframe['A'].astype(str) + " and her favorite instrument is " + dframe['B']
print(*out, sep='\n')

# Output
Mary's favorite number is 1 and her favorite instrument is Guitar
Mary's favorite number is 2 and her favorite instrument is Piccolo
Mary's favorite number is 3 and her favorite instrument is Sax

Update: to convert out series to a dataframe, use to_frame:

out = out.to_frame('colname')
print(out)

# Output
                                             colname
0  Mary's favorite number is 1 and her favorite i...
1  Mary's favorite number is 2 and her favorite i...
2  Mary's favorite number is 3 and her favorite i...

edited Jun 19 '22 at 05:20

answered Jun 19 '22 at 04:50

Corralien

109,409
8
28
52

How would I save the output as a separate data frame? @Corralien – HelpMeCode Jun 19 '22 at 05:07
`out` is already a `Series`, you can add it to your dataframe `df['colname'] = out` or convert it as dataframe `out = out.to_frame('colname')` – Corralien Jun 19 '22 at 05:08
Would like the entire output as a data frame, if possible. Sorry if I did not make that clear. @Corralien – HelpMeCode Jun 19 '22 at 05:09
out = print(*out, sep='\n')? @Corralien – HelpMeCode Jun 19 '22 at 05:11
1

No just use `out = out.to_frame('colname')`. Is it what you expect? – Corralien Jun 19 '22 at 05:11

ddejohn · Answer 2 · 2022-06-19T04:42:05.347

1

Here's something that works:

df["Sentence"] = df.apply(lambda row: f"Mary's favorite number is {row.A} and her favorite instrument is {row.B}", axis=1)

for sentence in df["Sentence"]:
    print(sentence)

Output:

Mary's favorite number is 1 and her favorite instrument is Guitar
Mary's favorite number is 2 and her favorite instrument is Piccolo
Mary's favorite number is 3 and her favorite instrument is Sax

You could get fancier:

def sentence_maker(name, possessive_pronoun, number, instrument):
    return f"{name}'s favorite number is {number} and {possessive_pronoun} favorite instrument is {instrument}"


for sentence in df.apply(lambda row: sentence_maker("Leela", "their", row.A, row.B), axis=1):
    print(sentence)

Output:

Leela's favorite number is 1 and their favorite instrument is Guitar
Leela's favorite number is 2 and their favorite instrument is Piccolo
Leela's favorite number is 3 and their favorite instrument is Sax

edited Jun 19 '22 at 04:42

answered Jun 19 '22 at 04:36

ddejohn

8,775
3
17
30

Just wondering if this is the optimal solution, isn't the loop happening twice? Once in apply lambda and once in sentence? – Kavi Harjani Jun 19 '22 at 04:41
Huh? What is "stence"`? Either way, no, the lambda runs only once. – ddejohn Jun 19 '22 at 04:42
I read it uses a loop under the hood https://stackoverflow.com/questions/47749018/why-is-pandas-apply-lambda-slower-than-loop-here – Kavi Harjani Jun 19 '22 at 04:45
1

Oh, sure. Looping over a dataframe is a terrible solution, period. Whether it's an explicit loop like OP is trying to use, or using `.apply()`. If OP really needs the kind of performance where the difference between `.iterrows()` and `.apply()` becomes important, then they're kind of screwed because strings can't take advantage of vectorized operations. – ddejohn Jun 19 '22 at 04:47
How would I save the output as a separate data frame? @ddejohn – HelpMeCode Jun 19 '22 at 05:08

score 1 · Answer 3 · answered Jun 19 '22 at 04:40

1

import pandas as pd
df = pd.DataFrame({
        
            'A' : [1,2,3],
            'B' : ['Guitar', 'Piccolo', 'Sax']
        })
    
for index, row in df.iterrows():
    print(f"Mary's favorite number is {df.A.iloc[index]} and her favorite instrument is {df.B.iloc[index]}")

output

answered Jun 19 '22 at 04:40

Kavi Harjani

661
5
15

How would I save the output as a separate data frame? @KaviHarjani – HelpMeCode Jun 19 '22 at 05:08

Concatenating Strings and Items from a Pandas Data Frame

3 Answers3