-1

I have a pandas dataframe:

ind  
0   ['C']
1   ['C']
2   ['C']
3   ['C']
4   ['E']
5   ['E']

I want to convert it into a string: CCCCEE

jpp
  • 159,742
  • 34
  • 281
  • 339
kdba
  • 433
  • 5
  • 13
  • possible duplicate: https://stackoverflow.com/questions/41400381/python-pandas-concatenate-a-series-of-strings-into-one-string – Dodge May 22 '18 at 22:16
  • Possible duplicate of [Python Pandas concatenate a Series of strings into one string](https://stackoverflow.com/questions/41400381/python-pandas-concatenate-a-series-of-strings-into-one-string) – Dodge May 22 '18 at 22:17

3 Answers3

4

You can using str

df['ind'].str[0].sum()
Out[197]: 'CCCCEE'
BENY
  • 317,841
  • 20
  • 164
  • 234
3

Using itertools.chain:

from itertools import chain

df = pd.DataFrame({'ind': [['C'], ['C'], ['C'], ['C'], ['E'], ['E']]})

res = ''.join(chain.from_iterable(df['ind']))

print(res)

CCCCEE
jpp
  • 159,742
  • 34
  • 281
  • 339
  • Sure but shouldn't `''.join(chain.from_iterable(df['ind']))` suffice? – Anton vBR May 22 '18 at 21:55
  • 1
    @AntonvBR, The `list` conversion is intentional, [see here why](https://stackoverflow.com/a/37782238/9209546). You're right that accessing `values` is unnecessary, updated. – jpp May 22 '18 at 21:59
  • Oh.. it is that same old thing. Yeah that is probably right but I feel it looses readability when applying multiple layers of parenthesis. – Anton vBR May 22 '18 at 22:03
  • Ok I ran some timings with a very large dataframe and I interpret it like this: the `str.join` will convert a generator to a list before applying the `join` and it is therefore recommended to do `[]` a list comprehension over a `()` expression. However when the returning value as in this case is an iterator there is no point to convert it to a list as this will be done by the `str.join` anyway. I therefore think you are wrong in this particular case and that the `list()` only creates the extra layer of parenthesis. – Anton vBR May 22 '18 at 22:08
  • @AntonvBR, That's interesting, and I agree `list` is expensive. Looks like I need to dig a little deeper. Thanks :). – jpp May 22 '18 at 22:16
  • In any case. Nice solution. Sorry for spamming comments. – Anton vBR May 22 '18 at 22:17
  • @AntonvBR, No.. in fact, I think this is precisely the purpose of comments. Learn something new! – jpp May 22 '18 at 22:17
  • OK last thing. Let's leave this question open for the future. The answer might not be right but I strongly believe it is. – Anton vBR May 22 '18 at 22:18
0

You can do this:

chain=""
for index, row in df.iterrows():
    chain=chain+row['column']
return chain

If you have a problem iterating over a dataframe, you can check this How to iterate over rows in a DataFrame in Pandas?