67

What is the method to convert a Python list of strings to a pd.Series object?

(pandas Series objects can be converted to list using tolist() method--but how to do the reverse conversion?)

user2314737
  • 27,088
  • 20
  • 102
  • 114
Hypothetical Ninja
  • 3,920
  • 13
  • 49
  • 75

4 Answers4

50

I understand that your list is in fact a list of lists

import pandas as pd

thelist = [ ['sentence 1'], ['sentence 2'], ['sentence 3'] ]
df = pd.Series( (v[0] for v in thelist) )
Colin Bernet
  • 1,354
  • 9
  • 12
  • from your edits and comments, I understand that the list you're talking about is a list of lists. you have to make it 1D to make the Series. I edited my post to show how to do that with a generator. – Colin Bernet Feb 08 '14 at 14:01
  • 1
    it was simple.. df = pd.Series(data) .. automatically converted the whole text into a dataframe object.. thanks.. you can edit your post and include this too,for others to benefit.. :) – Hypothetical Ninja Feb 08 '14 at 14:02
  • 1
    ok, still not sure about what a sentence is in your case, but I'm happy I could help anyway :-) - cheers – Colin Bernet Feb 08 '14 at 14:04
39

To convert the list myList to a Pandas series use:

mySeries = pd.Series(myList) 

This is also one of the basic ways for creating a series from a list in Pandas.

Example:

myList = ['string1', 'string2', 'string3']                                                                                                                
mySeries = pd.Series(myList)                                                                                                                             
mySeries                                                                                                                                                 
# Out: 
# 0    string1
# 1    string2
# 2    string3
# dtype: object

Note that Pandas will guess the data type of the elements of the list because a series doesn't admit mixed types (contrary to Python lists). In the example above the inferred datatype was object (the Python string) because it's the most general and can accommodate all other data types (see data types).

It's possible to specify a data type when creating a series:

myList= [1, 2, 3] 

# inferred data type is integer
pd.Series(myList).dtype                                                                                                                        
# Out:
# dtype('int64')

myList= ['1', 2, 3]                                                                                                                                     

# data type is object  
pd.Series(myList).dtype                                                                                                                                                                                                                                                                
# Out: 
# dtype('O')

One can specify dtype as integer:

myList= ['1', 2.2, '3']
mySeries = pd.Series(myList, dtype='int')  
mySeries.dtype                                                                                                                                 
# Out:
# dtype('int64')

But this will work only if all elements in the list can be casted to the desired data type.

user2314737
  • 27,088
  • 20
  • 102
  • 114
11
import pandas as pd
sentence_list = ['sentence 1', 'sentence 2', 'sentence 3', 'sentence 4']
print("List of Sentences: \n", sentence_list)
sentence_series = pd.Series(sentence_list)
print("Series of Sentences: \n", sentence_series)

Documentation

Even if sentence_list is a list of list, this code still converts a list to Pandas Series object.

JustCurious
  • 328
  • 2
  • 10
2

pd.Series(l) actually works on almost any type of list and it returns Series object:

import pandas as pd
l = [ ['sentence 1'], ['sentence 2'], ['sentence 3'] ] #works
l = ['sentence 1', 'sentence 2', 'sentence 3'] #works
l = numpy.array(['sentance 1', 'sentance2', 'sentance3'], dtype='object') #works

print(l, type(l))
ds = pd.Series(l)
print(ds, type(ds))

0    sentence 1
1    sentence 2
2    sentence 3
dtype: object <class 'pandas.core.series.Series'>
prosti
  • 42,291
  • 14
  • 186
  • 151