0

I understood the former one gives me a series whereas the letter gives a dataframe. What I couldn't get is its arguments. df[['column_name']] is giving dataframe. Is that the reason cuz I'm sending ['column_name'] an iterative as its data= parameter? I'm struggling how python is working here! My results are following:

df['Yil']=
bir     2021
ikki    2020
19      2019
18      2018
17      2017
16      2016
15      2015
10      2010

df[['Yil']]=

        Yil
bir     2021
ikki    2020
19      2019
18      2018
17      2017
16      2016
15      2015
10      2010
Name: Yil, dtype: int64 
wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • 1
    `Name: Yil, dtype: int64` seems to be in the wrong place. Seems like you copy-pasted `df[['Yil']]` into the middle – wjandrea Oct 08 '22 at 18:34
  • 1
    BTW, welcome to Stack Overflow! Check out the [tour], and [ask] if you want tips. I edited your question to remove some unnecessary parts since Stack Overflow is meant to be like a reference ([more details](https://meta.stackoverflow.com/q/260776/4518341)). – wjandrea Oct 08 '22 at 18:36
  • What do you mean by "`data=` parameter"? Are you confusing the dataframe constructor with indexing? – wjandrea Oct 08 '22 at 18:47
  • Related: [Selecting multiple columns in a Pandas dataframe](/q/11285613/4518341), [Keep selected column as DataFrame instead of Series](/q/16782323/4518341) – wjandrea Oct 08 '22 at 19:23

2 Answers2

2

df['column_name'] returns a Series that is that column

df[['column_name']] returns a DataFrame that has one column named column_name

which you clearly noticed...

dataframes have some different methods available to them vs series. it's hard to tell which one you want to use without more info.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
1

For selecting certain columns of a dataframe, the indexing can't be just any iterable. (For example, strings are iterable.) According to the documentation, it has to be a list, although from some quick testing, some other iterables will work:

Iterators

In [2]: df = pd.DataFrame({'a': [2, 3], 'b': [4, 5], 'c': [6, 7]})

In [3]: df[['a']]
Out[3]: 
   a
0  2
1  3

In [4]: df[iter(['a'])]  # Dummy iterator
Out[4]: 
   a
0  2
1  3

In [5]: df[(x for x in ['a'])]  # Dummy generator, a kind of iterator
Out[5]: 
   a
0  2
1  3

Ranges

In [6]: df1 = pd.DataFrame([['a', 'b'], ['c', 'd']])

In [7]: df1[range(1)]
Out[7]: 
   0
0  a
1  c

Dicts and sets also work, but they're deprecated.


In contrast, a tuple cannot be used to select multiple columns:

In [8]: df[('a',)]
Traceback (most recent call last):
  ...
KeyError: ('a',)

Because it needs to be possible to do multilevel column indexing:

In [9]: df2 = pd.DataFrame(
   ...:    [[2, 4], [3, 5]],
   ...:    columns=pd.MultiIndex.from_tuples([('a', 'b'), ('a', 'c')]))

In [10]: df2
Out[10]: 
   a   
   b  c
0  2  4
1  3  5

In [11]: df2[('a', 'c')]
Out[11]: 
0    4
1    5
Name: (a, c), dtype: int64
wjandrea
  • 28,235
  • 9
  • 60
  • 81