3

Newbie here.

Im trying to learn Python and work with datasets, ive kinda got thrown into the deep end at work. The language is clearly very powerful but very different to anything else ive experienced before.

I need some clarity / help / explanation on the following please.

Partial Algo code

history = data.history(context.stock_list, fields="price", bar_count=300, frequency="1d")
hs = h.iloc[-20:]
p = h.iloc[-1]

What is Difference Between 3 Variables Shown?

hs1 = history.iloc[:20]   
hs2 = history.iloc[20:]
hs3 = history.iloc[-20]

history creates a data sets of 4 asset prices, as can be seen from image under "additional info"

Ive researched and learned data iloc is a pandas indexing and referencing function

However what I do not understand is the [:20], [20:], [-20] indexes(?) attached to iloc function in the 3 example variables shown above

Questions

  • hs1 = history.iloc[:20], According to my research following python programming tutorial on pandas dataframe hs1 = history.iloc[:20] singles out deletes the first 20 columns within the dataframe, is this correct?
  • hs2 = history.iloc[:20] What is difference to above variable?
  • hs3 = history.iloc[-20] Why a minus - and no : inside the index?

Additional Info

History variable creates dataset of 3 assets

enter image description here

Hope this makes sense, please comment if you need any additional info any help and advice much appreciated.

Marilee
  • 1,598
  • 4
  • 22
  • 52
  • 1
    This is basic indexing in python. I find it hard to believe that you are being made to develop products in a language you don't even know, without at least first pointing you to basic documentation and references. Wow, that's harsh. – cs95 Nov 02 '17 at 06:14
  • As a favour, I've closed this question as a duplicate of a canonical question on the topic of slicing. Read that and understand how slicing works. Once you have understood basic slicing, it becomes easy to apply the concept to dataframes - `iloc` is just a convenience function for row/column slicing by index. – cs95 Nov 02 '17 at 06:16
  • 1
    @COLDSPEED yeah it sucks but it is what it is, my job is on the line here so kinda stressed. To make it worse I come from the PHP world ugh. Thanks for the link, that does help a lot, reading through it now. Unfortunately I do not have much time to "play around with code" and test / explore different scenarios. Allow me to be a bit cheeky may I ask a huge favor from you -- would you be so kind to just shortly summarize / explain the different scenarios specific to my question? – Marilee Nov 02 '17 at 06:24
  • Well, if you ask me so nicely, how can I refuse? Give me a bit. – cs95 Nov 02 '17 at 06:27
  • Understanding code and functions is one thing, but if you have a more specific question on how to apply a particular operation on your data, you could open a new question. You should provide some sample data and expected output along with a brief description of what you want to achieve. Read [this question](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples), it gives you a good idea of what is expected in a good pandas "How do I...?" question. – cs95 Nov 02 '17 at 06:44
  • hey dont feel so bad :)many probably have same problem. the number of times i put in **li = [1,2,3]** and use the interpreter to play w indices like yours is a bit humbling. slicing syntax is powerful but it never seems to stick long in my head :( – JL Peyret Nov 02 '17 at 07:54
  • @Marilee *coming from* PHP is not a sin - quite possibly even an act of repentance. *Returning to* PHP on the other hand is the one that Dante forgot from the list of eight deadly vices. – Antti Haapala -- Слава Україні Nov 02 '17 at 17:08

1 Answers1

3

Before beginning anything else, I recommend reading Understanding Python's slice notation to get a first class insight on how python's slicing notation works. In particular, look at the different slice modes available to you:

a[start:end] # items start through end-1
a[start:]    # items start through the rest of the array
a[:end]      # items from the beginning through end-1
a[:]         # a copy of the whole array
  • a[start:end] returns a sub-slice from index start (inclusive) to end - 1

    >>> lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    >>> lst[2:5]
    [3, 4, 5]
    
  • a[start:] returns a sub-slice from start till the end of the list.

    >>> lst[5:]
    [6, 7, 8, 9, 10]
    
  • a[:end] returns a sub-slice from the beginning of the list till end - 1.

    >>> lst[:5]
    [1, 2, 3, 4, 5]
    
  • a[:] just returns a new copy of the same list.

    >>> lst[:]
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    

Understand this, and you've understood dataframe indexing.


As I've already mentioned, iloc is used to select dataframe subslices by their index, and the same rules apply. Here's the documentation:

DataFrame.iloc

Purely integer-location based indexing for selection by position.

.iloc[] is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array.

It's a bit much to take in, but the pandas cookbook makes it simple. The basic syntax is:

df.iloc[x, y] 

Where x is the row index/slice and y is the column index/slice. If the second argument is omitted, row slicing is assumed. In your case, you have:

  • history.iloc[:20] which returns the first 20 rows.

  • history.iloc[20:] which returns everything after the first 20 rows.

  • history.iloc[-20], which is interpreted as history.iloc[len(history) - 20] which is the 20th row from the end (negative indices specify indexing from the end).

Consider a dataframe:

df

   A
0  0
1  1
2  2
3  3
4  4
5  5
6  6
7  7
8  8
9  9

Here are the different slice modes in action.

df.iloc[:5]

   A
0  0
1  1
2  2
3  3
4  4
df.iloc[5:]

   A
5  5
6  6
7  7
8  8
9  9
df.iloc[-5]

A    5
Name: 5, dtype: int64

References

cs95
  • 379,657
  • 97
  • 704
  • 746
  • I get it now, its actually kind of simple. Just a very different syntax. Thank you so much for your time and effort...appreciate it more than you realize! Feel like I owe you.something, stay awesome :) – Marilee Nov 02 '17 at 08:01
  • 1
    @Marilee Just super glad to know it helped. If you've any more questions, feel free to ping me, here, or at my email (it's in my bio). Cheers. – cs95 Nov 02 '17 at 08:03