-1

I'm working on a machine learning project for university and I'm having trouble understanding some bits of code online. Here's the example:

digits = np.loadtxt(raw_data, delimiter=",")
x_train, y_train = digits[:,:-1], digits[:,-1:].squeeze() 

What do the slices done in the second line mean? I'm trying to make a slice selecting the first 2/3 of the array and I've done before by something like [:2*array_elements // 3], but I don't understand how to do it if there's a delimiter in half.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Johnny
  • 41
  • 1
  • 10
  • Possible duplicate of [Understanding Python's slice notation](https://stackoverflow.com/questions/509211/understanding-pythons-slice-notation) – Bracula Dec 06 '18 at 23:02
  • 1
    Question has nothing to do with `machine learning` (the `numpy` tag is more than enough) - kindly do not spam the tag (removed). – desertnaut Dec 06 '18 at 23:04
  • @Vladimir I'm not sure thats a great dupe target (but I'm sure there is one that's numpy specific somewhere) – jedwards Dec 06 '18 at 23:07
  • @jedwards probably, however, that question looks quite good to start with. This one is very basic, all you need is some documentation, a few minutes of reading and some practice. Nevertheless, if you think that it's not a duplicate, so be it. UPD. What do you think about this one: https://stackoverflow.com/questions/12116830/numpy-slice-of-arbitrary-dimensions ? – Bracula Dec 06 '18 at 23:53

2 Answers2

0

numpy (or anything, but this seems like numpy) can implement __getitem__ to accept tuples instead of what stdlib does, where only scalar values are accepted (afaik) (e.g. integers, strings, slice objects).

You want to look at the slice "parts" individually, as specified by , delimiters. So [:,:-1] is actually : and :-1, are are completely independent.

First slice

: is "all", no slicing along that axis.

:x is all up until (and not including) x and -1 means the last element, so...

:-1 is all up until (and not including) the last.

Second slice

x: is all after (and including) x, and we already know about -1 so...

-1: is all after (and including) the last -- in this case just the last.

jedwards
  • 29,432
  • 3
  • 65
  • 92
0

There are two mechanisms involved here.

  1. The python's notation for slicing array : Understanding Python's slice notation

    Basically the syntax is array[x:y] where the resulting slice starts at x (included) and end at y (excluded). If start (resp. end) is omitted it means "from the first item" (resp. "to the last item) (This is a shortcut). Also the notation is cyclic :

    array[-1:0]
    # The elements between the last index - 1 and the first (in this order).
    # Which means the elements between the last index -1 and the last index
    # Which means a list containing only the last element
    array[-1:] = [array[-1]]
    
  2. The numpy's 2-dimensionnal arrays (assuming the np is for numpy) : Numpy frequently uses arrays of 2 dimensions like a matrix. So to access the element in row x and column y you can write it matrix[x,y] Plus the python's notation for slicing arrays also apply here to slice matrix into a sub-matrix of smaller size

So, back at your problem:

digits[:,:-1]
= digits[start:end , start:-1]
= digits[start:end , start:end-1]
= the sub-matrix where you take all the rows (start:end) and you take all the columns except the last one (start:end-1)

And

digit[:,-1:]
= digit[start:end, -1:start]
= digit[start:end, -1:end]
= sub-matrix with all the rows and only the last column
MoaMoaK
  • 172
  • 1
  • 8