2

I have been following this Python linear regression tutorial: https://medium.com/@contactsunny/linear-regression-in-python-using-scikit-learn-f0f7b125a204

Using the following dataset: https://github.com/contactsunny/data-science-examples/blob/master/salaryData.csv

My problem is with the following piece of code:

x = dataset.iloc[:, :-1].values

What does the negation(-1) do here? Why do I get an error If I use the following as an alternate:

x = dataset.iloc[:, 0].values
desertnaut
  • 57,590
  • 26
  • 140
  • 166
kzs
  • 1,111
  • 5
  • 20
  • 35

1 Answers1

2

It means, get all columns except the last column:

df = pd.DataFrame(np.random.randint(0,100,(5,5)), index=[*'abcde'], columns=[*'ABCDE'])

df.iloc[:,:-1]

Output:

    A   B   C   D
a  79  23   9  89
b  67  60  32  82
c  66  18  41  67
d  90  51  63  29
e  34  65  82  82

This statement gets all rows and slices the columns to filter out the last. And, there is no error by your second statement it is good statement.

df.iloc[:, 0]

Output:

a    79
b    67
c    66
d    90
e    34
Name: A, dtype: int3

Get all rows of the first column (position 0).

Scott Boston
  • 147,308
  • 15
  • 139
  • 187