1

123.csv has below data.

47608.62, 47573.99, 47530.34, 48089.44, 48105.55,.......

there are many slicing ways in python. What differences on those?

1. a=dataset[i]
2. a=dataset[i,0]
3. a=dataset[i:i+1, 0]

results

1. a = dataset[i]
[array([47608.62]), array([47573.99]), array([47530.34]), ...

2. a = dataset[i,0]
[47608.62, 47573.99, 47530.34, 48089.44, 48105.55, ...

3. a = dataset[i:i+1, 0]
[array([47608.62]), array([47573.99]), array([47530.34]), ...

code is below

import numpy
from pandas import read_csv

dataframe = read_csv('123.csv', usecols=[1], engine='python')
dataset = dataframe.values
dataX = []
for i in range(len(dataset) - 2):
    a = dataset[i:i+1,0]
    dataX.append(a)
print(dataX)

I expected #2 and #3 would be same result. but it is not. Why do I get the above test result? I'm confused index and slicing between list and array. Can you explain it?

avermaet
  • 1,543
  • 12
  • 33
Shumagel
  • 11
  • 1
  • Look at your data types. `dataset[1]` and `dataset[1:2,0]` are numpy arrays, whereas `dataset[1,0]` is a float. Thus, when you put them in a list you get a list of arrays and a list of floats, respectively. The first two take slices of the array `dataset`, while the third one pulls a value directly given its row and column location. – Todd Burus Aug 17 '19 at 13:31

0 Answers0