2

I am a bit new to Python and was confused on how to read documentation. In particular, the Scikit learn documentation for their Logistic Regression classifier describes a function fit with the following parameters:

 Parameters:    
X : {array-like, sparse matrix}, shape (n_samples, n_features)
Training vector, where n_samples is the number of samples and n_features is the number of features.
y : array-like, shape (n_samples,)
Target vector relative to X.

The first part of X seems to ask for a dictionary ... and but the shape is a tuple? I'm confused - what type of input is the function expecting for X here?

Max von Hippel
  • 2,856
  • 3
  • 29
  • 46
Mike
  • 65
  • 8
  • 3
    It's asking for an array-like object, (so like an `np.array`) or a sparse matrix with a particular shape... those aren't literals, those are just brackets... – juanpa.arrivillaga Dec 08 '17 at 23:07
  • 1
    Also that would be a set, not a dictionary, as it lacks a colon. – jonrsharpe Dec 08 '17 at 23:07
  • 1
    Going off @jonrsharpe 's comment, a dictionary would be of the form `{ key: value, ... key:value}`, rather than a set which is `{key, key, ... , key}`. – Max von Hippel Dec 08 '17 at 23:08
  • If it was expecting a `dict` or a `set` it would say "dict" or "set" or "tuple" or whatever. – juanpa.arrivillaga Dec 08 '17 at 23:10
  • @juanpa.arrivillaga In Java terms, would that be similar to an ArrayList where each item in the list is a sample feature vector? And thanks everyone for the clarification. – Mike Dec 08 '17 at 23:11
  • 1
    @Mike No, it would *not* be like an ArrayList. The term "array-like" in scipy-parlance is defined a bit tautologically, it's anything that can be converted into a `numpy.ndarray`, see [here](https://stackoverflow.com/questions/40378427/numpy-formal-definition-of-array-like-objects). Basically, though, it wants a two-dimensional numpy array. – juanpa.arrivillaga Dec 08 '17 at 23:14
  • @juanpa.arrivillaga How do I figure out what should be the array's rows and columns? Is that defined by shape (n_samples, n_features)? – Mike Dec 08 '17 at 23:21
  • 2
    @Mike yes, exactly. rows are samples, columns are features. – juanpa.arrivillaga Dec 08 '17 at 23:22

0 Answers0