using CSV, DataFrames
iris = CSV.read(joinpath(dirname(pathof(DataFrames)),"..","test/data/iris.csv"))
head(iris)
6×5 DataFrame
│ Row │ SepalLength │ SepalWidth │ PetalLength │ PetalWidth │ Species │
│ │ Float64⍰ │ Float64⍰ │ Float64⍰ │ Float64⍰ │ String⍰ │
├─────┼─────────────┼────────────┼─────────────┼────────────┼─────────┤
│ 1 │ 5.1 │ 3.5 │ 1.4 │ 0.2 │ setosa │
│ 2 │ 4.9 │ 3.0 │ 1.4 │ 0.2 │ setosa │
│ 3 │ 4.7 │ 3.2 │ 1.3 │ 0.2 │ setosa │
│ 4 │ 4.6 │ 3.1 │ 1.5 │ 0.2 │ setosa │
│ 5 │ 5.0 │ 3.6 │ 1.4 │ 0.2 │ setosa │
│ 6 │ 5.4 │ 3.9 │ 1.7 │ 0.4 │ setosa │
I want to find all rows where Species is in setosa
or virginica
. Note that the answer must use a lookup into an array of values to find since I want the result to work when looking for arbitrarily many values.
There is a function called indexin. It gets me halfway there:
iris[indexin(iris.Species ,["setosa", "virginica"])]
But when I try to use it for indexing the result is:
ERROR: ArgumentError: Only Integer values allowed when indexing by vector of numbers