-1

I'd like to get the unique values from a column in a dataframe. With the R package dplyr, it should be possible.

enter image description here

This distinct(select(dataframe, column)) works great on my Mac. In RStudio on Windows 7 I encounter this:

enter image description here

when I run this R code:

library(dplyr)
df <- data.frame(replicate(4,sample(0:1,10,rep=TRUE)))

enter image description here

unique_values <- distinct(select(df, X1))

enter image description here

EDIT

Please check if dplyr::distinct(select(df, X1)) works? – akrun

Of course - here is the console output:

enter image description here

EDIT

I've not used distinct, but perhaps unique would work for you? unique(df$X1) – NPE

It does work, and it's concise too! I would still like to understand this dplyr error...

enter image description here

EDIT

Please add the output of sessionInfo() instead. – Roland

enter image description here

EDIT

some comments note that dplyr_0.2 version is old. install.packages("dplyr") gets a CRAN link to the old package. Now to figure out how to manually install dplyr_0.3.0.2.

enter image description here

smci
  • 32,567
  • 20
  • 113
  • 146
Micah Stubbs
  • 1,827
  • 21
  • 34

1 Answers1

1

Figured it out! Old R means old dplyr means no distinct() function.

To fix this, install the latest version of R:

  1. go to http://www.r-project.org
  2. click on 'CRAN'
  3. then choose the CRAN site that you like. I like Kansas: http://rweb.quant.ku.edu/cran/
  4. click on 'Download R for X' [where X is your operating system]
  5. follow the installation procedure for your operating system
  6. restart RStudio
  7. rejoice

source: this very nice answer

Then run the command install.packages("dplyr") in the RStudio Console.

Now you can create a dataframe and use the distinct() function to get the unique values from one of its columns:

library(dplyr)

# create a dataframe with some values
df <- data.frame(replicate(4,sample(0:1,10,rep=TRUE)))
df

# select a column from that dataframe and get a list of the unique values
unique_values <- distinct(select(df, X1))
unique_values

In the console you should see:

enter image description here

Thanks to David Arenburg and Richard Scriven for pointing our that dplyr-0.2 is old and lacks the distinct() function. This line of thinking led to the answer.

Community
  • 1
  • 1
Micah Stubbs
  • 1,827
  • 21
  • 34