0

I attend a video lecture from coursera. I got a table of data of salaries like this.

data02 = read.table("~/R/introstats/NYRedBullsSalaries.txt", header = F)

The table is like this:

           V1
1    33750.00
2    44000.00
3   138188.00
4    45566.67
5    44000.00

I want to copy all the data in V1 into one line, retaining the format as numeric. I typed the syntax like this:

salaries = paste(as.character(data02), sep = " ", collapse =",")
salaries

and it works.

# [1] "c(33750, 44000, 138188, 45566.67, 44000)"

But when I want to make a box plot, it fails:

boxplot(salaries)
## Error in x[floor(d)] + x[ceiling(d)]

non-numeric argument to binary operator

I could only make it by manually copying the bunch of numbers:

salaries_revised = c(33750, 44000, 138188, 45566.67, 44000)
boxplot(salaries_revised)

Question:

It is OK if I handle 5 pieces of data. But it's impossible to highlight and copy 5,000 pieces of data. Can you tell me how to copy a large pile of data without manual input?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
ronzenith
  • 341
  • 3
  • 11
  • This might be useful: http://stackoverflow.com/questions/20297564/converting-text-file-into-data-frame-in-r – Docconcoct Nov 07 '14 at 04:38

1 Answers1

2

You could get the boxplot by just using

 boxplot(data02) #it will give a boxplot for each column

From the post, I am assuming that you want to mimic the output that is generated using

salaries_revised = c(33750, 44000, 138188, 45566.67, 44000)
str(salaries_revised)
#num [1:5] 33750 44000 138188 45567 44000

You don't have to manually copy the elements to get the correct format for boxplot input dataset. Just do:

salaries_revised <- data02[,"V1"]

Or

salaries_revised <- data02$V1

str(salaries_revised)
# num [1:5] 33750 44000 138188 45567 44000

Regarding the paste code you used, it is creating a single character string

salaries <- paste(as.character(data02), sep = " ", collapse =",")
str(salaries)
# chr "c(33750, 44000, 138188, 45566.67, 44000)"

One way to get the result you wanted is to use eval(parse(..

boxplot(eval(parse(text=salaries)))

You don't even need the paste to get the above string

 as.character(data02)
 #[1] "c(33750, 44000, 138188, 45566.67, 44000)"

 boxplot(eval(parse(text=as.character(data02))))

Also, you were using the entire data.frame for the paste. Suppose, your dataset have multiple columns.

data03 <- data02
data03$V2 <- 1:5
as.character(data03)
#[1] "c(33750, 44000, 138188, 45566.67, 44000)"
#[2] "1:5"  

The eval(parse(..) directly on the above will return only the last element

 eval(parse(text=as.character(data03)))
 #[1] 1 2 3 4 5

Using paste

salaries <- paste(as.character(data03), sep = " ", collapse =",")
salaries 
#[1] "c(33750, 44000, 138188, 45566.67, 44000),1:5"

It will end up with error.

boxplot(eval(parse(text=salaries)))
#Error in parse(text = salaries) : <text>:1:41: unexpected ','

If you need only the V1 column

 salaries <- paste(as.character(data03[,"V1", drop=FALSE]), 
                                       sep = " ", collapse =",")

When you try to subset a single column from a dataset by default, it gets converted to vector. So, you can avoid that by specifying drop=FALSE.

Or

 salaries <- paste0("c(",paste(as.character(data03[,"V1"]),
                                        sep=" ", collapse=","), ")") 

 salaries
 #[1] "c(33750,44000,138188,45566.67,44000)"

data

 data02 <- structure(list(V1 = c(33750, 44000, 138188, 45566.67, 44000)), 
 .Names = "V1", class = "data.frame", row.names = c("1", "2", "3", "4", "5"))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Nice and thorough explanation! One little thing you might add is a short note of the drop = False/True argument in [. (+1) – talat Nov 07 '14 at 06:20
  • Thank you so much, David! The most wonderful thing is the code [boxplot(eval(parse(text=as.character(data02))))] Then I don't need to copy the chunk of numbers in plotting graphs or other calculations. – ronzenith Nov 07 '14 at 09:16