0

I was playing around with a data frame and I can't wrap my head around a problem. Here is the code I used:

Died.At <- c(22,40,72,41)
Writer.At <- c(16, 18, 36, 36)
First.Name <- c("John", "Edgar", "Walt", "Jane")
Second.Name <- c("Doe", "Poe", "Whitman", "Austen")
Sex <- c("MALE", "MALE", "MALE", "FEMALE")
Date.Of.Death <- c("2015-05-10", "1849-10-07", "1892-03-26","1817-07-18")
writersdataframe <- data.frame(Died.At, Writer.At, I(First.Name), I(Second.Name), Sex, as.Date(Date.Of.Death))

This is the result:

 str (writersdataframe)
'data.frame':   4 obs. of  6 variables:
 $ Died.At               : num  22 40 72 41
 $ Writer.At             : num  16 18 36 36
 $ First.Name            : 'AsIs' chr  "John" "Edgar" "Walt" "Jane"
 $ Second.Name           : 'AsIs' chr  "Doe" "Poe" "Whitman" "Austen"
 $ Sex                   : Factor w/ 2 levels "FEMALE","MALE": 2 2 2 1
 $ as.Date.Date.Of.Death.: Date, format: "2015-05-10" "1849-10-07" "1892-03-26" ...

I wrote the code like this because I want R to interpret Date.Of.Death as a date, but I do not want as.Date to show in the name of the column inside the data frame. I found a way to do it, which is is change the format before creating the data frame:

Date.Of.Death <- as.Date(Date.Of.Death)
writersdataframe <- data.frame(Died.At, Writer.At, I(First.Name), I(Second.Name), Sex, I(Date.Of.Death))

I checked with:

class(writersdataframe$Date.Of.Death)
[1] "AsIs" "Date"

What I was wondering is if I can create the data frame while treating Date.Of.Death as.Date directly in the function data.frame. Is there a reason that doing it (e.g.:

writersdataframe <- data.frame(Died.At, Writer.At, I(First.Name), I(Second.Name), Sex, as.Date(Date.Of.Death))

) renames the column title or did I make a mistake?

  • 1
    You can specifically name the column when creating the data frame. For example: writersdataframe <- data.frame(Died.At, Writer.At, I(First.Name), I(Second.Name), Sex, Date.of.death = as.Date(Date.Of.Death)) – LMunyan Jun 05 '18 at 17:47
  • Nice, thank you. In my previous attempts to do this for some reason while using writersdataframe <- data.frame(Died.At, Writer.At, I(First.Name), I(Second.Name), Sex, Date.of.death <- as.Date(Date.Of.Death)) I got back Date.of.death....as.Date.Date.Of.Death as title. I see that if I use = as you did, instead of <- as I did this doesn't happen. I'll have to look into why is this. – Alessandro Jun 06 '18 at 18:59
  • You will only want to use = within the data.frame function call. My solution involves naming the column you want to place the Date.Of.Death data into. If it is helpful I can write it out in a more detailed answer as an answer. – LMunyan Jun 07 '18 at 12:52
  • I still don't really understand why <- doesn't work but = does, even after reading a bit more into it ( https://stackoverflow.com/questions/1741820/what-are-the-differences-between-and-in-r ) – Alessandro Jun 10 '18 at 12:38

1 Answers1

0

Please see the below explanation for clarification.

There are several ways to get to solve the problem in your original question.

Solution 1: directly specify all column names. This is more explicit and makes your code more readable.

writersdataframe <- data.frame(Died.At = Died.At, Writer.At = Writer.At, First.Name = First.Name, Second.Name = Second.Name, Sex = Sex, Date.of.Death = as.Date(Date.Of.Death))

In this case you explicitly naming each column based on what is to the left of the '=' sign within the data.frame() function. To the right of the '=' sign you assign values to these columns. You can do so by entering raw data or by entering a variable that is already set up in your environment. In this case it looks like you are trying to create the data frame using variables you have already set up.

Generally speaking you will want to use the '=' sign when you are specifying arguments within a function, in this case data.frame(). You will use the assignment operator '<-' when you want to create a new variable, just how you used it in the first code chunk of your question.

When you are specifying as.Date(Date.of.Death) within the data.frame function in your first code chunk, the data frame function is looking at the existing variable, Date.of.Death, and converting it to a new variable with the new date format. By specifying the column name first within the data.frame() function you are not creating a variable within the global environment. You are simply creating a new column in your data frame based on the existing Date.of.Death variable.

Another way to do it would be to convert everything to a date in your original Date.of.Death variable. See below.

Date.Of.Death <- as.Date(c("2015-05-10", "1849-10-07", "1892-03-26","1817-07-18"))

writersdataframe <- data.frame(Died.At, Writer.At, I(First.Name), I(Second.Name), Sex, I(Date.Of.Death))

Hope this helps.

LMunyan
  • 116
  • 6