So, I'm trying to figure out a larger problem, and I think it may stem from exactly what's happening when I import data from a .txt
file. My regular beginning commands are:
data<-read.table("mydata.txt",header=T)
attach(data)
So if my data has say, 3 columns with headers "Var1"
, "Var2"
and "Var3"
, how exactly is everything imported? It seems as though it is imported as 3 separate vectors, then bound together, similar to using cbind()
.
My larger issue is modifying the data. If a row in my data frame has an empty spot (in any column) I need to remove it:
data <- data[complete.cases(data),]
Perfect - now say that the original data frame had 100 rows, 5 of which had an empty slot. My new data frame should have 95 rows, right? Well if I try:
> length(Var1)
[1] 100
> length(data$Var1)
[1] 95
So it seems like the original column labelled Var1
is unaffected by the line where I rewrote the entire data frame. This is why I believe that when I import the data, I really just have 3 separate columns stored somewhere called Var1
, Var2
and Var3
. As far as getting R to recognize that I want the modified version of the column, I think I need to do something along the lines of:
Var1 <- data$Var1 #Repeat for every variable
My issue with this is that I will need to write the above bit of code for every single variable. The data frame I have is large, and this way of coding seems tedious. Is there a better way for me to transform my data, then be able to call the modified variables, without needing to use the data$ precursor every time?