The data set which I'm using is filled with irrational numbers and NA's. A sample can be found below
>head(df, n=5)
cheading1 cheading2 cheading1 cheading3 cheading1 cheading1
1 1.0925485 NA 0.714186 NA 0.008650 NA
2 1.0564646 NA 0.714286 NA 0.008651 NA
3 0.9816899 NA 0.714186 NA 0.008652 NA
4 0.9857995 NA 0.714186 NA 0.008651 NA
5 0.9760769 NA 0.714086 NA 0.011350 NA
> dim(df)
[1] 16500 199
Please do not assume that the columns in the sample represent a continuing stream of the same data type. further on as the row number increases, column1 becomes filled with NAs and the other columns act in the same way. All columns have both irrational numbers and NA's inside. There are also zeros everywhere in this data frame
So, of course, when I try to take the natural log of the whole data set, it returns an error because of the non numeric values "NA"
log(df, base=exp(1))
> Error in Math.data.frame(df, base = exp(1)) : non-numeric variable
> in data frame: cheading2
I tried using the remove tool to try and tell R to exclude the NA's while executing the natural log on all numeric values but again it returned an error.
log(df, base=exp(1), na.rm=T)
> Error in log(df, base = exp(1), na.rm = T) : unused argument (na.rm
> = TRUE)
So how does one take the natural log of this entire data frame (with column headers), ignore all NAs and result with another table e.g. lndf which still has it's headers and NAs?
I've also tried to use a for loop but with the same outcome. (too many NANs produced)
I plan on using this data in a fixed effect regression after this has been solved. I hope to be able to answer any questions that may arise.
Also tried taking logs of every single numeric column to then combine them. Still doesn't work.
lnoecd<- log(df$oecd, base=exp(1))
lng20<- log(df$g20, base=exp(1))
lnoecdna<- log(df$oecdna, base=exp(1))
lnifscode<- log(df$ifscode, base=exp(1))
lnccode<- log(df$ccode, base=exp(1))
lnyear<- log(df$year, base=exp(1))
lnoxfx<- log(df$oxfx, base=exp(1))
lnncusd2011<- log(df$ncusd2011, base=exp(1))
lnncppp2005<- log(df$ncppp2005, base=exp(1))
...
...
lndf <- c(lnoecd, ...the lot
Whenever i take the log of any numeric column and then look at the dimensions of the edited column(s) just returns NULL
Note: Very new to programming and have started using R as a foot in the door. Apologies in advance for any possible lack of trivial knowledge. I hope that individuals that try to help me will be satisfied with how I'm coming across.