0

I would like to search for all values in my data frame to see if there is any infinite value there because when I run a function I get this error:

Error in optim(apply(X, 2, median, na.rm = TRUE), fn = medfun, gr = dmedfun,  :
  non-finite value supplied by optim

In addition:

Warning message:
In betadisper(d, rep(1, ncol(as.matrix(d)))) :
  Missing observations due to 'd' removed.

I have a huge file and it happens just for some rows but I have no idea why?

Thanks

Jaap
  • 81,064
  • 34
  • 182
  • 193
sepehr
  • 11
  • 4
  • 1
    You can test for infiniteness using `is.infinite`. However, I don't think your problem are infinite values in your data. But without a reproducible example it's difficult to give better advice. – Roland Feb 14 '14 at 11:57
  • Note that I rolled back your edits. You can add to your question, but you shouldn't completely change it. – Roland Feb 15 '14 at 08:24
  • You should [provide a reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for people to help with your problem. – nico Feb 15 '14 at 08:29

1 Answers1

1

How about this with dplyr: Obviously in this case it will only work if your numbers are positive, but you can tweak it if your data is more complex:

df<-read.table(header=F,text="name1  2 2
name2  1 0
name2  0 2
name3  0 2
name3  0 1")

require(dplyr) #for aggregation

  group_by(df,V1) %.%            # with each subset of V1 (col 1)
    summarise(prod=sum(V2)*sum(V3)) %.%   # calculate prod - 0 if either column sums to 0
    filter(prod!=0) %.% select(V1)  %.%      # this selects the rows where prod !=0, and the V1 col
    inner_join(df)               # join to original df as a filter

     V1 V2 V3
1 name1  2  2
2 name2  1  0
3 name2  0  2

Or this with data.table

require(data.table)     # prereq
merge(df,               # merge the data frame
data.table(df,"V1")[,list(prod=prod(colSums(.SD))),by="V1"][prod!=0,] # this sums all columns except the "by" key, and filters
)[,1:ncol(df)]          # this just chops the "prod" column off the end
Troy
  • 8,581
  • 29
  • 32
  • @sepehr I think with `data.table` if you want it to be more flexible - see above. If there are columns you wish to exclude from the calc you can do so by applying a filter in the call, e.g. `data.table(df[,1:3],"V1")` – Troy Feb 15 '14 at 06:23