0

I have the following dataset

df_temp$ID:
chr4       chr4       chr5       chr7       chr7       chr7 

df_temp$value_beta: 

0.01960784 0.01960784 0.01960784 0.00000000 0.00990099 0.01941748 

df_temp$value_model:
    
 0.7605  0.1261  4.3766  0.7605   0.1261   4.3766  ## which are repeated

I would like to extract only the unique values, such that

df_temp$ID:
chr4       chr4       chr5 

df_temp$value_beta: 

0.01960784 0.01960784 0.01960784

df_temp$value_model:
    
 0.7605  0.1261  4.3766

To accomplish that I have started with the following code

df_temp_beta = data.frame("ID" = names(reduced_beta), "value_beta" = as.numeric(reduced_beta))

df_temp_model = data.frame("ID" = names(f$coefficients), "value_model" = as.numeric(f$coefficients))

df_temp <- full_join(df_temp_beta, df_temp_model, by = "ID")


unique(df_temp[, 2:3]) ## I get a memory error that says that it has reached max cappacity

algorithmically speaking this code seems completely inifficient and most likelly the error is not due to the memory max capacity but the algorithm that I have used

program
  • 103
  • 3
  • 1
    Hi program, welcome to Stack Overflow. I am having trouble understanding the structure of your data, the problem, and expected output. Sometimes it is also helpful to provide some background information about the problem. Regardless, it will be much easier to help if you provide a sample of your data with `dput(df_temp[1:20,])`. You can [edit] your question and paste the output. Please surround the output with three backticks (```) for better formatting. See [How to make a reproducible example](https://stackoverflow.com/questions/5963269/) for more info. – Ian Campbell Jul 06 '20 at 13:56
  • you could try, `df_temp[!duplicated(df_temp$ID), ]`, but no gurrantee if it will work, perhaps if you share a sample dataset, that would enable us to help you better. – monte Jul 06 '20 at 14:00
  • thank you all for your quick responses and yes it worked perfectly @monte, thank you again, I just did a small adjustment to the code df_temp[!duplicated(df_temp$value_model),] and it removed all the duplicated samples from df_temp$value_model, thank you once more – program Jul 06 '20 at 14:17

0 Answers0