4

I am currently tying to change some specific values from some specific rows in a data frame with dimensions (401, 2).

The data frame looks like this:

logFC pval cg00035864 2.931898e-02 0.519802679 cg00061679 -9.465129e-05 0.519802679 cg00063477 -1.360574e-01 0.244373340 cg00121626 7.946710e-03 0.611252125 cg00212031 -6.052011e-02 0.774827599 cg00213748 -9.248549e-02 0.851445095 cg00214611 8.384351e-02 0.519802679 cg00223952 2.184674e-03 0.998934883 cg00243321 9.606841e-02 0.519802679 cg00271873 1.781436e-01 0.605388199 cg00272582 1.186292e-01 0.191905652 cg00308367 1.496136e-02 0.791579139 cg00311963 1.260400e-01 0.519802679 cg00335297 1.819981e-01 0.405942400 cg00455876 1.107911e-01 0.519802679 cg00576139 -9.465129e-05 0.519802679 cg00599377 9.778042e-02 0.519802679 cg00639218 1.005280e-01 0.719199850 cg00676506 2.603663e-02 0.706729687 cg00679624 -3.499232e-02 0.735048055 cg00762184 3.561985e-02 0.039468075 cg00789540 1.296961e-01 0.519802679 cg00876332 -1.240570e-01 0.991495608 cg00975375 1.242095e-01 0.519802679 cg01053349 6.237889e-02 0.938655973 cg01061520 3.988364e-02 0.529964491 cg01073572 -9.700589e-02 0.000829731 cg01086462 -5.650370e-02 0.519802679 cg01141334 1.130912e-01 0.883360324 cg01209756 9.301333e-02 0.519802679

What I would like is to change the values from those rows from the column logFC that don't pass a 5% FDR (column pval, which is already adjusted).

I was doing this in a very rough way, just checking which one where not significant and then change them into 0. As the following way:

data[data$pval >= 0.05,]

  • Here I look which rows are the ones that I want to change. Let's put an example that are the rows 2,3,5,8,10 and 11 from the original data frame. Then I proceed like this:

data$logFC[c(2,3,5,8,10,11)] <- 0

The current problem is that I used to do this in data frame that had a dimension 15, 2. Right now, as said before, the dimension is way larger (401, 2). So I can't do it "manually".

Does somebody know an effective way to do this?

Thank you very much,

Aina

Aina Jené
  • 51
  • 1
  • 1
  • 2
  • The proposed solution does not work if the new value is calculated from another one, for example `df$logFC[df$pval >= 0.05] <- df$pval + 3` – meolic May 07 '21 at 10:50

2 Answers2

9

A piece of code like this should work:

df$logFC[df$pval >= 0.05] <- 0

Where df is your dataframe.

4rj4n
  • 346
  • 1
  • 9
  • 1
    I see I was 46s too slow... – 4rj4n Jul 11 '17 at 08:58
  • 6
    btw, isn't this question a duplicate of for instance this: https://stackoverflow.com/questions/13871614/replacing-values-from-a-column-using-a-condition-in-r (I have too low rep to comment, except on my own) – 4rj4n Jul 11 '17 at 09:01
3

The normal way to do this would be

df$logFC[df$pval >= 0.05] <- 0 

This is certainly fast enough for your 420 x 2 data frame.

This is very basic R programming. If you are planning on using R more often, I recommend doing a tutorial or something like this

On a side-note, don't use data as a name for you data frame, because the name data is actually reserved for other purposes. I recommend using df or so

KenHBS
  • 6,756
  • 6
  • 37
  • 52