I would like to get all entries where there is a decreasing trend

Question

The data is in the format indicated below. The desired output should be a data frame with the column_ids c,f,h,l,m. I have tried using

t(apply(x, 1, diff) >= 0)

but I keep getting the error:

"Error in r[i1] - r[-length(r):-(length(r) - lag + 1L)] : non-numeric argument to binary operator"

Please dont post your data as images, but instead as reproducible code - see some options for doing so [at this post](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) — jpsmith, Feb 23 '23 at 13:07
Also, could you explain your question a bit better? You mention the final data should have column_ids c,f,h,l,m, but this isn't clear. Could you also post an example of your desired output and how exactly is it determined? — jpsmith, Feb 23 '23 at 13:11

Andre Wildberg · Answer 1 · 2023-02-23T13:25:04.467

If this is your data

set.seed(42)
mat <- t(sapply(letters[1:12], function(x) rbind(sample(10, 3))))

mat
  [,1] [,2] [,3]
a    1    5   10
b    9    4    2
c   10    1    8
d    8    7    4
e    9    5    6
f    4    2    7
g    3    9    1
h    4    5    7
i    5    4    2
j    8    3    2
k    1    8    6
l    6    8    4

Get the decreasing rows

t(t(mat)[, colSums(diff(t(mat)) <= 0) == 2])
  [,1] [,2] [,3]
b    9    4    2
d    8    7    4
i    5    4    2
j    8    3    2

as a data frame

data.frame(t(t(mat)[, colSums(diff(t(mat)) <= 0) == 2]))
  X1 X2 X3
b  9  4  2
d  8  7  4
i  5  4  2
j  8  3  2

score 0 · Answer 2 · answered Feb 23 '23 at 13:24

The specific error you are getting is because your first column is not numeric, yet you are trying to diff across the whole row. R does not know how to calculate the different between "a" and 1, for example. We need to pass x[-1] instead of x to apply in this case, since we don't want to include the id column in our diff calculations.

From your description of the desired output, your definition of "downward trend" is a little odd. If you did a linear regression across each row, then you would have a downward trend in rows with the ids b, c, f, h, l and m:

x$id[apply(x[-1], 1, \(x) lm(x ~ seq_along(x))$coef[2] < 0)]
#> [1] "b" "c" "f" "h" "l" "m"

This, to my mind, is the best definition of "downward trend", but it doesn't quite match what you seem to desire. Instead, your desired output can be obtained by finding the rows where there are no increases across the row and at least one decrease:

x$id[apply(x[-1], 1, \(x) !any(diff(x) > 0) & any(diff(x) < 0))]
#> [1] "c" "f" "h" "l" "m"

Data used

By using an image of your data instead of just including your data as text, means that anyone who wants to help you would have to transcribe your image by hand.

x <- data.frame(id = letters[1:13], 
                `2` = c(1,1,3,1,1,2,3,9,1,1,1,2,2),
                `3` = c(3,2,2,1,1,1,3,4,2,1,1,1,2),
                `4` = c(13,0,2,1,1,0,4,0,2,2,1,0,1), check.names = FALSE)

I would like to get all entries where there is a decreasing trend

2 Answers2