-1

I have a dataframe of two columns TSH values and dates. I want to give a rank number for each day (1 starting from the oldest date) as in the picture.

enter image description here

Ethan
  • 876
  • 8
  • 18
  • 34
Mo Kh
  • 11
  • 2
  • Please post data using `dput` and avoid using screen captures. Please see [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) about making a good reproducible example. – Ben Mar 12 '22 at 23:15

1 Answers1

0

It is preferable to copy data using dput and not use images/screen captures.

Using your example data.frame, we'll call df, you can do the following.

Your first step can be to change your DATE to a date object. The code you provided in your comment can work, but if you use pipes such as %>% below, you won't need to make intermediate variables or objects. Instead, it will feed the result of one line of code into the next line. See this for more information.

Next, using dplyr, you can sort by Date (oldest first) and then use dense_rank.

To write your result to an Excel file, you can use write.xlsx from the xlsx package. There are other options available for this as well.

Here is a complete example:

library(dplyr)
library(xlsx)

df %>%
  mutate(DATE = as.Date(DATE, format = "%d.%m.%y")) %>%
  arrange(DATE) %>%
  mutate(ID = dense_rank(DATE)) %>%
  write.xlsx("filename.xlsx")

An alternative would be cur_group_id after grouping by Date:

df %>%
  mutate(DATE = as.Date(DATE, format = "%d.%m.%y")) %>%
  arrange(DATE) %>%
  group_by(DATE) %>%
  mutate(ID = cur_group_id()) %>%
  ungroup %>%
  write.xlsx("filename.xlsx")
  

Output

  TSH       DATE ID
1 1.3 2015-01-05  1
2 0.9 2015-01-05  1
3 1.4 2015-01-05  1
4 1.6 2015-01-05  1
5 3.1 2015-02-24  2
6 1.6 2015-02-24  2
Ben
  • 28,684
  • 5
  • 23
  • 45
  • Thanks Ben ,it works after i did this acrobatic codes ( i'm novice in R) : temp1=as.Date(data$DATE, format='%d.%m.%y') temp1 A=as.data.frame(temp1) B=as.data.frame(data$TSH) library(dplyr) df=bind_cols(B,A) df %>% arrange(temp1) %>% group_by(temp1) %>% mutate(ID = cur_group_id()) ' dataframe is too long to past it by dput() . However, how do i extract the table to excel? Thanks again – Mo Kh Mar 13 '22 at 08:36
  • @MoKh You don’t need to share all your data. Just try `dput(head(data))` to share first 6 rows. Copy and paste the result into your post above (edit your original question). Then, I can start with that, and will add how to export to excel too… – Ben Mar 13 '22 at 11:53
  • Thanks Ben, this is my dataframe : structure(list(TSH = c(1.3, 0.9, 1.4, 1.6, 3.1, 1.6), DATE = c("05.01.15", "05.01.15", "05.01.15", "05.01.15", "24.02.15", "24.02.15")), row.names = c(NA, 6L), class = "data.frame"). How can i extract the table to excel or CSV file? thank you – Mo Kh Mar 13 '22 at 16:53
  • @MoKh See edited answer, that uses your data, and writes to an Excel file. Good luck! – Ben Mar 13 '22 at 17:06
  • God bless you Dear Ben. – Mo Kh Mar 13 '22 at 17:11
  • Hi again , it seems that day data are few , is it possible to rank by months?. Thanks again – Mo Kh Mar 13 '22 at 18:01
  • You can use the `lubridate` library, and use `floor_date` which will reference the "first" of the month for any day in a given month. Your statement would look something like: `mutate(DATE = floor_date(as.Date(DATE, format = "%d.%m.%y"), unit = "months")) %>%` to use the first day of the month. If you need the original date, you can create a new column with the first of the date too, so you have both. – Ben Mar 13 '22 at 18:06