-1

Update: I have a dataset that is output in a staircase format. For example:

rows 80-88 columns 1-6

I would like the data to move up from row 80-88 to row 1-8 and then the data before the prompts to be in row 1 and the data after the promts to line up with the correct promt. This example would then look like this:

Preferred outcome

I am hoping there are functions of some sort that I can run in Rstudio to correct these issues because I have almost 200 participants I will have to make these same corrections for. Any ideas?

I have not found a function where I can move a single column up or delete a single cell that would force the data upward. Any ideas would be greatly appreciated.

divibisan
  • 11,659
  • 11
  • 40
  • 58
Arnold.D
  • 9
  • 4
  • 1
    The 2 main things we ask for here so we can be sure that we understand your question are 1) a reproducible example, and 2) desired output. For 1, the best way is to use the `dupt` function and paste it into the question. For your example, just do `dput(MYDF[1:7,1:6])`. That will let us reproduce your problem and try to solve it. Then, just write up by hand what you want the results to look like, so we can be sure that we understand you. As you can see from the answer below, what you think of as a clear explanation might not be – divibisan Jul 21 '23 at 16:49
  • 1
    There's definitely a simple solution for this problem, but you've got to help us be sure we understand what you want. This question here is really helpful if you have any problems making your reproducible example: https://stackoverflow.com/q/5963269/8366499 – divibisan Jul 21 '23 at 16:50
  • Thank you the more I look at this, the crazier the cleaning I need to do becomes... I am sorry I am new at asking a question on a forum. I am going to update my question above. – Arnold.D Jul 21 '23 at 18:16
  • This is more helpful. A few more questions: 1) do you want this reordering to happen separately per participant (or per participant/date combination)? 2) Will there ever be more than one (non-NA) value in each of the relevant columns (per participant)? 3) Do you want to keep all the rest of the rows that are now NA? 4) Will only data in the 3 "slider*" columns will change? – divibisan Jul 21 '23 at 19:44
  • It would be helpful if you included the data in a usable format, not an image. Instead of taking a screenshot, just use `dput` on the part of your dataframe shown above and paste the result into the question. That way you're not asking people to manually type up all your data again just to answer the question – divibisan Jul 21 '23 at 19:52
  • Thank you @divibisan 1) Each separately per participant is fine, so I can check for weird errors and then merge after. 2) Yes, for some reason there are a few columns that have more than one non-NA, including the "prompt" column in this dput. 3)Truly blank NA rows are not important, they were only created by the weird data output. 4) All the data needs to slide at different intervals. For example, column 1 has to go from rows 80-88 all the way up to rows 1-8. – Arnold.D Jul 24 '23 at 14:48

3 Answers3

1

This first part fo the code is just to get a data.frame with diagonal values

# Recreate the example data.frame
df <- data.frame(diag(c(3, 6, 7, 4, 3), 5, 5))
names(df) <- paste0("Task ", c(1:5))
        
df

# Output 
Task 1 Task 2 Task 3 Task 4 Task 5
1      3      0      0      0      0
2      0      6      0      0      0
3      0      0      7      0      0
4      0      0      0      4      0
5      0      0      0      0      3

Now this function reads diagonal values and add them into the first column

change_df <- function(df) {
  # Get diagonal values
  values <- diag(as.matrix(df))
  # Empty data.frame
  df[,] <- 0
  df[, 1] <- values
  df
}

change_df(df)

# Output 
 Task 1 Task 2 Task 3 Task 4 Task 5
1      3      0      0      0      0
2      6      0      0      0      0
3      7      0      0      0      0
4      4      0      0      0      0
5      3      0      0      0      0
Enrique Pérez Herrero
  • 3,699
  • 2
  • 32
  • 33
  • This is very helpful but I have failed myself in adding an example that is too simple. I have updated the example above to hopefully better show my problem. I tried using the code you gave me and going from there but I see that I was sorely mistaken. – Arnold.D Jul 21 '23 at 15:47
1

Based on your explaination of the problem, I think this should do what you want to do:

First, a simplified version of your data (It would have made this much easier if you included this yourself):

df <- structure(list(row = 80:88, participant = c(1, 1, 1, 1, 1, 1, 
1, 1, 1), slider_header = c(9, NA, NA, NA, NA, NA, NA, NA, NA
), slider_2_header = c(NA, 2, NA, NA, NA, NA, NA, NA, NA), prompt = c("Please", 
"Do you", "Does this", "Have you", "How many", "Wjat is", "What is", 
"How man", "How many3"), slider_scale = c(NA, NA, NA, 2, NA, 
NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, -9L
))

Using tidyverse functions, we can do what you want by grouping the data by participant, removing the leading NA values from each of the relevant vectors so all the numbers are in the first row of the group, and then padding them so the vectors are all the same length for the data frame (based on this closely related question: Remove leading NAs to align data):

library(tidyverse)

df %>%
    group_by(participant) %>%
    mutate(across(starts_with('slider'), ~ `length<-`(zoo::na.trim(.x), length(.x))))

  participant slider_header slider_2_header prompt    slider_scale
        <dbl>         <dbl>           <dbl> <chr>            <dbl>
1           1             9               2 Please               2
2           1            NA              NA Do you              NA
3           1            NA              NA Does this           NA
4           1            NA              NA Have you            NA
5           1            NA              NA How many            NA
6           1            NA              NA Wjat is             NA
7           1            NA              NA What is             NA
8           1            NA              NA How man             NA
9           1            NA              NA How many3           NA

By grouping on "participant", this process will happen separately for each "participant". If you want to group on other variables (like date), just add it to the group_by function. Then we apply a function to each variable starting with "slider": this function is defined with ~ and uses na.trim from the zoo package to remove the NAs from each variable, and then assigns each of the that variable's original length, resulting in them being padded with trailing NAs as in your requested data.

divibisan
  • 11,659
  • 11
  • 40
  • 58
-1

You havent made it easy to help you , lack of reproducible example. but I can address this :

I have not found a function where I can move a single column up or delete a single cell that would force the data upward.

use dplyr::lead, to slide up all or a portion of a column up by n places (dplyr::lag would be do the slide down)


library(tidyverse)
(example_1 <- tibble(a=1:6,b=1:6))

#im going to slide column b up so that 456 slide up to begin after 2 rather than 3, this will leave a hole where 6 was
example_1$b[3:6] <- dplyr::lead(example_1$b[3:6],n=1)

# look at the result
example_1
Nir Graham
  • 2,567
  • 2
  • 6
  • 10