How to move data up and get rid of NA?

Question

The data shows NA for some points but the information is right below it. It is the same UPC, Store, and Week. How do I group my data to avoid redundancy and the NA data?

This is my code so far:

`library(tidyverse)
RD <- read.csv("Raw Soft Drinks Sales Data.csv")
U  <- read.csv("UPC Soft Drinks.csv") %>%
  mutate(UPC   = as.factor(UPC),
         BRAND = as.factor(BRAND),
         CLASS = as.factor(CLASS))
RDX <- RD %>%
  filter(UPC != "Total") %>%
  select (-c(Total.Q1,Total.Q2,Total.Q3,Total.Q4))


RDXL <- RDX %>%
  pivot_longer(
    cols = starts_with("Week"),
#    cols = X1:X52,
#    cols = !c("STORE","UPC"),
    names_to = "WEEK", 
    names_prefix = "Week",
    values_to = "UNITS",
    values_drop_na = TRUE)

RDW <- pivot_wider(RDXL, names_from = "ITEM", values_from = "UNITS")%>%
  select(-TOTAL)

`

This is what the original data set looks like: original data

I need Store, UPC, Dollars, Units, Feat, Deal, and Week to be their own columns.

Please provide a [reproducible minimal example](https://stackoverflow.com/q/5963269/8107362). Especially, provide some sample and target data in a clear and ready-to-use format, e.g. with `dput()` and use the [reprex-package](https://reprex.tidyverse.org/). — mnist, Sep 19 '21 at 21:45
Looks like a fairly simple merge/join process: `merge(df[1:2] , df[c(1,3)])` as a start. `merge` is base R. Look up the help pages and examples here on the various `*_join` functions in the tidyverse. — IRTFM, Sep 19 '21 at 22:23
Note the instructions at the top of the [tag:r] tag page and in particular the one NOT to use images because no one else can easily use them without retyping it all. — G. Grothendieck, Sep 19 '21 at 22:53
In addition to G. Grothendieck's comment, here is why: https://meta.stackoverflow.com/a/285557/11374827 — teunbrand, Sep 22 '21 at 14:08

GuedesBF · Accepted Answer · 2021-09-19T23:19:49.973

I think you may want to get all your data in the wide format. If your current data has one value for every data column (DOLLARS:DEAL) for every week, and all other data columns are NAs, you may want to coalesce the data like this:

minimal example data

tibble(store=c(1,1,1), UPC=rep(1200000044, 3), week=c(1, 1, 1), DOLLARS=c(10.1, NA, NA), UNITS=c(NA, 30, NA), FEAT=c(NA, NA, 'unknown'))

# A tibble: 3 x 6
  store        UPC  week DOLLARS UNITS FEAT   
  <dbl>      <dbl> <dbl>   <dbl> <dbl> <chr>  
1     1 1200000044     1    10.1    NA NA     
2     1 1200000044     1    NA      30 NA     
3     1 1200000044     1    NA      NA unknown

A solution

You can group_by the ID columns (UPC, week), and then sort with na.last=TRUE :

library(dplyr)

 df %>% group_by(UPC, week) %>%
         mutate(across(DOLLARS:FEAT, sort, na.last = TRUE)) %>%
         filter(!if_all(DOLLARS:FEAT, is.na))

# A tibble: 1 x 6
# Groups:   UPC, week [1]
  store        UPC  week DOLLARS UNITS FEAT   
  <dbl>      <dbl> <dbl>   <dbl> <dbl> <chr>  
1     1 1200000044     1    10.1    30 unknown

Minimal reprex for the original data

df2<-tibble(store=c(1, 1), UPC=rep(1200000044, 2), ITEM=c('DOLLARS', 'UNITS'), Week.1=c(58.4, 29), Week.2=c(118.8, 55))

df2

# A tibble: 2 x 5
  store        UPC ITEM    Week.1 Week.2
  <dbl>      <dbl> <chr>    <dbl>  <dbl>
1     1 1200000044 DOLLARS   58.4   119.
2     1 1200000044 UNITS     29      55

solution with original data

Starting with the original data, you can sequentially pivot_longer %>% pivot_wider

df2 %>% pivot_longer(cols = starts_with('Week'), names_to = 'week', values_to = 'value')%>%
        pivot_wider(names_from = ITEM)

# A tibble: 2 x 5
  store        UPC week   DOLLARS UNITS
  <dbl>      <dbl> <chr>    <dbl> <dbl>
1     1 1200000044 Week.1    58.4    29
2     1 1200000044 Week.2   119.     55

score 0 · Answer 2 · answered Sep 19 '21 at 23:10

0

We could also do

library(dplyr)
df %>%
    group_by(UPC, week) %>%
    mutate(across(DOLLARS:FEAT, ~ .[order(is.na(.))])) %>%
    filter(if_all(DOLLARS:FEAT, Negate(is.na)))

answered Sep 19 '21 at 23:10

akrun

874,273
37
540
662

How to move data up and get rid of NA?

2 Answers2