0

I am looking at a dataset where I got companies, and what there prices are for several weeks. If the value is blank/empty it is due to the house is booked and therefore there is no price available.

I have this code which is working, but I want to do all companies and weeks at once if possible. And then I want it to become a part of the data.

sum(D1$Company=='dc' & D1$`Price week 24`== " ") / sum(D1$Company=='dc' & D1$`Price week 24`!="-10")

Where I take the sum of one companies where the houses is booked (no price therefore blank/empty), and divide with the total. No values of -10..

My data could look like this (Sorry for the bad vision but I cannot paste in a screenshot). I got several more weeks and several companies. I could see a new column named "Occupancy week 24" where it contains the value according to the company in row 1.

EDIT : the data

# dput(DF1)
structure(list(Company = 1:6, Price_week_24 = structure(c(1L, 
1L, 1L, 1L, 1L, 1L), .Label = "ns", class = "factor"), Price_week_25 = c(1639L, 
860L, NA, NA, 399L, 645L), Price_week_26 = c(NA, 860L, NA, NA, 
399L, NA), Price_week_27 = c(NA, 1010L, 1010L, 699L, 399L, 1010L
), Price_week_28 = c(NA, 1399L, NA, 1129L, 640L, 1399L)), class = "data.frame", row.names = c(NA, 
-6L))

df$occupancy_rate <- apply(df[,2:6], 1,function(x) sum(x>0, na.rm = TRUE)/length(x)) Solve many problems but not them all. I want a value for every single company and not a total for them all.

I am looking forward to getting some help. Thank you.

Best Regards

  • 4
    Hi ! You can share your data easier using `dput(D1)` or `dput(head(D1))` as mentionned here : [how-to-make-a-great-r-reproducible-example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Can you precise or give an exemple of what you want to achieve ? – cbo Jun 21 '19 at 10:55
  • I want to add a new column for week 24 for example. I want that column to contain the occupancy rate for the company for the row at that position. The occupancy rate can be found during the calculation showing above. That meaning that first row will look like this: c(ns, 1639, "" , "","" , occupancy rate (fx. .333 in this case since we have 2 empty and 6 rows in total).. If that makes sense? – jack petersen Jun 21 '19 at 11:09
  • And remember that we have several companies and I want the column rows to contain the occupancy rate according to the row (company).. – jack petersen Jun 21 '19 at 11:15
  • Ok, it would be even more efficient if you were to add an edit in your post (people don't read comments some time) with __EDIT__ concretely for the first row it gives `sum(as.numeric ( D1[,2:7] )>0 ) / 6` – cbo Jun 21 '19 at 11:19
  • To add a column ```foo``` to a ```data.frame```, just use ```df$name <- foo``` Does this solve the issue? – David Jun 21 '19 at 11:48

1 Answers1

1

Here is how the data was created to share the example. I have included one example solution using base:

#Create a reprex
df <- read.table(text =
"1 ns     1639            ' '             ' '             ' '
2 ns      860             860             1010            1399
3 ns      ' '             ' '             1010            ' '
4 ns      ' '             ' '             699             1129
5 ns      399             399             399             640
6 ns      645             ' '             1010            1399")

names(df) <- c("rows", "Company", paste0("Price_week_", 24:27) )

# to share the data
dput(df)

# Using base R
df$occupancy_rate <- apply(df[,2:6], 1,function(x) sum(x>0, na.rm = TRUE)/length(x))
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
cbo
  • 1,664
  • 1
  • 12
  • 27
  • It is a big step in the right direction, but the problem is that there are more than one company in the first row.. – jack petersen Jun 21 '19 at 12:03
  • You should then retake my code to share such an example and edit your question so that other can understand what you want to achieve. – cbo Jun 21 '19 at 12:06