-1

I have a dataframe including text and digit and I want to split, sort and combine text and digit in the separate columns. Kindly find the example dataset.

Text
7 LIFTING & SHORING
-00 General
-10 Jacking
-20 Shoring
8 LEVELING & WEIGHING
-00 General
-10 Weighing and Balancing
-20 Leveling

I need to get the below results:

dig      Desc.
700      LIFTING & SHORING General
710      LIFTING & SHORING Jacking
720      LIFTING & SHORING Shoring
800      LEVELING & WEIGHING General
810      LEVELING & WEIGHING Weighing and Balancing 
820      LEVELING & WEIGHING Leveling

I tried in R with for loop but I'm a newbie in r and can't find any solutions.

I appreciate your help.

Marjan
  • 13
  • 1

1 Answers1

0

Using tidyverse:

library(tidyverse)

df %>%
  separate(Text,c("text1","text2"),"^-") %>%
  mutate(text1 = ifelse(text1 == "", NA, text1)) %>%
  fill(text1) %>%
  filter(!is.na(text2)) %>%
  separate(text1,c("dig1","Desc1"),convert = TRUE,extra="merge") %>%
  separate(text2,c("dig2","Desc2"),convert = TRUE,extra="merge") %>%
  transmute(dig = dig1*100+dig2,
            Desc. = paste(Desc1,Desc2))

#   dig                                      Desc.
# 1 700                  LIFTING & SHORING General
# 2 710                  LIFTING & SHORING Jacking
# 3 720                  LIFTING & SHORING Shoring
# 4 800                LEVELING & WEIGHING General
# 5 810 LEVELING & WEIGHING Weighing and Balancing
# 6 820               LEVELING & WEIGHING Leveling

data

df <- read.table(text="Text
'7 LIFTING & SHORING'
'-00 General'
'-10 Jacking'
'-20 Shoring'
'8 LEVELING & WEIGHING'
'-00 General'
'-10 Weighing and Balancing'
'-20 Leveling'",strin=F,h=T)
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • Moody, Thank you! I sent you an email with more detail information. – Marjan Jul 12 '18 at 15:10
  • You're welcome, you should find out which rows are problematic and post an additional example with expected output as you did already. I believe that some of your rows don't start with dashes but expect a similar treatment, and my code expect these dashes as a first character, but i need reproducibe data (in text format, no pic) to do more. – moodymudskipper Jul 12 '18 at 19:15
  • This is the data link: http://itlims-zsis.meil.pw.edu.pl/pomoce/ESL/2016/ATA_Chapters.pdf and the parts of the data i need to extract temp <- data.frame(Essai=pdf_text("ATA_Chapters.pdf")%>% strsplit(split = "\r\n") %>% unlist) x = nrow(temp) ata<- temp[90:x,] ata <- as.character(ata) df <- data.frame(Text=ata) the output i want is exactly what you achieved. Thanks – Marjan Jul 13 '18 at 07:59
  • Please include the relevant info in your question. You'll find great info here on how to do so : https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – moodymudskipper Jul 13 '18 at 08:06