How to separate text from numbers and turn text into column names?
Asked
Active
Viewed 53 times
-2
-
This discussion can be helpful for splitting part: https://stackoverflow.com/questions/4350440/split-data-frame-string-column-into-multiple-columns – Thoughtful_monkey Nov 30 '22 at 07:50
-
What is your desired output? – diomedesdata Nov 30 '22 at 08:00
-
I just need numbers, not data names in each box. – TUSTLGC Nov 30 '22 at 08:23
-
2You don't need to parse out the numbers. Instead of running `summary()` on your data frame, run it on each column separately - using `mtcars` as an example: `data.frame(sapply(mtcars, summary))` or transposed `data.frame(t(sapply(mtcars, summary)))`. – Ritchie Sacramento Nov 30 '22 at 09:15
-
@RitchieSacramento great point, this could very easily be OP's true goal (if indeed they have the raw data to begin with). – diomedesdata Nov 30 '22 at 10:21
-
1My goal has been achieved. Thank you. – TUSTLGC Nov 30 '22 at 10:26
-
Try to ask questions that represent your actual goal, rather than intermediate questions that may answer it - it helps us help you! – diomedesdata Nov 30 '22 at 10:39
1 Answers
1
Based on your comment, I think the following gives you what you want.
Using data.table
(but you could easily use base R):
library(data.table)
setDT(df)
df[, statistic := c("Min", "1st Q", "Median", "Mean", "3rd Q", "Max")]
setcolorder(df, "statistic")
cols <- names(df)[c(FALSE, rep(TRUE, ncol(df) - 1))]
df[, (cols) := lapply(.SD, function(x) str_replace(x, "^.*:\\s?", "") |> as.numeric()),
.SDcols = cols]
df
If this is not what you want, please edit your post with the output of dput(head(df))
. In general, you should not post images of data or code - it makes your question much harder to answer.
As a toy example on some similar looking data:
df <- data.table(x = c("Min. : 2", "1st Q: 3",
"Median.: 4", "Mean:5",
"3rd Q: 6", "Max: 7"),
y = c("Min. : 3", "1st Q: 4",
"Median.: 5", "Mean: 6",
"3rd Q: 7", "Max: 8"))
df
x y
1: Min. : 2 Min. : 3
2: 1st Q: 3 1st Q: 4
3: Median.: 4 Median.: 5
4: Mean:5 Mean: 6
5: 3rd Q: 6 3rd Q: 7
6: Max: 7 Max: 8
The code above returns:
statistic x y
1: Min 2 3
2: 1st Q 3 4
3: Median 4 5
4: Mean 5 6
5: 3rd Q 6 7
6: Max 7 8
You could make this more sophisticated by pulling out the exact text in the columns to make statistic
, but this doesn't seem like a problem for your data.

diomedesdata
- 995
- 1
- 6
- 15