I have a string that looks like this:
t2 <- "============================================
Model 1 Model 2
--------------------------------------------
education 3.66 *** 2.80 ***
(0.65) (0.59)
income 1.04 *** 0.85 ***
(0.26) (0.23)
type: blue collar -5.91 -27.55 ***
(3.94) (5.41)
type: white collar -8.82 ** -24.12 ***
(2.79) (5.35)
income x blue collar 3.01 ***
(0.58)
income x white collar 1.91 *
(0.81)
prop. female 0.01 0.08 *
(0.03) (0.03)
--------------------------------------------
R^2 0.83 0.87
Adj. R^2 0.83 0.86
Num. obs. 98 98
============================================
*** p < 0.001, ** p < 0.01, * p < 0.05"
and I'm trying to extract the left hand column so that I get a vector that looks like this:
education
income
type: blue collar
type: white collar
income x blue collar
income x white collar
prop. female
I'm new to regex
and stringr
, and I'm trying to extract the words that follow a linebreak:
library(stringr)
covariates <- str_extract_all(t2, "\n\\w+")
covariates
which is getting me a bit closer:
[1] "\neducation" "\nincome" "\ntype" "\ntype" "\nincome" "\nincome" "\nprop" "\nR"
[9] "\nAdj" "\nNum"
but I can't work out how to capture the entire column of text eg, getting the full "type: blue collar", instead of "\ntype".