0

So I have some text data that looking something like this,

BUYER: [TEXT]

SELLER: [TEXT]

BUYER: [TEXT]

When exporting to a CSV everything is condensed into a paragraph essentially and will look more like this:

BUYER: [TEXT]. SELLER: [TEXT]. BUYER: [TEXT].

I wanted to know if there's a way I can only track the Buyer responses and disregard what the seller has to say, I've been using the tidy text library and trying regex commands but nothing seems to be taking me in the right direction

yungFanta
  • 31
  • 5

1 Answers1

2

Test data
Since there is no sample data provided, following is a test data. Lets say that the output from your code is

BUYER: [TEXT]. SELLER: [TEXT]. BUYER: [TEXT]

Potential solution
Using the solution available here, a modified version is presented below.

library(stringr)
s <- "BUYER: [TEXT]. SELLER: [TEXT]. BUYER: [TEXT]"
buyerStrings <- str_extract_all(s, pattern = "(?=BUYER:).*?((?=SELLER)|$)")

Extracted data:

print(buyerStrings)

[[1]]
[1] "BUYER: [TEXT]. " "BUYER: [TEXT]"

Note: This extracted data can then be transformed to your needs for exporting