I have the following data frame named bbchealth
:
head(bbchealth)
# A tibble: 6 x 1
Tweets
<chr>
1 Breast cancer risk test devised http://bbc.in/1CimpJF
2 GP workload harming care - BMA poll http://bbc.in/1ChTBRv
3 Short people's 'heart risk greater' http://bbc.in/1ChTANp
4 New approach against HIV 'promising' http://bbc.in/1E6jAjt
5 Coalition 'undermined NHS' - doctors http://bbc.in/1CnLwK7
6 Review of case against NHS manager http://bbc.in/1Ffj6ci
As you can see, each row, which contains a single tweet, has a URL at the end. I would like to remove only this URL while leaving the rest of the data frame unaffected.
If I try to use something like rm_url
, I get the following:
[1] "c(\"Breast cancer risk test devised \"GP workload harming care - BMA poll \"Short people's 'heart risk greater' \"New approach against HIV 'promising' \"Coalition 'undermined NHS' - doctors \"Review of case against NHS manager \"\\\"VIDEO: 'All day is empty, what am I going to do?' \"VIDEO: 'Overhaul needed' for end-of-life care \"Care for dying 'needs overhaul' \"VIDEO: NHS: Labour and Tory key policies \"Have GP services got worse? \"A&E waiting hits new worst level \"Parties row over GP opening hours \"Why strenuous runs may not be so bad after all \"VIDEO: Health surcharge for non-EU patients \"VIDEO: Skin cancer spike 'from 60s holidays' \"\.........
That is, a single vector(?) consisting of a string of the tweets with the URLs removed.
The code I used was rm_url(bbchealth, replacement = "")
.
If I use gsub("http.*","",bbchealth)
, I get the following output:
[1] "c(\"Breast cancer risk test devised "
However, this is not what I want. I want to retain the columnar structure. That is,
# A tibble: 6 x 1
Tweets
<chr>
1 Breast cancer risk test devised
2 GP workload harming care - BMA poll
3 Short people's 'heart risk greater'
4 New approach against HIV 'promising'
5 Coalition 'undermined NHS' - doctors
6 Review of case against NHS manager
How can I accomplish this?