0

I am having a problem on how to get a certain value for each row in my dataframe. More precisely, it is about the variable entities.urls whose values again consist of variables

Dataframe

entities.urls consists of the following variables: start, end, url, expanded_url and display_url etc. Variables within entities.url:

img

I need to get the second entry of the expanded_url variable that is embedded in the entities.urls variable for each row in my original dataframe. To give you some context: I want to get the expanded url for each tweet.

I have not figured out any way to solve this issue and I would highly appreciate any help.

The structure of my dataframe is the following:

glimpse(data2)

[Output]: 
Rows: 66
Columns: 28
$ entities.urls                  <list> [<data.frame[2 x 5]>], [<data.frame[2 x 5]>], [<data.fram~
$ id                             <chr> "1498290646406340615", "1498234776737792000", "14982113720~ ....

The entities.urls dataframe consists of the following:

str(entities_urls)

[Output]
List of 66
 $ :'data.frame':   2 obs. of  5 variables:
  ..$ start       : int [1:2] 240 264
  ..$ end         : int [1:2] 263 287
  ..$ url         : chr [1:2] "shortened url not allowed" "shortened url not allowed"
  ..$ expanded_url: chr [1:2] "shortened url not allowed" **"https://twitter.com/Agilent/status/1498290646406340615/photo/1**"
  ..$ display_url : chr [1:2] "bit.ly/3qDpMxH" "pic.twitter.com/Re4DtJ7U2z" 
keevinhoo
  • 1
  • 1
  • [See here](https://stackoverflow.com/q/5963269/5325862) on making a reproducible example that is easier for folks to help with. It's hard to do more than guess based on a picture of data. What have you tried? – camille Mar 01 '22 at 16:59
  • Please post a reprex – Bruno Mar 01 '22 at 16:59
  • Something like `purrr::map_chr(your_data$entities.url, \(x) x$expanded_url[2])`. If that doesn't work or you need more help, please post a reproducible sample of data with `dput()`, e.g., `dput(your_data[1:5, c("id", "entities.url")])` for the first 5 rows of those two columns. – Gregor Thomas Mar 01 '22 at 17:47
  • As noted here, it's tough to help without a reprex. You'll find plenty of tutorials online, however, for dealing with lists as variables in a data frame, like this [one](https://jennybc.github.io/purrr-tutorial/ls13_list-columns.html#lists_as_variables_in_a_data_frame). – rdelrossi Mar 01 '22 at 17:53
  • @all: thank you for your comments, I really appreciate your help. Sorry for my formatting, as I am new, I still need to improve here... Unfortunately I cannot produce a dput since Stack does not allow me to post shortened urls and the dataframe consists a lot of that. – keevinhoo Mar 02 '22 at 13:08
  • @GregorThomas: If I incorporate your line of code, I get a null output. Do you have any idea why this is the case? Also, please consider that I incorporated some more information on my post above. Maybe that helps... Thank you so much! – keevinhoo Mar 02 '22 at 13:09
  • Did you adjust the data frame and column names to match your actuals? Looking at the extra info you've provided I would try `purrr::map_chr(data2$entities.urls, \(x) x$expanded_url[2])` – Gregor Thomas Mar 02 '22 at 20:55

0 Answers0