-1

I have a string that is a sentence. There are eight words in the sentence. What I'm trying to do, is take the third, forth, and fifth word the sentence. I have tried using indexing such as:

string[3][4][5]

But this raises an IndexError. What am I missing here?

Christian Dean
  • 22,138
  • 7
  • 54
  • 87
James Dean
  • 155
  • 1
  • 17
  • 3
    **(1)** What you're doing is called **indexing**. **(2)** What I believe you want is called **_slicing_**: `page_soup.title.string[3:6]`. – Christian Dean Oct 30 '17 at 03:07
  • That assumes `page_soup.title.string` is a list of words. It seems more likely that it's just one big string, so that solution would grab the third, fourth, and fifth _characters_. – John Gordon Oct 30 '17 at 03:10
  • @JohnGordon True, but your assuming that when he says "word" he means a "group of characters". That's not necessarily true. He already was confused about the terminology for list slicing. He could be doing the same here. – Christian Dean Oct 30 '17 at 03:11
  • Are you trying to extract words from a string like "This is my title with seven words"? – omijn Oct 30 '17 at 03:12
  • It is one sentence with 8 words. I want to grab the third word and the fifth word. – James Dean Oct 30 '17 at 03:13
  • Possible duplicate of [Understanding Python's slice notation](https://stackoverflow.com/questions/509211/understanding-pythons-slice-notation) – Nir Alfasi Oct 30 '17 at 03:13
  • Ah, then what you actually want is: `page_soup.title.string.split()[3:6]` – Christian Dean Oct 30 '17 at 03:14

2 Answers2

2
# split the title string into words (split by spaces)
thead_list = page_soup.title.string.split()

# access elements with index 3, 4, 5
words = thead_list[3:6]

Or if you want just the third and fifth words, use thead_list[2] and thead_list[4]

If you need to concatenate the resulting words that you extracted, then do this:

new_title = " ".join(words) # converts ["word1", "word2"] to "word1 word2"

Combining all of the above steps into one line of code:

thead = " ".join(page_soup.title.string.split()[3:6])
omijn
  • 646
  • 1
  • 4
  • 11
  • When I use your code, I get odd results. For some reason, it grabs the third word of every word. Also, it shows something like this: `[u'MyThirdWord']` - adding `[u']` is not what I want. – James Dean Oct 30 '17 at 03:28
  • @JamesDean: The third word of every word? Also, the 'u' isn't actually part of the word, so don't worry about that – omijn Oct 30 '17 at 03:30
  • Yeah I know, but it prints that on my html page. also, why can't it be simple as: `thead = page_soup.title.string.split()[3:4]` ? – James Dean Oct 30 '17 at 03:32
  • @JamesDean: Sure, you can do that. I just wrote it in two lines so I could add in some comments – omijn Oct 30 '17 at 03:34
  • right, I use the "simple" method, however, it prints unsual results. Adding the braket [, this ' and another '. Also, it's not printing the third or forth word. – James Dean Oct 30 '17 at 03:35
  • @JamesDean: Which words are you trying to extract? It wasn't too clear from your question. If you want the third word, use an index of 2, for example. And the brackets will appear if you're printing the list itself. If you need the extracted words concatenated into a string, try using `" ".join(list)` – omijn Oct 30 '17 at 03:40
  • @omjin Can you add that to the answer as to how that would look? – James Dean Oct 30 '17 at 03:41
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/157769/discussion-between-omijn-and-james-dean). – omijn Oct 30 '17 at 03:49
0

You can try this:

thead = page_soup.title.string
final_word1, final_word2 = thead.split()[2], thead.split()[4]
Ajax1234
  • 69,937
  • 8
  • 61
  • 102