1

I have a corpus of text, which is separated into paragraphs with \nn.

\n\n"Well done, Mrs. Martin!" thought Emma.  "You know what you are about."\n\n"And when she had come away, Mrs. Martin was so very kind as to send\nMrs. Goddard a beautiful goose--the finest goose Mrs. Goddard had\never seen.  Mrs. Goddard had dressed it on a Sunday, and asked all\nthe three teachers, Miss Nash, and Miss Prince, and Miss Richardson,\nto sup with her."\n\n"Mr. Martin, I suppose, is not a man of information beyond the line\nof his own business? He does not read?"\n\n"Oh yes!--that is, no--I do not know--but I believe he has\nread a good deal--but not what you would think any thing of.\nHe reads the Agricultural Reports, and some other books that lay\nin one of the window seats--but he reads all _them_ to himself.\nBut sometimes of an evening, before we went to cards, he would read\nsomething aloud out of the Elegant Extracts, very entertaining.\nAnd I know he has read the Vicar of Wakefield.  He never read the\nRomance of the Forest, nor The Children of the Abbey.  He had never\nheard of such books before I mentioned them, but he is determined\nto get them now as soon as ever he can."\n\nThe next question was--\n\n"What sort of looking man is Mr. Martin?"

Or if printed,

"Well done, Mrs. Martin!" thought Emma.  "You know what you are about."

"And when she had come away, Mrs. Martin was so very kind as to send
Mrs. Goddard a beautiful goose--the finest goose Mrs. Goddard had
ever seen.  Mrs. Goddard had dressed it on a Sunday, and asked all
the three teachers, Miss Nash, and Miss Prince, and Miss Richardson,
to sup with her."

"Mr. Martin, I suppose, is not a man of information beyond the line
of his own business? He does not read?"

"Oh yes!--that is, no--I do not know--but I believe he has
read a good deal--but not what you would think any thing of.
He reads the Agricultural Reports, and some other books that lay
in one of the window seats--but he reads all _them_ to himself.
But sometimes of an evening, before we went to cards, he would read
something aloud out of the Elegant Extracts, very entertaining.
And I know he has read the Vicar of Wakefield.  He never read the
Romance of the Forest, nor The Children of the Abbey.  He had never
heard of such books before I mentioned them, but he is determined
to get them now as soon as ever he can."

The next question was--

"What sort of looking man is Mr. Martin?"

Given a certain paragraph, I would like to know where the paragraph's boundaries are. That is, I would like to find where the paragraph is by the line breaks \n\n.

My goal is for my cursor to click on a certain paragraph, and I will know the boundaries of this paragraph based on the location of \n\n.

import string
string.find("\n\n")

will output the locations for where the spaces are within the string. But what about a certain paragraph? If I "click" on the fourth paragraph (at Vicar of Wakefield), how can I search for the first \n\n above this and search for the first \n\n below this?

EB2127
  • 1,788
  • 3
  • 22
  • 43

1 Answers1

1

Assuming you know the position pos where you "clicked" inside the long text string, then you can use str.find and str.rfind() to solve your question.

To look "forward" you would do a:

string.find("\n\n", pos)  # searches for "\n\n" starting from position `pos`, returning the first match

and "backward" you would do a:

string.rfind("\n\n", 0, pos) # searches for "\n\n" from the beginning up-to `pos` but will return you the last match

For documentation on both methods look at https://docs.python.org/2/library/string.html

sal
  • 3,515
  • 1
  • 10
  • 21
  • Any ideas on finding out the position where one "clicked"? – EB2127 Jan 13 '16 at 01:22
  • That requires you to give more context around the "system" you have. How do you display the paragraph? How do you handle input devices? I would post that as another question. – sal Jan 13 '16 at 01:26
  • I am going to use TkInter as an interactive GUI. Basically, input the text via Text Widget, and allow the user to "click" a paragraph. – EB2127 Jan 13 '16 at 01:51
  • I am not familiar with TkInter. In some cases GUI widgets have methods that respond to "click" or other events, and report a lots of good info, but I can't help you on that. Please post a separate question. I believe my answer is compliant with the OP. – sal Jan 13 '16 at 02:03
  • Actually, you should be able to use something from [http://stackoverflow.com/questions/31023744/getting-cursor-position-in-tkinter-entry-widget](this post). Basically using the text.index(INSERT) call. See also [http://effbot.org/tkinterbook/text.htm](this other one). – sal Jan 13 '16 at 02:08
  • Perfect! This should work well! INSERT or CURRENT is what I'm looking for, I believe – EB2127 Jan 13 '16 at 02:18