A sentence is an ordinated sequence of words in a given language. often referred to as "document" in Natural Language Processing.
Questions tagged [sentence]
364 questions
35
votes
6 answers
Javascript RegExp for splitting text into sentences and keeping the delimiter
I am trying to use javascript's split to get the sentences out of a string but keep the delimiter eg !?.
So far I have
sentences = text.split(/[\\.!?]/);
which works but does not include the ending punctuation for each sentence (.!?).
Does anyone…

daktau
- 633
- 1
- 7
- 17
34
votes
6 answers
How to break up document by sentences with Spacy
How can I break a document (e.g., paragraph, book, etc) into sentences.
For example, "The dog ran. The cat jumped" into ["The dog ran", "The cat jumped"] with spacy?

Ulad Kasach
- 11,558
- 11
- 61
- 87
15
votes
4 answers
Split sentence into words but having trouble with the punctuations in C#
I have seen a few similar questions but I am trying to achieve this.
Given a string, str="The moon is our natural satellite, i.e. it rotates around the Earth!"
I want to extract the words and store them in an array.
The expected array elements…

Richard N
- 895
- 9
- 19
- 36
15
votes
7 answers
R break corpus into sentences
I have a number of PDF documents, which I have read into a corpus with library tm. How can one break the corpus into sentences?
It can be done by reading the file with readLines followed by sentSplit from package qdap [*]. That function requires a…

Henk
- 3,634
- 5
- 28
- 54
12
votes
2 answers
NLP for extracting actions from text
I'm hoping somebody can point me in the right direction to learn about separating out actions from a bunch of text.
Suppose I have this text
Drop off the dry cleaning, and go to the corner store and pick-up a jug of milk and get a pint of…

pedalpete
- 21,076
- 45
- 128
- 239
10
votes
1 answer
Custom sentence segmentation using Spacy
I am new to Spacy and NLP. I'm facing the below issue while doing sentence segmentation using Spacy.
The text I am trying to tokenise into sentences contains numbered lists (with space between numbering and actual text), like below.
import spacy
nlp…

Satheesh K
- 501
- 1
- 3
- 16
10
votes
1 answer
Making a meaningful sentence from a given set of words
I am working on a program that needs to create a sentence that is grammatically correct from a given set of words. Here I will be passing an input of a list of strings to the program and my output should be a meaningful sentence made with those…

FaultyProgrammer3107
- 221
- 1
- 3
- 9
9
votes
1 answer
Sentence Structure identification - spacy
I intend to identify the sentence structure in English using spacy and textacy.
For example:
The cat sat on the mat - SVO , The cat jumped and picked up the biscuit - SVV0.
The cat ate the biscuit and cookies. - SVOO.
The program is supposed to…

Programmer_nltk
- 863
- 16
- 38
9
votes
1 answer
Python regex for finding all words in a string
Hello I am new into regex and I'm starting out with python.
I'm stuck at extracting all words from an English sentence.
So far I have:
import re
shop="hello seattle what have you got"
regex = r'(\w*) '
list1=re.findall(regex,shop)
print list1
This…

TNT
- 480
- 1
- 4
- 11
9
votes
2 answers
Python autocomplete user input
I have a list of teamnames. Let's say they are
teamnames=["Blackpool","Blackburn","Arsenal"]
In the program I ask the user which team he would like to do stuff with. I want python to autocomplete the user's input if it matches a team and print…

user3142412
- 91
- 1
- 2
- 4
9
votes
4 answers
Maven: If sentences in pom.xml in the property tag
I'd like to set a property if an environment variable is set. I googled a lot on it and all I found is something similar to the code below, but I keep getting the error:
[FATAL] Non-parseable POM Y:\Maven\parent-pom\pom.xml: TEXT must be…

Elyahu
- 226
- 1
- 2
- 15
8
votes
3 answers
I wish to create a system where I give a sentence and the system spits out sentences similar in meaning to the input sentence I gave
This is an NLP problem and I was wondering how I should proceed.
How difficult is the problem?
Could I replace the word with synonyms and check that the grammar is correct?

kosmos
- 359
- 5
- 13
7
votes
5 answers
Convert a list of string sentences to words
I'm trying to essentially take a list of strings containg sentences such as:
sentence = ['Here is an example of what I am working with', 'But I need to change the format', 'to something more useable']
and convert it into the following:
word_list =…

George Burrows
- 3,391
- 9
- 31
- 31
7
votes
2 answers
How to "transform" an array in a sentence?
I am using Ruby on Rails v3.0.9 and I would like to "transform" an array of strings in a sentence including punctuation. That is, if I have an array like the following:
["element 1", "element 2", "element 3"]
I would like to get\build:
# Note: I…

Backo
- 18,291
- 27
- 103
- 170
7
votes
3 answers
Splitting chinese document into sentences
I have to split Chinese text into multiple sentences. I tried the Stanford DocumentPreProcessor. It worked quite well for English but not for Chinese.
Please can you let me know any good sentence splitters for Chinese preferably in Java or Python.

pjesudhas
- 399
- 4
- 13