Questions tagged [sentence]

A sentence is an ordinated sequence of words in a given language. often referred to as "document" in Natural Language Processing.

364 questions
35
votes
6 answers

Javascript RegExp for splitting text into sentences and keeping the delimiter

I am trying to use javascript's split to get the sentences out of a string but keep the delimiter eg !?. So far I have sentences = text.split(/[\\.!?]/); which works but does not include the ending punctuation for each sentence (.!?). Does anyone…
daktau
  • 633
  • 1
  • 7
  • 17
34
votes
6 answers

How to break up document by sentences with Spacy

How can I break a document (e.g., paragraph, book, etc) into sentences. For example, "The dog ran. The cat jumped" into ["The dog ran", "The cat jumped"] with spacy?
Ulad Kasach
  • 11,558
  • 11
  • 61
  • 87
15
votes
4 answers

Split sentence into words but having trouble with the punctuations in C#

I have seen a few similar questions but I am trying to achieve this. Given a string, str="The moon is our natural satellite, i.e. it rotates around the Earth!" I want to extract the words and store them in an array. The expected array elements…
Richard N
  • 895
  • 9
  • 19
  • 36
15
votes
7 answers

R break corpus into sentences

I have a number of PDF documents, which I have read into a corpus with library tm. How can one break the corpus into sentences? It can be done by reading the file with readLines followed by sentSplit from package qdap [*]. That function requires a…
Henk
  • 3,634
  • 5
  • 28
  • 54
12
votes
2 answers

NLP for extracting actions from text

I'm hoping somebody can point me in the right direction to learn about separating out actions from a bunch of text. Suppose I have this text Drop off the dry cleaning, and go to the corner store and pick-up a jug of milk and get a pint of…
pedalpete
  • 21,076
  • 45
  • 128
  • 239
10
votes
1 answer

Custom sentence segmentation using Spacy

I am new to Spacy and NLP. I'm facing the below issue while doing sentence segmentation using Spacy. The text I am trying to tokenise into sentences contains numbered lists (with space between numbering and actual text), like below. import spacy nlp…
Satheesh K
  • 501
  • 1
  • 3
  • 16
10
votes
1 answer

Making a meaningful sentence from a given set of words

I am working on a program that needs to create a sentence that is grammatically correct from a given set of words. Here I will be passing an input of a list of strings to the program and my output should be a meaningful sentence made with those…
9
votes
1 answer

Sentence Structure identification - spacy

I intend to identify the sentence structure in English using spacy and textacy. For example: The cat sat on the mat - SVO , The cat jumped and picked up the biscuit - SVV0. The cat ate the biscuit and cookies. - SVOO. The program is supposed to…
Programmer_nltk
  • 863
  • 16
  • 38
9
votes
1 answer

Python regex for finding all words in a string

Hello I am new into regex and I'm starting out with python. I'm stuck at extracting all words from an English sentence. So far I have: import re shop="hello seattle what have you got" regex = r'(\w*) ' list1=re.findall(regex,shop) print list1 This…
TNT
  • 480
  • 1
  • 4
  • 11
9
votes
2 answers

Python autocomplete user input

I have a list of teamnames. Let's say they are teamnames=["Blackpool","Blackburn","Arsenal"] In the program I ask the user which team he would like to do stuff with. I want python to autocomplete the user's input if it matches a team and print…
user3142412
  • 91
  • 1
  • 2
  • 4
9
votes
4 answers

Maven: If sentences in pom.xml in the property tag

I'd like to set a property if an environment variable is set. I googled a lot on it and all I found is something similar to the code below, but I keep getting the error: [FATAL] Non-parseable POM Y:\Maven\parent-pom\pom.xml: TEXT must be…
Elyahu
  • 226
  • 1
  • 2
  • 15
8
votes
3 answers

I wish to create a system where I give a sentence and the system spits out sentences similar in meaning to the input sentence I gave

This is an NLP problem and I was wondering how I should proceed. How difficult is the problem? Could I replace the word with synonyms and check that the grammar is correct?
kosmos
  • 359
  • 5
  • 13
7
votes
5 answers

Convert a list of string sentences to words

I'm trying to essentially take a list of strings containg sentences such as: sentence = ['Here is an example of what I am working with', 'But I need to change the format', 'to something more useable'] and convert it into the following: word_list =…
George Burrows
  • 3,391
  • 9
  • 31
  • 31
7
votes
2 answers

How to "transform" an array in a sentence?

I am using Ruby on Rails v3.0.9 and I would like to "transform" an array of strings in a sentence including punctuation. That is, if I have an array like the following: ["element 1", "element 2", "element 3"] I would like to get\build: # Note: I…
Backo
  • 18,291
  • 27
  • 103
  • 170
7
votes
3 answers

Splitting chinese document into sentences

I have to split Chinese text into multiple sentences. I tried the Stanford DocumentPreProcessor. It worked quite well for English but not for Chinese. Please can you let me know any good sentence splitters for Chinese preferably in Java or Python.
pjesudhas
  • 399
  • 4
  • 13
1
2 3
24 25