One of the most useful techniques in textmatching is regex.
Questions tagged [textmatching]
78 questions
62
votes
19 answers
Is regular expression recognition of an email address hard?
I recently read somewhere that writing a regexp to match an email address, taking into account all the variations and possibilities of the standard is extremely hard and is significantly more complicated than what one would initially assume.
Why is…

shoosh
- 76,898
- 55
- 205
- 325
16
votes
4 answers
How to do `cy.notContains(text)` in cypress?
I can check if text exists in cypress with cy.contains('hello'), but now I delete hello from the page, I want to check hello doesn't exist, how do I do something like cy.notContains('hello')?

Alien
- 944
- 2
- 8
- 22
14
votes
2 answers
How to do Java String matching using Boolean Search Syntax?
I'm looking for a Java/Scala library that can take an user query and a text and returns if there was a matching or not.
I'm processing a stream of information, ie: Twitter Stream, and can't afford to use a batching process, I need to evaluate each…

arjones
- 460
- 3
- 12
13
votes
4 answers
Search with various combinations of space, hyphen, casing and punctuations
My schema:

Sudheer Aedama
- 2,116
- 2
- 21
- 39
11
votes
1 answer
Postgresql - converting text to ts_vector
Sorry for the basic question.
I have a table with the following columns.
Column | Type | Modifiers
--------+---------+-----------
id | integer |
doc_id | bigint |
text | text |
I am trying to do text…

CISCO
- 539
- 1
- 4
- 14
5
votes
5 answers
Python dictionary replacement with space in key
I have a string and a dictionary, I have to replace every occurrence of the dict key in that text.
text = 'I have a smartphone and a Smart TV'
dict = {
'smartphone': 'toy',
'smart tv': 'junk'
}
If there is no space in keys, I will break the…

James
- 13,571
- 6
- 61
- 83
5
votes
5 answers
Data Comparison
We have a SQL Server table containing Company Name, Address, and Contact name (among others).
We regularly receive data files from outside sources that require us to match up against this table. Unfortunately, the data is slightly different since…

wcm
- 9,045
- 7
- 39
- 64
4
votes
1 answer
Cluster sequences of strings in R
I have to following data:
attributes <- c("apple-water-orange", "apple-water", "apple-orange", "coffee", "coffee-croissant", "green-red-yellow", "green-red-blue", "green-red","black-white","black-white-purple")
attributes
attributes
1 …

constiii
- 638
- 3
- 19
4
votes
1 answer
How do I group companies having different names but are essentially the same semantically?
I am doing competitor analysis using Open Government Data from UK public sector. But there are some anomalies in my results. When I am grouping the contracts by the company names, there are a lot of issues like companies are misspelt or they vary in…

Tejasvi Gaurav
- 43
- 5
3
votes
7 answers
How to match URIs in text?
How would one go about spotting URIs in a block of text?
The idea is to turn such runs of texts into links. This is pretty simple to do if one only considered the http(s) and ftp(s) schemes; however, I am guessing the general problem (considering…
Ufuk Kayserilioglu
3
votes
2 answers
Is there a better way to capture all the regex patterns in matching with nested lists within a dictionary?
I am trying out a simple text-matching activity where I scraped titles of blog posts and try to match it with my pre-defined categories once I find specific keywords.
So for example, the title of the blog post is
"Capture Perfect Night Shots with…

Nicoconut
- 33
- 4
3
votes
2 answers
android espresso test is fails always in text matching
I have a problem in espresso test, I don't know why matching the text is always fail with me, I even tried to create simple app has two activities, the first activity has textview and two buttons one button show toast another go next activity,…

Rooh Al-mahaba
- 594
- 1
- 14
- 28
3
votes
0 answers
Record Linkage with multiple datasets
The problem
fastLink and RecordLinkage packages do extremely well in matching records (rows) from database A to database B and vice-versa. The developers are working on extending from matching only 2 databases to multiple databases.
A simple example…

Yeshyyy
- 669
- 6
- 21
3
votes
0 answers
text matching, semantic similarity, match the similar phrase/ words python semantic wordNet FuzzyMatch
By using wordnet text matching I realized that the wordnet can only match a single word to a single word. It cannot match a single word to a phrase.
As you can see, I has two lists.
list1=['fruit', 'world']
list2=[u'domain', u'creation Year',…

bob90937
- 553
- 1
- 5
- 18
3
votes
1 answer
How can I match string order between two documents in Perl?
I've a problem in making a PERL program for matching the words in two documents. Let's say there are documents A and B.
So I want to delete the words in document A that's not in the document B.
Example 1:
A: I eat pizza
B: She go to the market and…

Randy
- 33
- 6