Grammar induction is the process of generating a grammar that matches a set of strings.
Questions tagged [grammar-induction]
12 questions
109
votes
10 answers
Is it possible for a computer to "learn" a regular expression by user-provided examples?
Is it possible for a computer to "learn" a regular expression by user-provided examples?
To clarify:
I do not want to learn regular expressions.
I want to create a program which "learns" a regular expression from examples which are interactively…

Daniel Rikowski
- 71,375
- 57
- 251
- 329
30
votes
4 answers
How to auto generate regex from given list of strings?
You are given 2 lists of Strings - A and B. Find the shortest regex that matches all strings in A and none in B. Note that this regex can match/not-match other strings that are not in A and not in B. For simplicity, we can assume the that our…

pathikrit
- 32,469
- 37
- 142
- 221
11
votes
4 answers
Generating the shortest regex to match an arbitrary word list
I'm hoping someone might know of a script that can take an arbitrary word list and generated the shortest regex that could match that list exactly (and nothing else).
For example, suppose my list is
1231
1233
1234
1236
1238
1247
1256
1258
1259
Then…

Asmor
- 5,043
- 6
- 32
- 42
10
votes
2 answers
Find simplest regular expression matching all given strings
Is there an algorithm that can produce a regular expression (maybe limited to a simplified grammar) from a set of strings such that the evaluation of all possible strings that match the regular expression reproduces the initial set of strings?
It is…

fuenfundachtzig
- 7,952
- 13
- 62
- 87
8
votes
4 answers
Automatically built regex expressions that fit set of strings
We have written the system to analyse log messages from the large network. The system takes log messages from lots of different network elements, and analyses it by regex expressions. For example user may have written two…

Archie
- 6,391
- 4
- 36
- 44
8
votes
1 answer
Genetic algorithm grammar induction program/code?
Does anyone know of a program that uses a GA to perform grammar induction/inference, I've read tonnes of research papers and articles on this stuff like Lankhorst and De Pauw but I can't find any implementations or programmes that use this technique…

Matt Robinson
- 305
- 1
- 4
- 14
6
votes
3 answers
creating a regular expression for a list of strings
I have extracted a series of tables from the scientific literature which consist of columns each of which is a distinct type. Here is an example
I'd like to be able to automatically generate regular expressions for each column. Obviously there are…

peter.murray.rust
- 37,407
- 44
- 153
- 217
5
votes
3 answers
Grammar inference library?
What are the best (or any) open source libraries for regular or context-free grammar inference from a set of examples believed to be generated by a common grammar? I'd prefer a good library in Java, Python or Ruby, but of course beggars can't be…

Lucas Wiman
- 10,021
- 2
- 37
- 41
3
votes
1 answer
Grammar Induction Program - Squitor
Does anyone know of a program that does grammar induction? For example, where can I find the source code for the REQUITER Context Free Grammar program?
user562688
2
votes
2 answers
Generate RegEx from matches
I want to generate RegEx Pattern from a given matches
for example i want to get \d+<\b> from the following array of matches
1
2
3
4
5
...
any ideas?

MrBassam
- 349
- 1
- 5
- 17
0
votes
1 answer
Automatically generating Regex from set of strings residing in DB using C#
I have about 100,000 strings in database and I want to if there is a way to automatically generate regex pattern from these strings. All of them are alphabetic strings and use set of alphabets from English letters. (X,W,V) is not used for example.…

Muhammad Adeel Zahid
- 17,474
- 14
- 90
- 155
-2
votes
2 answers
Generate JS Regex from a set of strings
Is there any way or any library out there that can compute a JS RegEx from a set of strings that I want to be matched?
For example, I have this set of strings:
abc123
abc212
And generate abc\d\d\d ?
Or this set:
aba111
abb111
abc
And generate…