Questions tagged [grammar-induction]

Grammar induction is the process of generating a grammar that matches a set of strings.

12 questions
109
votes
10 answers

Is it possible for a computer to "learn" a regular expression by user-provided examples?

Is it possible for a computer to "learn" a regular expression by user-provided examples? To clarify: I do not want to learn regular expressions. I want to create a program which "learns" a regular expression from examples which are interactively…
Daniel Rikowski
  • 71,375
  • 57
  • 251
  • 329
30
votes
4 answers

How to auto generate regex from given list of strings?

You are given 2 lists of Strings - A and B. Find the shortest regex that matches all strings in A and none in B. Note that this regex can match/not-match other strings that are not in A and not in B. For simplicity, we can assume the that our…
pathikrit
  • 32,469
  • 37
  • 142
  • 221
11
votes
4 answers

Generating the shortest regex to match an arbitrary word list

I'm hoping someone might know of a script that can take an arbitrary word list and generated the shortest regex that could match that list exactly (and nothing else). For example, suppose my list is 1231 1233 1234 1236 1238 1247 1256 1258 1259 Then…
Asmor
  • 5,043
  • 6
  • 32
  • 42
10
votes
2 answers

Find simplest regular expression matching all given strings

Is there an algorithm that can produce a regular expression (maybe limited to a simplified grammar) from a set of strings such that the evaluation of all possible strings that match the regular expression reproduces the initial set of strings? It is…
fuenfundachtzig
  • 7,952
  • 13
  • 62
  • 87
8
votes
4 answers

Automatically built regex expressions that fit set of strings

We have written the system to analyse log messages from the large network. The system takes log messages from lots of different network elements, and analyses it by regex expressions. For example user may have written two…
Archie
  • 6,391
  • 4
  • 36
  • 44
8
votes
1 answer

Genetic algorithm grammar induction program/code?

Does anyone know of a program that uses a GA to perform grammar induction/inference, I've read tonnes of research papers and articles on this stuff like Lankhorst and De Pauw but I can't find any implementations or programmes that use this technique…
6
votes
3 answers

creating a regular expression for a list of strings

I have extracted a series of tables from the scientific literature which consist of columns each of which is a distinct type. Here is an example I'd like to be able to automatically generate regular expressions for each column. Obviously there are…
peter.murray.rust
  • 37,407
  • 44
  • 153
  • 217
5
votes
3 answers

Grammar inference library?

What are the best (or any) open source libraries for regular or context-free grammar inference from a set of examples believed to be generated by a common grammar? I'd prefer a good library in Java, Python or Ruby, but of course beggars can't be…
Lucas Wiman
  • 10,021
  • 2
  • 37
  • 41
3
votes
1 answer

Grammar Induction Program - Squitor

Does anyone know of a program that does grammar induction? For example, where can I find the source code for the REQUITER Context Free Grammar program?
user562688
2
votes
2 answers

Generate RegEx from matches

I want to generate RegEx Pattern from a given matches for example i want to get \d+<\b> from the following array of matches 1 2 3 4 5 ... any ideas?
MrBassam
  • 349
  • 1
  • 5
  • 17
0
votes
1 answer

Automatically generating Regex from set of strings residing in DB using C#

I have about 100,000 strings in database and I want to if there is a way to automatically generate regex pattern from these strings. All of them are alphabetic strings and use set of alphabets from English letters. (X,W,V) is not used for example.…
Muhammad Adeel Zahid
  • 17,474
  • 14
  • 90
  • 155
-2
votes
2 answers

Generate JS Regex from a set of strings

Is there any way or any library out there that can compute a JS RegEx from a set of strings that I want to be matched? For example, I have this set of strings: abc123 abc212 And generate abc\d\d\d ? Or this set: aba111 abb111 abc And generate…