1

I'm writing a simple translator for John Tromp's Binary Lambda Calculus over to De Bruijn Notation Lambda Calculus so that I can understand how his Lambda files are working in his 2012 "Most Functional" International Obfuscated C Code winner

here is an example of the language before translation primes.blc:

00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110

I'm having trouble with a nested regex in commented line before the primes.txt file save section of Bruijn.pl:

#!/usr/bin/env perl
#use strict;
use warnings;
use IO::File;
use Cwd; my $originalCwd = getcwd()."/";
#primes.blc as argument for test conversion
#______________________________________________________________________open file
my ($name) = @ARGV;
$FILE = new IO::File;
$FILE->open("< ".$originalCwd."primes.blc") || die("Could not open file!");
#$FILE->open("< ".$name) || die("Could not open file!");
while (<$FILE>){ $field .= $_; }
$FILE->close;
#______________________________________________________________________Translate
$field =~ s/(00|01|(1+0))/$1 /gsm;
$field =~ s/00 /\\ /gsm;
$field =~ s/01 /(a /gsm;
$field =~ s/(1+)0 /length($1)." "/gsme;

$RecursParenthesesRegex = m/\(([^()]+|(??{$RecursParenthesesRegex}))*\)/;
#$field =~ 1 while s/(\(a){1}(([\s\\]+?(\d+|$RecursParenthesesRegex)){2})/\($2\)/sm;
#______________________________________________________________________save file
#$fh = new IO::File "> ".$name;
$fh = new IO::File "> ".$originalCwd."primes.txt";
if (defined $fh) { print $fh $field; $fh->close; }

An what the translated file primes.txt should be:

\ (\ (1 (1 ((\ (1 1) \ \ \ ((1 \ \ 1) (\ (((4 4) 1) (\ (1 1) \ (2 (1 1)))) \ \ \ \ ((1 3) (2 (6 4)))))) \ \ \ (4 (1 3))))) \ \ ((1 \ \ 2) 2))

Currently with the line commented out it translates to an almost readable format that looks like:

\ (a \ (a 1 (a 1 (a (a \ (a 1 1 \ \ \ (a (a 1 \ \ 1 (a \ (a (a (a 4 4 1 (a \ (a 1 1 \ (a 2 (a 1 1 \ \ \ \ (a (a 1 3 (a 2 (a 6 4 \ \ \ (a 4 (a 1 3 \ \ (a (a 1 \ \ 2 2 

Which needs to find innermost abstractions of (a and 2 of either a number or matching parentheses and all their contents and insert a trailing ) and remove the a all the way up to the outermost application.

GlassGhost
  • 16,906
  • 5
  • 32
  • 45
  • 4
    Instead of commenting out the `use strict` you should rather make your code pass with strict turned on. You also have a bunch don't need Time::HiRes. Please only show a [mcve]. – simbabque Apr 19 '16 at 18:18
  • Perhaps, `$RecursParenthesesRegex = m/(?\((?>[^()]++|(?&rec))*\))/;` – Wiktor Stribiżew Apr 19 '16 at 18:21
  • @simbabque it was made from my template perl file, my bad I'll edit the post. – GlassGhost Apr 19 '16 at 18:37
  • 2
    @MJSuriya: If the OP were to add line numbers to the source for you, it would mean that everyone else who wanted to run the code would have to remove them. Line 22 is the one that starts with `$field =~ 1 while` – Borodin Apr 19 '16 at 18:54
  • 1
    Why are you using `IO::File` instead of just `open`? And please don't use the regex `/m` and `/s` modifiers when they have no effect on the pattern. You code is complicated enough as it is – Borodin Apr 19 '16 at 19:19

2 Answers2

2

Although I don't understand your algorithm, this line is suspicious

$RecursParenthesesRegex = m/\(([^()]+|(??{$RecursParenthesesRegex}))*\)/

You're defining an undeclared variable in terms of whether a pattern that includes it matches $_

It's mistakes like this that use strict is meant to catch, but instead of fixing the errors you turned them off. That's not wise

At a guess you're trying to define a recursive pattern, so you need to use qr// instead of m//, and use (?0) or (?R) in the pattern

And lets call it $re instead shall we? Like this

my $re = qr/\(([^()]+|(?R))*\)/

Also, this line is odd

$field =~ 1 while s/(\(a){1}(([\s\\]+?(\d+|$RecursParenthesesRegex)){2})/\($2\)/sm

which compares the value of $field with the regex pattern 1 and discards the result as long as the substitution changes something in $_

Beyond that I cannot help you without a description of the algorithm and how your code relates to it

Borodin
  • 126,100
  • 9
  • 70
  • 144
2

You probably need a regex like this

 # (\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))

 ( \(a )                       # (1)
 (                             # (2 start)
      (                             # (3 start)
           [\s\\]*? 
           (?:
                \d+ 
             |  
                (?&RecursParens) 
           )
      ){2}                          # (3 end)
 )                             # (2 end)

 (?(DEFINE)

      (?<RecursParens>              # (4 start)
           (?>
                \(
                (?>
                     (?> [^()]+ )
                  |  (?:
                          (?= . )
                          (?&RecursParens) 
                       |  
                     )
                )+
                \)
           )
      )                             # (4 end)
 )

With a Perl code like this

use strict;
use warnings;
use feature qw{say};

my $field = "00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110";

$field =~ s/(00|01|(1+0))/$1 /g;
$field =~ s/00 /\\ /g;
$field =~ s/01 /(a /g;
$field =~ s/(1+)0 /length($1)." "/ge;

1 while $field =~ s/(\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))/\($2\)/g;

$field =~ s/\( /\(/g;

say $field;

That will give you an output like this

\ (\ (1 (1 ((\ (1 1) \ \ \ ((1 \ \ 1) (\ (((4 4) 1) (\ (1 1) \ (2 (1 1)))) \ \ \ \ ((1 3) (2 (6 4)))))) \ \ \ (4 (1 3))))) \ \ ((1 \ \ 2) 2))

That can be formatted to look like this

 \ 
 (                             # (1 start)
      \ 
      (                             # (2 start)
           1 
           (                             # (3 start)
                1 
                (                             # (4 start)
                     (                             # (5 start)
                          \ 
                          ( 1 1 )                       # (6)
                          \ \ \ 
                          (                             # (7 start)
                               ( 1 \ \ 1 )                   # (8)
                               (                             # (9 start)
                                    \ 
                                    (                             # (10 start)
                                         (                             # (11 start)
                                              ( 4 4 )                       # (12)
                                              1
                                         )                             # (11 end)
                                         (                             # (13 start)
                                              \ 
                                              ( 1 1 )                       # (14)
                                              \ 
                                              (                             # (15 start)
                                                   2 
                                                   ( 1 1 )                       # (16)
                                              )                             # (15 end)
                                         )                             # (13 end)
                                    )                             # (10 end)
                                    \ \ \ \ 
                                    (                             # (17 start)
                                         ( 1 3 )                       # (18)
                                         (                             # (19 start)
                                              2 
                                              ( 6 4 )                       # (20)
                                         )                             # (19 end)
                                    )                             # (17 end)
                               )                             # (9 end)
                          )                             # (7 end)
                     )                             # (5 end)
                     \ \ \ 
                     (                             # (21 start)
                          4 
                          ( 1 3 )                       # (22)
                     )                             # (21 end)
                )                             # (4 end)
           )                             # (3 end)
      )                             # (2 end)
      \ \ 
      (                             # (23 start)
           ( 1 \ \ 2 )                   # (24)
           2
      )                             # (23 end)
 )                             # (1 end)
  • Thanks @sln ! I think you remember me from earlier. You're a regex monster(in a good way). – GlassGhost Apr 23 '16 at 03:18
  • @GlassGhost - Glad it worked for you. That blc looks like cryptography with key and all. –  Apr 25 '16 at 22:06
  • Yeah, you should take a look at; http://viruliant.github.io AND http://stackoverflow.com/a/21769444/144020 same principle. I truly believe this is the language of the gods. – GlassGhost Apr 26 '16 at 02:44