How to read all the Assembly predicate in a .txt file using read predicate in Prolog?

Question

This is the file .txt with the assembly predicate:

brz END //comment
sub ONE
sta SECOND
lda RESULT //comment
add FIRST
bra LOOP

We have to load and read the .txt file and we have to delete the comments and create a list of the instruction like this:

L=[brz END,sub ONE,...]

Have you looked at the SWI Prolog documentation for file I/O? — lurker, Dec 04 '18 at 17:56
@TessellatingHeckler your plan is probably to break the input into lines and then do some separate step per line to parse them. But it turns out to be a lot easier in Prolog to go for the DCG approach all the way and parse the file directly into your target representation rather than doing it in two steps. (I had the same problem years ago when I started learning Prolog.) — Daniel Lyons, Dec 04 '18 at 20:37
@TessellatingHeckler I am busy writing an answer. (You aren't the OP though, so it's weird that you are commenting as though you are.) But you should understand that what makes sense to you from other languages as simple may not be simple in Prolog and other things that are simple in Prolog may not appear simple to you, as a beginner. But that isn't Prolog's fault. :) — Daniel Lyons, Dec 04 '18 at 20:53
@TessellatingHeckler yeah, Prolog I/O is fairly primitive. You could use various other solutions kicking around for reading lines (e.g., [this one on SO](https://stackoverflow.com/questions/4805601/read-a-file-line-by-line-in-prolog)). Looks like you also want to parses out your asm comments(?) which you'd have to deal with separately but would not be much code. — lurker, Dec 04 '18 at 21:13

score 4 · Answer 1 · answered Dec 04 '18 at 21:06

In a conventional language like Python you would be tempted to solve this problem with something like this:

result = []
for line in open('file.txt'):
   line = re.replace(line, '//.*', '')
   result.append(line)

In Prolog, you will instead find it simpler to write a full DCG for your input as if it were a grammar. Having a more powerful parsing framework right there in the core has sort of prevented Prolog from developing a large and complex suite of string- and character-banging functions. So I would expect that even if you did parse to strings, you would then be stuck again, but for want of a regular expression library or ways of slicing and dicing strings which just aren't there.

As with everything in Prolog, it's more wordy than you're probably used to, but there are advantages that are probably not obvious from the outset. Here's the code I came up with for your toy problem (which took me about 15 minutes.)

:- use_module(library(pio)).
:- use_module(library(dcg/basics)).

comment --> "//", string_without("\n", _).
comment --> [].

optarget(A) --> string(S), { atom_codes(A, S) }.

instruction(inst(Op, Target)) --> optarget(Op), " ", whites, 
                                  optarget(Target), whites, comment, "\n".

instructions([Inst|Rest]) --> instruction(Inst), instructions(Rest).
instructions([]) --> [].

This will parse your example into something like this:

?- phrase_from_file(instructions(Inst), "test.txt").
Inst = [inst(brz, 'END'), inst(sub, 'ONE'), inst(sta, 'SECOND'), 
        inst(lda, 'RESULT'), inst(add, 'FIRST'), inst(bra, 'LOOP')] .

You should not feel like you are "abusing" dcg/basics by using it for things that are not related to HTTP. The library was extracted some time ago because of its general usefulness.

I'm using whites here to discard whitespace, but because it will succeed with nothing, you need an explicit space between the two optarget calls
There are more interesting things you could do instead of optarget//1, like parse only your real instructions or only your real arguments, but I don't know what they are so you're getting atoms here
When it turns out your instructions take more arguments, you can add additional instruction//1 rules to handle them individually. That's probably what I would do, anyway
If you realize a different representation would be more beneficial to downstream processing, it should be fairly easy to realize it by changing instruction//1 or instructions//1

Thank you for your answer but I can’t use this library for my project.. I know that’s a toy problem but I started to learn prolog one week ago.. — Prologanti, Dec 04 '18 at 22:01
DCGs are not a library, they are a fundamental feature of Prolog. They're as separate from Prolog as dictionaries are from Python. The dcg/basics and pio libraries are separate, but their code is open-source. It would be manual labor of about four lines to copy the portion of dcg/basics I'm using here into your code. pio is probably another story, but even there it would not be terribly hard to add a small predicate to read the entirety of a file and pass it to `phrase/2`. — Daniel Lyons, Dec 04 '18 at 22:07
@TessellatingHeckler I'm referring to string-regex libraries. I understand your frustration, but it isn't helping you learn Prolog to be mad at it for not being Perl. Prolog really is quite different from other languages. If you want to be proficient at it, you must approach it on its own terms. I was once just as frustrated at Prolog for this exact reason. It's hard to explain in a comment what changed, but it has to do with Prolog taken holistically. Overall, the power you need is there and it is elegant. But you have to enlarge your perspective to grok it. — Daniel Lyons, Dec 05 '18 at 15:18

How to read all the Assembly predicate in a .txt file using read predicate in Prolog?

1 Answers1