This problem will go down much easier if we break it down into a few sub-problems. Let's try to parse the file into a direct representation of the file first and then tackle loading it into the database in the shape you want.
This kind of problem is a great fit for definite clause grammars (DCGs). It's quite natural to express complex grammars in Prolog using this technique, and you get an efficient implementation based on difference lists. If you're careful you can even use them to generate output as well as parse input!
First let's get the tremendously helpful dcg/basics library in there:
:- use_module(library(dcg/basics)).
I find it easier to do grammars in a top-down fashion, so let's break the input down into parts. First we have a list of elements. Then we have some number of "funct" lines. I don't really know the semantics of what you're trying to accomplish so this is going to be probably badly named, but let's take a crack at it.
document(document(Elements, Functs)) -->
element_list(Elements), blanks, funct_list(Functs).
The result of parsing is going to be a structure document(E, F)
, where E is an element list and F is a funct list. Notice that we're using -->
instead of :-
. This is how you define a DCG rule. Internally, what's going to happen is Prolog will rewrite the predicate to give it two extra parameters: the "before" and "after" difference lists.
Now let's do elements first because they're simpler:
element_list([E|Rest]) --> element(E), ",", element_list(Rest).
element_list([E]) --> element(E).
If you've seen a CFG before this should be pretty intuitive. We're getting an element, then a comma, then more elements, or else just an element. Now let's define element
:
element(E) --> [Code], { atom_codes(E, [Code]) }.
We can actually test these with phrase/2
right now:
?- phrase(element(X), "a").
X = a.
Good, that's the result we want. You may have to expand this definition if you're going to have more than single-character elements.
?- phrase(element_list(X), "a,b,c,d,e").
X = [a, b, c, d, e] ;
false.
So now we know what the first part of document/2
is going to look like on the way out of the parser: document([a,b,c,d,e], Functs)
. Looks just like the file, which is what we want. Our first task is just to bring in the file and all of its structure in a way that Prolog can work with.
Let's do the funct lists next:
funct_list([F|Rest]) --> functp(F), blanks, funct_list(Rest).
funct_list([F]) --> functp(F).
This looks just like the element list, but we're making functs instead of elements. Let's see what it's like to parse a funct:
functp(funct(E1, List)) -->
"funct(", element(E1), ",", whites, "[", number_list(List), "])".
The part in quotes is basically literal text. Again, you may have to refine this depending on how flexible you want to be in parsing the file, but this will work fine with the sample input you posted. Now we need a number list:
number_list([N|Rest]) --> number(N), ",", number_list(Rest).
number_list([N]) --> number(N).
Again, just like the element list. This is actually everything we need to test it out. Let's put your example text in a file called file.txt
(you can use whatever you actually have) and run it through phrase_from_file/2
to parse it. Make sure you do not have a spare newline at the end of the file; we didn't handle that case. Also, you have a typo on line 3 (missing parenthesis).
?- phrase_from_file(document(D), 'file.txt').
D = document([a, b, c, d, e],
[funct(a, [1, 2, 3, 4, 5]),
funct(b, [2, 4, 6, 8, 10]),
funct(c, [1, 3, 5, 7|...]),
funct(d, [1, 1, 2|...]),
funct(e, [3, 7|...])]) ;
false.
Bingo, we have file parsing.
Step two is to use this to create your funct/3
structures. Let's make a predicate to handle one funct/2
. It's going to need the element list to process and it's going to generate a list of its own.
do_normalize([E|Es], funct(F,[N|Ns]), [funct(F,E,N)|F3s]) :-
do_normalize(Es, funct(F,Ns), F3s).
do_normalize([], funct(_, []), []).
Let's try it out:
?- do_normalize([a,b,c,d,e], funct(a,[1,2,3,4,5]), X).
X = [funct(a, a, 1), funct(a, b, 2), funct(a, c, 3), funct(a, d, 4), funct(a, e, 5)].
This looks pretty good so far!
Edit And we're back.
The above function is good, but we need to use it on each of the funct/2
s that we have coming in from the file to produce all the funct/3
s. We can do that with maplist
, but we need to bridge to that from the output of the parser. We'll also need to use append/2
to take care of the fact that they're going to come back as nested lists; we want a flattened list.
normalize(document(Elements, Funct3s), Funct2s) :-
normalize(Elements, Funct3s, NestedFunct2s),
append(NestedFunct2s, Funct2s).
normalize(Elements, Funct3s, Funct2s) :-
maplist(do_normalize(Elements), Funct3s, Funct2s).
Now let's see if it works:
?- phrase_from_file(document(D), 'file.txt'), normalize(D, Normalized).
Normalized = [funct(a, a, 1),
funct(a, b, 2),
funct(a, c, 3),
funct(a, d, 4),
funct(a, e, 5),
funct(b, a, 2),
funct(b, b, 4),
funct(b, c, 6),
funct(..., ..., ...)|...]
So we've now 2/3rds of the way there. We've successfully read the file and converted its contents into the structures we want in the database. Now we just need to put them into the database and we'll be done!
We must first tell Prolog that funct/3
is dynamic and can be modified at runtime:
:- dynamic funct/3.
We can use a forall/2
loop to run through the list and assert everything:
?- phrase_from_file(document(D), 'file.txt'),
normalize(D, Normalized),
forall(member(Fact, Normalized), assertz(Fact)).
Proof it's working:
?- funct(X, Y, Z).
X = Y, Y = a,
Z = 1 ;
X = a,
Y = b,
Z = 2 ;
X = a,
Y = c,
Z = 3 ;
X = a,
Y = d,
Z = 4 ;
X = a,
Y = e,
Z = 5
...
Now let's just package the whole works up in a single nice predicate:
load_funct(Filename) :-
phrase_from_file(document(D), Filename),
normalize(D, Functs),
forall(member(Funct, Functs), assertz(Funct)), !.
Try it:
?- load_funct('file.txt').
true.
And you're done! Came to about 23 lines.
Hope this helps, and hope you enjoy Prolog and stick with it!