0

I had an unexpected error when parsing a (string) s-expression into a nested array/lists representing the AST. The s-expression comes from (SerAPI https://github.com/ejgallego/coq-serapi), but it looks fine to me:

sexp = b'(Answer 3(ObjList((CoqGoal((fg_goals(((name 3)(ty(App(Ind(((Mutind(MPfile(DirPath((Id Logic)(Id Init)(Id Coq))))(DirPath())(Id eq))0)(Instance())))((Ind(((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id bool))0)(Instance())))(App(Const((Constant(MPfile(DirPath((Id Nat)(Id Init)(Id Coq))))(DirPath())(Id odd))(Instance())))((App(Construct((((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)2)(Instance())))((Construct((((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)1)(Instance())))))))(Construct((((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id bool))0)1)(Instance()))))))(hyp()))))(bg_goals())(shelved_goals())(given_up_goals()))))))\n'

When I try to parse that I get the following error message:

train(policy,optimizer,env,gamma,nb_episodes=nb_episodes,ema_alpha=ema_alpha)
  File "main.py", line 88, in train
    state, reward, done, _ = env.step('Example test_oddb1: Nat.odd 1 = true.')
  File "/Users/korkejudith/home_simulation_research/coq-serapi-python/python_api/coq_env.py", line 120, in step
    state = self.state_embedder(state)
  File "/Users/korkejudith/home_simulation_research/coq-serapi-python/python_api/ai_mathematician.py", line 33, in __call__
    psexp = loads(str(sexp))
  File "/Users/korkejudith/miniconda3/envs/rltp/lib/python3.6/site-packages/sexpdata.py", line 243, in loads
    obj = parse(string, **kwds)
  File "/Users/korkejudith/miniconda3/envs/rltp/lib/python3.6/site-packages/sexpdata.py", line 675, in parse
    return Parser(string, **kwds).parse()
  File "/Users/korkejudith/miniconda3/envs/rltp/lib/python3.6/site-packages/sexpdata.py", line 655, in parse
    (i, sexp) = self.parse_sexp(0)
  File "/Users/korkejudith/miniconda3/envs/rltp/lib/python3.6/site-packages/sexpdata.py", line 641, in parse_sexp
    (i, subsexp) = self.parse_sexp(i + 1)
  File "/Users/korkejudith/miniconda3/envs/rltp/lib/python3.6/site-packages/sexpdata.py", line 642, in parse_sexp
    append(Quoted(subsexp[0]))
IndexError: list index out of range

I tried a different version of the string but none seem to work. Does someone know what might be going wrong?


Code to reproduce the error with different versions that failed for me:

from sexpdata import loads, dumps

def buffer_test():
    print('buffer_test')
    sexp = ''' b'(Answer 3(ObjList((CoqGoal((fg_goals(((name 3)(ty(Prod(Name(Id n))(Ind(((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)(Instance())))(App(Ind(((Mutind(MPfile(DirPath((Id Logic)(Id Init)(Id Coq))))(DirPath())(Id eq))0)(Instance())))((Ind(((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)(Instance())))(App(Const((Constant(MPfile(DirPath((Id Nat)(Id Init)(Id Coq))))(DirPath())(Id add))(Instance())))((Construct((((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)1)(Instance())))(Rel 1)))(Rel 1)))))(hyp()))))(bg_goals())(shelved_goals())(given_up_goals()))))))\n'
    '''
    print(f'sexp = {sexp}')
    psexp = loads(sexp)

def buffer_str_test():
    print('buffer_str_test')
    sexp = str(''' b'(Answer 3(ObjList((CoqGoal((fg_goals(((name 3)(ty(Prod(Name(Id n))(Ind(((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)(Instance())))(App(Ind(((Mutind(MPfile(DirPath((Id Logic)(Id Init)(Id Coq))))(DirPath())(Id eq))0)(Instance())))((Ind(((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)(Instance())))(App(Const((Constant(MPfile(DirPath((Id Nat)(Id Init)(Id Coq))))(DirPath())(Id add))(Instance())))((Construct((((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)1)(Instance())))(Rel 1)))(Rel 1)))))(hyp()))))(bg_goals())(shelved_goals())(given_up_goals()))))))\n'
    ''')
    print(f'sexp = {sexp}')
    psexp = loads(sexp)

def str_str_test():
    print('str_str_test')
    sexp = str(''' (Answer 3(ObjList((CoqGoal((fg_goals(((name 3)(ty(Prod(Name(Id n))(Ind(((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)(Instance())))(App(Ind(((Mutind(MPfile(DirPath((Id Logic)(Id Init)(Id Coq))))(DirPath())(Id eq))0)(Instance())))((Ind(((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)(Instance())))(App(Const((Constant(MPfile(DirPath((Id Nat)(Id Init)(Id Coq))))(DirPath())(Id add))(Instance())))((Construct((((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)1)(Instance())))(Rel 1)))(Rel 1)))))(hyp()))))(bg_goals())(shelved_goals())(given_up_goals()))))))\n'
    ''')
    print(f'sexp = {sexp}')
    psexp = loads(sexp)

def str_test():
    print('str_test')
    sexp = ''' (Answer 3(ObjList((CoqGoal((fg_goals(((name 3)(ty(Prod(Name(Id n))(Ind(((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)(Instance())))(App(Ind(((Mutind(MPfile(DirPath((Id Logic)(Id Init)(Id Coq))))(DirPath())(Id eq))0)(Instance())))((Ind(((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)(Instance())))(App(Const((Constant(MPfile(DirPath((Id Nat)(Id Init)(Id Coq))))(DirPath())(Id add))(Instance())))((Construct((((Mutind(MPfile(DirPath((Id Datatypes)(Id Init)(Id Coq))))(DirPath())(Id nat))0)1)(Instance())))(Rel 1)))(Rel 1)))))(hyp()))))(bg_goals())(shelved_goals())(given_up_goals()))))))'
    '''
    print(f'sexp = {sexp}')
    psexp = loads(sexp)

if __name__ == '__main__':
    print('running main')
    buffer_test()
    buffer_str_test()
    str_str_test()
    str_test()
    print('sucessful main')

I also opened this gitissue:

https://github.com/jd-boyd/sexpdata/issues/18


crossposted:

https://www.reddit.com/r/Python/comments/bg87nx/how_does_one_fix_a_parsing_error_with_what_seems/

https://www.quora.com/unanswered/How-does-one-fix-a-parsing-error-with-what-seems-a-valid-s-expression-in-Python

Charlie Parker
  • 5,884
  • 57
  • 198
  • 323
  • 1
    Without knowing about `sexpdata`, it seems to me that it could be about the string type. The `''' b'` for example seems like a big confusion to me. – fabianegli Apr 22 '19 at 22:57

1 Answers1

1

You need to fix sexp to be a normal string.


The failing code referenced in your stacktrace

        elif c == "'":
            (i, subsexp) = self.parse_sexp(i + 1)
            append(Quoted(subsexp[0]))
            sexp.extend(subsexp[1:])

suggests that somehow, you have an apostrophe in your expression. (Debugging would show that clearly.) Where could it come from?

  File "/Users/korkejudith/home_simulation_research/coq-serapi-python/python_api/ai_mathematician.py", line 33, in __call__
    psexp = loads(str(sexp))

This is the culprit. In Python 3, str(bytes) is its repr.

ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152
  • hmmm ok so somehow I am not converting it to a string properly. How do I convert it to a string properly? – Charlie Parker Apr 22 '19 at 23:14
  • ok so `sexp.decode('utf-8')` seems to work for me. But how does one choose an encoding? I sort of just chose one randomly. – Charlie Parker Apr 22 '19 at 23:30
  • @CharlieParker You need to choose it depending on what characters can appear in the bytestring and how they are supposed to be encoded (in case of e.g. national characters). If nothing except ASCII characters can appear, you can as well use `ascii`. – ivan_pozdeev Apr 22 '19 at 23:37
  • 1
    Hi, maintainer of sexpdata here; the problem is indeed in string quoting, it is just exacerbated by the fact that apostrophe is a lisp shorthand for quoting lists and symbols ( `'(foo)` is equivalent to `(quote (foo))`), so seeing `b'(foo)'` sexpdata was trying to parse it as symbol `b`, followed by `'(foo)`, followed by `'` and raised a cryptic error from it. I have a fix pending to be merged in a short while. – immerrr Apr 24 '19 at 07:12
  • 1
    Actually, writing this comment made me think we should probably remove the quoting shorthand implementation, because it really belongs in LISP, not s-expression parsing per se. – immerrr Apr 24 '19 at 07:15
  • @immerrr so what should I do with my code/string? right now encoding to `utf-8` makes the parser work. – Charlie Parker Apr 25 '19 at 15:19
  • As Ivan has correctly pointed out, you need to replace `str(sexp)` with `sexp.decode('utf-8')`, as that is the correct way of converting `bytes` to `string` in Python. – immerrr Apr 26 '19 at 16:23