1

I'm Working on transformers .

I have a python script there, that takes arguments as input via argparse.

Here is part of it:

    parser = argparse.ArgumentParser()
    parser.add_argument("--model_type", default=None, type=str, required=True,
                        help="Model type selected in the list: " + ", ".join(MODEL_CLASSES.keys()))
    parser.add_argument("--model_name_or_path", default=None, type=str, required=True,
                        help="Path to pre-trained model or shortcut name selected in the list: " + ", ".join(ALL_MODELS))

I want to be able to call the script iteratively with different arguments. I can call the script with %run or !python <bash command>, but can I cannot pass variables to it be interpreted as arguments, because it treats the variables as the actual string value:

%run examples/run_lm_finetuning.py --gradient_accumulation_steps=1 --output_dir='output_medium' --model_type='gpt2' \
                      --model_name_or_path=model_load --do_train --train_data_file='/root/sharedfolder/omri/data/pieces/backup/{}'.format(file)\
                      --overwrite_output_dir --per_gpu_train_batch_size=1 --per_gpu_eval_batch_size=1 --save_total_limit=5

Returns:

OSError: Model name 'model_load' was not found in 
model name list (gpt2, gpt2-medium, gpt2-large, gpt2-xl, distilgpt2). 

We assumed 'model_load' was a path or url to a configuration file named 
config.json or a directory containing such a file but couldn't find any 
such file at this path or url.
hpaulj
  • 221,503
  • 14
  • 230
  • 353
pyxai
  • 31
  • 1
  • 5

3 Answers3

3

It looks like {} expands variables. I stumbled on it trying do the Python f'' formatting. I don't see it in %run docs; it must be part of the %magic syntax.

Anyways, with a simple echo script:

In [29]: cat echo.py                                                            
import sys
print(sys.argv)

In [30]: foo = "a string"                                                       

In [31]: run echo.py {foo} bar                                                  
['echo.py', 'a', 'string', 'bar']
In [32]: run echo.py "{foo}" bar                                                
['echo.py', 'a string', 'bar']

===

With another magic

In [71]: astr="*.h5"                                                            
In [72]: ls {astr}                                                              
abc_copy.h5  string.h5          testdate.h5       test_str.h5...

===

$ also does this:

In [79]: foo = "a string"                                                       
In [80]: run echo.py $foo bar                                                   
['echo.py', 'a', 'string', 'bar']

How to pass a variable to magic ´run´ function in IPython

IPython.core.magic.no_var_expand(magic_func) Mark a magic function as not needing variable expansion

By default, IPython interprets {a} or $a in the line passed to magics as variables that should be interpolated from the interactive namespace before passing the line to the magic function. This is not always desirable, e.g. when the magic executes Python code (%timeit, %time, etc.). Decorate magics with @no_var_expand to opt-out of variable expansion.

https://ipython.readthedocs.io/en/stable/api/generated/IPython.core.magic.html

hpaulj
  • 221,503
  • 14
  • 230
  • 353
1

So model_load in %run should be interpreted as python variable? That would be a little weird, don't you think? Try calling python directly from python, not through ipython magic functions:

In [18]: import subprocess                                                                                                                                                                            

In [19]: model_load = "gpt2"                                                                                                                                                                          

In [20]: subprocess.run(f"python file.py --model_name_or_path={model_load}", shell=True)
RafalS
  • 5,834
  • 1
  • 20
  • 25
  • It's not weird since the model_load is actually the path to the weights. So after I finish training a single file with the default weights, i need to redirect the path. But actually I need it more to iterate through the files, this was just an example your solution is great, but doesn't print stdout on Jupyter, i'm posting an improvement for your suggestion – pyxai Dec 15 '19 at 13:10
1

RafalS's answer is great, but doesn't print stdout if you want, here is how:

Put the entire statement in an fstring and assign to a variable run:

run = f"python examples/run_lm_finetuning.py\
                    --model_name_or_path={model_load}\
                    --train_data_file='/root/sharedfolder/omri/data/pieces/backup/{file}'\
                    --overwrite_output_dir\
                    --per_gpu_train_batch_size=1\
                    --per_gpu_eval_batch_size=1\
                    --save_total_limit=5"

Then run through Jupyter shell by simply:

!{run}
pyxai
  • 31
  • 1
  • 5