1

It seems we have two different ways to activate the current opam switch environment. So my questions are:

  1. Which one is the preferred method?
  2. What is the difference between the two? eval $(opam env --switch=$SWITCH --set-switch) vs opam switch set $SWITCH

Thanks!


Context

I need to change opam envs within python due to my applicaiton (no way around this 100%).

Usually I do:

eval $(opam env --switch={switch} --set-switch)

but this gives an issue (see end).

Thus, going to try:

opam switch set {switch}

are these truly equivalent?

(Note: in python opam switch set {switch} seems to work, but still like to understand why there are two version)


For context error:

Traceback (most recent call last):
  File "/lfs/ampere4/0/brando9/iit-term-synthesis/iit-term-synthesis-src/data_pkg/data_gen.py", line 510, in <module>
    main()
  File "/lfs/ampere4/0/brando9/iit-term-synthesis/iit-term-synthesis-src/data_pkg/data_gen.py", line 497, in main
    asyncio.run(create_dataset(path_2_save_new_dataset_all_splits=args.path_to_save_new_dataset,
  File "/dfs/scratch0/brando9/anaconda/envs/iit_synthesis/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/dfs/scratch0/brando9/anaconda/envs/iit_synthesis/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/lfs/ampere4/0/brando9/iit-term-synthesis/iit-term-synthesis-src/data_pkg/data_gen.py", line 437, in create_dataset
    coq_proj_data: DataCoqProj = await get_coq_proj_data(coq_proj, split)
  File "/lfs/ampere4/0/brando9/iit-term-synthesis/iit-term-synthesis-src/data_pkg/data_gen.py", line 194, in get_coq_proj_data
    path2filenames_raw: list[str] = strace_build_coq_project_and_get_filenames(coq_proj)
  File "/afs/cs.stanford.edu/u/brando9/pycoq/pycoq/opam.py", line 706, in strace_build_coq_project_and_get_filenames
    activate_opam_switch(switch)
  File "/afs/cs.stanford.edu/u/brando9/pycoq/pycoq/opam.py", line 892, in activate_opam_switch
    raise e
  File "/afs/cs.stanford.edu/u/brando9/pycoq/pycoq/opam.py", line 886, in activate_opam_switch
    res = subprocess.run(command.split(), check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/dfs/scratch0/brando9/anaconda/envs/iit_synthesis/lib/python3.9/subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/dfs/scratch0/brando9/anaconda/envs/iit_synthesis/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/dfs/scratch0/brando9/anaconda/envs/iit_synthesis/lib/python3.9/subprocess.py", line 1821, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'eval'

I think it has something to do with calling subprocesses from within python don't fully understand,


Why is subprocess not inheriting the env vars from the main python process but from the mutable subprocesses as more subprocess calls are done?

[quote="Frederic_Loyer, post:22, topic:10957"] A process can’t change the environment of an other process. Then opam can’t change the parent process environment (bash or Python). [/quote]

I've confirmed this. What I do is run opam switch set coq-8.10 from a python subprocess:

        #     opam_set_switch_via_opam_switch('coq-8.10')
        result = subprocess.run(f'opam switch set {switch}'.split(), check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

which the docs says returns a completed process.

Then I compare the contents of the env variables of the subprocess by calling via another subprocess the cmd opam env and compare it with the main python process by comparing it with os.environ. I get that the two indeed don't match:

    # opam_env_dict: dict = get_variables_from_opam_env_output(py_prints_on=py_prints_on)
            result = subprocess.run('opam env'.split(), capture_output=True, text=True)
# ... compare with os.environ
            assert uutils.check_dict1_is_in_dict2(opam_env_dict, os.environ, verbose=True)

assert fails

--> k='OPAM_SWITCH_PREFIX' is in dict2 but with different value 
dict1[k]='/Users/brandomiranda/.opam/coq-8.10' 
dict2[k]='/Users/brandomiranda/.opam/test'

The only thing that confuses me is that it seems that subprocess has it's own process that does remember things. I say this because I would have expected the new subprocess that calls opam env to not be affected by the first opam switch set coq-8.10 but it seems it was affected. I expected the 2nd subprocess to spawned from the main python and be independent form the process that called opam switch set coq-8.10.


refs:

Charlie Parker
  • 5,884
  • 57
  • 198
  • 323
  • Note that `eval $(something)` is categorically buggy and should as a rule be replaced with `eval "$(something)"`. The former word-splits the output of `something`, expands each word as a glob, and then concatenates the resulting words together with spaces before running the result through the parser. The latter runs the output of `something` _directly_ through the shell parser without preceding steps. – Charles Duffy Feb 17 '23 at 23:34
  • Bigger picture, though: _What is the output written to stdout by `opam env --switch="$SWITCH" --set-switch`?_ The only time `eval` is appropriate is if that command writes well-formed, safely-escaped shell commands to its stdout. _If this is in fact what it does_, then there's a good reason to use `eval`: Only using `eval` _or running a function_ can modify the state of your already-active shell. (In the `opam switch` case, if this command is expected to modify shell state, `opam` is presumably expected to be a function already source'd or eval'd into your current shell). – Charles Duffy Feb 17 '23 at 23:36
  • (As another aside, `$var` references should be quoted for similar reasons -- hence `"$SWITCH"` above instead of bare `$SWITCH`). – Charles Duffy Feb 17 '23 at 23:40
  • (Similarly, as a whole, all-caps variable names should be avoided except when you're referring to a name meaningful to the shell or other operating-system-defined tools; see [the relevant POSIX specification](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html) defining the namespace of variables with at least one lower-case character as reserved for application use; as an application, if your script sticks to that namespace for its own names it's guaranteed not to stomp on variables meaningful to the system by mistake). – Charles Duffy Feb 17 '23 at 23:40
  • ...and yes, `eval` isn't an external command you can `exec`, so it's completely normal that you can't run it as a subprocess. It's shell syntax, so it can only be run _inside a shell_. – Charles Duffy Feb 17 '23 at 23:42
  • 1
    (Moreover, note that because the purpose of `eval`ing content is to change the state of the current shell, eval'ing content in a short-lived shell -- like that you get from using `shell=True` in `subprocess.Popen` -- only has effect for that shell's short life; it doesn't change the calling process, or other shells later invoked by that caller). – Charles Duffy Feb 17 '23 at 23:43
  • very relevant: https://discuss.ocaml.org/t/is-eval-opam-env-switch-switch-set-switch-equivalent-to-opam-switch-set-switch/10957/25 – Charlie Parker Feb 21 '23 at 00:29
  • @CharlesDuffy is it really true that subprocess only affects in that open shell? Then why does it seem that when I keep calling `suprocess.run` the env vars keep changing, see the new bounty section in my original question (although the ones from the main python process did not by inspecting `os.environ`). – Charlie Parker Feb 21 '23 at 00:31
  • I have trouble believing that, unless you're on a non-UNIXy operating system. `subprocess.run` can only change the variables of the new subprocess it starts, and it's always been so. – Charles Duffy Feb 21 '23 at 19:30
  • See f/e [Can a shell script set environment variables of a calling shell?](https://stackoverflow.com/questions/496702/can-a-shell-script-set-environment-variables-of-the-calling-shell) -- though the design limitations aren't specific to shell scripts; _no_ code written in _any_ language can modify its parent process's environment variables without that parent process's cooperation using only standard interfaces, and the nonstandard alternatives are highly irregular (think about things like connecting to your parent process with a debugger and forcing `setenv()` to be invoked). – Charles Duffy Feb 21 '23 at 19:31
  • (in `eval "$(something)"`, the `eval` is the "cooperation" referenced in the above comment; the shell that runs `eval` is running the stdout of `something` as code, which of course requires trusting `something` and its authors) – Charles Duffy Feb 21 '23 at 19:35
  • It could well be that `opam switch set coq-8.10` is adjusting filesystem content -- setting up symlinks or similar; _that_ wouldn't require adjusting environment variables. – Charles Duffy Feb 21 '23 at 20:40
  • @CharlesDuffy btw, thanks for all your feedback! Much more helpful that you think :) Btw, one final question, why can't we run `eval` in a python subprocess? I truly never understood that. – Charlie Parker Feb 22 '23 at 19:07

1 Answers1

3

You seem to use a Python function which tries to find an eval program. Such a program doesn't exist. It is an internal command of a Bourne Shell. This explains the error.

Given the way opam works, what should be useful to do is:

  1. Launch opam env --switch my_switch --set-switch,
  2. Analyse its output (multiple lines with a VAR=value syntax, some ; separators and export command to be ignored)
  3. Change the environment (os.environ[var] = value)

Afterward, the right ocaml will be found in the modified PATH and will be setup properly.


Back to the original question : What is the difference between eval $(opam env --switch={switch} --set-switch) and opam switch set SWITCH?

There are 3 ways to set the current opam switch : using a --switch option with all your opam commands, setting the OPAMSWITCH environment variable ($(opam env --switch={switch} --set-switch) does this), and setting a global state stored in your .opam directory with opam switch set.

The three ways work well with the opam command, but Ocaml programs (ocamlc, opcamlopt, ...) are not selected automatically when you set a switch. Then you have two ways to make your selected switch efficient : Using opam exec with every command, or using eval $(opam env) or eval $(opam env --switch SWITCH --set-switch) to set the PATH variable and some others. If you have an interactive shell, and some Ocaml hooks, eval $(opam env) will be implicit, then a opam switch set will be the most handy way of setting an other switch.

The eval command MUST be executed in a shell father of all the commands you want to execute in the selected switch. Then from a Python program, either execute opam exec with all commands, or mimick eval like proposed in the first part of the answer.

Frédéric LOYER
  • 1,064
  • 5
  • 10
  • I want build another ocaml project/pkg from within the main python process (regardless of how I build it, with opam install, opam pin, make, make -C etc even if I dispatch those from python) so that any of those build inherit the right opam switch env being set they we always need to update os.env. Right? – Charlie Parker Feb 20 '23 at 23:32
  • The environment of Python you have setup with `os.environ` is inherited. – Frédéric LOYER Feb 21 '23 at 10:11
  • I should have written `VAR=value; export VAR;` syntax... – Frédéric LOYER Feb 21 '23 at 16:32
  • @CharlieParker Of course, this is all moot if you `eval` opam's output **before** your Python process starts, instead of trying to fix things up _after_. – Charles Duffy Feb 21 '23 at 19:33
  • what I really want is whenever I run a new subprocess that it is on the right opam switch. I don't think I necessarily need the main python process to be on the right switch, only when I build future ocaml projs/pkgs that they have the right switch. – Charlie Parker Feb 21 '23 at 19:35
  • Right, but if you can have it set up right to begin with all the complexities go away: You have a shell interpreter, so you can `eval` instructions quoted as shell commands without needing to interpret them yourself as this answer instructs you to. – Charles Duffy Feb 21 '23 at 19:36
  • Not that there aren't other options -- you can have a subprocess start a new shell, eval opam output inside that new shell, and then run an arbitrary command of your choice _inside that same shell_, f/e -- but they're certainly more work. – Charles Duffy Feb 21 '23 at 19:36
  • 2
    An alternative : launching every commands through `opam exec --switch my_switch my_command`, then OPAM will create the environment for its command. However, it will cause some delay. My proposed solution is more efficient if you have multiple commands to launch (but needs more programming). – Frédéric LOYER Feb 21 '23 at 20:47
  • @CharlesDuffy impossible to set it up before. By definition of the problem I need to build ocaml projects/pkgs from python. That is none-negotiable. Each one has their own opam switch. – Charlie Parker Feb 21 '23 at 23:59
  • 1
    Then the `opam exec` approach suggested above is the right one. – Charles Duffy Feb 22 '23 at 00:24
  • @FrédéricLOYER I will be going with the `f"opam exec --switch {switch} -- {ocaml_build_cmd}"` approach of running that from python. One also has to set up thw cmd of the subprocess I believe. Once it works I will provide my code. I happened to also implement the thing that modifies the main python os.environ. Will share it too. Note the subprocesses created from the main python process don't neccesserily inhert the os.environ to it's env vars so one needs to pass them explicitly or just use opam exec with the right switch. Note we can't use eval in the subprocess subshell for some reason. – Charlie Parker Feb 22 '23 at 19:04
  • tldlr: I will use `f"opam exec --switch {switch} -- {ocaml_build_cmd}"` for building ocaml projects from python since that is the safest way to guarantee the right opam switch is set up. – Charlie Parker Feb 22 '23 at 19:04
  • one final question, why can't we run `eval` in a python subprocess? I truly never understood that. – Charlie Parker Feb 22 '23 at 19:06
  • @CharlieParker, you _can_ run `eval` if you have `shell=True`, just with a caveat that most of the reasons one uses `eval` (to set variables in the shell, change the shell's working directory, etc) are defeated by running it in a very-short-lived shell (where those variables / new directory / etc disappear when the shell exits) – Charles Duffy Feb 22 '23 at 19:23
  • 1
    @CharlieParker, btw, using f-strings to build code to execute with `shell=True` is a major security risk unless you use `shlex.quote()`. Much safer is to pass your parameters out-of-band from the code. `['opam', 'exec', '--switch', str(switch), '--'] + ocaml_build_args` with ocaml_build_args being another list/array is the Right Approach. – Charles Duffy Feb 22 '23 at 19:23
  • 1
    If you _don't_ have `shell=True`, then there _is no subshell at all_, which explains why you "can't use eval in the subprocess subshell". If you can't use eval even with `shell=True`... ask a new question about that with a [mre] and tag me in; I'll be able to explain what's going on when I can see it happen on my own machine, but not before. – Charles Duffy Feb 22 '23 at 19:25
  • (mind, technically, `subprocess` doesn't _ever_ start a subshell directly: a subshell is a shell created by `fork()`ing a prior shell without any `execve()`; when you `fork()` a Python interpreter you get another Python interpreter, not a shell at all, so to start a shell you need to use `execve()` to start it, so you get a subprocess that's a shell, but not a subshell) – Charles Duffy Feb 22 '23 at 19:26
  • If you create a subprocess with a shell which only execute eval, it will configure itself for a switch, and the configuration will be lost as soon as the shell exit. If you really want the eval option in a subprocess, you have to make this subprocess do something useful. Like execute: sh -c "eval $(…); your_command". But executing "opam exec …" will be more handy, and like said Charles Duffy, safer. – Frédéric LOYER Feb 22 '23 at 19:54
  • 1
    I don’t know why you want to follow a Python way, but if you want to avoid a shell dependancy, you should avoid the « eval » way and prefer other ways like the « opam exec ». – Frédéric LOYER Feb 22 '23 at 20:03
  • 1
    Also, note that `opam install` has a `--switch` option… then no need to use `opam exec` for this command. You can also do `os.environ['OPAMSWITCH']= my_switch`, then launch `opam install`. But this works only with opam. A direct ocaml compiler execution will not read this variable. – Frédéric LOYER Feb 22 '23 at 20:29
  • @FrédéricLoyer the reason I need to do it in python is because the machine learning (ML) interface for this is in python and ML is mostly in python (if that question was for me)> – Charlie Parker Feb 22 '23 at 23:12
  • @FrédéricLoyer I wish it worked with `opam install` but my understanding is that it some of the projects I have don't have `proj.opam` files so I can only build them with their pre-specified build command e.g. its often just `make` but sometimes it's something else like `./configure.sh && make`. – Charlie Parker Feb 22 '23 at 23:13
  • You can have a Python script for ML stuff and shell for compilation/installation. Note that `./configure.sh && make` is a shell expression. – Frédéric LOYER Feb 23 '23 at 16:44