7

I'm programming a CLI using clap to parse my arguments. I want to provide defaults for options, but if there's a config file, the config file should win against defaults.

It's easy to prioritize command line arguments over defaults, but I want the priority order of:

  1. command line arguments
  2. config file
  3. defaults

If the config file isn't set by the command line options, it's also easy to set that up, just by parsing the config file before running parse_args, and supplying the values from the parsed config file into default_value. The problem is that if you specify the config file in the command line, you can't change the defaults until after the parsing.

The only way I can think of doing it is by not setting a default_value and then manually match "" in value_of. The issue is that in that case, clap won't be able to build a useful --help.

Is there a way to get clap to read the config file itself?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
picked name
  • 309
  • 1
  • 3
  • 9

3 Answers3

6

For users of clap v3 or clap v4 which would benefit of the derive macro I solved this issue making two structs: one is the target struct, the other is the same but with all fields optional. I parse the second struct with serde from config file and from command line with clap, then the structs can be merged into the first struct: elements which are None were not present in config/command line arguments.

To facilitate this I created a derive macro (ClapSerde) which automatically:

  • creates the structure with optional fields
  • derives clap Parser and serde Deserialize on it
  • provides methods to merge from clap and from the deserialized (with serde) struct with optional fields into the target struct; this can be used to create a layered config parser, i.e. for the case requested
// Priority:
// 1. command line arguments (clap)
// 2. config file (serde)
// 3. defaults
Args::from(serde_parsed)
    .merge_clap();
  • implements default (with possible custom values) on the target function, which will be used when neither layer has a value for the field.

Example:

use clap_serde_derive::{
    clap::{self, ArgAction},
    serde::Serialize,
    ClapSerde,
};

#[derive(ClapSerde, Serialize)]
#[derive(Debug)]
#[command(author, version, about)]
pub struct Args {
    /// Input files
    pub input: Vec<std::path::PathBuf>,

    /// String argument
    #[arg(short, long)]
    name: String,

    /// Skip serde deserialize
    #[default(13)]
    #[serde(skip_deserializing)]
    #[arg(long = "num")]
    pub clap_num: u32,

    /// Skip clap
    #[serde(rename = "number")]
    #[arg(skip)]
    pub serde_num: u32,

    /// Recursive fields
    #[clap_serde]
    #[command(flatten)]
    pub suboptions: SubConfig,
}

#[derive(ClapSerde, Serialize)]
#[derive(Debug)]
pub struct SubConfig {
    #[default(true)]
    #[arg(long = "no-flag", action = ArgAction::SetFalse)]
    pub flag: bool,
}

fn main() {
    let args = Args::from(serde_yaml::from_str::<<Args as ClapSerde>::Opt>("number: 12").unwrap())
        .merge_clap();
    println!("{:?}", args);
}

Note the above needs the following in Cargo.toml:

[dependencies]
clap = "*"
serde = "*"
serde_yaml = "*"
clap-serde-derive = "*"

There are already many crates on cargo which aims to achieve similar results (eg. viperus, twelf, layeredconf) but they use old versions of clap without derive and/or do not have a way to define unique defaults for clap and serde.
I hope this derive macro would be useful.

UPDATE

You can easily take the config file path from command line in this way.

use std::{fs::File, io::BufReader};

use clap_serde_derive::{
    clap::{self, Parser},
    ClapSerde,
};

#[derive(Parser)]
#[clap(author, version, about)]
struct Args {
    /// Input files
    input: Vec<std::path::PathBuf>,

    /// Config file
    #[clap(short, long = "config", default_value = "config.yml")]
    config_path: std::path::PathBuf,

    /// Rest of arguments
    #[clap(flatten)]
    pub config: <Config as ClapSerde>::Opt,
}

#[derive(ClapSerde)]
struct Config {
    /// String argument
    #[clap(short, long)]
    name: String,
}

fn main() {
    // Parse whole args with clap
    let mut args = Args::parse();

    // Get config file
    let config = if let Ok(f) = File::open(&args.config_path) {
        // Parse config with serde
        match serde_yaml::from_reader::<_, <Config as ClapSerde>::Opt>(BufReader::new(f)) {
            // merge config already parsed from clap
            Ok(config) => Config::from(config).merge(&mut args.config),
            Err(err) => panic!("Error in configuration file:\n{}", err),
        }
    } else {
        // If there is not config file return only config parsed from clap
        Config::from(&mut args.config)
    };
}
DPD-
  • 412
  • 1
  • 5
  • 15
  • I think this solution is missing the most important aspect of the question "The problem is that if you specify the config file in the command line ...". – John Vandenberg Dec 23 '22 at 03:19
  • @JohnVandenberg My bad I didn't explicitly write it, but taking advantage of the merge functionality of clap-serde-derive you can easily achieve it. There is an example on the documentation: https://crates.io/crates/clap-serde-derive#config-path-from-command-line. – DPD- Dec 23 '22 at 20:35
  • In practice you declare two structs: one deriving ClapSerde for the config readable from both cmd and the file, the other deriving only clap, and containing a field for the config file path and flattening the first struct. Then you parse the second struct with clap and retrive the config file path. If it exist you use clap-serde-derive to load it and merge the already clap parsed config on top of it. – DPD- Dec 23 '22 at 20:42
  • To be noted that the crate is AGPL, making it a non-starter for most people. – lu_zero Apr 04 '23 at 13:39
4

From clap's documentation on default_value:

NOTE: If the user does not use this argument at runtime ArgMatches::is_present will still return true. If you wish to determine whether the argument was used at runtime or not, consider ArgMatches::occurrences_of which will return 0 if the argument was not used at runtime.

https://docs.rs/clap/2.32.0/clap/struct.Arg.html#method.default_value

This can be utilized to get the behavior you described:

extern crate clap;
use clap::{App, Arg};
use std::fs::File;
use std::io::prelude::*;

fn main() {
    let matches = App::new("MyApp")
        .version("0.1.0")
        .about("Example for StackOverflow")
        .arg(
            Arg::with_name("config")
                .short("c")
                .long("config")
                .value_name("FILE")
                .help("Sets a custom config file"),
        )
        .arg(
            Arg::with_name("example")
                .short("e")
                .long("example")
                .help("Sets an example parameter")
                .default_value("default_value")
                .takes_value(true),
        )
        .get_matches();

    let mut value = String::new();

    if let Some(c) = matches.value_of("config") {
        let file = File::open(c);
        match file {
            Ok(mut f) => {
                // Note: I have a file `config.txt` that has contents `file_value`
                f.read_to_string(&mut value).expect("Error reading value");
            }
            Err(_) => println!("Error reading file"),
        }

        // Note: this lets us override the config file value with the
        // cli argument, if provided
        if matches.occurrences_of("example") > 0 {
            value = matches.value_of("example").unwrap().to_string();
        }
    } else {
        value = matches.value_of("example").unwrap().to_string();
    }

    println!("Value for config: {}", value);
}

// Code above licensed CC0
// https://creativecommons.org/share-your-work/public-domain/cc0/ 

Resulting in the behavior:

./target/debug/example
Value for config: default_value
./target/debug/example --example cli_value
Value for config: cli_value
./target/debug/example --config config.txt
Value for config: file_value
./target/debug/example --example cli_value --config config.txt
Value for config: cli_value
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Walther
  • 156
  • 2
  • 6
0

My solution was to use clap (version 4.2.1) + confy (version 0.5.1).
"confy takes care of figuring out operating system specific and environment paths before reading and writing a configuration."

This solution doesn't need to specify the config file on the command line.
The configuration file will be generated automatically and will have the same name as the main program.

I created a program called 'make_args' with the following files:

My Cargo.toml:

[package]
name = "make_args"
version = "0.1.0"
edition = "2021"

[dependencies]
confy = "0.5"
toml = "0.7"
serde_derive = "1"
serde = { version = "1", features = [ "derive" ] }
clap = { version = "4", features = [
    "derive",
    "color",
    "env",
    "help",
] }

The main.rs:

use std::error::Error;
mod args;
use args::Arguments;

fn main() -> Result<(), Box<dyn Error>> {

    let _args: Arguments = Arguments::build()?;

    Ok(())
}

And the module args.rs:

use serde::{Serialize, Deserialize};
use clap::{Parser, CommandFactory, Command};
use std::{
    default,
    error::Error,
    path::PathBuf,
};
 
/// Read command line arguments with priority order:
/// 1. command line arguments
/// 2. environment
/// 3. config file
/// 4. defaults
///
/// At the end add or update config file.
/// 
#[derive(Debug, Clone, PartialEq, Parser, Serialize, Deserialize)]
#[command(author, version, about, long_about = None, next_line_help = true)]
pub struct Arguments {
    /// The first file with CSV format.
    #[arg(short('1'), long, required = true)]
    pub file1: Option<PathBuf>,

    /// The second file with CSV format.
    #[arg(short('2'), long, required = true)]
    pub file2: Option<PathBuf>,

    /// Optionally, enter the delimiter for the first file.
    /// The default delimiter is ';'.
    #[arg(short('a'), long, env("DELIMITER_FILE1"), required = false)]
    pub delimiter1: Option<char>,

    /// Optionally, enter the delimiter for the second file.
    /// The default delimiter is ';'.
    #[arg(short('b'), long, env("DELIMITER_FILE2"), required = false)]
    pub delimiter2: Option<char>,

    /// Print additional information in the terminal
    #[arg(short('v'), long, required = false)]
    verbose: Option<bool>,
}

/// confy needs to implement the default Arguments.
impl default::Default for Arguments {
    fn default() -> Self {
        Arguments {
            file1: None, 
            file2: None, 
            delimiter1: Some(';'), 
            delimiter2: Some(';'),  
            verbose: Some(true),
        }
    }
}

impl Arguments {

    /// Build Arguments struct
    pub fn build() -> Result<Self, Box<dyn Error>> {

        let app: Command = Arguments::command();
        let app_name: &str = app.get_name();

        let args: Arguments = Arguments::parse()
            .get_config_file(app_name)?
            .set_config_file(app_name)?
            .print_config_file(app_name)?;

        Ok(args)
    }

    /// Get configuration file.
    /// A new configuration file is created with default values if none exists.
    fn get_config_file(mut self, app_name: &str) -> Result<Self, Box<dyn Error>> {

        let config_file: Arguments = confy::load(app_name, None)?;

        self.file1 = self.file1.or(config_file.file1);
        self.file2 = self.file2.or(config_file.file2);
        self.delimiter1 = self.delimiter1.or(config_file.delimiter1);
        self.delimiter2 = self.delimiter2.or(config_file.delimiter2);
        self.verbose = self.verbose.or(config_file.verbose);

        Ok(self)
    }

    /// Save changes made to a configuration object
    fn set_config_file(self, app_name: &str) -> Result<Self, Box<dyn Error>> {
        confy::store(app_name, None, self.clone())?;
        Ok(self)
    }

    /// Print configuration file path and its contents
    fn print_config_file (self, app_name: &str) -> Result<Self, Box<dyn Error>> {

        if self.verbose.unwrap_or(true) {

            let file_path: PathBuf = confy::get_configuration_file_path(app_name, None)?;
            println!("Configuration file: '{}'", file_path.display());

            let toml: String = toml::to_string_pretty(&self)?;
            println!("\t{}", toml.replace('\n', "\n\t"));
        }

        Ok(self)
    }
}

After running cargo without args, the output was:

cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/make_args`
error: the following required arguments were not provided:
  --file1 <FILE1>
  --file2 <FILE2>

Usage: make_args --file1 <FILE1> --file2 <FILE2>

For more information, try '--help'.

Note that the 'required' option can be changed to 'true' or 'false'.

#[arg(short('1'), long, required = true)]

And running cargo with some arguments, the output was:

cargo run -- -1 /tmp/file1.csv -2 /tmp/file2.csv 
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/make_args -1 /tmp/file1.csv -2 /tmp/file2.csv`
Configuration file: '/home/claudio/.config/make_args/default-config.toml'
    file1 = "/tmp/file1.csv"
    file2 = "/tmp/file2.csv"
    delimiter1 = ";"
    delimiter2 = ";"
    verbose = true
Claudio Fsr
  • 106
  • 6