Scala: CLI args parser
In many of CLI based apps, one would have to take one or more program arguments for the execution workflow.
Hand wiring these would require quite a bit of boilerplate. Hence a lightweight declarative based CLI processor would be a reasonable choice.
This article provides a light introduction to using scopt
.
scopt
is a command-line option parser. It can help with extracting
- string, int, file, boolean,… arg params in a type-safe manner.
- tuple args
- comma separated args as list
- equals+comma separated args as map
It features the above extraction and enables building the args help.
Let’s try to understand this using an example program-output.
A cli
program would be able to run successfully when the following args are provided.
-i /tmp/tar_input -o /tmp/output -x xml -c 2 -w overwrite
When no args are passed to a CLI program…
Error: Missing option --output
tar-to-parquet-converter 1.0
Usage: tar-to-parquet-converter [options]
-i, --input <value> input directory containing tar.gz files
-f, --input-format <value>
input format (default: tar.gz)
-o, --output <value> output directory for parquet files
-s, --chunk-size <value>
num records per output (default: 2000)
-x, --file-extension <value>
chunk size in MB (default: xml)
-w, --write-mode <value>
write-mode to output (default: append-only)
-c, --cores <value> cores (default: Available cores / 10)
Let’s break this down into sections…
programName("tar-to-parquet-converter")
head("tar-to-parquet-converter", "1.0")
opt[String]('i', "input")
scopt
Option Parser uses a builder to take the above args.
For each of the opt
value (like input
, output
…)
//required field
opt[String]('i', "input")
.required()
.action((x, c) => c.copy(inputDir = x))
.text(s"input directory containing ${default.inputFormat} files")
//optional field with string arg becoming a boolean value
opt[String]('w', "write-mode")
.action {
case ("overwrite", c) => c.copy(overwrite = true)
case ("append-only", c) => c.copy(overwrite = false)
case (_, c) => c
}
.text(s"write-mode to output (default: append-only)")
//Int value captured
opt[Int]('c', "cores")
.action((x, c) => c.copy(cores = x))
.text(s"cores (default: Available cores / ${default.cores})")
The builder can be initialized using the following fragment.
import scopt.OParser
val builder = OParser.builder[Config]
import builder._
OParser.sequence(
programName("tar-to-parquet-converter"),
head("tar-to-parquet-converter", "1.0"),
opt[String]('i', "input")
.required()
.action((x, c) => c.copy(inputDir = x))
.text(s"input directory containing ${default.inputFormat} files")
)
Here is a complete example
import scopt.OParser
case class Config(
inputDir: String = "",
outputDir: String = "",
chunkSize: Int = 2000,
inputFormat: String = "tar.gz",
fileExtn: String = "xml",
overwrite: Boolean = false,
cores: Int = Runtime.getRuntime.availableProcessors()
)
object Config {
def buildCLIParser: OParser[Unit, Config] = {
val builder = OParser.builder[Config]
val parser = {
val default = Config()
import builder._
OParser.sequence(
programName("tar-to-parquet-converter"),
head("tar-to-parquet-converter", "1.0"),
opt[String]('i', "input")
.required()
.action((x, c) => c.copy(inputDir = x))
.text(s"input directory containing ${default.inputFormat} files"),
opt[String]('f', "input-format")
.action((x, c) => c.copy(inputFormat = x))
.text(s"input format (default: ${default.inputFormat})"),
opt[String]('o', "output")
.required()
.action((x, c) => c.copy(outputDir = x))
.text("output directory for parquet files"),
opt[Int]('s', "chunk-size")
.action((x, c) => c.copy(chunkSize = x))
.text(s"num records per output (default: ${default.chunkSize})"),
opt[String]('x', "file-extension")
.action((x, c) => c.copy(fileExtn = x))
.text(s"chunk size in MB (default: ${default.fileExtn})"),
opt[String]('w', "write-mode")
.action {
case ("overwrite", c) => c.copy(overwrite = true)
case ("append-only", c) => c.copy(overwrite = false)
case (_, c) => c
}
.text(s"write-mode to output (default: append-only)"),
opt[Int]('c', "cores")
.action((x, c) => c.copy(cores = x))
.text(s"cores (default: Available cores / ${default.cores})")
)
}
parser
}
}
Other opt
examples
possible opt
types
//Mandatory String arg
opt[String]('i', "input")
.required()
.action((x, c) => c.copy(inputDir = x))
// -i value_1
//--input value_1
//Optional Int arg
opt[Int]('s', "chunk-size")
.action((x, c) => c.copy(chunkSize = x))
// -s 300
//--chunk-size 300
//Optional List value args
opt[Seq[File]]('j', "jars")
.valueName("<jar1>,<jar2>...")
.action((x, c) => c.copy(jars = x))
.text("jars to include")
// --jars foo.jar,bar.jar
//Optional Map value args
opt[Map[String, String]]("kwargs")
.valueName("k1=v1,k2=v2...")
.action((x, c) => c.copy(kwargs = x))
.text("other arguments")
// --kwargs key1=val1,key2=val2
//Arg without value
opt[Unit]("verbose")
.action((_, c) => c.copy(verbose = true))
.text("verbose is a flag")
// --verbose
//Hidden arg. Useful to not advertising these options in help.
opt[Unit]("debug")
.hidden()
.action((_, c) => c.copy(debug = true))
.text("this option is hidden in the usage text")
//opt using explicit abbr()
opt[Unit]("not-keepalive")
.abbr("nk")
.action((_, c) => c.copy(keepalive = false))
.text("disable keepalive")
// -nk
// --not-keepalive
Possible type values are available at https://github.com/scopt/scopt?tab=readme-ov-file#options
checkConfig
Once the opts are provided, combination of values can be validated using
checkConfig(
c =>
if (c.keepalive && c.xyz) failure("xyz cannot keep alive")
else success)
)
//Here 'c' is of type Config
case class Config(keepalive: Boolean, xyz: Boolean)
command
scopt
also supports a feature called command
. This can be understood thru popular cli examples
Here branch
and checkout
are commands from git
CLI program. Where, -r
is only applicable when branch
command is used.
The way command
can be provided in scopt
//single command
cmd("update")
.action( (_, c) => c.copy(mode = "update") )
.text("update is a command.")
.children(
opt[Unit]("not-keepalive").abbr("nk").action( (_, c) =>
c.copy(keepalive = false) ).text("disable keepalive"),
opt[Boolean]("xyz").action( (x, c) =>
c.copy(xyz = x) ).text("xyz is a boolean property"),
checkConfig( c =>
if (c.keepalive && c.xyz) failure("xyz cannot keep alive")
else success )
)
// update -nk --xyz false
//nesting 'backend' with 'update' command using unbound args.
cmd("backend")
.text("commands to manipulate backends:\n")
.action( (x, c) => c.copy(flag = true) )
.children(
cmd("update").children(
arg[String]("<a>").action( (x, c) => c.copy(a = x) )
)
)
//[prg] backend update val1
// here a='val1'
For a complete reference, please refer to https://github.com/scopt/scopt?tab=readme-ov-file#scopt