参考:
http://www.gnu.org/software/libc/manual/html_node/Example-of-Getopt.html
http://en.wikipedia.org/wiki/Getopt
http://www.lemoda.net/c/getopt/
http://www.ibm.com/developerworks/aix/library/au-unix-getopt.html
http://stackoverflow.com/questions/16483119/example-of-how-to-use-getopt-in-bash
Example of Parsing Arguments with getopt
Here is an example showing how getopt
is typically used. The key points to notice are:
- Normally,
getopt
is called in a loop. Whengetopt
returns-1
, indicating no more options are present, the loop terminates. - A
switch
statement is used to dispatch on the return value fromgetopt
. In typical use, each case just sets a variable that is used later in the program. - A second loop is used to process the remaining non-option arguments.
#include <ctype.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> int main (int argc, char **argv) { int aflag = 0; int bflag = 0; char *cvalue = NULL; int index; int c; opterr = 0; while ((c = getopt (argc, argv, "abc:")) != -1) switch (c) { case 'a': aflag = 1; break; case 'b': bflag = 1; break; case 'c': cvalue = optarg; break; case '?': if (optopt == 'c') fprintf (stderr, "Option -%c requires an argument.\n", optopt); else if (isprint (optopt)) fprintf (stderr, "Unknown option `-%c'.\n", optopt); else fprintf (stderr, "Unknown option character `\\x%x'.\n", optopt); return 1; default: abort (); } printf ("aflag = %d, bflag = %d, cvalue = %s\n", aflag, bflag, cvalue); for (index = optind; index < argc; index++) printf ("Non-option argument %s\n", argv[index]); return 0; }
Here are some examples showing what this program prints with different combinations of arguments:
% testopt aflag = 0, bflag = 0, cvalue = (null) % testopt -a -b aflag = 1, bflag = 1, cvalue = (null) % testopt -ab aflag = 1, bflag = 1, cvalue = (null) % testopt -c foo aflag = 0, bflag = 0, cvalue = foo % testopt -cfoo aflag = 0, bflag = 0, cvalue = foo % testopt arg1 aflag = 0, bflag = 0, cvalue = (null) Non-option argument arg1 % testopt -a arg1 aflag = 1, bflag = 0, cvalue = (null) Non-option argument arg1 % testopt -c foo arg1 aflag = 0, bflag = 0, cvalue = foo Non-option argument arg1 % testopt -a -- -b aflag = 1, bflag = 0, cvalue = (null) Non-option argument -b % testopt -a - aflag = 1, bflag = 0, cvalue = (null) Non-option argument -
Example of getopt in C
#include <stdio.h> /* getopt is defined in "unistd.h". */ #include <unistd.h> int main (int argc, char ** argv) { int i; while (1) { char c; c = getopt (argc, argv, "ab:"); if (c == -1) { /* We have finished processing all the arguments. */ break; } switch (c) { case 'a': printf ("User has invoked with -a.\n"); break; case 'b': printf ("User has invoked with -b %s.\n", optarg); break; case '?': default: printf ("Usage: %s [-a] [-b <something>].\n", argv[0]); } } /* Now set the values of "argc" and "argv" to the values after the options have been processed, above. */ argc -= optind; argv += optind; /* Now do something with the remaining command-line arguments, if necessary. */ if (argc > 0) { printf ("There are %d command-line arguments left to process:\n", argc); for (i = 0; i < argc; i++) { printf (" Argument %d: '%s'\n", i + 1, argv[i]); } } return 0; }
example of how to use getopt in bash
#!/bin/bash usage() { echo "Usage: $0 [-s <45|90>] [-p <string>]" 1>&2; exit 1; } while getopts ":s:p:" o; do case "${o}" in s) s=${OPTARG} ((s == 45 || s == 90)) || usage ;; p) p=${OPTARG} ;; *) usage ;; esac done shift $((OPTIND-1)) if [ -z "${s}" ] || [ -z "${p}" ]; then usage fi echo "s = ${s}" echo "p = ${p}" Example runs: $ ./myscript.sh Usage: ./myscript.sh [-s <45|90>] [-p <string>] $ ./myscript.sh -h Usage: ./myscript.sh [-s <45|90>] [-p <string>] $ ./myscript.sh -s "" -p "" Usage: ./myscript.sh [-s <45|90>] [-p <string>] $ ./myscript.sh -s 10 -p foo Usage: ./myscript.sh [-s <45|90>] [-p <string>] $ ./myscript.sh -s 45 -p foo s = 45 p = foo $ ./myscript.sh -s 90 -p bar s = 90 p = bar
Command-line processing with getopt()
Introduction
Early in its evolution, the command-line environment of UNIX® (its only user interface back then) became dominated by dozens of small text-processing tools. These tools were small and generally did one thing well. The tools were chained together in longer command pipelines, one program passing its output to the next as input, and controlled by a variety of command-line options and arguments.
This is one aspect of UNIX that makes it a supremely powerful environment for processing text-based data, one of its first uses in a corporate environment. Dump some text in one end of a command pipeline and retrieve processed output from the other end.
Command-line options and arguments control UNIX programs and tell them how to behave. As a developer, it's your responsibility to discover the user's intentions from the command line passed to your program's main()
function. This article shows you how to use the standard getopt()
and getopt_long()
functions to simplify your command-line processing, and it covers one technique for keeping track of your command-line options.
Before you start
The sample code included with this article (see Downloads) was written in Eclipse 3.1 using the C Development Tooling (CDT); the getopt_demo and getopt_long_demo projects are Managed Make projects, which are built using the CDT's program-generation rules. You won't find a Makefile in the project, but it's so trivial that you'll have no trouble generating one if you need to compile the code outside of Eclipse.
If you haven't tried using Eclipse yet (see Resources), you should really give it a go -- it's an excellent integrated development environment (IDE) that just gets better with each release. And that's coming from a die-hard EMACS- and Makefile-based developer.
Command lines
When you're working on a new program, one of the first obstacles you'll face is what to do about the command-line arguments that control its behavior. These are passed from the command line to your program's main()
function as an integer count (traditionally named argc) and an array of pointers to strings (traditionally named argv). The standard main()
function can be declared in two different ways that are essentially the same, as shown in Listing 1.
Listing 1. The main()
function's double life
int main( int argc, char *argv[] ); int main( int argc, char **argv );
The first one, with its array of pointers to char
, seems to be more fashionable these days, and slightly less confusing than the second version, with its pointer to pointers to char
. For some reason, I tend to use the second form more often, possibly to represent my hard-won victory over the C pointer learning curve way back in high school. For all intents and purposes, these are identical, so use whichever one appeals to you the most.
When your main()
is called by the C runtime library's program startup code, the command line has already been processed. The argc
argument contains a count of arguments, and argv
contains an array of pointers to those arguments. To the C runtime library, arguments are the program's name, and anything after the program's name should be separated by whitespace.
For example, if you ran a program named foo with arguments of -v bar www.ibm.com
, your argc would be set to 4, and argv
would be set up as shown in Listing 2.
Listing 2. argv's contents
argv[0] - foo argv[1] - -v argv[2] - bar argv[3] - www.ibm.com
A program has only one set of command-line arguments, so I'm going to store this information in a global structure that tracks options and settings. Anything that makes sense for the program to track globally can go in this structure, and I'm using a structure to help reduce the number of global variables. As I mentioned in my network services design article (see Resources), globals are bad for threaded programming, so it's a good idea to use them carefully.
The sample code is going to show command-line processing for an imaginary doc2html program. The doc2html program translates some sort of document into HTML, controlled by the command-line options specified by the user. It supports the following options:
-I
-- Don't create a keyword index.-l lang
-- Translate into the language specified using the language code,lang
.-o outfile.html
-- Write the translated document to outfile.html instead of printing to standard output.-v
- - Be verbose while translating; can be specified multiple times to increase the diagnostic level.- Additional file names will be used as input document.
You'll also support -h
and -?
to print a help message that gives the user a reminder about these options.
Simple command-line processing: getopt()
The getopt()
function, which lives in the unistd.h system header file, is shown in Listing 3:
Listing 3. getopt()
prototype
int getopt( int argc, char *const argv[], const char *optstring );
Given a number of command-line arguments (argc
), an array of pointers to those arguments (argv
), and an option string (optstring
), getopt()
returns the first option, and sets some global variables. When you call it again with the same arguments, it returns the next option, and sets the global variables. If no more recognized options are found, it returns -1
and you're done.
The global variables set by getopt()
include:
optarg
-- A pointer to the current option argument, if there is one.optind
-- An index of the next argv pointer to process whengetopt()
is called again.optopt
-- This is the last known option.
The option string (optstring
) is one character per option. Options that have arguments, such as the -l
and -o
options in the example, are followed by a :
character. The optstring
used by the example is Il:o:vh?
(remember, you also want to support the last two options for printing the program's usage message).
You call getopt()
repeatedly until it returns -1
; any remaining command-line arguments are usually considered file names or something else appropriate for the program.
getopt()
in action
Let's walk through the getopt_demo project's code; I've split it up here to make it easier to talk about, but you can see it in its full glory in the downloadable source code (see Downloads).
In Listing 4, you can see the system headers used by the demo program; standard fare with stdio.h
for standard I/O function prototypes, stdlib.h
for EXIT_SUCCESS
and EXIT_FAILURE
, and unistd.h
for getopt()
.
Listing 4. System headers
#include <stdio.h> #include <stdlib.h> #include <unistd.h>
Listing 5 shows the globalArgs
structure I've created to store the command-line options in a sensible manner. Since it's a global variable, code anywhere in the program can access these variables to see whether to create a keyword index, which language to generate, and so on. It's a good idea for code outside of the main()
function to treat this structure as a constant, read-only storage area, since any part of the program could depend on its contents.
There's one variable per command-line option, with extra variables to store the output file name, a pointer to the list of input files, and the number of input files.
Listing 5. Global argument storage and option string
struct globalArgs_t { int noIndex; /* -I option */ char *langCode; /* -l option */ const char *outFileName; /* -o option */ FILE *outFile; int verbosity; /* -v option */ char **inputFiles; /* input files */ int numInputFiles; /* # of input files */ } globalArgs; static const char *optString = "Il:o:vh?";
The option string, optString
, tells getopt()
which options you can process, and which options require an argument. If other options are encountered during processing, getopt()
displays an error message, and the program exits after displaying a usage message.
Listing 6 contains some small stubs for the usage message function and document conversion function referenced from main()
, below. Feel free to make these do something more useful than nothing!
Listing 6. Stubs
void display_usage( void ) { puts( "doc2html - convert documents to HTML" ); /* ... */ exit( EXIT_FAILURE ); } void convert_document( void ) { /* ... */ }
Finally, with Listing 7, you've made it to the main()
function. Like good developers, you need to initialize the globalArgs
structure before you begin processing the command-line arguments. In your programs, you can use this to set up reasonable defaults for your options in one place, which will make it easier to tweak later if more reasonable defaults come to light.
Listing 7. Initialization
int main( int argc, char *argv[] ) { int opt = 0; /* Initialize globalArgs before we get to work. */ globalArgs.noIndex = 0; /* false */ globalArgs.langCode = NULL; globalArgs.outFileName = NULL; globalArgs.outFile = NULL; globalArgs.verbosity = 0; globalArgs.inputFiles = NULL; globalArgs.numInputFiles = 0;
The while
loop and switch
statement in Listing 8 are the meat of the command-line processing for this program. Whenever getopt()
discovers an option, the switch
statement decides which option was found, and you take note of that in the globalArgs
structure. When getopt()
finally returns -1
, you're done processing options, and the remaining arguments are your input files.
Listing 8. Processing argc/argv with getopt()
opt = getopt( argc, argv, optString ); while( opt != -1 ) { switch( opt ) { case 'I': globalArgs.noIndex = 1; /* true */ break; case 'l': globalArgs.langCode = optarg; break; case 'o': globalArgs.outFileName = optarg; break; case 'v': globalArgs.verbosity++; break; case 'h': /* fall-through is intentional */ case '?': display_usage(); break; default: /* You won't actually get here. */ break; } opt = getopt( argc, argv, optString ); } globalArgs.inputFiles = argv + optind; globalArgs.numInputFiles = argc - optind;
Now that you're done collecting arguments and options, you can do whatever it is the program was built for (in this case, converting documents), and exit (Listing 9).
Listing 9. Go to work
convert_document(); return EXIT_SUCCESS; }
There, done. Perfect. You can stop reading now. Unless you want to bring your program up to the standards of the late '90s and support long options, popularized in GNU applications.
Complex command-line processing: getopt_long()
At some point in the 1990s (if memory serves), UNIX applications started supporting long options, a pair of dashes instead of the single dash used for normal short options, a descriptive option name, and possibly an argument connected to the option with an equal sign.
Luckily, you can add support for long options to your program by using getopt_long()
. As you might have already guessed, getopt_long()
is a version of getopt()
that supports long options in addition to the short options.
The getopt_long()
function takes additional arguments, one of which is a pointer to an array of struct option
objects. This structure is straightforward, as you can see from Listing 10.
Listing 10. option for getopt_long()
struct option { char *name; int has_arg; int *flag; int val; };
The name
member is a pointer to the long option's name without the double dashes. The has_arg
member is set to one of no_argument
, optional_argument
, or required_argument
(all defined in getopt.h
) to indicate whether this option has an argument or not. If the flag member isn't set to NULL, the int
it points to will be filled with the value in the val
member when this option is encountered during processing. If the flag member is NULL
, the value in val
is returned by getopt_long()
when it encounters this option; by setting val
to the option's short
argument, getopt_long()
can be used without adding any additional code -- the existing getopt()
that handles while loop
and switch
automatically handles this option.
Already this is more flexible, since options can now have optional arguments. More importantly, it's easy to drop into your existing code with very little work.
Let's see how using getopt_long()
changes the sample program (the getopt_long_demo project can be found in Downloads).
Using getopt_long()
Since the getopt_long_demo is nearly the same as the getopt_demo code you already looked at, I'll just take you through the changed bits. Because you've got more flexibility now, you'll also add support for a --randomize
option, without a corresponding short option.
The getopt_long()
function resides in the getopt.h
header instead of unistd.h
, so you'll need to include that (see Listing 11). I've also included string.h
, because you'll use strcmp()
later to help figure out which long argument you're dealing with.
Listing 11. Additional headers
#include <getopt.h> #include <string.h>
You've added a flag (see Listing 12) to the globalArgs
for the --randomize
option, and created the longOpts
array to hold information about the long options supported by this program. Except for --randomize
, all of the arguments correspond to existing short options (--no-index
is the same as -I
, for example). By including their short option equivalent as the last entry in the option structure, you can handle the equivalent long options without adding any extra code to the program.
Listing 12. Expanded arguments
struct globalArgs_t { int noIndex; /* -I option */ char *langCode; /* -l option */ const char *outFileName; /* -o option */ FILE *outFile; int verbosity; /* -v option */ char **inputFiles; /* input files */ int numInputFiles; /* # of input files */ int randomized; /* --randomize option */ } globalArgs; static const char *optString = "Il:o:vh?"; static const struct option longOpts[] = { { "no-index", no_argument, NULL, 'I' }, { "language", required_argument, NULL, 'l' }, { "output", required_argument, NULL, 'o' }, { "verbose", no_argument, NULL, 'v' }, { "randomize", no_argument, NULL, 0 }, { "help", no_argument, NULL, 'h' }, { NULL, no_argument, NULL, 0 } };
Listing 13 changes the getop()
calls to getopt_long()
, which takes the longOpts
array and an int
pointer (longIndex
) in addition to getopt()
's arguments. The integer pointed to by longIndex
will be set to the index of the currently found long option when getopt_long()
returns 0
.
Listing 13. New and improved option handling
opt = getopt_long( argc, argv, optString, longOpts, &longIndex ); while( opt != -1 ) { switch( opt ) { case 'I': globalArgs.noIndex = 1; /* true */ break; case 'l': globalArgs.langCode = optarg; break; case 'o': globalArgs.outFileName = optarg; break; case 'v': globalArgs.verbosity++; break; case 'h': /* fall-through is intentional */ case '?': display_usage(); break; case 0: /* long option without a short arg */ if( strcmp( "randomize", longOpts[longIndex].name ) == 0 ) { globalArgs.randomized = 1; } break; default: /* You won't actually get here. */ break; } opt = getopt_long( argc, argv, optString, longOpts, amp;longIndex ); }
I've also added a case for 0
where you can handle any long options that don't map to existing short options. In this case, you have only one long option, but the code still uses strcmp()
to make sure it's the one you're expecting.
That's all there is to it; the program now supports more verbose (and more casual user-friendly) long options.
Summary
UNIX users have always depended on command-line arguments to modify the behavior of programs, especially utilities designed to be used as part of the collection of small tools that is the UNIX shell environment. Programs need to be able to handle options and arguments quickly, and without wasting a lot of the developer's time. After all, few programs are designed to simply process command-line arguments, and the developer would rather be working on whatever the program really does.
The getopt()
function is a standard library call that lets you loop over a program's command-line arguments and detect options (with or without arguments attached to them) easily using a straightforward while/switch idiom. Its cousin, getopt_long()
, lets you handle the more descriptive long options with almost no additional work, which is something that makes developers very happy.
Now that you've seen how to easily handle command-line options, you can concentrate on improving your program's command line by adding support for long options, and by adding any additional options you might have been putting off because you didn't want to add additional command-line option handling to your program.
Don't forget to document all of your options and arguments somewhere, and to provide a built-in help function of some sort to help remind