[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1. Introduction and preliminaries

This first chapter explains what GNU m4 is, where m4 comes from, how to read and use this documentation, how to call the m4 program, and how to report bugs about it. It concludes by giving tips for reading the remainder of the manual.

The following chapters then detail all the features of the m4 language.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.1 Introduction to m4

m4 is a macro processor, in the sense that it copies its input to the output, expanding macros as it goes. Macros are either builtin or user-defined, and can take any number of arguments. Besides just doing macro expansion, m4 has builtin functions for including named files, running shell commands, doing integer arithmetic, manipulating text in various ways, performing recursion, etc.… m4 can be used either as a front-end to a compiler, or as a macro processor in its own right.

The m4 macro processor is widely available on all UNIXes, and has been standardized by POSIX. Usually, only a small percentage of users are aware of its existence. However, those who find it often become committed users. The popularity of GNU Autoconf, which requires GNU m4 for generating `configure' scripts, is an incentive for many to install it, while these people will not themselves program in m4. GNU m4 is mostly compatible with the System V, Release 3 version, except for some minor differences. See section Compatibility with other versions of m4, for more details.

Some people find m4 to be fairly addictive. They first use m4 for simple problems, then take bigger and bigger challenges, learning how to write complex sets of m4 macros along the way. Once really addicted, users pursue writing of sophisticated m4 applications even to solve simple problems, devoting more time debugging their m4 scripts than doing real work. Beware that m4 may be dangerous for the health of compulsive programmers.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.2 Historical references

GPM was an important ancestor of m4. See C. Stratchey: "A General Purpose Macro generator", Computer Journal 8,3 (1965), pp. 225 ff. GPM is also succinctly described into David Gries classic "Compiler Construction for Digital Computers".

The classic B. Kernighan and P.J. Plauger: "Software Tools", Addison-Wesley, Inc. (1976) describes and implements a Unix macro-processor language, which inspired Dennis Ritchie to write m3, a macro processor for the AP-3 minicomputer.

Kernighan and Ritchie then joined forces to develop the original m4, as described in "The M4 Macro Processor", Bell Laboratories (1977). It had only 21 builtin macros.

While GPM was more pure, m4 is meant to deal with the true intricacies of real life: macros can be recognized without being pre-announced, skipping whitespace or end-of-lines is easier, more constructs are builtin instead of derived, etc.

Originally, the Kernighan and Plauger macro-processor, and then m3, formed the engine for the Rational FORTRAN preprocessor, that is, the Ratfor equivalent of cpp. Later, m4 was used as a frontend for Ratfor, C and Cobol.

René Seindal released his implementation of m4, GNU m4, in 1990, with the aim of removing the artificial limitations in many of the traditional m4 implementations, such as maximum line length, macro size, or number of macros.

The late Professor A. Dain Samples described and implemented a further evolution in the form of M5: "User's Guide to the M5 Macro Language: 2nd edition", Electronic Announcement on comp.compilers newsgroup (1992).

François Pinard took over maintenance of GNU m4 in 1992, until 1994 when he released GNU m4 1.4, which was the stable release for 10 years. It was at this time that GNU Autoconf decided to require GNU m4 as its underlying engine, since all other implementations of m4 had too many limitations.

More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which addressed some long standing bugs in the venerable 1.4 release. Then in 2005 Gary V. Vaughan collected together the many patches to GNU m4 1.4 that were floating around the net and released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and prepared patches for the release of 1.4.5 and 1.4.6.

Meanwhile, development has continued on new features for m4, such as dynamic module loading and additional builtins. When complete, GNU m4 2.0 will start a new series of releases.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.3 Invoking m4

The format of the m4 command is:

 
m4 [option…] [file…]

All options begin with `-', or if long option names are used, with a `--'. A long option name need not be written completely, any unambiguous prefix is sufficient. Options may be intermixed with files, use `--' as a marker to denote the end of options. m4 understands the following options, grouped by functionality.

Several options control the overall operation of m4:

--help

Print a help summary on standard output, then immediately exit m4 without reading any input files.

--version

Print the version number of the program on standard output, then immediately exit m4 without reading any input files.

-E
--fatal-warnings

Stop execution and exit m4 once the first warning has been issued, considering all of them to be fatal.

-e
--interactive

Makes this invocation of m4 interactive. This means that all output will be unbuffered, and interrupts will be ignored.

-P
--prefix-builtins

Internally modify all builtin macro names so they all start with the prefix `m4_'. For example, using this option, one should write `m4_define' instead of `define', and `m4___file__' instead of `__file__'. This option has no effect if `-R' is also specified.

-Q
--quiet
--silent

Suppress warnings, such as missing or superfluous arguments in macro calls, or treating the empty string as zero.

-W REGEXP
--word-regexp=REGEXP

Use REGEXP as an alternative syntax for macro names. This experimental option will not be present on all GNU m4 implementations (see section Changing the lexical structure of words).

Several options allow m4 to behave more like a preprocessor. Macro definitions and deletions can be made on the command line, the search path can be altered, and the output file can track where the input came from. These features occur with the following options:

-D NAME[=VALUE]
--define=NAME[=VALUE]

This enters NAME into the symbol table, before any input files are read. If `=VALUE' is missing, the value is taken to be the empty string. The VALUE can be any string, and the macro can be defined to take arguments, just as if it was defined from within the input. This option may be given more than once; order is significant, and redefining the same NAME loses the previous value.

-I DIRECTORY
--include=DIRECTORY

Make m4 search DIRECTORY for included files that are not found in the current working directory. See section Searching for include files, for more details. This option may be given more than once.

-s
--synclines

Generate synchronization lines, for use by the C preprocessor or other similar tools. This is useful, for example, when m4 is used as a front end to a compiler. Source file name and line number information is conveyed by directives of the form `#line linenum "file"', which are inserted as needed into the middle of the output. Such directives mean that the following line originated or was expanded from the contents of input file file at line linenum. The `"file"' part is often omitted when the file name did not change from the previous directive.

Synchronization directives are always given on complete lines by themselves. When a synchronization discrepancy occurs in the middle of an output line, the associated synchronization directive is delayed until the beginning of the next generated line.

-U NAME
--undefine=NAME

This deletes any predefined meaning NAME might have. Obviously, only predefined macros can be deleted in this way. This option may be given more than once; undefining a NAME that does not have a definition is silently ignored.

There are some limits within m4 that can be tuned. For compatibility, m4 also accepts some options that control limits in other implementations, but which are automatically unbounded (limited only by your hardware constraints) in GNU m4.

-G
--traditional

Suppress all the extensions made in this implementation, compared to the System V version. See section Compatibility with other versions of m4, for a list of these.

-H NUM
--hashsize=NUM

Make the internal hash table for symbol lookup be NUM entries big. For better performance, the number should be prime, but this is not checked. The default is 509 entries. It should not be necessary to increase this value, unless you define an excessive number of macros.

-L NUM
--nesting-limit=NUM

Artificially limit the nesting of macro calls to NUM levels, stopping program execution if this limit is ever exceeded. When not specified, nesting is limited to 1024 levels.

The precise effect of this option might be more correctly associated with textual nesting than dynamic recursion. It has been useful when some complex m4 input was generated by mechanical means. Most users would never need this option. If shown to be obtrusive, this option (which is still experimental) might well disappear.

This option does not have the ability to break endless rescanning loops, since these do not necessarily consume much memory or stack space. Through clever usage of rescanning loops, one can request complex, time-consuming computations from m4 with useful results. Putting limitations in this area would break m4 power. There are many pathological cases: `define(`a', `a')a' is only the simplest example (but see section Compatibility with other versions of m4). Expecting GNU m4 to detect these would be a little like expecting a compiler system to detect and diagnose endless loops: it is a quite hard problem in general, if not undecidable!

-B NUM
-S NUM
-T NUM

These options are present for compatibility with System V m4, but do nothing in this implementation.

-N NUM
--diversions=NUM

These options are present only for compatibility with previous versions of GNU m4, and were controlling the number of possible diversions which could be used at the same time. They do nothing, because there is no fixed limit anymore.

GNU m4 comes with a feature of freezing internal state (see section Fast loading of frozen state). This can be used to speed up m4 execution when reusing a common initialization script.

-F FILE
--freeze-state=FILE

Once execution is finished, write out the frozen state on the specified FILE. It is conventional, but not required, for FILE to end in `.m4f'.

-R FILE
--reload-state=FILE

Before execution starts, recover the internal state from the specified frozen FILE. The options `-D', `-U', and `-t' take effect after state is reloaded, but before the input files are read.

Finally, there are several options for aiding in debugging m4 scripts.

-d[FLAGS]
--debug[=FLAGS]

Set the debug-level according to the flags FLAGS. The debug-level controls the format and amount of information presented by the debugging functions. See section Controlling debugging output, for more details on the format and meaning of FLAGS. If omitted, FLAGS defaults to `aeq'.

-l NUM
--arglength=NUM

Restrict the size of the output generated by macro tracing to NUM characters per trace line. If unspecified or zero, output is unlimited. See section Controlling debugging output, for more details.

-o FILE
--error-output=FILE

Redirect dumpdef output, debug messages, and trace output to the named FILE. Warnings, error messages, and errprint output are still printed to standard error. If unspecified, debug output goes to standard error; if empty, debug output is discarded. See section Saving debugging output, for more details.

-t NAME
--trace=NAME

This enables tracing for the macro NAME, at any point where it is defined. NAME need not be defined when this option is given. This option may be given more than once. See section Tracing macro calls, for more details.

The remaining arguments on the command line are taken to be input file names. If no names are present, the standard input is read. A file name of `-' is taken to mean the standard input. It is conventional, but not required, for input files to end in `.m4'.

The input files are read in the sequence given. The standard input can only be read once, so the file name `-' should only appear once on the command line. It is an error if an input file ends in the middle of argument collection, a comment, or a quoted string.

If none of the input files invoked m4exit (see section Exiting from m4), the exit status of m4 will be 0 for success, 1 for general failure (such as problems with reading an input file), and 63 for version mismatch (see section Using frozen files).

If you need to read a file whose name starts with a `-', you can specify it as `./-file', or use `--' to mark the end of options.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.4 Problems and bugs

If you have problems with GNU M4 or think you've found a bug, please report it. Before reporting a bug, make sure you've actually found a real bug. Carefully reread the documentation and see if it really says you can do what you're trying to do. If it's not clear whether you should be able to do something or not, report that too; it's a bug in the documentation!

Before reporting a bug or trying to fix it yourself, try to isolate it to the smallest possible input file that reproduces the problem. Then send us the input file and the exact results m4 gave you. Also say what you expected to occur; this will help us decide whether the problem was really in the documentation.

Once you've got a precise problem, send e-mail to (Internet) bug-m4@gnu.org. Please include the version number of m4 you are using. You can get this information with the command `m4 --version'. Also provide details about the platform you are executing on.

Non-bug suggestions are always welcome as well. If you have questions about things that are unclear in the documentation or are just obscure features, please report them too.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.5 Using this manual

This manual contains a number of examples of m4 input and output, and a simple notation is used to distinguish input, output and error messages from m4. Examples are set out from the normal text, and shown in a fixed width font, like this

 
This is an example of an example!

To distinguish input from output, all output from m4 is prefixed by the string `', and all error messages by the string `error-->'. Thus

 
Example of input line
⇒Output line from m4
error-->and an error message

The sequence `^D' in an example indicates the end of the input file. The majority of these examples are self-contained, and you can run them with similar results by invoking m4 -d. In fact, the testsuite that is bundled in the GNU M4 package consists of the examples in this document!

As each of the predefined macros in m4 is described, a prototype call of the macro will be shown, giving descriptive names to the arguments, e.g.,

Composite: example (string, [count = `1'], @

[argument]…) This is a sample prototype. There is not really a macro named example, but this documents that if there were, it would be a Composite macro, rather than a Builtin. It requires at least one argument, string. Remember that in m4, there must not be a space between the macro name and the opening parenthesis, unless it was intended to call the macro without any arguments. The brackets around count and argument show that these arguments are optional. If count is omitted, the macro behaves as if count were `1', whereas if argument is omitted, the macro behaves as if it were the empty string. A blank argument is not the same as an omitted argument. For example, `example(`a')', `example(`a',`1')', and `example(`a',`1',)' would behave identically with count set to `1'; while `example(`a',)' and `example(`a',`')' would explicitly pass the empty string for count. The ellipses (`') show that the macro processes additional arguments after argument, rather than ignoring them.

All macro arguments in m4 are strings, but some are given special interpretation, e.g., as numbers, file names, regular expressions, etc. The documentation for each macro will state how the parameters are interpreted, and what happens if the argument cannot be parsed according to the desired interpretation. Unless specified otherwise, a parameter specified to be a number is parsed as a decimal, even if the argument has leading zeros; and parsing the empty string as a number results in 0 rather than an error, although a warning will be issued.

This document consistently writes and uses builtin, without a hyphen, as if it were an English word. This is how the builtin primitive is spelled within m4.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by System Administrator on September, 23 2007 using texi2html 1.70.