.\" $NetBSD: yacc.1,v 1.7 2018/07/24 05:40:15 wiz Exp $ .\" .\" Copyright (c) 1989, 1990 The Regents of the University of California. .\" All rights reserved. .\" .\" This code is derived from software contributed to Berkeley by .\" Robert Paul Corbett. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" from: @(#)yacc.1 5.7 (Berkeley) 7/30/91 .\" from: Id: yacc.1,v 1.24 2014/10/06 00:03:48 tom Exp .\" $NetBSD: yacc.1,v 1.7 2018/07/24 05:40:15 wiz Exp $ .\" .Dd October 5, 2014 .Dt YACC 1 .Os .Sh NAME .Nm yacc .Nd an .Tn LALR(1) parser generator .Sh SYNOPSIS .Nm .Op Fl BdgilLPrtvVy .Op Fl b Ar file_prefix .Op Fl o Ar output_file .Op Fl p Ar symbol_prefix .Ar filename .Sh DESCRIPTION .Nm reads the grammar specification in the file .Ar filename and generates an .Tn LALR(1) parser for it. The parsers consist of a set of .Tn LALR(1) parsing tables and a driver routine written in the C programming language. .Nm normally writes the parse tables and the driver routine to the file .Pa y.tab.c . .Pp The following options are available: .Bl -tag -width symbol_prefixXXX .It Fl b Ar file_prefix The .Fl b option changes the prefix prepended to the output file names to the string denoted by .Ar file_prefix . The default prefix is the character .Ar y . .It Fl B Create a backtracking parser (compile-type configuration for .Nm ) . .It Fl d The .Fl d option causes the header file .Pa y.tab.h to be written. It contains #define's for the token identifiers. .It Fl g The .Fl g option causes a graphical description of the generated .Tn LALR(1) parser to be written to the file .Pa y.dot in graphviz format, ready to be processed by .Xr dot 1 . .It Fl i The .Fl i option causes a supplementary header file .Pa y.tab.i to be written. It contains extern declarations and supplementary #define's as needed to map the conventional .Nm yy-prefixed names to whatever the .Fl p option may specify. The code file, e.g., .Pa y.tab.c is modified to #include this file as well as the .Pa y.tab.h file, enforcing consistent usage of the symbols defined in those files. The supplementary header file makes it simpler to separate compilation of lex- and yacc-files. .It Fl l If the .Fl l option is not specified, .Nm will insert #line directives in the generated code. The #line directives let the C compiler relate errors in the generated code to the user's original code. If the .Fl l option is specified, .Nm will not insert the #line directives. #line directives specified by the user will be retained. .It Fl L Enable position processing, e.g., .Dq %locations (compile-type configuration for .Nm ) . .It Fl o Ar output_file Specify the filename for the parser file. If this option is not given, the output filename is the file prefix concatenated with the file suffix, e.g. .Pa y.tab.c . This overrides the .Fl b option. .It Fl P The .Fl P options instructs .Nm to create a reentrant parser, like .Dq %pure-parser does. .It Fl p Ar symbol_prefix The .Fl p option changes the prefix prepended to yacc-generated symbols to the string denoted by .Ar symbol_prefix . The default prefix is the string .Ar yy . .It Fl r The .Fl r option causes .Nm to produce separate files for code and tables. The code file is named .Pa y.code.c , and the tables file is named .Pa y.tab.c . The prefix .Dq y . can be overridden using the .Fl b option. .It Fl s Suppress .Dq #define statements generated for string literals in a .Dq %token statement, to more closely match original .Nm behavior. .Pp Normally when .Nm sees a line such as .Dq %token OP_ADD "ADD" it notices that the quoted .Dq ADD is a valid C identifier, and generates a #define not only for .Dv OP_ADD , but for .Dv ADD as well, e.g., .Bd -literal -offset indent #define OP_ADD 257 #define ADD 258 .Ed The original .Nm does not generate the second .Dq #define . The .Fl s option suppresses this .Dq #define . .Pp .St -p1003.1 documents only names and numbers for .Dq %token , though the original .Nm and .Xr bison 1 also accept string literals. .It Fl t The .Fl t option changes the preprocessor directives generated by .Nm so that debugging statements will be incorporated in the compiled code. .It Fl V The .Fl V option prints the version number to the standard output. .It Fl v The .Fl v option causes a human-readable description of the generated parser to be written to the file .Pa y.output . .It Fl y .Nm ignores this option, which .Xr bison 1 supports for ostensible POSIX compatibility. .El .Sh EXTENSIONS .Nm provides some extensions for compatibility with .Xr bison 1 and other implementations of yacc. The .Dq %destructor and .Dq %locations features are available only if .Nm yacc has been configured and compiled to support the back-tracking functionality. The remaining features are always available: .Pp .Dv %destructor { .Dv code } .Dv symbol+ .Pp Defines code that is invoked when a symbol is automatically discarded during error recovery. This code can be used to reclaim dynamically allocated memory associated with the corresponding semantic value for cases where user actions cannot manage the memory explicitly. .Pp On encountering a parse error, the generated parser discards symbols on the stack and input tokens until it reaches a state that will allow parsing to continue. This error recovery approach results in a memory leak if the .Dq YYSTYPE value is, or contains, pointers to dynamically allocated memory. .Pp The bracketed .Dv code is invoked whenever the parser discards one of the symbols. Within .Dv code , .Dq $$ or .Dq $\*[Lt]tag\*[Gt]$ designates the semantic value associated with the discarded symbol, and .Dq @$ designates its location (see .Dq %locations directive). .Pp A per-symbol destructor is defined by listing a grammar symbol in .Dv symbol+ . A per-type destructor is defined by listing a semantic type tag (e.g., .Dq \*[Lt]some_tag\*[Gt] ) in .Dv symbol+ ; in this case, the parser will invoke .Dv code whenever it discards any grammar symbol that has that semantic type tag, unless that symbol has its own per-symbol destructor. .Pp Two categories of default destructor are supported that are invoked when discarding any grammar symbol that has no per-symbol and no per-type destructor: .Pp The code for .Dq \*[Lt]*\*[Gt] is used for grammar symbols that have an explicitly declared semantic type tag (via .Dq %type ) ; .Pp the code for .Dq \*[Lt]\*[Gt] is used for grammar symbols that have no declared semantic type tag. .Pp .Bl -tag -width "%expect-rr number" -compact .It Dv %expect Ar number Tell .Nm the expected number of shift/reduce conflicts. That makes it only report the number if it differs. .It Dv %expect-rr Ar number Tell .Nm the expected number of reduce/reduce conflicts. That makes it only report the number if it differs. This is (unlike .Xr bison 1 ) allowable in .Tn LALR(1) parsers. .It Dv %locations Tell .Nm to enable management of position information associated with each token, provided by the lexer in the global variable .Dv yylloc , similar to management of semantic value information provided in .Dv yylval . .Pp As for semantic values, locations can be referenced within actions using .Dv @$ to refer to the location of the left hand side symbol, and .Dv @N .Dv ( N an integer) to refer to the location of one of the right hand side symbols. Also as for semantic values, when a rule is matched, a default action is used the compute the location represented by .Dv @$ as the beginning of the first symbol and the end of the last symbol in the right hand side of the rule. This default computation can be overridden by explicit assignment to .Dv @$ in a rule action. .Pp The type of .Dv yylloc is .Dv YYLTYPE , which is defined by default as: .Bd -literal -offset indent typedef struct YYLTYPE { int first_line; int first_column; int last_line; int last_column; } YYLTYPE; .Ed .Pp .Dv YYLTYPE can be redefined by the user .Dv ( YYLTYPE_IS_DEFINED must be defined, to inhibit the default) in the declarations section of the specification file. As in .Xr bison 1 , the macro .Dv YYLLOC_DEFAULT is invoked each time a rule is matched to calculate a position for the left hand side of the rule, before the associated action is executed; this macro can be redefined by the user. .Pp This directive adds a .Dv YYLTYPE parameter to .Fn yyerror . If the .Dq %pure-parser directive is present, a .Dv YYLTYPE parameter is added to .Fn yylex calls. .It Dv %lex-param Ar { Ar argument-declaration Ar } By default, the lexer accepts no parameters, e.g., .Fn yylex . Use this directive to add parameter declarations for your customized lexer. .It Dv %parse-param Ar { Ar argument-declaration Ar } By default, the parser accepts no parameters, e.g., .Fn yyparse . Use this directive to add parameter declarations for your customized parser. .It Dv %pure-parser Most variables (other than .Fa yydebug and .Fa yynerrs ) are allocated on the stack within .Fn yyparse , making the parser reasonably reentrant. .It Dv %token-table Make the parser's names for tokens available in the .Dv yytname array. However, .Nm yacc does not predefine .Dq $end , .Dq $error or .Dq $undefined in this array. .El .Sh PORTABILITY According to Robert Corbett: .Bd -offset indent Berkeley Yacc is an LALR(1) parser generator. Berkeley Yacc has been made as compatible as possible with AT\*[Am]T Yacc. Berkeley Yacc can accept any input specification that conforms to the AT\*[Am]T Yacc documentation. Specifications that take advantage of undocumented features of AT\*[Am]T Yacc will probably be rejected. .Ed .Pp The rationale in .%U http://pubs.opengroup.org/onlinepubs/9699919799/utilities/yacc.html documents some features of AT\*[Am]T yacc which are no longer required for POSIX compliance. .Pp That said, you may be interested in reusing grammar files with some other implementation which is not strictly compatible with AT\*[Am]T yacc. For instance, there is .Xr bison 1 . Here are a few differences: .Nm accepts an equals mark preceding the left curly brace of an action (as in the original grammar file .Dv ftp.y ) : .Bd -literal -offset indent | STAT CRLF = { statcmd(); } .Ed .Nm and .Xr bison 1 emit code in different order, and in particular .Xr bison 1 makes forward reference to common functions such as .Fn yylex , .Fn yyparse and .Fn yyerror without providing prototypes. .Pp .Xr bison 1 support for .Dq %expect is broken in more than one release. For best results using .Xr bison 1 , delete that directive. .Pp .Xr bison 1 no equivalent for some of .Nm 's command-line options, relying on directives embedded in the grammar file. .Pp .Xr bison 1 .Fl y option does not affect bison's lack of support for features of AT\*[Am]T yacc which were deemed obsolescent. .Pp .Nm accepts multiple parameters with .Dq %lex-param and .Dq %parse-param in two forms .Bd -literal -offset indent {type1 name1} {type2 name2} ... {type1 name1, type2 name2 ...} .Ed .Pp .Xr bison 1 accepts the latter (though undocumented), but depending on the release may generate bad code. .Pp Like .Xr bison 1 , .Nm will add parameters specified via .Dq %parse-param to .Fn yyparse , .Fn yyerror and (if configured for back-tracking) to the destructor declared using .Dq %destructor . .Pp .Xr bison 1 puts the additional parameters .Dv first for .Fn yyparse and .Fn yyerror but .Dv last for destructors. .Nm matches this behavior. .El .Sh ENVIRONMENT The following environment variable is referenced by .Nm : .Bl -tag -width TMPDIR .It Ev TMPDIR If the environment variable .Ev TMPDIR is set, the string denoted by .Ev TMPDIR will be used as the name of the directory where the temporary files are created. .El .Sh TABLES The names of the tables generated by this version of .Nm are .Dq yylhs , .Dq yylen , .Dq yydefred , .Dq yydgoto , .Dq yysindex , .Dq yyrindex , .Dq yygindex , .Dq yytable , and .Dq yycheck . Two additional tables, .Dq yyname and .Dq yyrule , are created if .Dv YYDEBUG is defined and non-zero. .Sh FILES .Bl -tag -width /tmp/yacc.uXXXXXXXX -compact .It Pa y.code.c .It Pa y.tab.c .It Pa y.tab.h .It Pa y.output .It Pa /tmp/yacc.aXXXXXX .It Pa /tmp/yacc.tXXXXXX .It Pa /tmp/yacc.uXXXXXX .El .Sh DIAGNOSTICS If there are rules that are never reduced, the number of such rules is written to the standard error. If there are any .Tn LALR(1) conflicts, the number of conflicts is also written to the standard error. .\" .Sh SEE ALSO .\" .Xr yyfix 1 .Sh STANDARDS The .Nm utility conforms to .St -p1003.2 .