What's New?
Hyacc 0.98 is released on January 01, 2017.
What is Hyacc, and why it is special
Many people have used Yacc. It is a LALR(1) parser generator, often used
with the lexical analyzer Lex to create compilers. There are many variations
of Yacc, most famous ones are like Bison and Byacc (Berkely yacc).
The problem with LALR(1) parser generators is that they contain "mysterious"
reduce/reduce conflicts even for deterministic and unambiguous grammars,
and these require the language designer to tweak
the grammar which is often a time-consuming and painful process, and the
resulted grammar may not be the same as before modification.
Hyacc is similar
to Yacc in that it is a parser generator. It is different from yacc in
that it is a LR(1)/LALR(1)/LR(0) parser generator, which means it can
accept all LR(1) grammars. This is more powerful than LALR(1) algorithm.
A LR(1) parser generator also does not have the
"mysterious reduce/reduce conflict" problem.
Hyacc also supports LR(k) for limited cases.
Hyacc is pronounced as "HiYacc", means Hawaii Yacc.
Specifically, based on the original Knuth LR(1) algorithm, three
optimizations are used:
- LR(1) parser generation by combining compatible states [1].
- Remove unit productions [2].
- Further remove repeated states after the previous step.
Besides, Hyacc also implemented:
- LR(1) parser generation by lane-tracing [3][4].
- LALR(1) based on lane-tracing.
- LR(0).
- A LR(k) algorithm that works for limited cases.
Hyacc is compatible with Yacc and Bison in the
format of input file and command line switches. So anyone used Yacc or
Bison before should be able to start using Hyacc immediately.
Hyacc is ANSI C compliant, and can be easily ported to other platforms.
Features
Current features:
- Implements the original Knuth LR(1) algorithm.
- Combines compatible states using the concept of weak compatibility [1].
- Removes unit productions [2].
- Removes repeated states after removing unit productions.
- Implements the LR(1) lane-tracing algorithm.
- Supports LALR(1) based on the lane-tracing algorithm [3][4].
- Supports LR(0).
- Supports LR(k) for limited cases.
- Allows empty production.
- Allows mid-production actions.
- Allows these directives: %token, %left, %right, %expect, %start, %prec, %union, %type.
- Allows terminal ' ' (space between apostrophes), which is used in, for example, Ruby grammar.
- In case of ambiguous grammars, uses precedence and associativity to resolve
conflicts. When unavoidable conflicts happen, in case of shift/reduce
conflicts the default action is to use shift, in case of reduce/reduce
conflicts the default is to use the production that appears first in a grammar.
- Is backward compatible to yacc and bison in the ways of input file format,
ambiguous grammar handling, error handling and output file format.
- If specified, can generate a graphviz input file for the parsing machine.
- If specified, the generated compiler can record the parsing steps in a file.
- Works together with Lex and Flex.
- Is ANSI C compliant.
- Rich information in debug output.
What's not working and to be implemented:
- Hyacc requires that all the "%%" and "%..." directives start from the first column of the line. This restriction should be removed later.
- Hyacc is not reentrant.
- Hyacc does not support Yacc directive: %nonassoc.
- Does not handle grammars which begin with actions, for example,
program: {ACTIONS} top_compstmt ... (see ruby.y) (Vladimir Lidovski, 10/10/2010)
Example
Here is a yacc input file calc.y for an infix notation calculator, modified from
an example from Bison's online document.
Type the command: hyacc calc.y -v -t -g -D0
-v asks for the generation of y.output
-D0 asks for printing all the relevant information into y.output
-t asks for generating y.parse that contains all the parsing details when the generated parser parses an input.
-g asks for generating y.gviz that can be used as input file to graphviz.
The generated files are: y.tab.c, y.output and y.gviz.
Using graphviz, based on the genereated file y.gviz, this is the graphviz-generated parsing machine.
Now compile y.tab.c with a C compiler, which gives calc.exe.
Use the input file input.txt (don't forget the newline at the end),
type: calc < input.txt
The answer is given in the standard output as:
# 25
#
and a y.parse file is generated, which shows the parsing process for the formular in input.txt.
Download
Hyacc can be compiled and used in Unix, Linux and Windows, or
any system that supports ANSI C.
Documentations (included in the source packages): readme.pdf,
hyaccmanpage
Source package:
Hyacc project page.
Or directly go to the download page.
License
Hyacc source code is released under the GPL license.
The only exception is the parser engine file "hyaccpar", which is released
under the BSD license. This guarantees that the generated parser can be used
in both open source and commercial softwares. A similar situation was
addressed by Richard Stallman in the release notes of Bison 1.23 and 1.24.
History
Releases of Hyacc so far:
Compared to version 0.97, version 0.98 added these changes:
- Added support for Yacc directives: %union, %type.
- Allows terminal ' ' (space between apostrophes), which is used in, for example, Ruby grammar.
- In hyaccpar, added "yyerrlab:" in line 609 to fix a potential bug.
- Various changes in hyaccpar and other files to solve compiler complaints.
- Previous version was compiled under Solaris and Cygwin. Now it compiles under OS X without complaint.
Compared to version 0.95, version 0.97 added these changes:
- Added LR(k) for limited cases.
Compared to version 0.9, version 0.95 added these changes:
- Allows mid-production action.
- Added the LR(1) lane-tracing algorithm.
- Added the LALR(1) algorithm based on lane-tracing.
- Added the LR(0) algorithm.
- Removed a bug in function getTHeads() in y.c.
An example grammar that is affected by the previous bug is
grammar G3 in Example 2 of [3] (page 29).
References
- David Pager, Xin Chen. A New Algorithm To Evaluate Terminal Heads Of Length K. The Second Asia-Pacific Programming Languages and Compilers Workshop (APPLC), Shenzhen, China, February 23, 2013 (In conjunction with CGO'13)
- Xin Chen, David Pager. Extension to the Unit Production Elimination Algorithm. Proceedings of the 2012 International Conference on Software Engineering Research and Practice, p.433-439. Las Vegas, July 2012
- David Pager, Xin Chen. The Lane Table Method Of Constructing LR(1) Parsers. The First Asia-Pacific Programming Languages and Compilers Workshop (APPLC) Beijing, China, June 14, 2012
- Xin Chen, David Pager. The Edge-Pushing LR(k) Algorithm. Proceedings of International Conference on Software Engineering Research and Practice, p.490-495. Las Vegas, July 18-21, 2011
- Xin Chen, David Pager. LR(1) Parser Generator Hyacc. Proceedings of International Conference on Software Engineering Research and Practice, p.471-477. Las Vegas, July 18-21, 2011
- Xin Chen, David Pager. Full LR(1) Parser Generator Hyacc And Study On The Performance of LR(1) Algorithms. Proceedings of The Fourth International C* Conference on Computer Science & Software Engineering, p.83-92. Montreal, Canada, May 16-18, 2011
- Xin Chen. Measuring and extending LR(1) parsing. PhD dissertation. University of Hawaii. August 2009
- Pager, D., A Practical General Method for Constructing LR(k) Parsers. Acta Informatica 7, 249 - 268 (1977)
- Pager, D., Eliminating Unit Productions from LR Parsers. Acta Informatica 9, 31 - 59 (1977)
- Pager, D., The Lane-Tracing Algorithm for Constructing LR(k) Parsers and Ways of Enhancing Its Efficiency. Information Sciences 12, 19 - 42 (1977)
- Pager, D., The lane tracing algorithm for constructing LR(k) parsers. Proceedings of the fifth annual ACM symposium on Theory of computing, 172 - 181 (1973)
Other implementations of [1].
- Implementation in Forthan: LR. By C. Wetherell and A. Shannon (1981)
- Implementation in Objective Caml: Menhir. By François Pottier and Yann Régis-Gianas (2004 ~ 2007)
- Implementation in Python. By Jason Evans (2007)
Contact
Please contact chenx AT hawaii DOT edu for any comments or bug report.
Copyrighted (2007 - 2017) @ X.C. Created on: 8/26/2007,
Last modified: 1/19/2017