Get Tech'Ed HomePage

Course Homepage


CSM CSCI-400

Intro to Flex/Bison

(Last Mod: 17 December 2013 13:16:52 )



Set-up

Unless you choose to work on a school computer in which Flex and Bison are already installed, you first need to install them onto your computer. The exact steps needed to do this and to run Flex and Bison once they are installed will depend on which operating system you use. The instructions below are for a Windows machine.

First, download Flex and Bison. There are a number of places you can go to do this in case either of the following are not working:

Flex: http://flex.sourceforge.net/

Bison: http://gnuwin32.sourceforge.net/packages/bison.htm

Both Flex and Bison are command line tools. In short, each takes in input file (a *.l file in the case of Flex and a *.y file in the case of Bison) and each produces C source code files (*.c and *.h) that you then compile using a compiler of your choice to get your final executable file that is your lexer/parser/interpreter/compiler (as appropriate).

To run the program from the directory in which your input files are located, you need to add the directories in which each program's executable is located to your system's PATH environment variable. The details of how you do this vary somewhat from one version of Windows to another and there are usually multiple ways to do it. In Win7, one way is the following:

  1. Bring up the the Start Menu.

  2. Choose "Computer" from the right-hand pane.

  3. Select "System Properties" from the toolbar.

  4. Select "Advanced system settings" from the left-hand pane.

  5. Choose "Environment Variables" from the bottom of the dialog box.

  6. Scroll done the list in the "System Variables" list (the bottom list) and select "Path".

  7. Either double-click the "Path" variable or highlight it and choose "Edit".

  8. Hit "End" (on your keyboard) to deselect the current Path string and move the curser to the end of it.

  9. Add the full paths to the directories in which the Flex and Bison executables are located. Separate directories with a semicolon. Note that the Flex executable is in the "flex" directory while the Bison executable is in the "bison\bin" directory.

  10. A better alternative to the above step is to add a single batch file directory, such as "C\bat" and place batch files in that directory that invoke programs using explicit paths.

  11. After saving the new Path environment variable, choose "OK" twice to close the dialog hierarchy.

  12. The new Path is available for Console windows that are launched AFTER you complete this process. Any Console windows already open will continue using their local copies of the environment variables.

Each program has many command line options, most of which we do not need. For this class, you should be able to get by with the following command line choices (but feel free to explore the documentation and experiment).

Flex:

flex -L -oFILENAME.c FILENAME.l

where FILENAME is the name of your input file. The -L option tells Flex not to insert #line preprocessor directives into the generated code. While these line numbers can be useful for inclusion in error messages, some apparently have have trouble dealing with them. The -o option tells Flex to use the provided output file name instead of the default lex.yy.c. Note that there cannot be a space between the -o and the file name.

Bison:

bison -d FILENAME.y

where FILENAME is the name of your input file. The -d option tells Bison to generate a separate *.h file with the token type definitions and the external variable declarations that will be needed by the lexer program.

Batch file:

A useful batch file (Windows) that can run both of the above programs on your input files is simply:

flex -L -o%1.c %1.l

bison -d %1.y

 

If these are in a batch file named, for instance, biflex.bat, then you can run flex and bison both on your input files by simply typing:

biflex FILENAME

You do not need to worry if one of the files (the *.y, for instance) is missing. You will simply get a benign error message from the program.

 

Somewhat useful documentation for Flex and pretty decent documentation for Bison can be found here:

http://dinosaur.compilertools.net/flex/index.html

http://dinosaur.compilertools.net/bison/index.html

In particular, you should read the following section, which reinforces and expands on the material presented in class.

http://dinosaur.compilertools.net/bison/bison_4.html

Next, you'll work through some fairly simple warm-up exercises for both Flex and Bison and then implement a calculator using the grammar that you have been working with in your C programs.


Flex Exercise Files

Download the Flex Exercise Zip file.


Flex Exercise: Pascal

This exercise is primarily to let you verify that your installation of Flex is working properly and to let you play around a bit with a working *.l file and the resulting lexer. Run flex on pascal.l and compile the resulting C file.

There is nothing to turn in from this exercise.


Flex Exercise: Words

This exercise let's you get some easy practice modifying a *.l file to extend its functionality. The basic exercise is based on lexing sentences into parts of speech based on a very small vocabulary. You will extend this slightly and also add some things that are more directly related to the next assignment.

Run flex on words.l and compile the resulting C file. Play around with it enough to be comfortable with what it is doing and how the contents of the *.l file are making this so. Next, extend the vocabulary by adding three new parts of speech, namely adjective, conjunction, and adverb. Include five words for each part of speech.

There is nothing to turn in from this exercise.


Flex Homework: Complex (REVISED)

Develop a Flex input file named complex.l file that recognizes real, imaginary, and complex numbers and the addition and subtraction operators applied to them (i.e., the two operators + and -). This is not a trivial undertaking; for instance, should -4+5i be recognized as a single complex number, or as the addition of a real number and an imaginary number or perhaps a unary minus sign operating on a real number that is then added to an imaginary number. You have some flexibility and can choose to do this a number of ways as long as the results are mathematically valid. You should learn some of the subtleties of regular expressions in the process.

You program should continue processing lines of input from console until Ctl-z if entered (the same way that Pascal.l did).

Remember that, at this point, you are only writing a lexer, not a parser. So your output should be a list of tokens and lexemes in a manner similar to the Pascal exercise. You also do not need to distinguish between unary and binary operators as this is something best left to a parser.

ADDED:

HOWEVER, if you are shifting some of the task of identifying a complex number itself onto the parser, be sure to describe the parser rules that would be needed to work on your tokens to finish this off.

You will probably discover that this task has been far more complicated than it looked at first glance. In fact, it is possible that you have been unable to come up with a workable solution. Describe why this task is more difficult than it first appears.

Few, if any, languages that support complex numbers do so using the syntax you have been asked to work with. Choose a syntax -- and feel free to use the syntax of a language that intrinsically supports complex numbers -- that makes the task easier. Be sure to think about the complicating issues; for instance, if you choose to identify a complex number by requiring that it be surrounded by angle brackets, will this cause problems when your lexer tries to deal with relational operators?

As before, you have the flexibility to partition the task between the lexer and the parser, but if you rely on the parser, describe the parser rules that would be needed.

 

 

For due date, point value, submission procedures, and grading rubric see Blackboard.