FAQ: PP 1: Lexer
Questions:
Answers:
- What are these "profiling: ... Merge mismatch for summaries" messages?
- Do comments require any special flex directives?
- Do I have to update yylloc in each rule?
- What are start conditions? Should I use them for this assignment?
- What happens if a newline occurs inside a string constant?
- What does the T_Dims token represent?
- What are nested comments and do we support them in Decaf?
- Is it okay to compare dcc outputs with another programming team to see if we are lexing correctly / missed any errors?
- Should I pass a preceding '-' to strtol/strtod?
Answers:
- What are these "profiling: ... Merge mismatch for summaries" messages? [top]
This seems to occur when the coverage instrumentation gets changed (e.g. you modify and recompile a file) but the data generated from past test runs (with data for old line numbers) is still present. This causes it to comlain about "mismatches."
The fix? Do make clean, or try deleting the data files with rm -f src/*.gcda - Do comments require any special flex directives? [top]
No, multiline comments can be implemented with a simple regular expression. Our solution looks something like:
{CCommentStart}{CCommentMiddle}{CCommentEnd}
with the three parts above representing defined regular expressions. Although you can use fancier flex functionality, you don't need to, and it is probably more trouble than it's worth. - Do I have to update yylloc in each rule? [top]
Yes, you do, in order to maintain location information for your lexer. However, you should check out do_before_each_action to make your life easier.
- What are start conditions? Should I use them for this assignment? [top]
Start conditions are a way of preserving state in flex. They are not necessary to the first assignment, and we encourage you to onlyuse regular expressions.
- What happens if a newline occurs inside a string constant? What is the expected behavior after we generate an error? [top]
You should treat a string constant that contains a newline or EOF as if the programmer forgot to close the string constant. In other words, report the error, consume the input up to the newline, and resume parsing as if it had been properly closed. You should not return malformed string constants. The lexer should know that it is at the end of a string constant when it encounters a '“', '\n', or EOF, and it should report an error in the case of the latter two.
- What does the T_Dims token represent? [top]
T_Dims represents the lexeme "[]" (but not the "[.+]" lexeme).
- What are nested comments and do we support them in Decaf? [top]
Nested comments consist of placing one /**/ comment inside another, as in:
/* this /* is /* a */ nested */ comment */
Decaf does not support them. The above comment would terminate with the first "*/" the lexer encountered, and lex each subsequent "*/" as the operator tokens "*" and "/". - Is it okay to compare dcc outputs with another programming
team to see if we are lexing correctly / missed any errors? [top]
Please see the assignment guidelines handout for details. To summarize, yes, you may share test cases. However, you may not discuss how to solve problems presented by any test cases, nor may you share any source code beyond the test cases.
- Should I pass a preceding '-' to strtol/strtod? [top]
No. According to the decaf specification provided in the assignment 1 handout, integer constants consist of strings of digits (hexadecimal ones prefixed by "0x"). Double constants may additionally contain a decimal point and an exponent. They cannot contain the '-' character. Rather, the '-' unary operator negates the value of the number constant.
CS143 Resources
- Announcements
- Schedule and Handouts
- Staff Info / Office Hours
- Email Archive
- Project FAQs
- Coding Guidelines
- Submission Instructions
- Lecture Videos
Outside Resources
Command References