This page is an accumulation of (we hope) helpful notes, hints, etc., that might be useful to you in doing Project #2.
The static types of the expressions E1 and E2, E1 or E2, and E1 if B else E2 are all the least upper bounds of the types of E1 and E2. The full definition of this is a little tedious, especially when function types get involved, so we'll test only a simplified version of T=lub(T1,T2):
Consider an assignment with multiple targets, such as
x::int, y::bool = 2, 3You could, of course, notice that 3 will be assigned to a bool variable, and issue an error message. On the other hand, consider
L = [2, 3] x::int, y::bool = Lwhich is really the same case, semantically. But here, you would generally not expect a compiler to catch the error.
In the view of the fact that you can only do a partial job anyway, I'm not going to worry about these cases (feel free to check if you want, just don't expect us to test for it). To be specific, when faced with multiple targets, just pretend that the right-hand side has type any. You still have to check that the targets are all assignable.
AST_Node* afunc (...) { char name[10] = "__init__"; AST_Node* token = make_token(IDENTIFIER, init_name); ... token eventually returned from afunc, possibly in a tree. }The problem is that name here is a local variable, whose lifetime is the duration of one call to afunc. Thus, the token created will be stuck pointing to garbage when afunc returns.
The usual way to initialize instance variables to a simple value is with an initializer, as in
class Foo : public Bar { public: Foo (int x) : Bar (x), foo_var (NULL) { } ... private: Foo* foo_var; };
aNode->append (MAKE_TREE (...));(see the section Useful macros (3) for a description of MAKE_TREE). If I replace this a child of aNode, the framework will release aNode's claim on the old child and take over the claim on the new one:
aNode->replace (i, MAKE_TREE (...));The previous value of aNode->child(i) is unref'ed if it differs from the new value. When the number of claims on a node reaches 0, the node is released. Thus, the following code is dangerous:
return MAKE_TREE (NEW, child(0), etc.);where my caller is doing something like:
replace (i, child (i)->convert_new ());Here, I am the old value of child(i) in my caller. The replace method will unref me, which will most likely cause me to be deleted. The destructor for AST_Node will then release my children, including the one I just inserted into the node I returned. Thus, this new node will contain garbage. Typically, the dangling reference thus created does not show itself until much later, in the form of a segfault, a "bad object" message, an abort, or just plain inexplicable spewing of garbage output. The proper way to do this is:
return MAKE_TREE (NEW, ref (child(0)), etc.);thus making sure that when I am deleted, my former, and still useful, child is not.
(gdb) set print object onwill cause GDB thenceforth to print the dynamic type of pointers that you print out, so that, e.g.,
(gdb) p declwill print out
$3 = (InstanceDecl *) 0x8102c30rather than
$3 = (Decl *) 0x8102c30and printing out *$3 will show the fields of an InstanceDecl
I suggest also using the command
(gdb) set print static offto avoid seeing static fields in every object you print.
(gdb) p tree->child (0)Some things don't work too well, such as passing reference parameters (you have to remember to pass their addresses), but integer and pointer arguments should work.
One good way to exploit this feature is to add functions to your program that are never used, except by the debugger. You might have a method for dumping information about a Decl or AST_Node on the standard error, so that you can simply write
(gdb) call dbg_decl(aDecl)and get a readable printout. You might provide functions to print C++ strings, vectors, or dictionaries. Template methods or functions can't be called directly in the debugger, but you can provide some debugging functions that call useful instances.
It is best not to rely on overloading for these debugging methods. Give each a distinct name.
for_each_child (c, aTree) { aTree->replace (c_i_, new_value); } end_for;(c_i_ gets defined by the macro as the index of the current child.) However, if you are adventurous, you might play around with my extension:
/** Control structure: * For each child, VAR, of AST_Node* VAR, ... * replacing the child with the value of VAR at the end of each iteration. * * Usage: * for_each_child_var (C, aTree) { * <operations involving C (an AST_Node*), including assignments> * } end_for; */ #define for_each_child_var(Var, Node) \ do { \ AST_Node* Var ## _n_ = Node; \ AST_Node* Var = NULL; \ for (int Var ## _i_ = 0; \ Var ## _i_ < (int) Var ## _n_->arity (); \ _replace_and_incr (Var ## _n_, Var ## _i_, Var)) { \ Var = Var ## _n_->child (Var ## _i_); /** Auxiliary function used by the for_each_child_var */ static inline int _replace_and_incr (AST_Node* node, int& k, AST_Node* new_child) { int n = new_child == NULL ? 0 : new_child->is_list () ? new_child->arity () : 1; node->replace (k, new_child); return k+=n; }As always, don't use it if you don't understand it.
class Child_Node : public Parent_Node { ... protected: /** Create Child_Node with operator OPER and initial capacity N. */ Child_Node (AST_Node* oper, size_t n) : Parent_Node (oper, n) { } /** Create and register exemplar. */ Child_Node (int syntax) : Parent_Node (syntax) { } /** Override make (in the exemplars) to create a Child_Node. */ Child_Node* make (AST_Node* oper, size_t n); static const Child_Node exemplar; }; Child_Node Child_Node::exemplar (some syntactic category);If you're defining a type that's intended only as a common supertype for a bunch of nodes (and will never be explicitly allocated itself), then you don't need the the definition of make and of exemplar.
Well, this is all pretty tedious, and it looks the same for each type. So why not refactor as a macro? For example, I use the following:
#define STD_BASE_CONSTRUCTORS(Type, Supertype) \ protected: \ \ Type (AST_Node* oper, size_t n) : Supertype (oper, n) { \ } \ \ Type (int syntax) : Supertype (syntax) { } #define STD_CONSTRUCTORS(Type, Supertype) \ STD_BASE_CONSTRUCTORS (Type, Supertype) \ \ Type* make (AST_Node* oper, size_t n) { \ return new Type (oper, n); \ } \ \ static const Type exemplar #define STD_EXEMPLAR(Type, Syntax) const Type Type::exemplar (Syntax)So that now I can write
class Child_Node : public Parent_Node { ... protected: STD_CONSTRUCTORS (Child_Node, Parent_Node); }; STD_EXEMPLAR (Child_Node, some syntactic category);
Of course, this definition doesn't allow for creating constructors with initializers (see the section " No default initialization of local variables and instance variables"), as in
Child_Node (AST_Node* oper, size_t n) : Parent_Node (oper, n), my_info (NULL) { } private: Info* my_info;You'll have to work out appropriate generalizations yourselves.
MAKE_TREE (DEF, name, params, return_type, body)and have it do the right thing. The problem is in allowing a call in which the number of parameters is arbitrary, because in C++ and C, there is no way to tell how many parameters you've been passed (if your signature has a trailing "..."). Basically, I copied stuff from the Horn framework to do this, and declared a function in ast.h:
/** make_tree(SYNTAX, c1, ..., cn, NULL) creates a tree whose operator * has syntax SYNTAX and whose children are C1 to Cn. This is * essentially taken from the Horn parser framework. */ extern AST_Tree* make_tree (int syntax, ...);But how to get the NULL at the end? I can ask the programmer to do it, but that's a little error-prone and unsightly. A variadic macro serves to do the job:
/** MAKE_TREE(SYNTAX, c1, ..., cn) creates a tree whose operator * has syntax SYNTAX and whose children are C1 to Cn. This is * essentially taken from the Horn parser framework. */ #define MAKE_TREE(...) ::make_tree (__VA_ARGS__, NULL)
If you have code like this in your program:
if (c->oper ()->as_string () == "lambda")) { do something to c... }you are on the wrong track, at least stylistically. One almost never does direct type tests (instanceof in Java) or their moral equivalent (as above) in properly constructed object-oriented code. Instead, behavior that depends on the type of an object corresponds to a virtual method. For example, you might define in the AST_Node class:
virtual void munge () { for_each_child (c, this) { c->munge (); } end_for; }—so that, by default, munging an AST node simply munges its children— and then define
class Lambda_Node : public AST_Tree { ... void munge () { Special munging for lambda nodes AST_Tree::munge (); /* C++ version of super.munge (). */ } ... };so that munging a Lambda_Node does some special work first and then does the default of munging its children. What if you want an effect like this:
void munge2 () { AST_Node* myParent = get my parent node somehow; if (myParent->oper ()->as_string () == "def")) { mungify me... } else { glorpify me... } }? In that case, your code is most likely calling out to be structured differently, such as:
void munge2 () { mungify me... } void glorp () { glorpify me... }Where munge2 and glorp are both virtual functions. It is then your parent node's job to decide whether it really wants to munge or glorp you. If there is normally no difference between the two, your top-level definitions (in AST_Node) might read:
virtual void munge2 (); virtual void glorp ();with implementations
void AST_Node::munge2 () { default definition; } void AST_Node::glorp () { munge2 (); }
void AST_Tree::print (ostream& out, int indent) { out << "(" << ast_name () << " " << line_number (); for_each_child (c, this) { out << endl << setw(indent+2) << ""; c->print (out, indent+2); } end_for; out << ")"; }and later, in class Id_Node:
void print (ostream& out, int indent) { out << "(id " << line_number () << " "; child(0)->print (out, indent+2); if (get_decl () != NULL) out << " " << get_decl ()->get_index (); out << ")"; }and then, in class Type_Node:
void print (ostream& out, int indent) { out << "(type " << line_number () << " "; child(0)->print (out, indent+2); out << ")"; }But these are all variations on a theme. So I defined a (non-virtual) protected method, AST_Tree::print_prefix, to catch the common beginning parts:
void AST_Tree::print_prefix (ostream& out, int indent, bool compact) { out << "(" << ast_name () << " " << line_number (); for_each_child (c, this) { if (compact) out << " "; else out << endl << setw(indent+2) << ""; c->print (out, indent+2); } end_for; }so that now, my print methods read:
void AST_Tree::print (ostream& out, int indent) { print_prefix (out, indent, false); out << ")"; } // In Id_Node: void print (ostream& out, int indent) { print_prefix (out, indent, true); if (get_decl () != NULL) out << " " << get_decl ()->get_index (); out << ")"; } // In Type_Node: void print (ostream& out, int indent) { print_prefix (out, indent, true); out << ")"; }
string name = an_id_node->child(0)->as_string ();—in other words, I wanted the name of an identifier stored in an Id_Node. This code is inside a method other than Id_Node, and yet it has to know something about the object structure of Id_Nodes. Of course, Id_Nodes can occur in places that also allow Typed_Id_Nodes (as in (typed_id N (id N NAME))), in which case, I would have to write
string name = an_id_node->child(0)->child(0)->as_string ();and (shudder) choose between one or the other depending on the dynamic type of an_id_node. I had violated the Law of Demeter, which tells me that I should have limited knowledge of how objects not closely related to me represent the information they contain.
In this case, we can fix the problem by overriding as_string on Id_Node and Typed_Id_Node:
// In Id_Node: string as_string () { return child (0)->as_string (); } // In Type_Id_Node: string as_string () { return child (0)->as_string (); } // my child (0) is an Id_Node!Now I can rewrite my original code can handle both cases clearly:
string name = an_id_node->as_string ();
//add a new decl current_index_list->push_back(global_decls.size()); ... } else { //add a new ref instead of decl (if we have "x=5\nx=6") the second x is a ref ... } } //check right side to see if its assign. // if not assign, send to frame reference createrThere are actually three problems here: