GENCPP Project
Final Report
by
Bo Hu
Project Course 97su591
Instructor: Dr. R J Lechner
Computer Science Department
University of Massachusetts Lowell
Aug 28, 1997
Rev. 2, Nov. 11 1997 - RJL
Contents
1.2 Advantages of CHGEN and GENDB
1.3 Disadvantages of CHGEN and GENDB
4.1. class designs and function specifications:
4.1.1 RowRef<TT> template class
4.1.2 Table<TT> template class
4.1.3 AA class ---- student class
4.1.4 BB class ---- course class
4.2.1 void clear_mem(char*, init)
4.2.2 void hcg_parse(char *s, char *w, int * index) 4.2.3 void hcg_read_next(char * hcg_buffer, FILE * fp)
4.2.5 template <class TT> TT * lookup( Table<TT> * mtable, hcg_key fkey) 4.3. Current State of the target code: 4.4. Suggestions for future enhancement: 5.1 Overview of the C++ code generator 5.2 The schema file for chgen v10 5.4 gencpp functions and their specification: 5.4.1 FILE* gen_open(struct TT *, int ) 5.4.2 char * trim_space(char *) 5.4.3 void gen_TTdatamember(FILE*, struct TA *); 5.4.4 void gen_TTinterface(FILE* , struct TA *, char *, char * ); I. Introduction A short history of CHGEN CHGEN was developed over several years time at the University of Massachusetts - Lowell. The name CHGEN stands for "C" and "H" code generator. Its purpose was to aid programmers in developing database code for an ongoing CASE tool project within the Computer Science Department's Software Engineering course sequence. It has been enhanced and reused in many follow-on projects. Its data modeling conventions enhance the readability and consistency of analysis models and code, motivate project teams to adopt standardized names, and provide semi-automatic code generation to bridge the gap from requirements and design to implementation. CHGEN parses an Extended Entity-Relationship Model or relational schema in First Normal Form, and generates a library of C code and a header file that declares a memory-resident single-user database. This provides the means to make data persistent and to maintain and navigate its linked-list representations of associations among its records. During 1992, Craig and Steve Smith developed the first complete version (5) of the CHGEN code generator. Follow-on projects enhanced CHGEN in various ways The current CHGEN version (10) includes compressed keys and set/get macros to encapsulate dat attributes and support logging and replay of update histories as an aid to application debugging, rollback and recovery. Several recent projects tried to extend CHGEN to generate C++ classes to support object-oriented databases with methods as well as data attributes. In the Spring of 1997, Jiang and Hu began a complete redesign of CHGEN. The purpose of this project was to provide a solid base for further implementation. The goal of this project was to convert an arbitrary Extended Entity-Relationship-Attribute (EERA) database schema into an OODB (Object Oriented DataBase) with a software support library of automatically generated code in the C++ language. They tried to provide compatibility with CHGEN's API (application programmer interface) but eliminate its workarounds due to CHGEN's lack of inheritance and polymorphism. Hu continued this project in the summer of 1997. The goal was to write a demonstration test case that can be compiled and executed and, based on this example, write a simple code generator that generates the required class definitions and access methods. The biggest advantage of CHGEN is that it supports both a persistent relational database and an internal Virtual Memory-resident object-oriented database. CHGEN translates a normalized relational data model (schema.sch) into (1) a schema-specific header file (schema.h) which declares all data structures and macros and (2) a program library (pr_util.a) that can be linking to any schema-compatible database application. The application's runtime environment is a virtual-memory-resident object-oriented (network) database called VMNetDB. The persistent database can be saved by the routine pr_dump, and later reloaded into virtual memory by the routine pr_load. The programmer has direct access to memory-resident data after loading into VMNetDB. No intermediate routines need to be called to work with the data, so performance is not sacrificed. Table-loop and child-loop macros support traversal of 1-to-many relations (e.g., from aggregates to components). Pr_set and pr_get macros are provided to encapsulate table attributes and support update logging and playback or transaction rollback facilities. C-language 'struct' data types represent records in each table type. The data manipulation routines are customized (by CHGEN itself) to manage the internal dynamic links between related data and update all the list pointers. For example, when a record is deleted, the pr_delete routine invokes a custom block of code to delete that record and update all related records. This code block specifies exactly how to unlink this record from all the linked lists that represent its relationships. CHGEN creates super- and sub-class records on demand. It simulates data inheritance through the use of "access macros". Access macros are generated for all unique and acyclic paths through the schema diagram. CHGEN enumerates these paths upward from child to parent tables, including component to container links and subclass to superclass links. This provides a form of attribute inheritance. A programmer who holds a reference to a descendant record instance or object, can refer to an attribute of an ancestor object by simply placing the access macro between the descendant record reference and the ancestor's attribute name. The macro expands to a chain of pointers which traverses the path up to that ancestor. The advantage of using access macros is that they provide the programmer with a limited form of schema independence. If tables are inserted or deleted along any path, CHGEN can be rerun to redefine the access macros; this reduces the fraction of application code that has to be changed. CHGEN provides the convenience of the relational model along with the performance advantages of object-oriented (memory-resident network) databases. The programmer can modify the data model or schema on-the-fly and recompile the application. Child tables can access attributes from parent tables via implicitly or explicitly over specified paths. Consistent naming conventions simplify the maintenance of diverse database applications that use a CHGEN-based library because these conventions apply to all data models. There is a significant learning curve before programmers can use CHGEN. Programmers need to learn the general rules governing the source code and data structures defined by CHGEN before they can write useful and efficient programs. The overall effect is that developing and debugging applications becomes difficult until the programmers gain experience with the use of CHGEN libraries. A minor problem stems from the fact that programmers are required to learn and conform to CHGEN's systematic way of declaring a complex memory-resident network equivalent to a relational data model, rather than invent their own. But in multi-programmer projects, this problem turns into an advantage: Rather than remembering many different uses of arbitrarily-named variables, the programmer can refer to a detailed ERA diagram to identify data attributes and access paths. The data model shows all the ways by which data should be accessed from anywhere in the program, either directly or indirectly through access macros. As with any generic support library, CHGEN's ability to support any application data model makes it difficult to enhance and maintain CHGEN itself. A future goal is to boot-strap CHGEN itself, as a translator application based on the meta-database implicit in the schema definition file. Maintenance and enhancement of CHGEN and the code it generates require detailed knowledge of its data representation. Any changes in this data representation tend to affect other parts of a program, leading to cascading changes that increase the probability of introducing new bugs. For example, a naive programmer could simply change primary keys of records behind the system's back, rather than respect safe programming rules. II Overview of GENCPP Project GENCPP represents an attempt to design a completely new C++ implementation of the functionality originally provided by CHGEN. The goal of GENCPP is to convert any normalized relational data model into an object-oriented database support library whose functionality is based on classes, patterns and templates. C++ was chosen as the target language; other candidates will be considered later. Relational tables will be encapsulated into generic template-based containers and record-type-specific classes. GENCPP's class-based interface will simplify application programming since C++ inheritance will replace (some) access macros and iterators will replace looping macros. To the extent possible, GENCPP will try to maintain consistency with the look and feel of the CHGEN API naming and navigation conventions and maintain or improve CHGEN performance. This project was divided into two phases: The first phase is to design and demonstrate a way to represent the relational database and its network of relationships as classes and objects. This phase includes writing typical target code as a test case for the GENCPP code generator. The target code completed so far compiles and works. It declares container and item classes for two table types and one 1-to-many parent-to-child linked lists, but without key compression, iterators or persistence (pr_load and pr_dump) methods. The second phase is guided by the first phase. If the first phase works as a test case, it should be easy to write a code generator which will parse the schema file and generate the target code, including all class data members and methods. This phase has begun: a simple code generator was written which generates class definitions and attribute access methods. These provide a basis for implementing further target code examples.
3.1 Functionality
a) GENCPP shall maintain data compatibility with persistent I/O data and meta-data files and the internal network architecture supported by CHGEN. This means that each schema-defined table will give rise to a table container class and a table row class with private data members. The row class with its data members replaces the struct type in which CHGEN declares a table row. The 'container class' replaces CHGEN's table or list structure type which links all rows or instances belonging to that table.
b) An item class corresponding to each relational tuple or 'flat' record type shall provide a standardized interface for the programmer to create objects (i.e., instances of the class), to access and update its data members, and to follow its is-part-of relationship links. (The C++ language itself supports inheritance paths for is-a links.)
c) A template-based container class corresponding to each record type shall provide methods to add or delete any of its component object instances, and to save its entire content to, and reload it from, a persistent database. Initially, all containers will be simple lists. Other container types will be added later.
d) Each component item class shall provide a method to load a persistent object of its type from a data file into its proper container and link it into the appropriate relational linked lists in VMNetDB.
e) Each component class shall provide a method to dump any object instance of its type from VMNetDB into a persistent data file.
f) The dump capability will support two options:
g) A table class shall provide an efficient and type-safe method to search through all of its component items and find one or more instances that match in value any of its data members.
h) Each class with one or more parent roles shall provide a method for an object instance to sequentially access its set of children along each downward 1 to M relation link. This will replace the child-loop macros of CHGEN. A template-based iterator class is the desired method to accomplish this.
i) Each class which has one or more child roles shall provide a method for any object instance to access its unique parent along each upward M to 1 relation links, which will replace the parent-pointer TTid_pp of CHGEN.
3.2 Deliverables from project 97sugencpp:
A working example of GENCPP target code is located in $CASE/97s523/ gencpp/bhu/demobak.
A simple code generator is located in $CASE/97s523/gencpp/bhu/codegenerator. It generates class definition and data member access functions but not parent-to-child iterators.
IV Target code
4.1. Class Designs and Function Specifications:
This chapter describes the target code for classes which should be generated by GENCPP. This chapter also provides the specifications of their member functions. The implementation and application of the each member function will be discussed in detail in later sections.
A student registration example is used to illustrate the functionality and applicability of GENCPP. Its data model contains one student table (type AA) and one roster table (type BB) which contains a record containing a course name and a reference to a student enrolled in this course. There is one 1-to-many parent-to-child relation from AA to BB indicating enrollments of students for courses.
For this schema, GENCPP will generate a Table class for table AA and for table BB, one RowRef<AA> class for student records, and another RowRef<BB> class for course enrollment instances. It will also declare one parent-to-child template class with type parameters <AA,BB> that captures the 1-to-many association from a student to all of that student's course enrollments.
To test this example, a test data file containing instances of AA and BB table rows can be pr_loaded from persistent storage, by a test driver program, modified by calling GENCPP-declared methods then pr_dumped to a new version of the date file.
This example was chosen because it is the simplest possible one and most versions of chgen have used it as a first test case. This test case is too simple to cover all the features desired in GENCPP for general application data models. It will be expanded gradually along with GENCPP.
4.1.1 RowRef<TT> template class
Class RowRef<TT> is a building block that holds backward and forward list pointers to sibling instances of their template argument class. RowRef instances are collected in table (or child-set) container class instances. RowRef instances are transient and cannot be saved to persistent storage. They are used to traverse tables or parent-child relations during transactions and when the database is saved, and they must be reconstructed again when the database is reloaded.
Each instantiation of class RowRef includes a linked list of references to items of its Row subtype <TT>. This class has two possible uses: (1) a single long list that holds the entire table<TT> template class, and (2) many short lists, each associating one parent object with its particular child list. Currently, the latter are nested inside their parent instances, and must be updated when a child is added or deleted.
The RowRef class definition is as follows:
<pre>
</pre>
template <class TT> class RowRef
{
private:
TT *currObj; // pointer to current TT object
RowRef<TT> *prevRowRef; // pointer to previous RowRef object or NULL
RowRef<TT> *nextRowRef; // pointer to next RowRef object or NULL
public:
RowRef ( TT * obj=NULL, RowRef<TT> * tbl=NULL) // constructor
// class RowRef interface
TT *curr(); // get current object pointer
RowRef *next(); // get next RowRef<TT> object pointer
RowRef *prev(); // get previous RowRef<TT> object pointer
dump_RowRef(); // dump the current object row.[not dump_TT_RowRef - RJL]
void setcurrObj( TT *); // set current object pointer
//pointer set functions below should be hidden, not in API - RJL
void setnextRowRef( RowRef<TT>* ); // set next RowRef<TT> pointer
void setprevRowRef( RowRef<TT>* ); // set previous RowRef<TT> pointer
RowRef<TT>* deleteARowRef(<TT>* ); // delete a RowRef<TT> object
void load_RowRef(TT*, RowRef<TT> * ); // load a RowRef<TT>
}
</pre>
4.1.2 Table<TT> template class
This container class provides sequential access to its contained object instances of type TT. For our test case, Table<TT> will be instantiated twice to hold two item types or classes: Table<AA> and Table<BB> are containers for their respective AA and BB Row class instances.
The FirstRowRef pointer points to the first object of the RowRef<TT> linklist. The LastRowRef pointer points to the last RowRef<TT> object. We can add another data member of integer type like "int NumberofObject" to count the number of object in the table. This will expedite checking for an empty table. Because this template class has nothing special, it is possible to declare a table of references to objects of a particular child class inside a parent class like AA or BB. This design makes the total data structure entangled, i.e. less object oriented. Using pattern is a possible way to avoid this.
The addRowRef(TT *) function will append a new RowRef<TT> object to the end of the table. Another version with a <TT>* object parameter could insert the new RowRef after the specified one.
The Table target class definition is as follows:
<pre>
template <class TT>
class Table
{
private:
RowRef<TT> *FirstRowRef; // first RowRef of the table
RowRef<TT> *LastRowRef; // last RowRef of the table
public:
Table () // constructor
Table (RowRef <TT> *aRowRef) // constructor
~Table(); // destructor
int loadTable(char * ); // not implemented - refer to load.cpp
RowRef<TT> * getfirstRowRef(); // get first RowRef pointer
void addRowRef(RowRef<TT> ) // append a RowRef to the table
void deleteARowRef( TT *); // delete a RowRef from a table
void dump_table(); /* (traverse the linked list and call RowRef<TT>::dump */
}
</pre>
4.1.3 AA class ---- student class
This class is customized by GENCPP to a variable number of attributes, so it cannot be template-based. The last data member Table<BB> * ChildTablePtr, is a template instantiation (Table<BB>) reference to a list of the BB-children of this AA-instance. In this version of target code, because AA has only one parent-role (one AA to BB relation), such an implementation is ok. If a type like AA is a parent (or superclass) in more than one relation, ChildTablePtr needs to be qualified with the concrete type BB.
<pre>
class AA{
private:
//The private data members of an AA object.
hcg_key AAid; // primary (surrogate) key field
char fname[31]; // first name of student
char lname[31]; // last name of student
int age; // age of student
float gpa; // GPA of student
Table <BB> * ChildTablePtr; // ptr to linked list of children of this object
public:
AA (); //default constructor
AA (hcg_key , char *, char *, int, float); //constructor
~AA(); //default destructor
// class AA interface
char* getfname(); // get first name of the student
char* getlname(); // get last name of the student
hcg_key getid(); // get AAid of the student
float getgpa(); // get gpa value of the student
int getage(); // get age of the student
Table<BB> * getchildtableptr(); // get child table pointer
void setfname(char *); // set student first name field
void setlname( char *); // set student last name field
void setAAid(hcg_key ); // set AAid
void setgpa(float); // set gpa value
void setage(int); // set the age of the student
void setchildtableptr(Table<BB>* ); // set the child table pointer
int dump(); // dump each field of the object
void parse(char * ); /* parse an input record of type AA,
* construct a class object and set each field */
}
</PRE>
4.1.4 BB class ------- course enrollment instances
<PRE>
class BB
{
private:
hcg_key BBid; // primary (surrogate) key field
hcg_key AAid; // foreign key field
char cname[MAX_COURSE_NAME]; // name of course
AA * p_ptr; // parent pointer to an AA object
public:
BB (); / default constructor
BB (hcg_key , hcg_key , char *); // constructor
// class BB interface
char* getcname(); // get the course name
hcg_key getfkey_id(); // get the foreign key id
hcg_key getid(); // get the primary key id
void setcname(char *); // set course name field
void setfkey_id( hcg_key ); // set foreign key id field
void setBBid(hcg_key ); // set BBid field
int dump(); // dump all fields of the object
void parse(char *); // parse an input row and set each fields
};
Table BB needs only one foreign key field AAid because table BB has only one child role to the AA parent role. In a more complicated data model, an BB object can have more than one parent of the same or different types. Then we need one foreign key id to each different parent type.
The parse(char *) function in AA class and BB class, gets a string and calls global function hcg_parse to skip whitespace separators and extract each field, so parse can set each field of an AA object or BB object respectively. (Hcg_parse is also used by pr_load to read actual data into an AA or BB row from the persistent external data file.)
4.2 Utility functions:
4.2.1 void clear_mem(char * , int )
This function simply sets a region of the memory to zero.
4.2.2 void hcg_parse(char *s, char *w, int * index)
This routine extracts a field from string (s). The parameter index serves two purposes. On the way in, It marks the starting point for the parse. From here, white space is skipped. Once a word has begun, it is considered finished when another white space is encountered, or end of string is encountered. Finally, index is returned to the caller so that the caller can decide to continue or not.
4.2.3 void hcg_read_next(char * hcg_buffer, FILE * fp)
This function is used by various file read routines. It flushes the read buffer, reads the next line of the file, and converts any control characters in the buffer into spaces.
4.2.4 void load(char *file_name, Table<AA> * AA_table, Table<BB> * BB_table)
This function is used to load AA and BB tables from a database. This function supposes the AA table has been loaded before it begins to load BB table. This function should be able to be put inside template class table as member function, which need to call lookup template function to find out the relation. If so, we don't need the assumption that parent table has been loaded before loading the child table. In addition, this will make the program more object oriented.
4.2.5 template <class TT> TT * lookup( Table<TT> * mtable, hcg_key fkey)
This is a template function, which is used to look up an object with fkey as primary key. (It is analogous to find_pkey in CHGEN, and could be renamed as such.-RJL)
4.3. Current status of the target code:
4.3.1 Unix vs. PC platforms:
The code is written in C++ and worked when compiled on Microsoft Visual C++ ver 4.2. When all declararationos were in the program file (no separate header files) it compiled with g++ and worked on remus (a decstation), but not on Jupiter (an alpha). After dividing AA.cpp into AA.h and AA.cpp, and BB.cpp into BB.h and BB.cpp, respectively, it still works on Visual C++ ver 4.2, but doesn't work on remus.
4.3.2 Functions implemented:
a) All class objects now have specific interface for programmer, including member data access and modify methods.
b) All class objects now have methods to modify embedded class object.
c) All class objects have delete functionality. The object reference is not checked for a NULL value before deletion is attempted.
d) The method to load data from data files into the network database is defined and works. But modification is needed - please refer to the code.
e) Both Row classes have dump method to dump the network database into data file. This only writes the network database in breadth first order, i.e. dump each entire table.
f) A template-based lookup function is defined to search for a table element by its (unique) pkey value.
g) All tables (including AA and BB tables and AA's BB-child-list) are implemented as linked lists of RowRef<TT> objects.
4.4. Suggestions for future enhancement:
Besides the suggestions for future enhancement in above chapters, the following enhancements are recommended:
The encode/decode functions provided by chgen have not been implemented. These are desirable because compressed integer keys have space and time advantages over char strings. GENCPP ought to use CHGEN's 32-bit integer hcg_key class with three data fields: table abbreviation, version number, and row index number.
Another field in the schema or view can be defined to support depth first dump of all records.
A generic superclass Row could be defined over all row classes (AA and BB) with data member primary key id and corresponding interface to access and update the data member.
In many applications such as searching or sorting, we always need to compare two values. template function int cmp(T a, T b) is suggested to be used to compare value a and b. If a==b the functions return the 0; if a>b the function return the 1 and if a<b the function return -1. We can use a switch to compare different data types: for integers, it is easy; for float, use default precision or put a third parameter to pass the precision into the function; for strings, we need to use strcomp() or strncomp().
A template searching function is recommmended to search a table according to a specific field. This function will be more general than the implemented lookup template function which only search a table by the primary key id, but it must provide for multiple values, and for resuming the search when multiple matches can occur over the scope of the search.
There are some more suggestions on the code I wrote. Please refer to it.
5.1 Overview of the C++ code generator GENCPP
There are two ways to write the code generator. One is write code from scratch and design the internal data structure. Another is using the chgen version 10 to generate the code and write a application program to generate the C++ code. The first way, we have more freedom to write the program, but that means much more coding although we can get some code from chgen project. Using chgen generate the c code means we can use the pr_utilities code and the handy macros, we don't need to write the utilities code ourselves, thus we can focus on the problem we want to solve and save a lot of time. But this way we have to live with the code generated by chgen and its data structure. In addition, chgen has been developed for many years. Writing a new code generator of the same functionalities without using chgen is a waste. Thus in the simple code generator program the second way is chosen.
The input file format has two choices too. One is to read and parse the schema file like chgen. We prefer to read in the meta-data file schema.msdat that defines the schema as [meta-]tables TT and TA. (This file is generated by chgen version 10 with switch -metafile.) We choose the second choice because it avoids the schema file parsing functions of chgen. IInsteasd we can write gencpp as a chgen application - pr_loading schema.msdat into VMNetDB as tables TT and TA. Each row of TT identifeis a class to be generated, and each TA_child of this TT_row specifies a data member to be generated for this class. With suitable schema extensions, method names can also be added to the schema.sch and schema.msdat files and pr_Loaded into gencpp. (Adding method parameters to schema.sch will require additional parsing to generate another meta-table for method parameter names and types with TA as its parent.)
The code generator code is written in C based on the code generated by chgen version 10. It uses pr_load to load the meta schema file (generated by running chgen version 10 with switch -metafile). Then do a table loop on TT table, inside this loop do a child loop for each TA child of of this TT. For our example, this will generate the AA and BB class declarations and get and set member functions for each data member. That is implemented in the current version of gencpp.
5.2 The metaschema file for chgen v10
The metaschema file format for tables TT and TA is shown below. These tables are populated by CHGEN version 10 after reading the application's schema file schema.sch. Each schema table becomes one row of parent table TT; one row of child table TA is generated for each column or attribute of the row in table TT. For our AA--->BB test case, table TT will have rows for AA and BB, and table TA will have rows for each attribute of AA or BB.
The surrogate primary key AAid or BBid is always the first field because it identifies the Table and Row class type of a record in persistent storage.
The defVal field of table TA is currently ignored by Row class constructors in gencpp. Its purpose is to tell a future version of GENCPP how to initialize data members in each Row class constructor.
If you run chgen10 on this schema file with -ansi switch, you will get ansi-compatible 'C' code for maintaining TT and TA data as a [meta-]database. Then we can write the gencpp program as a chgen application.
TT_Table TT /* Table of TT - indexed by TTid */
{
TTid TT_id c8 1 /* primary key field */
TblAbbr Table_abbr c2 0 /* abbreviation of the table */
TTname Table_name c30 0 /* table full name*/
comment Comments c120 0 /* comments */
}
TA_Table TA /* Table of TA - indexed by TAid */
{
TAid TA_id c8 1 /* primary key field */
TTid TT_id c8 1 /* foreign key field to parent table TT*/
fldAbbr field_abbr_name c8 0 /* field abbreviation name */
fldName field_name c30 0 /* field name (to be renamed defVal) */
DataType data_type c4 0 /* data type */
iskey is_key i4 0 /* is key (should be c2 not i4) */
defVal DefaultValue c12 0 /* default value for this field */
comment comments c120 0 /* comments */
}
5.3. The meta-schema file roster.msdat
If you run chgen with switch -metafile on the following schema file, the result will be roster.msdat.
Here is the schema or table definition file roster.sch:
students AA /* Table of student data */
{
AAid students_id c8 1 /* primary key field */
fname fname c30 0 /* first name of student*/
lname lname c30 0 /* last name of student */
age age i4 0 /* age of the student */
gpa gpa f4 0 /* gpa of the student */
}
roster BB /* Table associating students with courses */
{
BBid roster_id c8 1 /* primary key field */
AAid student_id c8 1 /* foreign key field */
cname cname c30 0 /* name of course */
}
Here is the meta-schema data file roster.msdat produced by CHGEN. GENCPP can call pr_load to load roster.msdat into tables TT and TA . The first or SV row contains the schema name and version identification. The first five TA_rows define fields or table AA (their fkey value is TT000001); the last three belong to table BB (fkey = TT000002).
SV010001 PJ010001 roster.sch CHGEN v10 CHGEN
TT010001 SV010001 AA students /*1 Table of student records */
TT010002 SV010001 BB roster /*1 Table associating students with courses */
TA010001 TT010001 AAid students_id c8 1 /* primary key field */
TA010002 TT010001 fname fname c30 0 /* first name of student */
TA010003 TT010001 lname lname c30 0 /* last name of student */
TA010004 TT010001 age age i4 0
TA010005 TT010001 gpa gpa f4 0
TA010006 TT010002 BBid roster_id c8 1 /* primary key field */
TA010007 TT010002 AAid student_id c8 1 /* foreign key field */
TA010008 TT010002 cname cname c30 0 /* name of course */
5.4. gencpp functions and their specification:
5.4.1 FILE* gen_open(struct TT *, int )
This function is used to open a file in write mode. The filename is determined by the struct TT object's abbreviated name. If the passed integer is 1, then this file is a header file, i.e. the filename will be AA.h for AA. If the passed integer is 0, then the file is a cpp file. For AA, the filename will be AA.cpp. It returns the File pointer.
5.4.2 char * trim_space(char *)
This function is used to trim the spaces padded by chgen when it load the metaschema file. Th
is make the generated files looks better.
5.4.3 void gen_TTdatamember(FILE*, struct TA *)
This function is used to generate a data member of a class based on its TA-row value.
5.4.4 void gen_TTinterface(FILE* , struct TA *, char *, char * )
This function is used to generate get and set interfaces for a data member.
5.4.5 These functions are TBD (to be defined):
a) function to generate the constructor and destructor. This can be done by another child loop to figure out all data members and their default value and write them into the cpp file.
b) function to generate the other member functions of a class like AA, this can be done by copying the target code and replace some specific variable for those class with the names passed in.
c) function to generate the template function. This seems to be pretty easy because the template functions do not depend on the data member of those classes. The can be copied into the generated code.
d) function to generate the common utilities functions. This depends on the target code. Current target code is not very easy to do automatic code generation. More work is needed here.
5.5. Current status of executable demo:
This current gencpp.c can generate class definitions as described above.. Because the implementation of gencpp depends on the design of the target code, the current target code should be easy to generate because it is template based. This part of the code generator has been tested and it works.
5.6. Future enhancement
Because the code generator depends on what target code we want to generate, it is important to make the target code more complete first so that when we write the code generator we have a clear goal what code should be generated. Another problem for future versions is whether we should generate template-based C++ code. There are still problems with different compilers to support templates. The template-based code is shorter and relatively easier to generate, but the generated code won't be as portable. Thus there is a tradeoff.