Version numbering problems and how to solve them in SIS.dat. RJLRef: ~/public_html/COOL/GEN/VersionControlUpdate.031005 To all 03f522 students: Please read this along with the chgen User Manual section on views and versions which explains its (now obsolete) mechanism for pkey version numbering before next Wed, when we will discuss the problem. R Lechner > From edklein@comcast.net Fri Oct 3 20:33:48 2003 > Subject: 91.522-SIS-student records failing > Date: Sat, 04 Oct 2003 00:33:35 +0000 > > As you suggested, I ran some more tests on my program for assignment #3. 0 students and 1 student tests work but with >1 student I have a problem that I need help with. > > The student records STxxxxxx are loading fine, but Term Section records, TExxxxxx will only load for one student. If there is a second student's TE defined it doesn't load. > > For example, with the below records, ST000027, ST000252 both load. But, only TE252031, TE252032, TE252033 load. TE027031 and TE027032 only load if TE252xxx records are deleted. > > ST000027 DE000091 Brown,Abe 2.00 1 > ST000252 DE000091 Smith,John 4.00 1 > > TE027031 ST000027 TM020031 04 04 2.00 > TE027032 ST000027 TM020032 04 04 2.00 > TE252031 ST000252 TM020031 03 03 4.00 > TE252032 ST000252 TM020032 03 03 4.00 > TE252033 ST000252 TM020033 03 03 4.00 > > Attached is my schema SIS.sch, the data SIS_students.dat and a dump from pr_load, pr_dump; without any processing. I've also attached main.c which has only these commands. I'm hoping you can build with these if you need to. > > Do you have any ideas? I don't see any errors from the pr_load(view,file) call. > > -Ed Klein > edklein@comcast.net 1. Reason for not loading all table rows: ----------------------------------------- Attached are Ed's .sch file and main.c test input and output .dat files. I saw immediately the symptom that explains his input or pr_load failure: Ed's TE row#'s start with a 3-digit student#; all of his TE-rows would be loaded OK if all students numbers began with 00 (000 thru 009). Chgen restricts the first two digits of each table's 6-digit ASCII row number to one version number in each View name. This rule is enforced by load_data (see below) when pr_loaded, and by function meets_view() when pr_dumped, based on a ViewName argument. The ViewName selects a list of table types and a version number for all of them, from a ViewDefs (type .vdf) file (see chgen/ver_10log User Manual handout). Unfortunately I didn't remember this when giving my advice on 6-digit row numbering based on user-defined keys (semester#, section#, etc. (see any chgen User Manual since ver_8). 2.First work-around: Compile chgen with -DNEW_VERSION ----------------------------------------- To avoid version_numbering complications, compile chgen with NEW_VERSION #defined in its Makefile. E.g., see $CASE/gen/ver_12/sjaganat/src/Makefile line 3: CFLAGS = -g -DDEBUG -DNEW_VERSION This disables meets_view so all rows are searched by find_*_loop macros, and saved by pr_dump. It also disables one of two tests of version numbers in load_data from gen_load_data.c, one test in do_pr_add from gen_pr_load.c, and one in meets_view from gen_pr_utils.c. However gen_load_data.c does not suppress a second check based on version#, generated at line 118 of gen_load_data.c. This test remains in bde/pr_util/pr_load.c (line 2522). (This may be a chgen bug - and Ed's problem even if he used genv12's Makefile.) 3. chgen/src/gen_pr_*.c use of NEW_VERSION flag: ----------------------------------------------- #ifdef NEW_VERSION is used in chgen code generators. It affects load_data() at line 89 of gen_load_data.c: ----------- #ifndef NEW_VERSION fprintf(prload_fp,"\ if (hcg_view_list.view_list[hcg_view_idx].version_list[hcg_tbl_idx] == ' \\0')\n\ {\n\ hcg_read_next();\n\ continue;\n\ }\n\ \n\ "); #endif -------------- but not at line 118 (which may be an untested bug:-( ): ------------ if (((char) hcg_version) != hcg_view_list.view_list[hcg_view_idx].versio n_list[hcg_tbl_idx])\n\ {\n\ hcg_read_next();\n\ continue;\n\ }\n\ ------------- NEW_VERSION also affects do_pr_add() in gen_pr_load.c at line 169, with the same if-condition as above. It affects a version check in function pr_dump() generated by gen_pr_dump.c at line 132. NEW_VERSION also suppresses the version check in function hcg_update_version() that is generated in gen_pr_utils.c at line 284 and called by pr_dump_row. Meets_view is a function used in find_..._loop macros, that are declared in schema.h from chgen. With meets_view enabled, find_*_loop searches are restricted and so is the content of any file that is saved by pr_dump: All saved rows in one table type must have pkeys in the range nn0000 to nn9999, where nn is the version number specified in the View for that table type. NEW_VERSION permanently disables meets_view() by trivializing it: gen_pr_utils.c at line 192 generates a meets_view function that returns the constant 1 instead of a version checking predicate. The generated meets_view code can be seen in lines 145-148 of ~lechner /bde2alpha_rl/sandbox/bdecheckout/bde/pr_util/pr_load.c : int meets_view(int tbl_idx, int view_idx, hcg_key pkey) { return(1); } I suspect Ed did not use the NEW_VERSION flag when compiling chgen. 3. How genv12's pr_load function works in bde: ---------------------------------------------- Function pr_load in bde/pr_util/pr_load.c calls load_data to parse input file records. Function int load_data( char *viewname) begins at line 2461. At line 2522, a pkey's (1-byte) version number is checked: ----------- if (((char) hcg_version) != hcg_view_list.view_list[hcg_view_idx].versio n_list[hcg_tbl_idx]) /* skip this row; */ /* otherwise call pr_parse to read each field; */ ... /* if all goes well, call pr_link to add this row to its table.*/ ... ----------- Since all bde.dat files are saved with version number unchanged, (hcg_update_version is never called in pr_dump) version numbers would not fail this constraint when next pr_loaded. For output, pr_dump is called; it calls pr_dump_row for the appropriate table type. Pr_dump has int new_version for its 3rd parameter. This is passed to pr_dump_row. If new_version == 0, pr_dump will NOT update the version number part of each pkey (and related fkeys?). (Value 1 will cause the pkey's version number to be incremented.) Bde always sets new_version = 0 or FALSE, and '#define FALSE (bool) 0' is in schema.h. 4. Alternate work-around (sub-optimal, not recommended) If meets_view is not disabled, another solution is applicable to term-specific tables (TM, SE and EN). It requires calling pr_load multiple times to load each TerM's data sequentially, one at a time. (Ed's table TM uses pkeys TM2003xx only. If he had also used TM1999xx these would not be accepted, but TM99xxxx and iTM03xxxx could be loaded during separate views. Unfortunately the year 2000 problem then rears its ugly head! :-) One version of the tables is needed for each term for which data is processed. This requires matching version numbers related to TM's pkey in tables TE and EN; this is not a bad idea, since each term's sections and enrollments could correctly be considered a different version of the database. However, it would not solve similar problems if version digits depend on STudent or other table row#'s as I suggested for SEction and ENrollment pkeys. So to get acceptable data you need to renumber table row#'s so their first 2 digits are invariant within a table. That wrecks my global numbering schema across all student .dat files, but table type abbrev. differences already did that. 5. Long-range solution: Use RCS/CVS for revision control ------------------------------------------------------- The long-range solution (TBD for BDE and GEN - two 03f522 projects?) is to completely abandon chgen's pkey version numbering and checks, while retaining the list of table types in View Definition files. This frees all six pkey digits for row# values. This would permit version control by standard source code control systems like SCCS, RCS, CVS and PCVS. It enables RCS/CVS to detect and ignore rows that do not change between checkout and checkin. (Version number incrementing would get in the way.) RCS/CVS revision control saves space by storing only differences due to diagram database changes between checkout and checkin. I would also add one new table type called FileSpec (FS) to each schema and view. This table would have one or a few rows. FS-rows specify version, update and author (revision history) info for all tables in the view used for loading or dumping this file. For bde, we would like to include RCS/CVS revision history tag info in FS-rows; these tags may permit RCS/CVS to document versions of bde diagrams in table FS and visually display the revision history in a separate FS-table window. Several FS version rows may be needed when doing database conversion from one schema version to another. Chgen anticipated this case by allowing pr_load and pr_dump to have two different views, so that input and output databases can have two different data model schemas. In SIS, multiple Term-specific versions of the database may need to be memory-resident simultaneously for historical processing, such as calculating GPAs or printing transcripts. 6. More details based on Ed's data: ----------------------------------- The view definition file *.vdf has options for a specfic version or the most recent version for each table type. Version number is encoded into the most significant two decimal digits of the ASCII pkey (3 digits if keys are 12 bytes long). Ed's TE records span two version numbers - only the highest one is accepted. I presume Ed deleted the higher numbered term from input data and re-ran. If Ed had deleted these rows in main.c and called pr_load again, this would also fail, unless he called pr_init first. Pr_init would clear the database, defeating the purpose of multi-version processing. I noticed the same behavior for Ed's table EN: Rows EN027* are not loaded (not in pr_dumped output) because the higher version rows take priority. (pr_init scans the tables first and saves the highest version number for each table before pr_load is called). Table TM uses pkeys TM2003xx only. IF he had data for TM1999xx also, it would be rejected for the same reason. ----------------------------------------------------------------- Ed Klein's schema.sch file: > > Department DE /* Department Table */ > { > DEid NA c8 1 /*Department ID Primary Key*/ > DEhead NA c30 0 /* Department Name */ > } > > Student ST /* Student Attribute Table */ > { > STid NA c8 1 /* Student ID Primary Key */ > DEid NA c8 1 /*Created via linked parent*/ > STsnm NA c32 0 /* Student Name */ > STgpa NA f4 0 /* Acumulate GPA */ > STsta NA c1 0 /* Student Status */ > } > > Trm TM /* Term or Semester Table */ > { > TMid NA c8 1 /* Semester ID Primary Key */ > TMley NA c8 0 /* Semester Name */ > } > > Faculty FA /* Faculty Table */ > { > FAid NA c8 1 /* Faculty ID Primary Key */ > DEid NA c8 1 /*Created via linked parent*/ > FAnm NA c32 0 /* Faculty Name */ > FAsta NA c1 0 /* Faculty Status */ > } > > Course CO /* Course Table */ > { > COid NA c8 1 /* Course ID Primary Key */ > DEid NA c8 1 /*Created via linked parent*/ > COtit NA c24 0 /* Course Title */ > } > > Address AD /* Student Address Table */ > { > ADid NA c8 1 /* Address ID Primary Key*/ > STid NA c8 1 /*Created via linked parent*/ > ADstree NA c30 0 /* Number & Street */ > ADcity NA c13 0 /* City */ > ADstat NA c2 0 /* State */ > ADzip NA c9 0 /* Zip Code */ > ADctry NA c2 0 /* Country */ > } > > StudentTer TE /* Student-Term Table */ > { > TEid NA c8 1 /*Student-term ID Primary Key*/ > STid NA c8 1 /*Created via linked parent */ > TMid NA c8 1 /*Created via linked parent */ > TEatt NA i2 0 /* credits attempted */ > TEear NA i2 0 /* credits earned */ > TEgpa NA f4 0 /* Current GPA */ > } > > StudTermSe RC /* Student-Term-Section Table*/ > { > RCid NA c8 1 /* StudTermSe ID Primary Key*/ > TMid NA c8 1 /*Created via linked parent*/ > FAid NA c8 1 /*Created via linked parent*/ > COid NA c8 1 /*Created via linked parent*/ > RCmax NA i4 0 /* Maximum Enrollment */ > RCmin NA i4 0 /* Minimum Enrollment */ > RCcur NA i4 0 /* Current Enrollment */ > RCsta NA c1 0 /* Section Status */ > } > > PreReqLinK PQ /* Prerequisite Table */ > { > PQid NA c8 1 /*PreReqlink ID Primary Key */ > COid NA c8 1 /*Created via linked parent */ > PQcod NA c11 0 /* Prerequisite Code */ > } > > > Enrollment EN /* Enrollment Table */ > { > ENid NA c8 1 /* Enrollment ID Primary Key*/ > TEid NA c8 1 /*Created via linked parent */ > RCid NA c8 1 /*Created via linked parent */ > SPcred NA i2 0 /* Credits */ > SPgrad NA c3 0 /* Grade*/ > } > > CourseSche SE /* Course-Schedule Term*/ > { > SEid NA c8 1 /*Course- Sched ID Primary Key*/ > RCid NA c8 1 /*Created via linked parent */ > SDdays SDdays c3 0 /* Lecture Day like MWF */ > SDtime SDtime c6 0 /* Lecture Time like 430PM */ > SDbldg SDbldg c3 0 /* Building */ > SDroom SDroom c3 0 /* Room */ > } > > > Content-Type: application/octet-stream; name="SIS_students.dat" > > /* Ed Klein CS91.522 Fall 2003 */ > /* Assignment 2 September 16, 2003 */ > > /* Department: ID Number Name */ > DE000091 91 ComputerScience > > /* Student: ID Dept Name GPA Status */ > > ST000027 DE000091 Brown,Abe 2.00 1 > ST000252 DE000091 Smith,John 4.00 1 > > /* Term: ID Name */ > > TM020031 Fall03 > TM020032 Spring03 > TM020033 Summer03 > > /* Faculty: ID Dept-ID Name Status */ > > FA000490 DE000091 Wright,Shirley 1 > > /* Course: ID Dept-ID Number Title */ > > CO091101 DE000091 101 IntroComputerScience > CO091103 DE000091 103 ComputerLanguages > > /* Student Address: */ > /* ID (Street&City) ST-ID Number&Street City State Zip Country */ > > AD000317 ST000252 55MainStreet Fireworks MA 023390000 US > AD000027 ST000027 1AppleStreet Bedford MA 017300000 US > > /* Student-Term: ID ST-ID TM-ID Credits-Att Credits-Ear GPA */ > > TE027031 ST000027 TM020031 04 04 2.00 > TE027032 ST000027 TM020032 04 04 2.00 > TE252031 ST000252 TM020031 03 03 4.00 > TE252032 ST000252 TM020032 03 03 4.00 > TE252033 ST000252 TM020033 03 03 4.00 > /* Course Section: ID CO-ID FA-ID TM-ID Number Location Timeslot */ > > SE101201 CO000101 FA000490 TM020032 201 OLSEN311 W1730 > SE103201 CO000103 FA000490 TM020031 201 OLSEN414 M1730 > SE104201 CO000103 FA000490 TM020031 201 OLSEN414 M1730 > > /* Enrollment: ID ST-TERM-ID SE-ID Credits Grade */ > > EN027031 TE027031 SE104201 03 C > EN252032 TE252032 SE103201 03 A > EN252031 TE252031 SE101201 03 A > > /* Student-Term-Section: */ > /* ID (dept+co+se) TM-ID FA-ID CO-ID Max Min Cur Status */ > RC911011 TM020031 FA000490 CO091101 025 005 020 3 > RC911031 TM020032 FA000490 CO091103 025 005 010 3 > > /* Prerequisite: ID Course code */ > PQ091101 CO091101 none > pQ091103 CO091103 91101 > > > --NextPart_Webmail_9m3u9jl4l_1799_1065227615 > Content-Type: application/octet-stream; name="dump.dat" > Content-Transfer-Encoding: 7bit > > DE000091 91 ComputerScience > ST000027 DE000091 Brown,Abe 2.0000 1 > ST000252 DE000091 Smith,John 4.0000 1 > TM020031 Fall03 > TM020032 Spring03 > TM020033 Summer03 > FA000490 DE000091 Wright,Shirley 1 > CO091101 DE000091 101 IntroComputerScience > CO091103 DE000091 103 ComputerLanguages > AD000027 ST000027 1AppleStreet Bedford MA 017300000 US > AD000317 ST000252 55MainStreet Fireworks MA 023390000 US > TE252031 ST000252 TM020031 3 3 4.0000 > TE252032 ST000252 TM020032 3 3 4.0000 > TE252033 ST000252 TM020033 3 3 4.0000 > RC911011 TM020031 FA000490 CO091101 25 5 20 3 > RC911031 TM020032 FA000490 CO091103 25 5 10 3 > EN252031 TE252031 SE101201 3 A > EN252032 TE252032 SE103201 3 A > SE101201 CO000101 FA0 TM0200 201 OLS > SE103201 CO000103 FA0 TM0200 201 OLS > SE104201 CO000103 FA0 TM0200 201 OLS > > --NextPart_Webmail_9m3u9jl4l_1799_1065227615 > Content-Type: text/plain; name="main.c" > > #define MAIN > #include > #include "SIS.h" > > /********************************** > UML CS 91.522 OOAD Fall 2003 > Ed Klein October 1, 2003 > Assignment #3, SIS report generator > ***********************************/ > > void main () > { > char* view_name; /* string for view name */ > > /* define view */ > view_name="SISView"; > > pr_init ("SIS.viewdefs", "SIS_students.dat"); > > /* load the database */ > pr_load(view_name,"SIS_students.dat"); > > /* dump the database for verification */ > pr_dump(view_name,"dump.dat",0,"w"); > > } > > --NextPart_Webmail_9m3u9jl4l_1799_1065227615-- > >