RJLRef: $PH/COOL-GEN/CandidateAndSurrogateKeys.{htm,doc,txt}

 

This note has links to chgen meta-model and surrogate key topics.

It is very relevant to asgnt3.

 

1. Candidate Key Generalization and Navigation:

-----------------------------------------------

[CK = CandidateKey, CA = CK<--->TA Associative Entity]

This topic does not solve 06f522/asgnt3 but it shows that CK's can be generalized using the same metaschema content; it gives processing  examples based on child_loop (foreach) iteration principles  that can implement asgnt3. These principles can be used after pr_loading tables SV, TT, TA, CK and CA from your augmented metaschema.msdat.

 

Note that chgen is 'boot-strapped' by running it twice:

 

      Pass1 converts the metaschema.sch to .msdat form.

 

      Pass 2 tests your candidate key methods

 

Pass 2 is an application of user-provided CK and CA tables. If only SOME real applications need to test candidate key  uniqueness and existence, then these CK and CA-based methods could be auto-generated under control of a new chgen runtime option.

 

http://www.cs.uml.edu/~lechner/Obj-RelDBv2/ (slides 3 -7) gives an example of a user-defined candidate key with components 91,522,291.

 

Only the sectionNo component is in the CourseSection table, whose instance row a user may be seeking (or entering).  Yet for query purposes, <Dept,Course,Section> certainly should be a candidate key.

 

Does this affect the metaschema?

 

If you  generalize a CK for one table type TT by permitting its  component fields to include attributes of any superclass or composite ancestor as well as local TA-attributes in its TT-->TA attribute set, then the meta-data model doesn't need to be changed. However, the same-TT-parent constraint needs to be relaxed to allow super-class or container ancestors of a TT to be components of

a candidate key CK.

 

Note that the two paths from CA to TT (via TAid or via CKid) do not necessarily locate the same TT-ancestor. How do you suppose the participating TT instances

must be related to insure the CK's uniqueness property?

 

 

We can inspect the TAcurr->TTid parents of FK-type TA's  that are associated with a CK via its child CAcurr->TAid fkeys. This way, we can check whether all of the CK's component fields are children of the SAME TT, or are spread over its ancestors.

 

Testing the values of CKcurr->CAid_fcp->[CKid_fpp->...]TAid_pp->TTid (meta-data contained in TT-->TA at the row or instance-level) can show if they are all identical to CKcurr->TTid and hence whether all CA_children of a CK refer to TA's of the same TT to which CK belongs (because it has that TTid as an fkey).

 

These sibling CA-children of one CK are iterated over by the macro call child_loop(CK,CA,CAid,CKid) which expands to:

 

for (CAcurr=CKcurr->CAid_fcp;  CAcurr!=CKcurr; CAcurrr=CAptr->CKid_fpp)

{.../*put your code in these braces*/}

 

[You can inspect macro call expansions into their bodies by running

      g++ -E f.cc. 

This invokes cpp to produce f.E in which macro calls are expanded.

The child_loop macro is #defined in schema.h, and customized by its ttabbrev and key arguments for any related table pair:

 

E.g., at $PH/COOL-BDE/bdegen13/bde/pr_util_nolog/94sbde_schema.h:676:

 

--------------------------

#define child_loop(parent,child,pkey,ckey) \

  assert(parent##curr!= NULL);  /* RJL060228 */\

  for (child##curr = parent##curr->pkey##_fcp; \

       ((child##curr != NULL) && ((struct parent *) child##curr) != (parent##curr)); \

       child##curr = (struct child *) child##curr->ckey##_fpp)

--------------------

 

========================================================

 

 

 

2. Links to data modeling concepts, metadata and chgen:

-----------------------------------------------------

 

For more on surrogate keys in the Dept/Course/Section example,  

see slides 3 - 7 in

 

      $PH/Obj-RelDBv2/    (slides 3 - 7)

 

[$PH is http://www.cs.uml.edu/~lechner to your browser.]

 

 

For more details on metaschema.sch in relation to any_application.sch

see this memo:

     

      $PH/COOL-GEN/NamespaceAndSchemaIntegration.htm

       

This suggests how to split from chgen the code to parse a .sch file and write its .msdat equivalent. This code defines a s2t converter from .sch to .msdat.  Then, chgen can pr_load its data model as .msdat instead of parsing the .sch file directly.

 

[Bde plus b2t can also define the .msdat file and all pfkey attributes. However, bde is not the best way to define non-key attributes. They can best be declared textually in the .sch file from t2s, after which s2t can regenerate table TA before running chgen.]

 

 

NamespaceAndSchemaIntegration.htm also contains links to these references:

 

      $PH/COOL-GEN/TTTA_metadata_jk.ppt   (genv10+ internals)

 

      $PH/COOL-GEN/hcg_struct_migration.ppt  (proposed genv14 internals)

 

      $PH/DataModels05fr1.ppt (data: slides 39-50; metadata: slides 51-62)

 

      $PH/COOL-FAQ/COOL_FAQv6.PPT (slides 11-19: chgen vs.gencpp)

 

 

[$PH is http://www.cs.uml.edu/~lechner in your browser.]

 

========================================================

 

 

 

 

 

 

3. Surrogate key definitions:

---------------------------------

 

Surrogate key has a specific meaning in DB literature. Please do not ignore it,

 

E.g., Date and Kroenke define surrogate key as follows::

 

C J Date: The DB Relational Model, A-W 2001 (pp.124-125):

--------------------

"[Codd's] RM/T paper is also the first in which Codd embraced the idea

of surrogates - meaning, essentially, system-assigned identifiers."

...

"Hiding surrogates constitutes a clear violation of Codd's own  

Information Principle, which states that all information in the

database must be cast explicitly in terms of values in relations

and in no other way."

------------------

 

D M Kroenke: Database Concepts (2ed), P-H 2005 (p. 33):

-----------------------

"a surrogate key ... is a unique numeric value that is appended to

the relation to serve as the primary key.  Surrogate key values have no

meaning to the user and are normally hidden from them."

...

"With SQL Server, the developer can specify the starting value of the

surrogate key as well as the amount to increment the key for each new row."

----------------------