Table is a Myth, It's Just a Relationship
- Ajay Kedia
- Jan 28, 2016
- 3 min read
A mental condition found in professional data modelers, that is, thinking about tables too early in the process of database design. Some people popularly refer to this condition as 'Table Think'."
Table (n.) – a collection of information (data?) describing a population of entities which possess some common characteristics, called attributes.
Relation – The building blocks of relational databases are, of course, relations--database abstract representations of sets of facts about classes of property-sharing entitie.
"The problem with 'logical' modeling ... is that it is too biased towards relational database design ... My preference is for what I've come to call 'essential' data modeling ... in terms of the 'things of significance' to the [business] domain ... Just Person, Activity, Contract, etc. ... relationship names are very important. They portray the facts that link these 'things' together. In each direction you have an assertion, such as 'each ORDER must be composed of one or more LINE ITEMs'."
An irony that goes unnoticed is that at a time when "data science" and “machine learning” are busily discovering insightful patterns (or so we are constantly assured), criticism of the relational data model (RDM) is rampant. This is ironic because patterns is another name for relationships and if the RDM is about anything, it is about relationships.

When E. F. Codd introduced the RDM in the late 1960's, there was no mention of tables. The R in RDM comes from relations. A mathematical relation is a kind of set -- more specifically, a set of sets called tuples, each of which is a set of attribute values, attributes also being sets. A relation is, in essence, comprised of a bunch of relationships among its attributes and among its tuples, with relationships also among relations.
If relations are used to represent sets of facts about the real world, a RDBMS could, in response to user queries, make logical inferences, i.e., apply set-mathematical operations -- the relational algebra (RA) -- to database relations to derive new sets of facts (relations). In other words, analytics, with the advantage of mathematically guaranteed logically correct results.
However, relations -- database and results -- must be presented to users in some way. Initially Codd considered the array, but then realized that a special kind of table is simpler and more familiar. The table has been an important factor in the adoption of relational technology, so much so that, somewhere along the way, in an industry devoid of education, relations got lost in the shuffle and, with them, the importance of the underlying relationships and the guarantee of correct query results. SQL tables, for example, are not necessarily R-tables.
"Logical refers to the relationships among the components of the relation, not to any arrangement of the components of a relation. Any presentation that preserves those relationships and adds no extra ones is acceptable. An R-table is one possible such presentation. The problem is that people fixate on this one presentation, identifying it with relation. They then go even further and force the physical implementation of a relation to be table-like."
And then blame the RDM for poor performance too!
For Everest there are no relations, only tables. Similarly, when Hay deplores “logical modeling that is too biased towards relational database design”, he is focused on tables, not on "logical" as in logic.
Even more ironic is that what Hay calls "essential modeling" -- believing, like so many, that he invented something new and fundamental -- nothing but relational thinking about the world. The assertion "Each ORDER is composed of one or more LINE ITEMs" is an informal formulation in English of a formal predicate in logic. And it so happens that, together with set theory, first order predicate logic (FOPL) is the other half of the dual theoretical foundation of the RDM. Every expression of the relational algebra e.g., join, describes a relation and is equivalent to a FOPL predicate that describes the relation in terms of its membership function. An R-table is only a way to present such a relation visually to users. If you conceptualize the world as predicates, what logical design is better than relations and what presentation of relations is better than R-tables?
RDM was devised for essential modeling and that’s what is essential about it, not tables. Criticism such as in the above examples is indisputable evidence of ignorance of data and relational fundamentals. Don’t allow yourself to be misled. Learn the fundamentals and when you use SQL tables, focus on the underlying relationships, not on the table format, which is exactly what a good analyst should always do.
Comments