Introduction to Data Modeling:
Data modeling is a Conceptual Representation of the Data Structures i.e., tables required for a Database. The Date structure provides a description for the information that flows in a data model. In simple it could be said as means of representing and communicating the business requirements.
Each data model has two components - logical and physical.
(i) The logical component establishes the data requirements to implement business rules. It also represents business rules using entities, attributes, and the relationships between them.
(ii) The physical component maps the data values with the physical databases.
Entities and Attributes:
Attributes are used to apply the business rules to a data model. The values stored in the attributes identify the characteristics of an entity.
The key attributes are:
- Primary Key: Defines a single attribute or a set of attributes to identify the instances of entities. All relationships in a data model are based on primary keys. Domain experts help identify a primary key and discover its attributes. Each key attribute is called a candidate key.
- Simple: Defines an attribute that has been reduced to its smallest possible granularity and is completely dependent on the primary key for values. Each simple attribute must have a value.
- Derived: Defines an attribute derived from the values of one or more attributes. These attributes are dependent on other attributes in the data model for their values.
- Foreign Key (FK): Defines a single attribute or a set of attributes that are primary keys for another entity. These are the key attributes that migrate to another entity in a data model.
Entities represent the identities, physical objects, and other elements, about which Information is stored in the data models. Each entity contains a name, description, and Associated attributes. The values in the attributes identify each element of the entity class.
The different types of entities are:
- Core: Describes the entities in a data model on which the business rules are applied. The core entities are also called the primal entities. They can be dependent or independent. The core entities identify the entity classes that can be reused in a data model. All core entities are extensively reused to implement standardization, reduce implementation time, and minimize changes or errors.
- Code: Describes independent entities. They are also called reference entities. The values assigned to the attributes of the code entities identify unique instances of the record values in a database.
- Subtype: Describes dependent entities. They are used to store a different set of attributes for the same entity. The siblings of the subtype entity are related to their parents using an inclusive or exclusive relationship.
- Associative: Describes the entities that have primary keys that are composite and consist of values from two or more entities. These entities help resolve a many-to-many relationship in which many instances of one entity are related to many instances of another entity.
- Redundant: Describes the entities that have different names but contain information on the same object. These entities are used when a role is modeled for an entity.
- Overloaded: Describes the entities that contain information on multiple objects in a data model. These entities do not have values for each of their attributes.
Relationships in a Database:
Relationship is the number of child and parent entities related by the Relationship. The degree of a relationship is the number of attributes related to each other in any two entities. When the degree of a relationship is unary, an instance of the entity is related to itself. This is also called a recursive relationship. The direction of a relationship Indicates the originating entity in a binary relationship. The originating entity is called the parent entity and the terminating entity is called the child entity.
The different types of relationships are:
- Identifying: Defines a relationship that identifies instances of child entities based on the values of the attributes in a parent entity. The child entity is dependent on the parent entity.
- Subtype: Defines a relationship that relates a parent entity to one or more child entities. This type of relationship identifies one or more instances of an attribute set in the child entities. If there is only one child entity, the relationship is called exclusive. In case there is more than one child entity, the relationship is called inclusive.
- One-to-One: Defines a relationship that relates a single instance of one entity with a single instance of a second entity. The parent and child entities are related using a primary key to identify instances of the attribute values.
- One-to-Many: Defines a relationship that relates a single instance of one entity to many instances in the second entity. The instances of the second entity are identified using composite keys.
- Many-to-Many: Defines a relationship that relates a single instance of one entity to many instances in the second entity, and a single instance of the second entity to many instances of the first entity. These relationships are resolved using the composite keys and the associate keys of the parent entity.
Q & A:
1. When does the Question of Developing a New Data model Arise?
- When a New Application for OLTP, ODS, Data warehouse or data marts are planned.
- When we need to re write the Existing Data Models to a whole new data model due to the existence of incorrect data models.
2. What are the advantages of the Data Modeling?
- Uses all information in the data (let the data do the talking) while estimating parameters of interest.
- Characterize more fully the behavior of a system.
- Simulate experiments before performing them.
3. Roles of an Data Modeler?
- Business Requirements Analysis, Interact with Business Analyst to get the Functional Requirements.
- Interact with End Users and Find out the Reporting Needs.
- Gather Accurate Data by Data Analysis and Functional Analysis.