MS Computer Science Virtual University of Pakistan

CS712 Midterm Solved Questions – Distributed Database Management System

CS712 Midterm Solved Questions – Distributed Database Management System

CHAPTER NO.1

The Distributed Database Management System used the two basic concept of

  1. Database Management System.
  2. Networking
  • Data Processing Application in computer terminology are referred as “File Processing Systems”.

Distributed Computing Systems:

“A system consisting of a number of autonomous processing elements that are connected a computer network and that cooperate in performing their assigned task”

CHAPTER NO.2

Distributed Database System:

A collection of logically inter related databases that are spread physically across multiple locations connected by a data communication link.

Types of Access of DDBS:

Local Access: the access by the users connected to a site and accessing the data from the same site.

Remote Access: a user connected to a site lets say site 1 and accessing the data from site 2.

Global Access: no matter where ever the access is made, data will be displayed after being collected from all locations.

Distributed Databases: where to apply?

  1. Large number of users
  2. Users are physically spread across large geographically area.

Following are some of the database applications that are strong candidates for a DDBS.

  • Banking Application
  • Air Ticketing
  • Business at multiple locations

Distributed Database Management System:

A software system that permits the management of distributed database and make the distribution transparent to the users.

Decentralized Database:

A collection of Independent database on non-networked computers.

LECTURE NO.3

Ram Sharing               =          Tight Coupling

Hard Disk Sharing      =          Loose Coupling

  • A number of local DBMSs are called local nodes.

The DDBMS contains the global schema, that is basically the merger of all local schema. The users connected to the DDBMS layer is called global users.

Database Administrator:      A person who maintain the data on a local site

Local User:    A user on the local site is called the local user.

Promises of DDBS:

  1. Transparency
  2. Data Independence
  3. Logical data independence

If we change the conceptual schema, there is little or no effect on external level.

  1. Physical data independence

If we change in physical or lower level then there is little or no effect on the conceptual level.

  1. Network transparency
  2. Replication transparency

LECTURE NO.4

Fragmentation Transparency:

A file or table is broken down into smaller parts/sections is called fragments. There are two major types of fragmentation i.e. vertical & horizontal.

Fragmentation Transparency is that a user does not know that the database is fragmented.

Responsibility of Transparency:

  1. Language/Complier

Language or complier used to develop the font-end application programs in a database system which provides the transparency of the linking and manipulation of data.

  1. Operating System

OS provide the network transparency.

  1. DDBMS

DDBMS provide the fragmentation or replication transparency.

Performance improvement by DDBS:

  1. Data localization
  2. Query Parallelism
  3. Inter Query Parallelism:

It is the form of parallelism where many different queries or transaction are executed in parallel with another on many processors. Examples are Oracle 8, Oracle Rdb. It increase transaction and throughput.

  1. Intra Query Parallelism:

It is the form of parallelism where single query is executed in parallel on many processors. Examples are Informix, Terra Data. It speed up single complex large running quarries. Used for complex scientific calculations.

Complicating Factors of DDBS:

  1. Selection of the copy.
  2. Failure recovery
  3. Complexity
  4. Cost
  5. Distribution of control

The problem Areas of DDBS:

  1. Database design
  2. Quarry processing
  3. Concurrency control
  4. Operating system
  5. Heterogeneity – Composition from dissimilar parts. Any type of difference can occur at different sites which should be managed. To establish the complexity of things.

CHAPTER NO.5

Data Model:

A set of tools that are used to design a database. Database and database management system based on data model.

  1. Record Based data model or legacy data models.
  2. Hierarchical
  3. Network
  4. Relational
  5. Semantic data model

Examples are Entity-Relationship and Object Oriented Data models.

Relational Data Model:

A data model based on relation or table.

Key:

  1. Super Key.
  2. Candidate Key
  3. Primary Key
  4. Alternate Key
  5. Secondary Key
  6. Foreign Key

Normalization:

It is a step by step process to produce an efficient and smart database design that makes it easier to maintain the consistency of the database. Normalization process is used to remove the database’s anomalies.

Anomalies:

These make the database inconsistent or incorrect. These anomalies are: –

  • Duplication
  • Insertion anomaly
  • Update anomaly
  • Deletion anomaly

Normal Forms:

  • First Normal Form
  • Second Normal Form
  • Third Normal Form
  • BCNF
  • Forth Normal Form
  • Fifth Normal Form

Dependency:

Normalization is based on dependencies. Dependencies are identified not designed.

CHAPTER NO.6

  1. Functional Dependency:
  2. Multi-valued Dependency:
  3. Project Join Dependency:

Functional Dependency:

Functional Dependency exists when the value of one or more attributes can be determined from the value of one or more other attributes.

  1. Full functional dependency.
  2. Partial Functional dependency.
  • Transitive dependency.

Determinant:       An attribute or set of attributes on which the values based to determine the value of other attributes.

Normalization Guidelines:

Transfer Attributes into Table format

Unnormal form

Remove repeating groups

First normal form

Remove partial dependency

Second normal form

Remove transitive dependency

Third normal form

It must be sure that every determinant is candidate key.

Boye codd normal form

Remove multi-value dependency

Forth normal form

Remove join dependency

Fifth normal form.

Question No.1:

a). Describe the difference between De-normalization and normalization with the help of examples.

Answer:

Normalization and De-normalization are the processes which are being used to optimize the performance of the database.

Normalization:

Normalization is a process which is carried out to minimize/remove the anomalies in order to maintain the database’s consistency. Example of normalization is given below: –

Company IDCompany NameProduct IDProduct NameProduct Quantity
1

 

Iqra Enterprises101

102

Beauty Soap

Tooth Brush

120

100

2Shahid Enterprises103Polish140
3Khalid Enterprises103Polish250

First Normal Form:

Company IDCompany NameProduct IDProduct NameProduct Quantity
1Iqra Enterprises101Beauty Soap120
1Iqra Enterprises102Tooth Brush100
2Shahid Enterprises103Polish140
3Khalid Enterprises103Polish250

 Second Normal Form:

Company Table:

Company IDCompany Name
1Iqra Enterprises
1Iqra Enterprises
2Shahid Enterprises
3Khalid Enterprises

Product Table:

Product IDProduct NameProduct Quantity
101Beauty Soap120
102Tooth Brush100
103Polish140
103Polish250

Company – Product Table

Company IDProduct ID
1101
1102
2103
3103

Third Normal Form:

Company Table:

Company IDCompany Name
1Iqra Enterprises
2Shahid Enterprises
3Khalid Enterprises

Product Table:

Product IDProduct NameProduct Quantity
101Beauty Soap120
102Tooth Brush100
103Polish140
103Polish250

Company – Product Table

Company IDProduct ID
1101
1102
2103
3103

De-normalization:

De-normalization is a reverse process of Normalization. It reduces the work load and optimizes the performance. Example of de-normalization is given below: –

Employee Table:

Employee IDEmployee Name
1Shahid
2Rizwan
3Khalid

Course Table:

Employee IDCourse Qualified
1MS Office
2Web Designing
3C++ programming

After De-normalization,

Employee IDEmployee NameCourse Qualified
1ShahidMS Office
2RizwanWeb Designing
3KhalidC++ programming

 Integrity Rules:

Integrity Rules are used to maintain the consistency of database. There are two types of integrity Rules:

  • Structural Integrity Constraint.
  • Behavioral Integrity Constraint.

CHAPTER NO.7

Structural Integrity Constraint:

  • Entity Integrity Constraint:

The primary key cannot be null even it contain more than one attributes. No attribute can have the null value. It cannot be repeated.

  • Referential Integrity Constraint:

It involves the concept of Foreign Key. Foreign key can either be null or same as primary key.

Data manipulation languages can be used to carry out different operation like insertion, deletion or creation of database.

Data Relation Languages:

  • Procedural Languages
  • Non-procedural languages

Relational Algebra:

  1. The Select Operation:

It is used to select the specific tuples of a table.

  1. The projection Operation:

It is used to select some specific column (s) of the table.

  1. The Union Operation:

Combine the attributes in both tables.

  1. The Intersection Operation:

Select the tuples which are in both tables.

  1. Set Difference:

Select all those tuples which are in first table but not in second.

  1. Cartesian Product:

Multiple the tuples of both the tables.

JOIN OPERATION:

  1. Theta Join:

Only selected rows of the relation made the cross product with second relation

  1. Equi Join:

Rows are join on the basis of common attributes between the relations.

  1. Natural Join:

It is same as Equi Join but the difference is that common attributes will appear only once.

  1. Semi Join:

First we take the Equi join and then we project the attributes of first table only.

  1. Outer Join

LECTURE NO.8

Types of Network:

There are seven topologies which are given below:-

  1. Star topology.
  2. Ring topology.
  3. Mesh topology.
  4. Bus topology.

Transmission mode:

Point-to-point transmission

Broadcast (multi-points)

LECTURE NO.9

Geographically distribution of Network:

  • LAN (Local Area Network)
  • MAN (Metropolitan Area Network)
  • WAN (Wide Area Network)

Circuit Switching:

Establish the link between the nodes.

Packet Switching:

Message is divided into packets. Each packet is contain unique id. It follows the different path to reach the same destination.

ISO/OSI (International Standard Organization/Open System Interface)

It consists of seven layers which are given below: –

  1. Physical Layer
  2. Data Link Layer
  3. Network Layer
  4. Transport Layer
  5. Presentation Layer
  6. Session Layer
  7. Application Layer

Architecture of DDBMS:

  1. Peer to Peer
  2. Client/Server
  3. Multi-databases

DDMS Standardization:

  1. Component based
  2. Function based
  3. Data based

LECTURE NO.10

  • Database Administrator is responsible for internal schema.
  • Application Administrator is responsible for external schema.

Architectural models for DDBMS:

  1. Autonomy
  2. Distribution
  3. Heterogeneity

 

  1. Autonomy:

Degree to which member databases can operate independently.

  1. Design Autonomy
  2. Communication Autonomy
  3. Execution Autonomy

Classification of DDBMS:

  • Tight Integration
  • Semi autonomous
  • Total isolation
  1. Distribution:

A system cannot be called truly a DDBMS, if the data is not distributed physically.

  1. Heterogeneity:

Simple means variation/difference.

  • Homogeneous
  • Heterogeneous

LECTURE NO.11

Components of DDBMS:

One component handles the interaction with users and other deals with the data.

The first component is called the user processor, it consists of four elements: –

  • The User interface handler
  • The Semantic data controller.
  • The global query optimizer and decomposer
  • The distributed execution monitor. The distributed execution monitor is also called distributed transaction manager.

The second component is called the data processor, it consists of three elements: –

  • The local query optimizer
  • The local recovery manager
  • The run-time support processor.

LECTURE NO.12

Global Data Dictionary:

A dictionary is a database that contains data about data (meta data). It is called global data dictionary in case of DDBS.

Distributed Database Design Strategies:

  • Top-down approach

It is used when a database is being designed from scratch.

Issues: Fragmentation & allocation

  • Bottom Up approach

Integration of existing database.

Issues: Design of the export and global schemas.

Reason for Fragmentation:

Easy to maintain

Increase system throughput

Reduce the chance of system failure

Parallel execution of queries

For large organization

LECTURE NO.13

Fragmentation

  1. Horizontal
  2. Vertical

FRAGMENTATION ALTERNATIVES

FRAGMNTATION ALTERNATIVES                                    

JNOJNAMEBUDGETLOC
J1Instrumental150,000Montreal
J2Database Dev.135,000New York
J3CAD/CAM250,000New York
J4Maintenance350,000Paris

HORIZONTAL PARTITIONING

JNOJNAMEBUDGETLOC
J1Instrumental150,000Montreal
J2Database Dev.135,000New York

 

JNOJNAMEBUDGETLOC
J3CAD/CAM250,000New York
J4Maintenance350,000Paris

VERTICAL PARTITIONING

JNOBUDGET
J1150,000
J2135,000
J3250,000
J4350,000

 

JNOJNAMELOC
J1InstrumentalMontreal
J2Database Dev.New York
J3CAD/CAMNew York
J4MaintenanceParis

LECTURE NO.14

Primary Horizontal Fragmentation:

We will find out here

  • Simple Predicates
  • Min-term Predicates
  • Implications
  • Final result of PHF.

Let’s see this example

Simple Example

Pay1=  salary 3000(Pay)

Pay2=  salary >3000(Pay)

Example

Consider the following relation

EmpIdNamejobTitleLocationSalary
115AliData Entry OperatorLahore20,000
116ZohaibProgrammerIslamabad35,000
223AmirNetwork AdministratorLahore38,000
133IftikharData Entry OperatorKarachi22,000
405SalmanNetwork AdministratorKarachi42,000
506HabibProject ManagerIslamabad60,000
148ZainNetwork ManagerLahore55,000

            A company has sites in three cities: Islamabad, Lahore and Karachi. Most queries from Lahore and Karachi sites are about local employees. At Islamabad site, around 40% queries are about local employees and similar proportion of queries is about managerial level employees. In addition, data entry operators are not kept at Islamabad site.

Perform PHF on the relation “Emp”. Determine:

  • The set of simple predicate Pr
  • Determine the set of minterm predicates M
  • Determine the set I of implications among pi є Pr
  • Give the resultant fragments

ANSWER:

  • Simple Predicate Pr:

p1 : jobTitle =” Data Entry Operator”

p2 : jobTitle =” Programmer”

p3 : jobTitle =” Network Administrator”

p4 : jobTitle =” Project Manager”

p5 : jobTitle =” Network Manager”

p6 : Location =” Islamabad”

p7 : Location =” Lahore”

p8 : Location =” Karachi”

  • Minterm Predicates M:

m1 :  (jobTitle = “Programmer”) ^ (Location = “Islamabad”)

m2 :  (jobTitle = “Project Manager”) ^ (Location = “Islamabad”)

m3 :  (jobTitle = “Data Entry Operator”) ^ (Location = “Lahore”)

m4 :  (jobTitle = “Network Administrator”) ^ (Location = “Lahore”)

m5 :  (jobTitle = “Network Manager”) ^ (Location = “Lahore”)

m6 :  (jobTitle = “Data Entry Operator”) ^ (Location = “Karachi”)

m7 :  (jobTitle = “Network Administrator”) ^ (Location = “Karachi”)

  • Implications among pi Pr :

Pr = Pr’ = {p1, p2, p3, p4, p5, p6, p7, p8}

Implications are

p1 ⇒⇁ p2 ∧ ⇁p3 ∧ ⇁ p4 ∧ ⇁ p5

p2 ⇒⇁ p1 ∧ ⇁p3 ∧ ⇁ p4 ∧ ⇁ p5

p3 ⇒⇁ p1 ∧ ⇁p2 ∧ ⇁ p4 ∧ ⇁ p5

p4 ⇒⇁ p1 ∧ ⇁p2∧ ⇁ p3 ∧ ⇁ p5

p5 ⇒⇁ p1 ∧ ⇁p2∧ ⇁ p3 ∧ ⇁ p4

p6 ⇒⇁ p7 ∧ ⇁ p8

p7 ⇒⇁  p6 ∧ ⇁ p8

p8 ⇒⇁  p6 ∧ ⇁ p7

  • Resultant Fragments:

Emp1

EmpIdNamejobTitleLocationSalary
115AliData Entry OperatorLahore20,000
133IftikharData Entry OperatorKarachi22,000

Emp2

EmpIdNamejobTitleLocationSalary
116ZohaibProgrammerIslamabad35,000

Emp3

EmpIdNamejobTitleLocationSalary
223AmirNetwork AdministratorLahore38,000
405SalmanNetwork AdministratorKarachi42,000

 Emp4

EmpIdNamejobTitleLocationSalary
506HabibProject ManagerIslamabad60,000

Emp5

EmpIdNamejobTitleLocationSalary
148ZainNetwork ManagerLahore55,000

LECTURE NO.15

Derived Horizontal Fragmentation:

In DHF, two important points are:-

  • Each link between owner and member is defined as an equi-join.
  • Equi-join can be implemented by means of semi-join.

Example of DHF:

Two relations of Employee and Pay is given as under: –

Emp

Emp No.Emp NameTitle
E1Tahir KhanElectrical Engineer
E2Waris ShahSystem Analyst
E3RizwanMechanical Engineer
E4AliProgrammer
E5ShahidSystem Analyst
E6NoorElectrical Engineer
E7WaliMechanical Engineer
E8KhalidSystem Analyst

 Pay

Title Salary
Electrical Engineer40000
System Analyst34000
Mechanical Engineer27000
Programmer24000

The two fragments Pay1 and Pay2 of Pay and Emp1 and Emp2 of Emp are defined as follows:-

Pay1= σ salary > 30000(Pay)

Pay2=σ salary ≤ 30000(Pay)

Emp1 = Emp ⋉ Pay1

Emp2 = Emp ⋉ Pay

So, the result of DHF is given below: –

Emp1

Emp No.Emp NameTitle
E1Tahir KhanElectrical Engineer
E2Waris ShahSystem Analyst
E5ShahidSystem Analyst
E6NoorElectrical Engineer
E8KhalidSystem Analyst

Emp2

Emp No.Emp NameTitle
E3RizwanMechanical Engineer
E4AliProgrammer
E7WaliMechanical Engineer

LECTURE NO.16

Vertical Fragmentation:

Two approaches are used for vertical fragmentation which are given below: –

Grouping: each individual attribute one fragment, at each step join some of the fragments until some criteria being satisfied

  • Attributes to fragments

Splitting: start with global relation, and generate beneficial partitions based on access behavior of the applications

  • Relations to fragments

LECTURE NO.17

Example:

Consider the relation Proj(jNo, jName, budget, loc). Assume that the following applications are defined to run on this relation. In each case we also give the SQL specification.

q1:       SELECT BUDGET FROM PROJ WHERE JNO=Value

q2:       SELECT JNAME, BUDGET FROM PROJ

q3:       SELECT JNAME FROM PROJ WHERE LOC=Value

q4:       SELECT SUM(BUDGET) FROM PROJ WHERE LOC=Value

Solution:

Let A1=jNo, A2=jName, A3=budget, A4=loc

So, the Attribute Usage Matrix will be

\begin{array}{l}\,\,\,\,\,\,\,{{A}_{1}}\,\,\,\,{{A}_{2}}\,\,{{A}_{3}}\,\,{{A}_{4}}\\\begin{matrix}  {{q}_{1}} \\  {{q}_{2}} \\  {{q}_{3}} \\  {{q}_{4}} \\  \end{matrix}\left[ \begin{matrix}  1 & 0 & 1 & 0 \\  0 & 1 & 1 & 0 \\  0 & 1 & 0 & 1 \\  0 & 0 & 1 & 1 \\  \end{matrix} \right]\end{array}

Example:

Let us assume that ref1(qk) = 1 for all qk and S1. If the application frequencies are

Acc1(q1) = 15             Acc2(q1) = 20             Acc3(q1) = 10

Acc1(q2) = 5               Acc2(q2) = 0               Acc3(q2) = 0

Acc1(q3) = 25             Acc2(q3) = 25             Acc3(q3) = 25

Acc1(q4) = 3               Acc2(q4) = 0               Acc3(q4) = 0

Solution:

The Frequency Matrix will be

\displaystyle \begin{array}{l}\begin{matrix}  {} & {{\text{S}}_{\text{1}}} & {{\text{S}}_{\text{2}}} & {{\text{S}}_{\text{3}}} \\  \end{matrix}\\\begin{matrix}  {{\text{q}}_{\text{1}}} \\  {{\text{q}}_{\text{2}}} \\  {{\text{q}}_{\text{3}}} \\  {{\text{q}}_{\text{4}}} \\  \end{matrix}\left[ \begin{matrix}  \text{15} & \text{20} & \text{10} \\  \text{5} & \text{0} & \text{0} \\  \text{25} & \text{25} & \text{25} \\  \text{3} & \text{0} & \text{0} \\  \end{matrix} \right]\end{array}

And the Attribute Usage Matrix is also given, such that

\displaystyle \begin{array}{l}\,\,\,\,\,\,\,{{A}_{1}}\,\,\,\,{{A}_{2}}\,\,{{A}_{3}}\,\,{{A}_{4}}\\\begin{matrix}  {{q}_{1}} \\  {{q}_{2}} \\  {{q}_{3}} \\  {{q}_{4}} \\  \end{matrix}\left[ \begin{matrix}  1 & 0 & 1 & 0 \\  0 & 1 & 1 & 0 \\  0 & 1 & 0 & 1 \\  0 & 0 & 1 & 1 \\  \end{matrix} \right]\end{array}

Now we find out the Attribute Affinity Matrix

Aff(A1,A1) = 15 + 20 + 10 = 45
Aff(A1,A2) = 0

Aff(A1,A3) = 15 + 20 + 10 = 45
Aff(A1,A4) = 0

Aff(A2,A2) = 5 + 25 + 25 + 25 = 80

Aff(A2,A3) = 5

Aff(A2,A4) = 25 + 25 + 25 = 75

Aff(A3,A3) = 15 + 20 + 10 + 5 + 3 = 53

Aff(A3,A4) = 3

Aff(A4,A4) = 25 + 25 + 25 + 3 = 78

So, the Attribute Affinity Matrix will be

\displaystyle \begin{array}{l}\,\,\,\,\,\,\,{{A}_{1}}\,\,\,\,\,\,{{A}_{2}}\,\,\,\,\,{{A}_{3}}\,\,\,\,{{A}_{4}}\\\begin{matrix}  {{q}_{1}} \\  {{q}_{2}} \\  {{q}_{3}} \\  {{q}_{4}} \\  \end{matrix}\left[ \begin{matrix}  45 & 0 & 45 & 0 \\  0 & 80 & 5 & 75 \\  45 & 5 & 53 & 3 \\  0 & 75 & 3 & 78 \\  \end{matrix} \right]\end{array}

Global Affinity Measures (AM):

Affinity Measures is a single value that is calculated on the basis of position of elements in Attribute Affinity and their surrounding elements.

\displaystyle AM=\sum\limits_{j=1}^{n}{\left[ bond({{A}_{j}},{{A}_{j-1}})+bond({{A}_{j}},{{A}_{j+1}}) \right]}

LECTURE NO.18

Example:

Let us consider the AA matrix given as under and study the contribution of moving attribute A4 between A1 and A2.

Solution:

Given Attribute Affinity is

\displaystyle \begin{array}{l}\,\,\,\,\,\,\,{{A}_{1}}\,\,\,\,\,\,{{A}_{2}}\,\,\,\,\,{{A}_{3}}\,\,\,\,{{A}_{4}}\\\begin{matrix}  {{q}_{1}} \\  {{q}_{2}} \\  {{q}_{3}} \\  {{q}_{4}} \\  \end{matrix}\left[ \begin{matrix}  45 & 0 & 45 & 0 \\  0 & 80 & 5 & 75 \\  45 & 5 & 53 & 3 \\  0 & 75 & 3 & 78 \\  \end{matrix} \right]\\Cont({{A}_{i}},{{A}_{k}},{{A}_{j}})=2\left\{ bond({{A}_{i}},{{A}_{k}})+bond({{A}_{k}},{{A}_{j}})-bond({{A}_{i}},{{A}_{j}}) \right\}\\Cont({{A}_{1}},{{A}_{4}},{{A}_{2}})=2\left\{ bond({{A}_{1}},{{A}_{4}})+bond({{A}_{4}},{{A}_{2}})-bond({{A}_{1}},{{A}_{2}}) \right\}\end{array}

Bond(A1,A4)=45*0 + 0*75 + 45*3 + 0* 78=135

Bond(A4,A2)=0*0 + 75*80 + 3*5 + 78* 75=6000+15+5850=11865

Bond(A1,A2)=45*0 + 0*80 + 45*5 + 0* 75=225

So,

Cont(A1,A4,A2)=2(135+11865+225)=2(11775)=23550

Bond Energy Algorithm:

\displaystyle \begin{array}{l}Cont({{A}_{i}},{{A}_{k}},{{A}_{j}})=A{{M}_{ne{{w}_{{}}}}}-A{{M}_{ol{{d}_{{}}}}}\\=2\left\{ bond({{A}_{i}},{{A}_{k}})+bond({{A}_{k}},{{A}_{j}})-bond({{A}_{i}},{{A}_{j}}) \right\}\end{array}

Steps in Bond Energy Algorithm:

Input: The AA Matrix

Output: Clustered Affinity matrix which is a permutation of AA
Initialization: Place and fix one of the column of AA in CA

Iteration:

Row Order: order of rows according to the column ordering.

LECTURE NO.19

Clustered Affinity Matrix:

First we copy the first two columns of Attribute Affinity Matrix to Clustered Affinity Matrix and start from 3rd column i.e. Attribute A3.

First we find the A0,A3,A1 (0-3-1) that will come = 8820

Ordering (1-3-2) = 10150

Ordering (2-3-4) = 1780

Here we choose ordering (1-3-2) because it gives the highest value.

Now find the ordering (0-4-1), (1-4-3), (3-4-2), (2-4-5)

So, the final Clustered Affinity Matrix will be

\displaystyle \begin{array}{l}\,\,\,\,\,\,\,{{A}_{1}}\,\,\,\,\,\,{{A}_{3}}\,\,\,\,\,{{A}_{2}}\,\,\,\,{{A}_{4}}\\\begin{matrix}  {{A}_{1}} \\  {{A}_{3}} \\  {{A}_{2}} \\  {{A}_{4}} \\  \end{matrix}\left[ \begin{matrix}  45 & 45 & 0 & 0 \\  45 & 53 & 5 & 3 \\  0 & 5 & 80 & 75 \\  0 & 3 & 75 & 78 \\  \end{matrix} \right]\end{array}

Question:

Find the point z along the diagonal that maximizes?

Clustered Affinity Matrix, Attributed Affinity Matrix and Access Frequency Matrix are given below:-

\displaystyle \begin{array}{l}\,\,\,\,\,\,\,{{A}_{1}}\,\,\,\,\,\,{{A}_{3}}\,\,\,\,\,{{A}_{2}}\,\,\,\,{{A}_{4}}\\\begin{matrix}  {{A}_{1}} \\  {{A}_{3}} \\  {{A}_{2}} \\  {{A}_{4}} \\  \end{matrix}\left[ \begin{matrix}  45 & 45 & 0 & 0 \\  45 & 53 & 5 & 3 \\  0 & 5 & 80 & 75 \\  0 & 3 & 75 & 78 \\  \end{matrix} \right]\end{array}

\displaystyle \begin{array}{l}\,\,\,\,\,\,\,{{A}_{1}}\,\,\,\,{{A}_{3}}\,\,{{A}_{2}}\,\,{{A}_{4}}\\\begin{matrix}  {{q}_{1}} \\  {{q}_{2}} \\  {{q}_{3}} \\  {{q}_{4}} \\  \end{matrix}\left[ \begin{matrix}  1 & 1 & 0 & 0 \\  0 & 1 & 1 & 0 \\  0 & 0 & 1 & 1 \\  0 & 1 & 0 & 1 \\  \end{matrix} \right]\end{array}

\displaystyle \begin{array}{l}\begin{matrix}  {} & {{S}_{1}} & {{S}_{2}} & {{S}_{3}} \\  \end{matrix}\\\begin{matrix}  {{q}_{1}} \\  {{q}_{2}} \\  {{q}_{3}} \\  {{q}_{4}} \\  \end{matrix}\left[ \begin{matrix}  15 & 20 & 10 \\  5 & 0 & 0 \\  25 & 25 & 25 \\  3 & 0 & 0 \\  \end{matrix} \right]\end{array}

Solution:
we know that

Z= CTQ * CBQ – COQ2
CTQ= q1=15+20+10=45

CBQ = q3=25+25+25=75

COQ=(q2,q4)=5+3=8

Therefore,

Z= 45 * 75 – 82=3311

LECTURE NO.21

Replication:

Storing a separate copy of database at each of two or three sites.

Advantages:

  • Fast access of data
  • Reliability
  • Helpful in case of network failure

Disadvantages:

  • Duplication of data may create ambiguity.
  • Extra storage is required.
  • Difficulty in maintenance of database.

Software component used in Replication (Types of SQL Server Replication Agent):

  1. Snapshot
  2. Merge
  3. Distribution
  4. Log reader

Replication Model:

See from book.

Replication Method:

  1. Snapshot Replication
  2. Transaction Replication
  3. Merge Replication