SEEM3550A-无代写-Assignment3
时间:2024-04-18
The Chinese University of Hong Kong Department of Systems Engineering and Engineering Management SEEM3550A Fundamentals of Information Systems 2023-24 Assignment 3: Query Processing
This is an individual assignment. Everyone must submit it individually. The due date for the assignment 3 is 17:00, Apr 25th, 2024. The late penalty will be 10% per day. A submission will not be accepted after the solution release. Q1. [20 marks] Answer the following questions related to query cost. (a) [5 marks] “A relational algebra expression specifies a unique evaluation algorithm”. Is this statement true? (b) [5 marks] What are the differences between the primary index and the secondary index? (c) [5 marks] “A secondary index must be dense for the index to be useful”. Is this statement true? (d) [5 marks] When performing a merge join operation, what is the minimum number of memory buffers required? Explain briefly.
Q2. [59 marks] You are given the following three relations in a university database as below (with primary keys underlined): INSTRUCTOR (Id, Name, DeptName, Salary) COURSE (Cid, Name) TEACHES (Id, Cid, Semester, Year) The relation INSTRUCTOR has 500 tuples, and 20 tuples fit into one block; the relation COURSE has 200 tuples, and 20 tuples fit into one block; the relation TEACHES has 500 tuples, and 50 tuples fit into one block. All records are sorted in ascending order of the primary keys of the table on disk (TEACHES is first sorted on Id, and then on Cid if two records have the same Id). Assume that the INSTRUCTOR Id ranges from 1 to 500, COURSE Cid from 1 to 200. It is assumed that a relation is stored on disk consecutively. Please answer the following questions and show your calculations clearly. (a) [8 marks] Answer the following questions and elaborate on your answers. i. [4 marks] Assume that a relation is stored on disk consecutively, how many disk seeks do we need if we read the entire relation, and read the relation from a certain block, respectively? ii. [4 marks] Assume that relation INSTRUCTOR has a primary index A and a secondary index B. On what possible search keys are A and B built, respectively? (b) [30 marks] Answer the following questions and show your calculations clearly. i. [10 marks] Consider the following SQL query: select * from COURSE where Name=’Data Structure’ or Cid=50 Assume that relation COURSE has a B+-tree index of height 5 on the Cid attribute. Further assume that the Cid of COURSE are integers between 1 to 200. What are the numbers of disk seeks and block transfers associated with the SQL query if we use the B+-tree index? What are the numbers if we do not use the B+-tree index? Should we use the B+-tree index? Explain briefly. ii. [10 marks] Consider the following SQL query: select * from INSTRUCTOR where Name = ‘Jack’ If we perform linear search, what are the numbers of disk seeks and block transfers associated with the SQL query? iii. [10 marks] Consider the following SQL query. select * from INSTRUCTOR where Id>100 Assume that the Id of relation INSTRUCTOR are integers between 1 and 500. Further assume that INSTRUCTOR has a B+-tree index of height 10 on the Id attribute. If we use the B+-tree for the query, what are the numbers of disk seeks and block transfers associated with the SQL query? (c) [8 marks] Assume that relation INSTRCUTOR has a B+-tree index of height 3 on the
DeptName attribute, and a B+-tree index of height 4 on the Name attribute. Further assume that there are 30 instructors from the Computer Science Department. i. [4 marks] Is the B+-tree on Name a primary or secondary index? Explain briefly. ii. [4 marks] Consider the following SQL query: select * from INSTRUCTOR where DeptName = ‘Computer Science’ If we use the B+-tree on the DeptName attribute for the query, what are the numbers of disk seeks and block transfers associated with the SQL query? The number of accesses to the blocks that contain pointers to data records in the data file is ignored: (d) [13 marks] Consider the following SQL query performing the join operation: select * from INSTRUCTOR, TEACHES where INSTRUCTOR.Id = TEACHES.Id i. [9 marks] Assume that 12 blocks are allocated to the memory buffer. The query is implemented by nested-loop join. Which of the relations should be the inner relation that leads to a smaller number of seeks and block transfers? Explain your answer by comparing the two cases with calculating the number of seeks and block transfers, when different relations are chosen as the inner relation. ii. [4 marks] Assume that INSTRUCTOR has a B+-tree index of height 3 on the Id attribute. The query is implemented by indexed nested-loop join with TEACHES as the outer relation. What are the numbers of disk seeks and block transfers?
Q3. [21 marks] Consider the following two relations R1 and R2: a b b c 2 P 3 Q 3 R R2 R1 We would like to merge-join R1 and R2 on attribute b. However, R1 is unsorted on b, so we have to sort R1 before joining the relations. The records R2 are stored on attribute b in ascending order sequentially from top to bottom. Assume 1 record fits into 1 block of memory. Please show how an external sort-merge on R1 on the attribute b is conducted, by illustrating the following steps: (a) [6 marks] Create sorted runs, assuming 4 blocks are allocated to the memory buffer.
• Draw figures as in p.35 of Ch. 12 lecture slides
• There should be 2 runs. Label them sequentially as r1, r2. (b) [12 marks] Merge the sorted runs created above via 2-way merge (p.36 of Ch. 12 lecture slides). The memory buffer has 3 blocks denoted as M1, M2 and O, respectively. Memory buffer (blocks):
A 1 B 8 C 2 D 3 E 9 F 7 G 10 H 6
M1 M2 O
• Show this process by stating the read/write operations in the form of “[Read/Write] [X] to [Y]” For instance, “Read r1 to M1” stands for reading a block of r1 from disk to the memory block M1, “Write M1 to O” stands for writing the record in M1 to the output buffer, and “Write O to Disk” stands for writing the record in O to disk. The first seven operations are shown as a reference: Read r1 to M1 Read r2 to M2 Write M1 to O Write O to Disk Read r1 to M1 Write M1 to O Write O to Disk (c) [3 marks] Show the resulting joined relation.
essay、essay代写