COMP2501-无代写-Assignment 2
时间:2023-12-04
COMP 2501 Introduction to Data Science and Engineering [Section 1A, 2023]
Assignment 2
Deadline. 4 December 2023 at 23:59 p.m.
Problem 1. According to past experience, the life of a certain electrical appliance follows
an exponential distribution [1] with a mean value of 100 hours. Now we randomly select 16
of them. Suppose their lives are independent of each other.
(a) Try to prove V ar(X + Y ) = V ar(X) + V ar(Y ) when X, Y are independent and iden-
tically distributed. 1
(b) Try to find the probability that the sum of the lives of these 16 electrical appliances is
greater than 1920 hours.
Problem 2. The lengths X of a batch of parts are known to follow a normal distribution
N(µ, 1) [2]. A random sample of 16 parts is taken from the batch and the mean value of the
lengths is obtained as 40. What is the confidence interval for µ with a confidence level of
0.95?
Problem 3. The CS department requires us to obtain department information for further
analysis.
(a) Submitting an R program to find all professors (only academic sta↵) and their corre-
sponding personal pages in the link
https://www.cs.hku.hk/people/academic-sta↵.
Present your results with the format “url:name” per line.
(b) Submitting an R program to find all PhD students in the link
https://www.cs.hku.hk/people/research-student
(Note that MPhil student is not a PhD student).
Present your results with the format “name” per line.
Problem 4. A PhD candidate Wu wants to investigate the accepted papers published in the
conference ICSE 2023 Technical Track. Help him to obtain the name of each accepted
paper and its LAST author with the help of ONLY one regular expression by parsing the
raw html [3]. Present your results with the format “paper title:last author”.
(The corresponding link: https://conf.researchr.org/track/icse-2023/icse-2023-technical-track)
(Hint: check package curl and search “regex group”)
References
[1] “Exponential distribution,” https://en.wikipedia.org/wiki/Exponential distribution,
2023.
[2] “Normal distribution,” https://en.wikipedia.org/wiki/Normal distribution, 2023.
[3] “Html representation,” https://en.wikipedia.org/wiki/HTML, 2023.
1Note that Var refers to variance https://en.wikipedia.org/wiki/Variance