Spark代写-COMP5349
时间:2021-06-19
Dr. Ying Zhou
School of Computer Science
COMP5349 – Cloud Computing
Week 13: Course Review and Exam Info
COMP5349 Schedule in 2021
COMP5349 "Cloud Computing" - 2021 (Y. Zhou) 13-2
Week Topic
Week 1 Cloud Computing Overview and Service Models
Week 2 Virtualization Technology
Week 3 Container Technology
Week 4 Map/Reduce Framework
Week 5 Spark Framework
Week 6 Distributed Execution: GFS
Week 7 Distributed Execution: YARN
Week 8 Spark DataFrame
Week 9 Spark Machine Learning Library
Week 10 Cloud Storage and Databases Services
Week 11 Paxos Consensus Algorithm
Week 12 Distributed System Management and Kubernetes
Week 13 Course Review
The Big Picture
n Cloud Computing
„ Shared IT services for clients to rent from
„ On different levels (IaaS, PaaS, SaaS, FaaS, ….)
„ Made possible through web and data center technology
n Enabling Technologies
„ Virtualization
¡ Used by all IaaS providers
„ Container
¡ Allows individual applications/services to be deployed on VM
¡ Container orchestration software such as Kubernetes makes it easy to deploy
and manage distributed systems built as container based services
n Key features
„ Illusion of a whole system to every client
„ Performance isolation
„ Security and others
COMP5349 "Cloud Computing" - 2021 (Y. Zhou) 13-3
Analytics and BigData Services
n Basic Computational Model
„ Storage: distributed file systems (GFS, HDFS)
„ Programming Paradigm: MapReduce
„ Hadoop MapReduce as specific (open source) example
„ Map and Reduce phases
¡ Each phase allows multiple tasks to run in parallel
¡ Synchronization and shuffling happen between map and reduce phase
¡ Map output key is used to reorganize intermediate result in reduce phase
„ An analytic workload may needs several map reduce phases
„ Simple localized fault tolerance mechanism depends on storage and I/O
n More advanced computation model
„ Spark
¡ RDD based API
¡ Data Frame based API
„ Main-memory based as compared with disk/batch based approach by MapReduce
n All based on functional programming paradigm
COMP5349 "Cloud Computing" - 2021 (Y. Zhou) 13-4
Cloud Storage and Database Services
n Cloud Storage Services
„ The cloud version of file system: GFS/HDFS, S3, EBS, etc
n Cloud Database Services
„ The cloud version of database: Bigtable, WAS, Dynamo, AWS
Aurora
n Common features
„ Replication
„ Partition
„ Fault Tolerance
„ Various consistency levels
„ Various ways of handling read/write of the data
COMP5349 "Cloud Computing" - 2021 (Y. Zhou) 13-5
Cloud Storage/DB Services Consistency
n Many systems use customized algorithms for handling
read/write requests
n Classic distributed system consensus algorithm: Paxos
„ Tolerate message loss but not corruption
„ Two phase design to satisfy safety and liveness requirements
„ First phase only requests participants to make a promise with
respect to proposal number, the actual value is proposed in the
second phase
„ A leader is necessary to maintain the progress of the algorithm
n Paxos can be used in replicated environment to reach
consensus
„ Run multiple Paxos, each is numbered and the value to be chosen
represents an update command
¡ Paxos instance number, proposal sequence number, value
„ Efficient mechanism to run infinite Paxos instances
COMP5349 "Cloud Computing" - 2021 (Y. Zhou) 13-6
Final Exam Format
n Format: take home exam (3 hours)
„ Hosted in a separate Canvas site: Final exam for: COMP5349
„ All instructions will be included on the exam site
„ You will be added to the site no later than a week before the
exam
n Two files you should download
„ Exam Script as PDF file
„ Answer template as WORD file (as guideline only)
¡ OK to use you own word document
¡ OK to use latex or other type setting tool
¡ Include your name, SID
¡ Label your answer with question number/part number
n Upload answer as single PDF file
COMP5349 "Cloud Computing" - 2021 (Y. Zhou) 13-7
Final Exam Questions
n Conceptual short answer questions
„ E.g. describe/compare some concept(s) or technology(ies), describe the
advantage/disadvantage of certain technology
n Scenario based short answer questions
„ Questions based on a given scenario or code snippet
n Open ended short answer questions
„ Come up with a scenario that satisfies the question description, e.g. show a
message sequence that ends up with an acceptor accepting two proposals
in Pasox
n Integral or long answer question
„ Needs to combine contents from different lecture
n The exam has a 100 points in total
n The exam has a 40% barrier
„ You need to get at least 40 of 100 points in the final exam to pass this
subject
COMP5349 "Cloud Computing" - 2021 (Y. Zhou) 13-8
Final Exam Content
n Assessable:
„ Lecture content
¡ All weeks
„Tutorial material
¡ All except week 1
„Assignment
n Spark programming may be assessed in various ways
„Questions based on a short program
„Design a workload by writing code/pseudo code
COMP5349 "Cloud Computing" - 2021 (Y. Zhou) 13-9
COMP5349 "Cloud Computing" - 2021 (Y. Zhou) 13-10
Thank You!


































































































































学霸联盟


essay、essay代写