B450F-Python代写
时间:2022-11-26
BIA B450F
Unit 4
Data warehouse
planning and
management
Dr Franklin Lam
1
Traditional system development life cycle (SDLC)
• All systems have a life cycle
or a series of stages they
naturally undergo.
– The number and name of the
stages varies, but the primary
stages are conception,
development, maturity and
decline.
– The systems development life
cycle (SDLC) therefore, refers
to the development stage of
the system’s life cycle.
SDLC
• Plan the project
• Determine how the
system will be designed
and built
• Execute various test
to ensure that all
works as planned
• Design the solution and
write the code
• Make solution
available to users
• Address issues
if needed
• Gather business &
technical requirements
2
The data warehouse life cycle
Kimball Lifecycle diagram
3
Program/project planning and management
• DW/BI initiative begins with a series of program and project planning
activities
– Assessing readiness
– Scoping and justification
– Staffing
– Developing and maintaining the plan
4
The data warehouse project challenge
• Data warehouses are highly integrated with many other business systems. This
creates a larger number of dependencies that must be accounted for more than
in other more modular or discrete projects.
• The integrated nature of a data warehouse means that communication across
the organization — with business users, with the heads of various business
units, with senior management and many others — is critical to success.
• Data warehouse development really does not have a beginning and an end.
Warehouses are always ‘works in progress’. A single development iteration (or a
single run through the lifecycle map) might take several years.
• Data warehouses are expensive developments that cross departmental
boundaries. If organizational politics are to intervene in any project, they will
make themselves felt in a data warehouse development.
5
Assessing readiness
• Before moving ahead with a DW/BI effort, it is prudent to take a moment to
assess the organization’s readiness to proceed. Factors to be considered in
assessing the readiness of organizations include:
1. A strong executive business sponsor (*most critical) – Business sponsors should have a
clear vision for the DW/BI system’s potential impact on the organization. Optimally,
business sponsors have a track record of success with other internal initiatives. They
should be politically astute leaders who can convince their peers to support the effort.
2. A strong, compelling business motivation for tackling the DW/BI initiative – compelling
motivation typically creates a sense of urgency for enabling a tight cooperation within
the organization
3. Feasibility – several aspects feasibility, including technical and resource feasibility, but
data feasibility is the most critical
6
Data warehouse readiness litmus test
Factor Low readiness  High readiness
Strong business management sponsor
Not well respected  Considerable organizational clout
Can take weeks for team to gain access  Readily available to team
‘I’ll get back to you on that’  Quick, decisive resolution to issues
Hope ‘you’ get it done  Active, vocal and visible supporter —
willing to put own neck on line
You can deliver this to 250 users next month, right?  Realistic expectations
‘A data whatta?’  Data warehouse savvy
Compelling business motivation
‘And your point is?’  Survival dependent on data warehouse
Funding is a big problem  Cost is not an issue — we can’t afford not do this!
‘Shifting sands’ vision  Clearly articulated vision
Ten different views of the solution  Consistent view of the solution
Tactical issue  Strategic issue
Cost savings opportunity  Incremental revenue opportunity
Unable to quantify the payback  Huge payback
7
Factor Low readiness  High readiness
IS/Business partnership
Business engages outside consultant without IS  Business and IS work hand-in-hand
knowledge
Business unit creates own pseudo IS team to build  IS actively engaged with business unit
‘We can’t trust any numbers from our systems’  Strong confidence in existing reporting environment
It takes ‘years’ to get a new ad hoc request turned  Quick IS response to ad hoc requests
around
Users don’t even submit requests anymore  Short existing user request backlog
Current analytic culture
‘Gut feel’ decision making  Decision making relies on facts and figures
Users don’t ask for data  Business users clamour for access to data —
‘Just get me the data and I’ll figure it out’
Users don’t look at current reports  Current reports are consistently re-keyed into
spreadsheets for analysis and historical trending
Current reports used as doorstops until the recycling  Current reports are dog eared, highlighted and filled
bin comes by with yellow self adhesive notes
Users have secretaries log on and print off email to  Users are very computer literate
read it
Finance is extremely possessive of bottom line  Information shared openly throughout the
performance figures organization
8
Factor Low readiness  High readiness
Feasibility
Data warehouse would require purchase of all new  Robust technical infrastructure in place
technology
Everyone and their uncle is committed to existing  Experienced resources available
projects
Reliable data won’t be available until after the  Quality data available
enterprise resource planning (ERP) implementation
9
Scoping and justification
• Project scoping is the task of defining what will be accomplished by the project. Without a clear sense
of a project’s scope, it is not possible to determine the finish the project or manage conflicting
demands from different segments of the user community.
• Defining the project scope is an opportunity to get together all the major stakeholders and thrash out
the primary objectives of the project – this process may engender a good deal of debate and
controversy.
• The project scope should be summarized in a document that everyone involved in the project has
access to and signs off on.
• Justification is the process to demonstrate, as best as possible, that the project will generate a return
on its investment (ROI). In other word, it is to prove that the project is worth doing.
10
Data warehouse project team
Depending on the project type, there are different types of roles that perform the project. The following
list presents typical roles and their responsibilities in data warehouse projects based on Linstedt and
Olschimke (2015).
Role Responsibilities
Business sponsor • create alignment between the project and business and cultural goals
• communicate on behalf of the project, especially towards senior management
• be the key advocate of the project and gain commitment among other key
stakeholders
• arrange resources to ensure the success of the project
• facilitate problem solving by ensuring the escalation of issues to be solved
effectively at the organizational level
• support the project manager by offering mentoring, coaching and leadership; and
build durability to make sure that project outputs are sustainable
Technical business
analyst
• establish standards and access control lists
• prioritize change requests
• establish new requirements
• create new reports for the general business user audience
• help the team debugging alpha releases
• participate in the development and design of information marts
• create user training material.
11
Role Responsibilities
Project Manager • make sure that the project team completes the project
• develops the project plan, manages the team’s performance of project tasks and
secures acceptance and approval of deliverables from the project sponsor and other
stakeholders
• responsible for communication, such as status reporting, risk management and
escalation of issues that cannot be solved within the project team
IT Manager • ensure the business continuity and success of the business
• oversee projects and make sure that they use resources effectively.
• advise the management team objectively on where IT might make a difference to
business
• agreeing on costs, timescales, and standards to be met and monitoring them
throughout the project
• helping the organization to transition smoothly from legacy systems to new systems
• keeping management updated on the progress of current projects
12
Role Responsibilities
ETL Developer • team members who are assigned to this Extract, Transform, Load (ETL) role
• implement the data or control flows that load the data from source systems to
staging, from staging to EDW and from there to information marts.
• responsible for creating virtual marts, or implementing soft business rules in ETL
requested by the business
Report Developer • implement business-driven reports based on information marts, Business Vault
tables, or directly on the Raw Data Vault (in rare circumstances)
Data Architect /
Information Architect
• responsible for the information architecture and data integration
Metadata Manager • responsible for the planning of metadata design
• facilitates a framework for metadata development
• coordinates activities and communication with other roles and projects
• administrates access levels to metadata for all members and external staff who need
to work with the metadata
Change Manager • ensures that new functionality will not disrupt other IT or business services on roll-
out
• responsible for making sure that roll-outs are possible in the environment and not
hindered by other projects
13
The project plan
• A document that describes the task
required at each step in the data
warehouse lifecycle.
• For each task, responsible team members
are identified and the time required to
complete each task is estimated.
• To success
– project planning should be done early and done
thoroughly;
– the business sponsor and business users are
part of the team, and are involved early and
continuously throughout the project;
– all team members are involved in user
acceptance/project review steps, and these are
scheduled at the end of key stages in the
project.
14
Fans Front Office Coaches Regular Line-Up Special Teams
Project Task B
u
s
in
e
s
s
U
s
e
rs
B
u
s
in
e
s
s
S
p
o
n
s
o
r
/




B
u
s
in
e
s
s
D
ri
v
e
r
D
W
/B
I
D
ir
e
c
to
r
/








P
ro
g
ra
m
M
a
n
a
g
e
r
P
ro
je
c
t
M
a
n
a
g
e
r
B
u
s
in
e
s
s
P
ro
je
c
t
L
e
a
d
B
u
s
in
e
s
s
A
n
a
ly
s
t
D
a
ta
S
te
w
a
rd
/
Q
A
A
n
a
ly
s
t
D
a
ta
A
rc
h
ite
c
t
/








D
a
ta
M
o
d
e
le
r
/
D
B
A
M
e
ta
d
a
ta
M
a
n
a
g
e
r
E
T
L
A
rc
h
ite
c
t
/










E
T
L
D
e
v
e
lo
p
e
r
B
I
A
rc
h
ite
c
t
/
A
p
p
D
e
v
e
lo
p
e
r
/
P
o
rt
a
l D
e
v
e
lo
p
e
r
T
e
c
h
n
ic
a
l A
rc
h
ite
c
t
/





T
e
c
h
S
u
p
p
o
rt
S
p
e
c
ia
lis
t
S
e
c
u
ri
ty
M
a
n
a
g
e
r
L
e
a
d
T
e
s
te
r
D
a
ta
M
in
in
g
/
S
ta
ts
S
p
e
c
ia
lis
t
E
d
u
c
a
to
r
PROJECT/PROGRAM LAUNCH AND MANAGEMENT
LEGEND:
PROJECT DEFINITION l Primary responsibility
1 Assess DW/BI readiness m m l l t t m Involved
2 Develop preliminary project scope/charter m m l l t t t t t t Provides input
3 Build business justif ication t m t l l t t t t t r Informed of results
PROJECT PLANNING & MANAGEMENT
1 Establish project identity t t l l
2 Identify project resources t l l
3 Prepare project plan r r l l m m m m m m m m m m m
4 Develop project communication plan t t l l
5 Conduct project team kick-off & planning t t l l m m m m m m m m m m m
6 Develop process to manage scope/control changes t t l l
7 Develop process to measure success m t l l
8 User acceptance/project review r m m l l m r r r r r r r r r r
9 Ongoing project management r r r l l m m m m m m m m m m m
PROGRAM PLANNING & MANAGEMENT
BUSINESS REQUIREMENTS DEFINITION
DW/BI TECHNICAL ARCHITECTURE
APPLICATION ARCHITECTURE DESIGN
1 Create architecture task force m m l
2 Gather & document technical requirements l
3 Review current technical environment m m m m m l
4 Develop architecture implications document m m m m l
5 Create architecture model t m m m l
6 Determine phased implementation approach m m l
7 Define and specify subsystems m m m l
8 Create the architecture plan t t t l
9 Develop configuration recommendations t t t l
10 User acceptance/project review r r r l m r r r r r r m r r r r
PRODUCT SELECTION
MANAGE METADATA
IMPLEMENT TACTICAL SECURITY MEASURES
DEVELOP STRATEGIC SECURITY PLAN
CREATE INFRASTRUCTURE PLAN
PRODUCT INSTALLATION
IMPLEMENTATION
DIMENSIONAL DATA MODEL DESIGN
PHYSICAL DATABASE DESIGN
Determining requirements
• Collaborating with business users to
understand their requirements and
ensure their buy-in is absolutely
essential to successful data warehousing
and business intelligence.
• Approaches:
– Interviews – encourage individual
participation and are also easier to schedule.
– Facilitated sessions – reduce the elapsed
time to gather requirements but require
more time commitment from each
participant.
15
• The purpose of defining requirements
is to identify, formalize and validate all
requirements that serve as a basis for
activities such as:
– Project planning
– Scope setting and change control
– Gap analysis
– Prioritization and consensus of system
objectives and requirements
– Product evaluation (if purchasing systems)
– Integration with existing systems and
infrastructure
– Warehouse testing.
Requirement gathering process
Plan
• Choose forums
• Identify and
prepare
requirement
team
• Select,
schedule and
prepare
business
requirements
Gather
• Interview
• Workshop
• Manage
expectations
• Collect
samples
Document
• Executive
summary
• Requirements
findings
• Requirements
matrix
• Describe
benefits
• Present back
for validation
Prioritizes
• Review
findings
• Discuss and
set priorities
16
Prioritization grid
• Prioritization grid can be used to prioritize
the finding’s business processes by impact
and feasibility/
• Each of the finding’s business process
themes is placed on the grid based on the
representatives’ composite agreement
regarding impact and feasibility.
• Projects with high potential business
impact and feasibility warrant for
immediate attention.
17
Warrant immediate attention
for DW/BI project team
Should be avoided Don’t justify for short-
term attention
IT teams should address
the feasibility limitations
Technical architecture design
• The technical architecture is the blueprint for the DW/BI environment’s
technical services and infrastructure. Major tasks include:
– Establish an architecture task force – It is useful to create a small task force of two
or more people focused on architecture including the ETL developer, Data
architect etc.
– Collect and document architecture-related requirements – The architecture
design is driven by business requirements but also considers the current
standards and technology of DW.
– Create the architecture model – The architecture requirements are grouped into
major components, such as ETL, BI, metadata, and infrastructure. From there the
team drafts and refines the high-level architectural model.
18
Technical architecture design
– Determine architecture implementation phases – Establish architecture priorities
similar to that for business requirements.
– Design and specify the subsystems – Identify and define subsystems that may not
be found on the off-the-shelf products.
– Create the architecture plan - The technical architecture plan document should
include adequate details so skilled professionals can proceed with construction of
the framework.
– Review and finalized the technical architecture.
19
Production selection and installation
• The following six tasks associated with DW/BI product selection are quite similar to
any technology selection.
– Understand the corporate purchasing process
– Develop a product evaluation matrix
• Define use cases and evaluation criteria for evaluating the infrastructure, data management, analysis and
content creation, and sharing of findings capabilities of products (Reading 4.1 – Gartner (2016) Magic
Quadrant for Business Intelligence and Analytics Platforms.)
• A spreadsheet-based evaluation matrix should be developed that identifies the evaluation criteria, along
with weighting factors to indicate importance.
– Conduct market research
– Evaluate a short list of options
– Conduct a prototype (if necessary)
– Select product, install on trial and negotiate
20
Physical design
• The dimensional models developed and documented via a preliminary source-to-
target mapping need to be translated into a physical database, including:
– Develop naming and database standards
– Develop physical database model
– Develop initial index plan
– Design aggregations, including OLAP database
– Finalize physical storage details: blocks, files, disks, partitions, and table spaces or databases
• The aggregation, indexing and other performance tuning strategies will evolve as
actual usage patterns are better understood, so be prepared for the inevitable
ongoing modifications.
21
BI application specification
• Review the findings of business requirements definition and collected sample reports
to identify a starter set of BI reports and analytic applications.
• Before designing the initial application, it is helpful to establish standards, such as
common pull-down menus and consistent output look and feel.
• Use the standards to specify each application template and capture sufficient
information about the layout, input variables, calculations, and breaks, so both the
application developer and business representatives share a common understanding
of the applications.
• Identify structured navigational paths to access the applications, reflecting the way
users think about their business. For example, leverage customizable information
portals or dashboards for disseminating access.
22
Magic quadrant for BI and analytics platforms
(Gartner)
23
Reading 4.1 - Magic Quadrant for Analytics
and Business Intelligence Platforms
(Source: https://b2bsalescafe.files.wordpress.com/2018/03/magic-quadrant-for-
analytics-and-business-intelligence-platforms.pdf)
BI application development
• Focus on standards; naming conventions, calculations, libraries, and coding standards
should be established to minimize future rework.
• The application development activity can begin when the database design is
complete, the BI tools and metadata are installed, and a subset of historical data has
been loaded.
• Revisited the BI application template specifications to account for the inevitable
changes to the model since the specifications were completed.
• Provide appropriate BI tool-specific education or supplemental resources for the
development team.
• Test query response time and review the preliminary performance-tuning strategies
of DW/BI system.
24
Deployment
• The technology, data, and BI application tracks converge at deployment.
• Deployment requires substantial preplanning but successful deployment demands
the courage and willpower to honestly assess the project’s preparedness to deploy.
• Perform end-to-end system testing, including data quality assurance, operations
processing, performance, and usability testing.
• Provide education and support to the user.
• Support can be organized into a tiered structure. The first tier is website and self-
service support; the second tier is provided by the power users residing in the
business area; centralized support from the DW/BI team provides the final line of
defense.
25
Maintenance and growth
• Continue to manage the existing environment by investing resources in
the following areas:
26
– Support
• User support is immediately crucial
following the deployment to ensure the
business community gets hooked.
• If the DW/BI deliverable is not of high
quality, the unanticipated support
demands for data reconciliation and
application rework can be overwhelming.
– Education
• Continuing education program for the
DW/BI system must be provided to users,
developers and power users.
– Technical support
• Technical support should proactively
monitor performance and system capacity
trends.
– Program support
• Communication with the varied DW/BI
constituencies must continue to ensure that
existing implementations continue to
address the needs of the business.
• Ongoing checkpoint reviews are a key tool
to assess and identify opportunities for
improvement.
Common pitfalls of data warehouse project
• Ten common pitfalls (Kimball and Ross, 2013)
1. Become overly enamored with technology and data rather than focusing on the
business’s requirements and goals.
2. Fail to embrace or recruit an influential, accessible, and reasonable senior
management visionary as the business sponsor of the DW/BI effort.
3. Tackle a galactic multiyear project rather than pursuing more manageable,
although still compelling, iterative development efforts.
4. Allocate energy to construct a normalized data structure, yet run out of budget
before building a viable presentation area based on dimensional models.
5. Pay more attention to back room operational performance and ease-of-
development than to front room query performance and ease of use.
27
Common pitfalls of data warehouse project
6. Make the supposedly queryable data in the presentation area overly complex.
7. Populate dimensional models on a standalone basis without regard to a data
architecture that ties them together using shared, conformed dimensions.
8. Load only summarized data into the presentation area’s dimensional structures.
9. Presume the business, its requirements and analytics, and the underlying data
and the supporting technology are static.
10. Neglect to acknowledge that DW/BI success is tied directly to business
acceptance.
28
Critical success factors (CSF) of data warehouse
and BI systems
29
Yeoh and Koronios (2010) (Reading 4.2) and Yeoh and Popovic (2016)
Dimension CSF
Organization • Committed management support and sponsorship
• A clear vision and well-established business case
Process • Business-centric championship and balanced team
composition
• Business-driven and iterative development approach
• User-oriented change management
Technology • Business-driven, scalable and flexible technical framework
• Sustainable data quality and integrity
essay、essay代写