IS6126/IS6146/IS6156
Databases for Management Information Systems
Group Assignment
Assignment Type: Group Assignment
Percentage of Module Mark: 30% (i.e., 70% Group and 30% Individual)
Due Date: on or before 26th April 2021, 4pm
Upload to Canvas
Prepare a report analysing data in an organisation you are familiar with, using the specified
platforms to process your raw data files. The assignment submission requires the following
sections:
• Section 1: Raw Data File Creation
– Create three separate CSV files i.e., File_A.csv, File_B.csv and File_C.csv, each of which
contains raw data from three separate and different data sources related to the organisation
of your choice (e.g., customer satisfaction metrics, smart car sensor readings, database(s),
social media metrics/readings etc). Feel free to use fictitious data, just as long as it is
realistic of the data source and represents the type of data your organisation would use.
Each file needs to contain at least 60 rows and 5 columns of data per row. [Output: 3 CSV
files.] [5%].
• Section 2: MS Excel Data Ingestion and Data Aggregation
– Import File_A.csv into a MS Excel worksheet and aggregate the data such that it provides
value to your organisation. Use Pivot Table(s) to process the data and generate two Pivot
Charts from the one Pivot table. [Output(s): 1 MS Excel worksheet with Pivot Table(s) and
two Pivot Charts. One max 5-minute video demonstrating the data import and Pivot Table
and Chat generation] [15%].
• Section 3: MS Excel CSV Generation
– Generate one new CSV file (File_D.csv) based on the aggregated results from section 2.
[Output(s): One CSV file containing the aggregated results. One max 2-minute video
demonstrating the data export.] [5%].
• Section 4: R Studio Data Ingestion
– Import three CSV files i.e., File_B.csv, File_C.csv and File_D.csv. Use any data structures in
the R Language you wish to import and store the data. [Output(s): Three R Language data
structures (using at least two different types data structures) to contain all the imported data.
One max 5-minute video demonstrating the data import and data structure usage.] [20%].
• Section 5: R Studio Data Analysis
– Using the R Language and relevant supporting functions, analyse the data in your three data
structures to generate three meaningful summary reports (both Text and Graphical) which
will add value to your organisation. Each summary report will have a maximum word count
of 250 and at maximum two graphical aids generated using the R Language. [Output(s):
Three summary reports, max 250-words per report, each containing both text and graphical
aids, all within one PDF file. Please add the cover page details to this PDF file.] [25%].
• Section 6: Individual Report
– Provide a max 500-word report on the strengths and weaknesses of MS Excel vs the R
Language in one specific area of data analysis. As part of your individual report, make a
recommendation to use one of these platforms over the other for a very specific reporting
technique in a real-world setting. [Output(s): A max 500-word report in a PDF format. Please
add the cover page details to this PDF file.] [30%].
Required Format:
- All external sources of information must be referenced, Harvard Referencing Style
- Font- Times New Roman size 12
- Spacing- 1.5
- Cover Page to include Student Name, Student Number, Lecture Name
- Provide a table of contents in your report
- Add all of the individual’s files including the individual reports into one single ZIP file.
Name the zip file after your group name e.g., Group 01, or Group 09, or Group 11 etc.
IS6126/IS6146/IS6156
Databases for Management Information Systems
Group Assignment
Grade Description
Fail Very poor, clearly inadequate coverage of material, badly prepared, poor
quality and confusing report. Very poor coverage of the subject matter.
Demonstrated a lack of understanding regarding MS Excel or the R
Language when it comes to processing raw data in a real-world setting.
Pass A subject has been clearly identified and defined. However, no clear
overview or analysis was provided. Provided a basic coverage of the
subject area with a limited set of data sets. Functions within MS Excel
and/or the R Languages were applied, but to a very limited manner.
3H A subject has been identified and clearly explained in the report. Subject
area is covered, however, limited supporting data models and a basic
analysis is provided using MS Excel and/or the R Language.
2H2 Subject area clearly explained. Reasonable analysis of the problems
posed by the domain through the usage of MS Excel and/or the R
Language. However, some issues may have been left unaddressed in the
data models.
2H1 Excellent coverage. Clear exposition of the domain. Issues around
representation problems and the recommended system have been
thoroughly reviewed. Report is of high standard. Very strong application of
the functions within MS Excel and the R Language. Very strong supporting
summary reports generated using the two platforms.
1H Outstanding. Thorough coverage of subject. Perceptive analysis and
assessment accompanied by interesting application of the MS Excel and
R Language functions. Excellent understanding of the two platforms and
logical application of the relevant functions which reflect a real-world
setting.
学霸联盟