matlab代写-ELEC2103/9103
时间:2021-10-03

ELEC2103/9103: Simulation and Numerical Solutions in Engineering Dr Mahyar Shirvanimoghaddam School of Electrical and Information Engineering, The University of Sydney 2021 Assignment Description Modelling, predicting, and verifying the accuracy of models are vital skills in engineering and other fields. This assignment will assess your ability to develop and validate statistical and machine learning models using MATLAB, and in particular, your use of the Statistics and Machine Learning Toolbox. 1 Key information For this assignment, you need to complete two steps: 1. Complete the MATLAB Machine Learning Onramp and upload your certificate to the assignment box. This is an individual assignment and each person needs to complete the training individually. This is worth 5% of your total mark1. 2. Perform statistical analysis and machine learning on the given dataset, write a report and submit it to the assignment box. This is group work and you will work with the same group member as your lab to complete it. You only need to submit one report as part of the group. This is worth 20% of your total mark. Due Sunday 10 Ocober 2021 at 23:59. 2 Background This assignment asks you to explore and analyse a publicly available data set of your choice from the following list. 1. Ausgrid distribution zone substation data: Ausgrid operates a network with over 180 zone substations. These substations form the boundary between the sub-transmission network and the distribution (11kV) network. Ausgrid is making available historical interval demand data (in Megawatts) for all zone substations not subject to third party privacy concerns. The dataset is available here: https://www.ausgrid.com.au/Industry/Our-Research/Data-to-share/Distribution-zone-substation-data 2. World Bank Education Statistics: The World Bank EdStats Query holds around 2,500 internationally comparable education indicators for access, progression, completion, literacy, teachers, population, and expenditures. The indicators cover the education cycle from pre-primary to tertiary education. The query also holds learning outcome data from international learning assessments (PISA, TIMSS, etc.), equity data from household surveys, and projection data to 2050. The dataset is available here: https://databank.worldbank.org/source/education-statistics-%5e-all-indicators 1Please follow this link to access the MATLAB Machine Learning Onramp course: https://matlabacademy.mathworks.com/ details/machine-learning-onramp/machinelearning 1 Assignment Description ELEC2103/ELEC9103 3. World Bank Health Nutrition and Population Statistics: World Bank key health, nutrition and population statistics gathered from a variety of international sources. The dataset is available here: https://databank.worldbank.org/source/health-nutrition-and-population-statistics 4. Australian Bureau of Statistics: Causes of Death, Australia Statistics on the number of deaths, by sex, selected age groups, and cause of death classified to the International Classification of Diseases (ICD). The data set is available here: https://www.abs.gov.au/statistics/health/causes-death/causes-death-australia/2019#data-download 5. Kaggle: Retail Analysis with Walmart Sales Data Historical sales data for 45 Walmart stores located in different regions are available. There are certain events and holidays which impact sales on each day. The business is facing a challenge due to unforeseen demands and runs out of stock some times, due to inappropriate machine learning algorithm. Walmart would like to predict the sales and demand accurately. An ideal ML algorithm will predict demand accurately and ingest factors like economic conditions including CPI, Unemployment Index, etc. The dataset is available here: https://www.kaggle.com/rutuspatel/retail-analysis-with-walmart-sales-data 6. Any other large dataset: If you decided to choose another dataset, you need to discuss your choice with me before start working on the dataset. 3 The assignment task You are to explore and analyse some or all of the data files in one of the datasets above. You are to complete your analysis using MATLAB, and present your analysis as a report contained in a script and other files that can be published to a report in html using MATLAB’s Publish features. You are encouraged to share ideas, but your submitted assignment must be uniquely your own. 3.1 The data The data is mostly contained in csv files and might be separated for each financial year or month. You need to fully understand the attributes in each dataset and be able to explain them in your report. You can also draw on other data sources to inform your analysis (see the section on higher grades below). If you have something particular in mind, I can advise you of whether it is freely available and where to find it, but the Australian Bureau of Statistics (ABS) or the Bureau of Meteorology (BOM) are good places to consider. 3.2 Submission requirements Your assignment will be submitted via Canvas in the form of a .zip file named in the following format: Group_Group Number.zip. Your .zip file must contain: 1. Your main file, called called elec2103a.m (regardless of if you are undergrad or postgrad). You are provided with a MATLAB script stub to get you started, which is available on Canvas. 2. Any custom functions that you write. 3. The data that is needed to complete your analysis. 4. A PDF file including your answer to Part 1, the published version of your main file (Part 2), and your answers to Part 3. 4 Assignment criteria and grades The assignment will be given a grade out of 20. Marks will be allocated in three tranches, as follows: Page 2 Assignment Description ELEC2103/ELEC9103 4.1 Part 1 Here, you need to clearly explain the dataset and the problem that you want to solve. 1. Problem Statement and Background: A high-level statement of the problem you intend to ad- dress/business case study. Give a clear and complete statement of the problem. 2. Resources: Where do the data come from, and what are their characteristics? (a) The data source(s), and (b) characteristics of the data you intend to use (eg. attributes, data types, etc.) Marks for part 1: Completing this part reasonably well will earn you 3 marks. Here, “reasonably” means more than copying and pasting information already available on the websites you downloading the dataset from. I want to see evidence that you have understood the dataset you are working on and the problem you are going to address. 4.2 Part 2 The minimum requirements of this assignment are to: 1. Write a sub-routine to load some or all of the data (from one of the datasets) into a useable format in MATLAB. 2. Write a sub-routine to analyse the data in MATLAB by modelling/fitting it, using regression, classification, ANOVA or other machine learning methods. You may wish to pre-process the data in order to extract some interesting values or variables of merit. Briefly explain your model. 3. Write a sub-routine that makes some assessment of the statistical errors or goodness-of-fit of your model and returns or prints them in your report. Explain these figures. 4. Make appropriate use of plots and/or charts in your report. 5. In your main script, include a call to at least one custom function that you have written in a separate m-file. 6. Put your analysis in a publishable MATLAB script that runs without errors. Build on the provided m-file stub elec2103a.m. Marks for part 2: Satisfying each of the minimum requirements 1-5 above will earn you 1 mark, while requirement 6 is worth 2 mark each. Here, “satisfying” means more than joining two points at different times with a straight line, and will be satisfied if you make proper use of a tool available in the Statistics and Machine Learning Toolbox. In other words, I want to see evidence that you have learnt how to use some new MATLAB tools. A caution: if the main MATLAB script you submit doesn’t run, I will spend a very small amount of time trying to assess the remaining requirements. On the other hand, if you submit a very basic script that does run without errors, you’ll get the marks for requirement 6. 4.3 Part 3 Second, to earn higher grades, you either need to do the first part extremely well (which might get you up to a credit), or you will need to add one or two additional advanced forms of statistical analysis and/or prediction, performed on the same data set, with justification for your choice. You can choose, but these could include the following sub-routines: 1. Make a prediction using your model, perhaps into the future (for time series data) or across a new subset. Discuss your prediction, including making an assessment of the reliability of the prediction. 2. Complete a formal statistical comparisons of more than one model or method of analysis. 3. Make use of advanced statistical analysis testing the assumptions of your modelling choice, such as tests of heteroscedasticity, multicollinearity, etc. Page 3 Assignment Description ELEC2103/ELEC9103 4. Bootstrapping, jackknifing, k-folds or some other resampling-based validation of the predictive ability of your model. 5. Sophisticated use of more than one data set (i.e. incorporating additional data beyond the dataset you chose into your analysis). 6. Use of an advanced statistical estimation or machine learning technique, with justification. This could include: (a) Using MATLAB’s neural network tools (which doesn’t take much effort); (b) Advanced time series analysis, such as ARIMA or GARCH models; (c) Estimating a stochastic volatility or hidden Markov model; (d) Using Bayesian models; (e) Advanced clustering and/or hierarchical analysis; (f) If you have an interest in signal processing, you could investigate non-parametric kernel estimators (akin to kernel smoothing techniques), principle component analysis, or apply a series of bandpass filters over a time series and see what you get. Or come up with something else, after discussing with me. (Note In contrast to previous years’ assignments, making use of MATLAB’s advanced visualisation or GUI tools will not attract marks for higher grades. That is, do not submit code that generates a GUI or only returns advanced visualization objects and expect to get any marks for it.) I reinforce that simply applying a fancy method is insufficient for the purposes of this assessment. Instead, you will need to justify your choice. For example, writing “I used an Ornstein-Uhlenbeck process to model the variations in X” is not sufficient justification; while Writing “An Ornstein-Uhlenbeck process is used because mean-reverting processes are appropriate for the setting of X” is much better. Another good justification is choosing an advanced model based on insights drawn from the output of a simpler one. Please limit yourself to three forms of analysis or investigations in total, including your first approach that satisfies the minimum requirements listed in requirements 1-6. If you include more I will only assess the first three. If you are unsure where the boundaries of an analysis are, contact me for clarification. Marks for part 3: In addition to the 10 marks available for requirements 1-6, each additional piece of analysis will be worth at most 6 marks, at my discretion. This means for the assignment, you can score up to 22/20! This is not a mistake, but is included to encourage you to try new things. For example, trying two advanced methods and scoring only 4/6 for each gets you 18/20. However, note that any score >20 marks will be given only the full 20% weighting in your final unit grade. Curb your enthusiasm accordingly. For the additional analyses, the more sophisticated the techniques you use are, the higher mark you will score. Examples of these advance techniques will be not be covered in the lectures or the labs. You are required to discover them for yourself, but do feel free to discuss your proposed approaches with me (via the Canvas discussion board). 4.4 Assignment length There are no minimum or maximum lengths to the submission, but treat this like you are trying to convince a busy person that you have something important to say. Being terse and direct is not a bad thing in engineering and business communication. 4.5 Late submission penalties Late assignments will be penalised by deducting 2 marks and by reducing the maximum grade achievable by 2 marks for each 24 hours overdue, including weekends. Don’t be late! Page 4 Assignment Description ELEC2103/ELEC9103 5 Useful Resources There are several tutorials and resources available online to learn how to use the MATLAB Machine Learning toolbox. 1. Introducing Machine Learning https://www.mathworks.com/content/dam/mathworks/ebook/gated/machine-Learning-ebook.pdf 2. MATLAB for Machine Learning https://au.mathworks.com/solutions/machine-learning.html 3. Mastering Machine Learning: A Step-by-Step Guide with MATLAB, https://au.mathworks.com/content/dam/mathworks/ebook/gated/machine-learning-workflow-ebook. pdf 4. Applied Machine Learning https://au.mathworks.com/videos/series/applied-machine-learning.html 5. What Is Deep Learning? 3 things you need to know https://au.mathworks.com/discovery/deep-learning.html 6. Predictive Analytics: 3 Things You Need to Know https://au.mathworks.com/discovery/predictive-analytics.html Page 5































































































































































































学霸联盟


essay、essay代写