R代写-STA 4373
时间:2021-10-15

David Kahle STA 4373 - Computational Methods in Statistics Project 1 Goals The purpose of this project is to give you an opportunity to flex some of your EDA stills on a dataset interesting to you and to practice working in a group (each group will have 2-3 people). As part of this analysis, you will analyze a single dataset, asking at least three questions, and generate a report. Your analysis should primarily be inquiry/hypothesis driven and answered using EDA techniques from class and the text. Rules While your data is obtained from the internet, you may not use any analyses on the internet. Please provide sources for your data and any background information you obtained (e.g. Wikipedia or background sources explaining a topic). If you have questions about what is in-bounds/out-of-bounds, please contact me directly. Data You may use a dataset of your choosing for this assignment, but it should not be a dataset used in the text. Deadlines and deliverables • Contact me by Wednesday, October 6 via Slack to communicate your dataset, source, and the problems you intend to investigate, as well as any preliminary analyses. This should be in a group DM. • Friday October 15, 5:00pm - Submit all files via Canvas. Grading On Canvas you will find a detailed rubric that I will use to assess your grade on this assignment. As an overall guide, here is a list of the five criteria on which your submission will be evaluated. Introduction - 5 points The purpose of the introduction is to introduce the dataset, provide some context, and guide me as to what to expect from the rest of the report. You may find it easiest to write the introduction last, after you write the rest of the report. Questions and findings - 60 points You should have three main questions and associated findings, each which may be broken down further with more specific minor questions. Some of these questions will occur to you immediately upon looking at the data, and some will require considerable exploration before they occur to you. To get to the three questions that you report on, I’d expect you to have had 10 or more questions that you consider. A lot of the time you will run into a dead end, or the answer to your question will turn out to be uninteresting or obvious. It 1 is always disappointing not to report on something that you spend time working on, but it does make for a better report. (Of course, negative results are still important! Just because an answer to a question you expect to be “yes” turns out “no” doesn’t mean that you shouldn’t communicate it to others.) You might want to briefly mention some of the dead ends you went down to demonstrate that you’ve done more than just the obvious. Put commented code for these questions in an appendix. I will assess the questions and findings based on the three criteria of curiosity, skepticism and organization. In all real data sets you will need to spend a lot of time cleaning up the data - fixing incorrect values, dealing with missing values etc. Some of this has been done for you, but there may be more lurking. If you perform additional cleaning steps, don’t forget to give a brief description of what you did. If the dataset is particularly messy, the cleaning process itself can account for one of your three findings, but this should be cleared with me first since the techniques you would use would be outside of the EDA scope. Conclusion - 5 points The conclusion should summarize your findings. Rather than just repeating what you’ve already said, try and weave your findings together into a consistent story. You should also reflect a little on other questions that the exploration raised, and what you would do next. Do you need to collect more data, or collect data in a different way? Presentation - 15 points I’ll also mark the general presentation of the project. This is divided into two parts: text and graphics The text should be consistent (i.e., not obviously written by three people and then glued together) and written in the first person plural. Each section should be clearly outlined and the overall document should have a professional look. For example: how big are the resulting graphics on the printed page? How long is the document: are you spamming me, or is it concisely written and two the point? Do I have to dig around for your questions and conclusions? As inspiration: “I have made this letter longer than usual, because I lack the time to make it short.” - Blaise Pascal “If I am to speak ten minutes, I need a week for preparation; if fifteen minutes, three days; if half an hour, two days; if an hour, I am ready now.” - Woodrow Wilson Graphics should follow the guidelines we have discussed in class. Are they concise and to the point? Are they distilled to being as simple as possible to understand the intended point? Have they been polished, with a clean title that explains the point I am to take away from the graphic and clearly labeled axes/legends? Code/reproducibility - 15 points Last, but not least, your report should include an appendix which allows the reader to reproduce your findings. Since all of the findings are presented in the main body of the report, the appendix should be used to show intermediate hypotheses/ideas/findings. The main body code and appendix code will be graded according to the code rubric. What you’ll turn in Each team will elect a team representative who will submit the assignment via Canvas. Your project submission should include your compiled PDF file .pdf, the corresponding R markdown .Rmd file, and any datasets used. 2 










































































学霸联盟


essay、essay代写