sql代写|数据库代写-SQL programming
时间:2020-11-26
_ Rename the file to "al_perf.csv" for easier handling. 3. Create a schema Create a new schema called 'FAA' for your project in the Workbench 4. Load the data You will use different methods to load the data into your database. 1) Create and Load fact table using MySQL provides a utility mysqlimport to load large data sets into tables. Follow these steps to load the fact table data: - Create table 'al_perf' in schema FAA using CreateFactTable.sql script - Create EC2 Instance on AWS. Document "Create_EC2_Instance_on_AWS_instructions.docx" contains instructions. - Secure copy(scp) your csv file from your laptop to your home directory of the EC2 instance. For example: $scp ~/al_perf.csv ec2-user@:/home/ec2- user - Run the following command to install mysql on your EC2 instance: $sudo yum install mysql - Run mysqlimport utility to move the file AWS RDS. Notice options that are used below. You can read about their meaning in MySQL documentation. Example: $ mysqlimport --local \ --compress \ --user=admin --password= \ --host=.rds.amazonaws.com \ --fields-terminated-by=',' \ --fields-optionally-enclosed-by=’”’ \ al_perf.csv 2) Create and Load dimension tables L_AIRLINE_ID.csv L_AIRPORT.csv L_AIRPORT_ID.csv L_DISTANCE_GROUP_250.csv L_WEEKDAYS.csv using Table Data Import Wizard on the Workbench. The Wizard does not require to create tables in advance, it creates a table if it does not exist. However, if it takes too long you can load the tables using mysqlinport. In that case you will need to create the dimension tables first. 3) Create dimension table L_CANCELATION using CREATE TABLE statement. Load data into dimension tables using INSERT statements. 5. Analyze the data Create and run SQL queries to do the following. 1) Find maximal departure delay in minutes for each airline. Sort results from smallest to largest maximum delay. Output airline names and values of the delay. 2) Find maximal early departures in minutes for each airline. Sort results from largest to smallest. Output airline names. 3)Rank days of the week by the number of flights performed by all airlines on that day ( 1 is the busiest). Output the day of the week names, number of flights and ranks in the rank increasing order. 4) Find the airport that has the highest average departure delay among all airports. Consider 0 minutes delay for flights that departed early. Output one line of results: the airport name, code, and average delay. 5) For each airline find an airport where it has the highest average departure delay. Output an airline name, a name of the airport that has the highest average delay, and the value of that average delay. 6) a) Check if your dataset has any canceled flights. b) If it does, what was the most frequent reason for each departure airport? Output airport name, the most frequent reason, and the number of cancelations for that reason. 7) Build a report that for each day output average number of flights over the preceding 3 days.


essay、essay代写