2D-Python代写
时间:2023-05-16
Tutorial 4 – Introduction to Project
Topic – Indexing and Query
2
• Spatial
– 2D coordinates for each data point
• Spatio-temporal
– 2D coordinates
– Timestamps for each data point
• More complicated
– Sequence of data points (e.g., trajectory)
– More relationships between data point (e.g., friendship)
Datasets Selection
3
Example
Datasets Size Attributes Difficulty Marks Capped
Chipotle
Locations 2,629 Coordinates Easy 15
Satellite Data 419,438 Coordinates Easy 15
Traffic
Accident 2,845,342
Coordinates,
Timestamps Moderate 17
FourSquare 38,333 Coordinates, Timestamps Moderate 17
Taxi Trajectory
Data 1,703,650
Coordinate
Sequences,
Timestamps
Hard 20
Gowalla 6,442,890
Coordinates,
Timestamps,
Relationships
Hard 20
• Moderate datasets
– size is ≥10, 000
– attributes contain at least coordinates and timestamps.
• Hard datasets,
– size ≥ 100, 000
– attributes should be more complicated and informative.
Estimate Difficulty for Custom Datasets
4
Project Implementation
5
• Task Description: you are required to implement indexing and querying
algorithms to the selected datasets using a programming language (e.g.,
python, Java), to access, manipulate or retrieve data from large scale
spatial/spatio-temporal datasets.
• What to Implement: at least 3 algorithms taught in this course, such as R
tree, KD tree, etc.
– You may include at most one basic baseline algorithm, such as linear
search.
• Mark Bonus: improve the taught methods with your own ideas or/and try
novel methods proposed in recent research.
You need to propose at least 5 real-world query tasks for your implemented
algorithm.
Examples Queries:
a. find all data points in a given rectangular area and within a certain time window. (easy)
b. find all data points within certain distance to a trajectory emerging on the same day.
(moderate)
c. find k nearest neighbors (data points) of a given trajectory for a given date. (moderate)
d. find the skyline data points. (hard)
e. find the trajectory that is shortest from given data point to another. (moderate)
f. find the trajectory that is most similar to a given trajectory. (hard)
g. design your custom queries!
Query Task Examples
6
For both building the index and executing the query, you need report:
1) runtime cost
2) memory cost
3) I/O cost
Try to find methods to measure the above metric, based on your operating
system / programming language.
Evaluation Metrics
7
You need to ensure that your implementations return the correct results as
DBMS.
1. import the datasets to the postgres. (how)
2. write SQL code for your proposed tasks.
3. compare the results returned by postgres with the results returned by your
algorithm.
All SQL code used for database construction, data import, query operations
should be put in a single file (.sql), then compressed along with your
algorithm implementation code as a single zip file.
Correctness of Returned Results
8
Your report must show a clear structure, the following structure is an
example:
1. Introduction
2. Methodology:
1. describe each algorithm in detail
2. how to solve each of task queries by each algorithms
3. Experiments
1. experiments setup & implementation details
2. experimental results presentation (table, plot, map visualization, etc.)
3. analysis
4. Conclusion
Report Writing
9
• Weight: 10%
• Use IEEE template (either Word or Latex)
• No more than 4 pages, excluding reference list.
• Submit to Turnitin in PDF format.
Report Writing
10
PostgreSQL
1. How to write SQL for K-Nearest Neighbors?
2. How to show objects in maps?
3. How to analyze the query results?
– dictionary for Explain Analysis
Implementation Example
• find the K nearest sites (Point) of a given location (Point)
Demo
essay、essay代写