DATA7001-无代写
时间:2023-09-07
Prac-2
Getting the Data I need
Part 1 - Assessing the Ethical Use of Data
SQL Syntax
Useful links: https://sqlbolt.com/ Related course: INFS7901
SELECT column1, column2, column3, ...
FROM table_name
WHERE condition1 AND/OR condition2
GROUP BY column1, column2, ...
HAVING condition3
ORDER BY column1, column2, ... ASC/DESC
LIMIT offset, row_count;
①
②
③
④
⑤
⑥
⑦
actual
execution
order
Part 1 - Assessing the Ethical Use of Data
k-Anonymity:
It ensures that each record is indistinguishable from at least (k-1) other records
within the dataset.
k = 2: Each record is indistinguishable from at least one other record.
A privacy-preserving technique
From lecture slides: Module2-Managing Data Privacy
Part 2 - Reasoning with sampling strategies
Sampling Strategies
Simple Random Sampling Weighted Sampling
Systematic SamplingStratified Sampling
Equal chance
for all Members
Members selected at
regular intervals.
Different
probabilities
based on
weights.
Samples drawn from
predefined groups.