00025B-无代写
时间:2023-04-17
CRICOS code 00025B
Advanced (Business) Data Analytics
Semester 1, 2023
Dr Morteza Namvar
Module 1.3 Lecture
Introduction to Python 2
Outline
Review of Module 1.2
Data types & structure in Python
Function in Python
For Loop in Python
CRICOS code 00025B
Module 1.2 review
Tan, Steinbach and Kumar (2006)
4Dr Morteza Namvar
 Importing libraries
 Importing data
Exploring data
CRICOS code 00025B
Data structure
Python lists,
NumPy arrays &
Pandas dataframes
5
CRICOS code 00025B
 Integers
• Integer refers to whole numbers such as -10, -5, 0, 1, 4, etc.
 Floats
• Float refers to numbers which are not integers. It has floating decimal points e.g., -2.5, 4.899, etc.
 Strings
• Strings in python are surrounded by either single quotation marks, or double quotation marks.
 Booleans
• Boolean refers to logical truth values: True and False.
Python data types
6Dr Morteza Namvar
CRICOS code 00025B
 Lists
• Python lists can hold elements of different data types.
 Tuples
• Tuple is similar to lists, but the only difference is that once tuple is defined, its elements cannot be changed. To
change the elements, you need to define the tuple again.
• Lists are defined using squared brackets whereas tuples are defined using parentheses.
 Dictionary
• Dictionary stores (key, value) pairs (e.g., (person, age))
Python containers
7Dr Morteza Namvar
CRICOS code 00025B
 The list is a most versatile datatype available in Python which can be written as a list of comma-
separated values (items) between square brackets.
• Important thing about a list is that items in a list can be of different types.
 Creating a list is as simple as putting different comma-separated values between square brackets.
Example:
list1 = ['physics', 'chemistry', 1997, 2000]
list2 = [1, 2, 3, 4, 5 ]
list3 = ["a", "b", "c", "d"]
Python lists
8Dr Morteza Namvar
CRICOS code 00025B
 NumPy is the core library for scientific
computing in Python since it provides a high-
performance multidimensional array object.
NumPy arrays
9Dr Morteza Namvar
 The library is imported into Python working environment by typing:
Import numpy as np
 A NumPy array is a grid of values, all of the same type.
CRICOS code 00025B
NumPy array vs. Python list
10Dr Morteza Namvar
 At the implementation level, the array essentially contains a single pointer to
one contiguous block of data.
CRICOS code 00025B
 At the very basic level, Pandas objects can be thought of as enhanced versions of NumPy structured
arrays in which the rows and columns are identified with labels rather than simple integer indices.
 Pandas provides a host of useful tools, methods, and functionality on top of the basic data structures.
Pandas Dataframes
11Dr Morteza Namvar
 A Pandas Series is a one-dimensional array of indexed data.
CRICOS code 00025B
 A Pandas Series is one dimensioned whereas a
DataFrame is two dimensioned.
 Therefore, a single column DataFrame can
have a name for its single column but a Series
cannot have a column name.
 In fact, each column of a DataFrame can be
converted to a series.
 The pandas functions related to Data Frame
can be used for the DataFrame column and the
pandas functions related to Series can be used
for the series.
 There are many functions which can be used for
both Series and DataFrame, however not all
functions can be used interchangeably.
Pandas series vs. single-column DataFrame
12
CRICOS code 00025B
Lists, NumPy arrays, and Pandas dataframes can all be used to hold a sequence of data, but these data
structures are built for different purposes.
Main Differences
13Dr Morteza Namvar
Lists are simple Python built-in data structures, which can be easily used as a container to hold a dynamically changing data
sequence of different data types,.
NumPy provides N-dimensional array objects to allow fast scientific computing.
Pandas is more like excel spreadsheets, as Pandas provides tabular data structures which consist of rows and columns.
Homogeneity Accessibility Others
List Heterogeneous Integer position Python built-in data structure
Array Homogeneous Integer position High performance array calculation
Dataframe Heterogeneous Integer position or index Tabular data structure
CRICOS code 00025B
Transformation
14Dr Morteza Namvar
List
ArraySeries pd . DataFrame (df)
np . array (df)
CRICOS code 00025B
How To Choose the Data Structure
15Dr Morteza Namvar
Lists: A list is a handy and flexible Python
solution to deal with a small amount of
data. It is so easy to quickly create a list
in the Python code.
Lists are naturally suitable for dealing
with a dynamic sequence of data.
You can create a list and just append the
value to the list.
 In addition, a list allows a mixture of data
types, which is useful when I have no
clue about the upcoming data types.
NumPy arrays is designed for
performance. Specially optimized for
high scientific computation performance,
it comes with built-in mathematical
functions and array operations.
However, the trade-off is to lose the
flexibility to deal with dynamic data
sequence and mixed data types.
 Generally, NumPy array is a good
choice for large amount of data or high
dimensional data.
Pandas dataframes is extended from
NumPy array, and inherits the capabilities
of high-performance mathematical
computation & array operation.
Similar to lists, pandas dataframe allows
mixed data types.
When it comes to tabular data with row
index & column index, your choice can be
Pandas dataframe, as it allows flexible
access to values using integer position or
index.
CRICOS code 00025B
Functions
in Python
16
CRICOS code 00025B
Function definition begins with “def.”
Defining Functions
17
def to_do_something(X, y, …):
line1
line2

return the_result
The keyword ‘return’ indicates the
value to be sent back to the caller.
The indentation matters…
First line with less
indentation is considered to be
outside of the function definition.
Function name
Colon
Function argument(s)
CRICOS code 00025B
Calling a Function
The syntax for a function call is:
>>> def myfun(x, y):
Z= x * y
return Z
>>> myfun(3, 4)
12
CRICOS code 00025B
For-Loop
in Python
19
CRICOS code 00025B
 List creation and looping are the essential requirement for preliminary text pre-processing.
Basic programming skills for unstructured data manipulation
20
 List comprehension is more readable than For Loop and Lambda function.
Pythonic
feature
Pythonic
feature
List comprehension:
[i**2 for i in range (2, 10)]
For loop:
sqr=[]
for i in range(2, 10):
sqr.append(i**2)
sqr
Lambda + Map:
list(map(lambda i: i**2, range(2, 10)))
CRICOS code 00025B
for loop
21
If no more item in sequence
Next item for sequence
Item for
sequence
Execute statement(s)
for iterating Var in sequence:
Statement(s)
• Leave an indent before statement
• Leave ‘:’ immediately after expression
CRICOS code 00025B
For loop: example
22
 for loop over a range of numbers with index (total value):
total=0
for i in range(100):
total+=i
print(total)
 for loop over a range of numbers with append command:
total=[]
for x in range(100):
total.append(x**2)
 for loop over a phrase:
for letter in 'Advanced Analytics':
print ('current letter :', letter)
CRICOS code 00025B
List comprehension
23
 Expression is evaluated once for each item in iterable.
 Var takes items from an iterable one by one
 Iterate is a collection of objects (like a list, tuple etc.)
 For loop to create list (list comprehension):
squared = [x**2 for x in range (10)]
 Two for loop (nested loop) to combine the lists
of letters and numbers (list comprehension):
list1 = ["a", "b", "c", "d"]
list2 = ["1", "2", "3", "4"]
list3=[x+y for x in list1 for y in list2]
 List comprehensions provide us with a simple way
to create a list based on some iterable. During the
creation, elements from the iterable can be
conditionally included in the new list and
transformed as needed.
CRICOS code 00025B
Introduction to Lambda function
24
 Lambda function always begins with lambda keyword.
 Lambda can have multiple arguments separated by commas.
 Arguments and expressions are separated by colon.
 Expression is evaluated and returned.
 Lambda with two arguments:
raise_to_power = lambda x, y: x ** y
raise_to_power (2, 3)
Python’s lambda creates anonymous functions
CRICOS code 00025B
Introduction to map function
25
 Python map ( ) function accepts multiple iterable arguments to map to the specified function:
map
list [x, y, z] [f(x), f(y), f(z)]
Modified list
 Lambda with one argument:
numbers=[1,2,3,4,5]
list(map(lambda x:x*x, numbers))
 Lambda with multiple arguments:
numbers_1=[1,2,3]
numbers_2=[5,6,7]
list(map(lambda x,y:x*y, numbers_1, numbers_2))
CRICOS code 00025B
Questions
26
CRICOS code 00025B
Thank you
Dr Morteza Namvar| Lecturer
Business School
m.namvar@business.uq.edu.au
facebook.com/uniofqld
Instagram.com/uniofqld
https://twitter.com/namvar_morteza
https://www.linkedin.com/in/mnamvar/

essay、essay代写