Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Introduction

  • In last chapter, we got some information about python pandas ,data structure and series. It is not able to handle the data in the form of 2D or multidimensional related to real time.
  • For such tasks, python pandas provides some other data structure like dataframes and panels etc.
  • Dataframe objects of Pandas can store 2 D hetrogenous data.
  • On the other hand, panels objects of Pandas can store 3 D hetrogenous data.
  • In this chapter, we will discuss them.

DataFrame Data Structure

  • A DataFrame is a kind of panda structure which stores data in 2D form.
  • Actually, it is 2 dimensional labeled array which is an ordered collection of columns where columns can store different kinds of data.
  • A 2D array is a collection of row and column where each row and column shows a definite index starts from 0.
  • In the given diagram, there are 5 rows and 5 columns. Row and column index are from 0 to 4 respectively.
  • Each cell has the address like-
  • A[2][1], A[1][4] etc like shown in the diagram.
Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Characteristics of DataFrame

Characteristics of a DataFrame are as follows-

  • It has 2 index or 2 axes.
  • It is somewhat like a spreadsheet where row index is called index and column index is called column name.
  • Indexes can be prepared by numbers, strings or letters.
  • It is possible to have any kind of data in columns.
  • its values are mutable and can be changed anytime.
  • Size of DataFrame is also mutable i.e. The number of row and column can be increaded or decreased anytime.
Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Creation and presentation of DataFrame

  • DataFrame object can be created by passing a data in 2D format.
  • import pandas as pd
  • <dataFrameObject> = pd.DataFrame(<a 2D Data Structure>,\ [columns=<column sequence>],[index=<index sequence>])
  • You can create a DataFrame by various methods by passing data values. Like-
  • 2D dictionaries
  • 2D ndarrays
  • Series type object
  • Another DataFrame object

Creation of DataFrame from 2D Dictionary
A. Creation of DataFrame from dictionary of List or ndarrays.

Notes Chapter 11 Python Pandas II Dataframes And Other Operations
Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Creation of DataFrame from 2D Dictionary

B. Creation of DataFrame from dictionary of Dictionaries-

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Creation of Dataframe from 2D ndarray

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Creation of DataFarme from 2D Dictionary of same Series Object

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Creation of DataFrame from object of other DataFrame

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Displaying DataFrame Object

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

DataFrame Attributes

  • When we create an object of a DataFrame then all information related to it like size, datatype etc can be accessed by attributes.
  • <DataFrame Object>.<attribute name>
  • Some attributes are –
Notes Chapter 11 Python Pandas II Dataframes And Other Operations

DataFrame Attributes

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Selecting and Accessing from DataFrame

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Selection of subset from DataFrame

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Selection of subset from DataFrame

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Accessing and modifying values in DataFrame

a) Syntax to add or change a column-
<DFObject>.<Col Name>[<row label>]=<new value>

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Accessing and modifying values in DataFrame

b) Syntax to add or change a row-
<DFObject> at[<RowName>, : ] =<new value>
<DFObject> loc[<RowName>, : ] =<new value>

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Accessing and modifying values in DataFrame

c) Syntax to change single value-
<DFObject>.<ColName>[<RowName/Lebel>]

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Accessing and modifying values in DataFrame

d) Syntax for Column deletiondel
<DFObject>[<ColName>] or
df.drop([<Col1Name>,<Col2Name>, . . ], axis=1)

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Iteration in DataFrame

  • Sometimes we need to perform iteration on complete DataFrame. In such cases, it is difficult to write code to access values separately. Therefore, it is necessary to perform iteration on dataframe which is
  • to be done as-
  • <DFObject>.iterrows( ) it represents dataframe in row-wise subsets .
  • <DFObject>.iteritems( ) it represents dataframe in column-wise subsets.

Use of pandas.iterrows () function

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Use of pandas.iteritems() function

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Program for iteration

  • Write a program to iterate over a dataframe containing names and marks, then calculates grades as per marks (as per guideline below) and adds them to the grade column.
    Marks > =90 Grade A+
    Marks 70 – 90 Grade A
    Marks 60 – 70 Grade B
    Marks 50 – 60 Grade C
    Marks 40 – 50 Grade D
    Marks < 40 Grade F

Program for iteration

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Binary Operations in a DataFrame
It is possible to perform add, subtract, multiply and devision operations on DataFrame.
To Add – ( +, add or radd )
To Subtract – (-, sub or rsub)
To Multiply– (* or mul)
To Divide – (/ or div)
We will perform operations on following dataframes –

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Addition

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Subtraction

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Multiplication

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Division

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Other important functions

Other important functions of DataFrame are as under-
<DF>.info ( )
<DF>.describe ( )

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Other important functions

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Other important functions of DataFrame are as under-
<DF>.head ([ n=<n>] ) here, default value of n is 5.
<DF>.tail ( [n=<n>])

Cumulative Calculations Functions

In DataFrame, for cumulative sum, function is as under-
<DF>.cumsum([axis = None]) here, axis argument is optional.

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Index of Maximum and Minimum Values

Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Handling of Missing Data

  • The values with no computational significance are called missing values.
  • Handling methods for missing values-
  • Dropping missing data
  • Filling missing data (Imputation)
Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Comparison of Pandas Objects

Notes Chapter 11 Python Pandas II Dataframes And Other Operations
Notes Chapter 11 Python Pandas II Dataframes And Other Operations

Related Posts

error: Content is protected !!