Data Selection and Filtering in Pandas
Pandas offers versatile methods to select and filter data from Series and DataFrames, enabling you to work efficiently with your datasets.
Selecting Rows and Columns
1. Selecting Columns
Access columns using bracket notation or dot notation.
import pandas as pd
# Sample DataFrame
data = {
'Name': ['Anika', 'Rahul', 'Sneha'],
'Age': [25, 30, 22],
'City': ['Delhi', 'Mumbai', 'Bangalore']
}
df = pd.DataFrame(data)
# Selecting a column
print(df['Name']) # Bracket notation
print(df.Name) # Dot notationOutput:
0 Anika
1 Rahul
2 Sneha
Name: Name, dtype: object2. Selecting Rows
Select rows using slicing or .iloc and .loc.
Using Slicing
# Select rows by slicing
print(df[0:2])Output:
Name Age City
0 Anika 25 Delhi
1 Rahul 30 MumbaiUsing .iloc (Position-Based)
# Select rows by position
print(df.iloc[1])Output:
Name Rahul
Age 30
City Mumbai
Name: 1, dtype: objectUsing .loc (Label-Based)
# Select rows by label
print(df.loc[0])Output:
Name Anika
Age 25
City Delhi
Name: 0, dtype: objectFiltering Data with Conditions
Single Condition
# Filter rows where Age > 25
filtered = df[df['Age'] > 25]
print(filtered)Output:
Name Age City
1 Rahul 30 MumbaiMultiple Conditions
Use & for AND and | for OR. Enclose conditions in parentheses.
# Filter rows where Age > 25 and City is 'Mumbai'
filtered = df[(df['Age'] > 25) & (df['City'] == 'Mumbai')]
print(filtered)Output:
Name Age City
1 Rahul 30 MumbaiBoolean Indexing
Boolean indexing allows you to select data based on the evaluation of conditions.
# Create a boolean mask
mask = df['Age'] > 25
print(mask)
# Use the mask to filter data
filtered = df[mask]
print(filtered)Output:
0 False
1 True
2 False
Name: Age, dtype: bool
Name Age City
1 Rahul 30 MumbaiTry It Yourself
Problem 1: Select Specific Data
Given the following DataFrame:
import pandas as pd
data = {
'Product': ['Laptop', 'Phone', 'Tablet'],
'Price': [80000, 30000, 20000],
'Stock': [50, 150, 100]
}
df = pd.DataFrame(data)- Select the
Pricecolumn. - Filter products with a price greater than 25000.
Show Code
# Select the Price column
print(df['Price'])
# Filter products with price > 25000
filtered = df[df['Price'] > 25000]
print(filtered)Problem 2: Use Boolean Indexing
Given the same DataFrame:
- Create a mask for products with stock greater than 100.
- Use the mask to filter and display the result.
Show Code
# Create a mask
mask = df['Stock'] > 100
print(mask)
# Filter using the mask
filtered = df[mask]
print(filtered)