Skip to Content
ModulesPandas TutorialData Structures in Pandas

Data Structures in Pandas

Pandas offers three primary data structures to handle and manipulate data effectively. Each is designed for specific use cases, ensuring flexibility and ease of use.


1. Series: One-Dimensional Data

A Series is a one-dimensional labeled array capable of holding data of any type (integer, float, string, etc.). The labels (known as the index) provide easy access to data.

Creating a Series

import pandas as pd # Create a Series from a list data = [10, 20, 30, 40] series = pd.Series(data) print(series)

Output:

0 10 1 20 2 30 3 40 dtype: int64

Adding an Index

# Create a Series with a custom index series = pd.Series(data, index=['a', 'b', 'c', 'd']) print(series)

Output:

a 10 b 20 c 30 d 40 dtype: int64

2. DataFrame: Two-Dimensional Data

A DataFrame is a two-dimensional table-like data structure with labeled axes (rows and columns). It is the most commonly used structure in Pandas.

Creating a DataFrame

# Create a DataFrame from a dictionary data = { 'Name': ['Anika', 'Rahul', 'Sneha'], 'Age': [25, 30, 22], 'City': ['Delhi', 'Mumbai', 'Bangalore'] } df = pd.DataFrame(data) print(df)

Output:

Name Age City 0 Anika 25 Delhi 1 Rahul 30 Mumbai 2 Sneha 22 Bangalore

3. Panel: Three-Dimensional Data (Deprecated)

A Panel was a three-dimensional data structure in Pandas, but it has been deprecated since version 1.0. Instead, multi-dimensional data can now be handled using hierarchical indexing or libraries like NumPy and xarray.

Alternative: Using MultiIndex DataFrames

# Multi-dimensional data using MultiIndex data = { ('Math', 'Term1'): [90, 85, 80], ('Math', 'Term2'): [88, 89, 84], ('Science', 'Term1'): [92, 87, 85], ('Science', 'Term2'): [90, 91, 86] } df = pd.DataFrame(data, index=['Anika', 'Rahul', 'Sneha']) print(df)

Output:

Math Science Term1 Term2 Term1 Term2 Anika 90 88 92 90 Rahul 85 89 87 91 Sneha 80 84 85 86

Indexing and Slicing in Pandas

Pandas provides robust indexing and slicing capabilities for Series and DataFrames.

1. Indexing in a Series

# Accessing elements by index series = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd']) print(series['b']) # Output: 20

2. Slicing in a Series

# Slice elements by index print(series['b':'d'])

Output:

b 20 c 30 d 40 dtype: int64

3. Indexing in a DataFrame

# Accessing a column data = {'Name': ['Anika', 'Rahul'], 'Age': [25, 30]} df = pd.DataFrame(data) print(df['Name'])

Output:

0 Anika 1 Rahul Name: Name, dtype: object

4. Slicing Rows in a DataFrame

# Accessing specific rows print(df[0:1])

Output:

Name Age 0 Anika 25

5. Using .loc and .iloc

  • .loc: Access by labels.
  • .iloc: Access by integer positions.
# Using loc for label-based indexing print(df.loc[0]) # Using iloc for position-based indexing print(df.iloc[0])

Try It Yourself

Problem 1: Create and Index a Series

Create a Pandas Series for the marks of 3 students in Math (Anika: 90, Rahul: 85, Sneha: 88). Display the marks of Rahul.

Show Code

import pandas as pd marks = pd.Series([90, 85, 88], index=['Anika', 'Rahul', 'Sneha']) print("Rahul's Marks:", marks['Rahul'])

Problem 2: Create and Slice a DataFrame

Create a DataFrame for 3 products with columns Product, Price, and Stock. Display the details of the second product.

Show Code

import pandas as pd data = { 'Product': ['Laptop', 'Phone', 'Tablet'], 'Price': [80000, 30000, 20000], 'Stock': [50, 150, 100] } df = pd.DataFrame(data) print("Second Product Details:\n", df.iloc[1])

Pyground

Play with Python!

Output:

Last updated on