Skip to Content
ModulesPandas TutorialMerging, Joining, and Concatenation

Merging, Joining, and Concatenation in Pandas

Pandas provides multiple methods to combine datasets, each suited to specific use cases. This page covers the concepts of concatenation, merging, and joining.


1. Concatenation

Concatenation combines multiple DataFrames or Series along a specific axis (rows or columns).

Example: Concatenating DataFrames

import pandas as pd # Sample DataFrames df1 = pd.DataFrame({ 'Name': ['Anika', 'Rahul'], 'Age': [25, 30] }) df2 = pd.DataFrame({ 'Name': ['Sneha', 'Amit'], 'Age': [22, 28] }) # Concatenate along rows (default axis=0) result = pd.concat([df1, df2]) print(result)

Output:

Name Age 0 Anika 25 1 Rahul 30 0 Sneha 22 1 Amit 28

Concatenate Along Columns

# Concatenate along columns (axis=1) result = pd.concat([df1, df2], axis=1) print(result)

Output:

Name Age Name Age 0 Anika 25 Sneha 22 1 Rahul 30 Amit 28

2. Merging

The merge() function combines DataFrames based on a common column or index. It is similar to SQL joins.

Example: Merging on a Common Column

# Sample DataFrames df1 = pd.DataFrame({ 'ID': [1, 2], 'Name': ['Anika', 'Rahul'] }) df2 = pd.DataFrame({ 'ID': [1, 2], 'Age': [25, 30] }) # Merge on 'ID' result = pd.merge(df1, df2, on='ID') print(result)

Output:

ID Name Age 0 1 Anika 25 1 2 Rahul 30

Types of Merges

  • Inner Join: Default. Includes rows with matching keys in both DataFrames.
  • Outer Join: Includes all rows, filling with NaN where necessary.
  • Left Join: Includes all rows from the left DataFrame.
  • Right Join: Includes all rows from the right DataFrame.
# Outer Join result = pd.merge(df1, df2, on='ID', how='outer') print(result)

Output:

ID Name Age 0 1 Anika 25.0 1 2 Rahul 30.0

3. Joining

The join() method is used to combine DataFrames based on their index.

Example: Joining on Index

# Sample DataFrames df1 = pd.DataFrame({ 'Age': [25, 30] }, index=['Anika', 'Rahul']) df2 = pd.DataFrame({ 'City': ['Delhi', 'Mumbai'] }, index=['Anika', 'Rahul']) # Join on index result = df1.join(df2) print(result)

Output:

Age City Anika 25 Delhi Rahul 30 Mumbai

Try It Yourself

Problem 1: Concatenate DataFrames

Given the following DataFrames:

import pandas as pd data1 = { 'Name': ['Sneha', 'Amit'], 'Score': [85, 90] } data2 = { 'Name': ['Anika', 'Rahul'], 'Score': [88, 92] } df1 = pd.DataFrame(data1) df2 = pd.DataFrame(data2)
  1. Concatenate df1 and df2 along rows.
  2. Concatenate them along columns.

Show Code

# Concatenate along rows result_rows = pd.concat([df1, df2]) print(result_rows) # Concatenate along columns result_columns = pd.concat([df1, df2], axis=1) print(result_columns)

Problem 2: Merge DataFrames

Given the following DataFrames:

left = pd.DataFrame({ 'ID': [1, 2, 3], 'Name': ['Anika', 'Rahul', 'Sneha'] }) right = pd.DataFrame({ 'ID': [2, 3, 4], 'Score': [90, 85, 88] })
  1. Perform an inner join.
  2. Perform an outer join.

Show Code

# Inner Join result_inner = pd.merge(left, right, on='ID', how='inner') print(result_inner) # Outer Join result_outer = pd.merge(left, right, on='ID', how='outer') print(result_outer)

Problem 3: Join DataFrames on Index

Given the following DataFrames:

left = pd.DataFrame({ 'Age': [25, 30] }, index=['Anika', 'Rahul']) right = pd.DataFrame({ 'City': ['Delhi', 'Mumbai'] }, index=['Anika', 'Rahul'])

Join the two DataFrames and display the result.

Show Code

result = left.join(right) print(result)

Pyground

Play with Python!

Output:

Last updated on