Histograms in Matplotlib
Histograms are used to represent the frequency distribution of a dataset. They are helpful for understanding the shape, spread, and central tendency of data.
Creating Histograms
The hist() function in Matplotlib is used to create histograms.
Example: Basic Histogram
import matplotlib.pyplot as plt
import numpy as np
# Data
data = [22, 87, 5, 42, 88, 30, 56, 78, 95, 42, 67, 89, 42]
# Create histogram
plt.hist(data, bins=5, color='blue', edgecolor='black')
# Add title and labels
plt.title("Basic Histogram")
plt.xlabel("Value Range")
plt.ylabel("Frequency")
# Display the plot
plt.show()Customizing Histograms
Matplotlib provides several parameters to customize histograms:
| Parameter | Description | Example Value |
|---|---|---|
bins | Number of bins or intervals | 10, [0, 20, 40] |
color | Color of the bars | 'blue', 'green' |
edgecolor | Color of the edges of the bars | 'black' |
alpha | Transparency level (0 to 1) | 0.5 |
Example: Customized Histogram
# Data
data = np.random.normal(50, 10, 1000) # Generate random data
# Create customized histogram
plt.hist(data, bins=20, color='green', edgecolor='black', alpha=0.7)
# Add title and labels
plt.title("Customized Histogram")
plt.xlabel("Value Range")
plt.ylabel("Frequency")
# Display the plot
plt.show()Comparing Multiple Histograms
Example: Overlapping Histograms
# Data
data1 = np.random.normal(60, 10, 1000)
data2 = np.random.normal(50, 15, 1000)
# Create overlapping histograms
plt.hist(data1, bins=20, alpha=0.5, label="Dataset 1", color='blue', edgecolor='black')
plt.hist(data2, bins=20, alpha=0.5, label="Dataset 2", color='orange', edgecolor='black')
# Add title, labels, and legend
plt.title("Overlapping Histograms")
plt.xlabel("Value Range")
plt.ylabel("Frequency")
plt.legend()
# Display the plot
plt.show()Example: Side-by-Side Histograms
# Data
data1 = [22, 87, 5, 42, 88, 30, 56]
data2 = [32, 57, 15, 72, 48, 50, 66]
# Define bin edges
bins = [0, 20, 40, 60, 80, 100]
# Create side-by-side histograms
plt.hist([data1, data2], bins=bins, label=["Dataset 1", "Dataset 2"], color=['blue', 'green'], edgecolor='black')
# Add title, labels, and legend
plt.title("Side-by-Side Histograms")
plt.xlabel("Value Range")
plt.ylabel("Frequency")
plt.legend()
# Display the plot
plt.show()Practical Examples
Example 1: Student Test Scores
# Data
scores = [56, 78, 45, 89, 90, 65, 76, 88, 92, 55, 69, 80, 77]
# Create histogram
plt.hist(scores, bins=5, color='purple', edgecolor='black')
# Add title and labels
plt.title("Student Test Scores")
plt.xlabel("Score Range")
plt.ylabel("Frequency")
# Display the plot
plt.show()Example 2: Monthly Rainfall Data
# Data
rainfall = [100, 120, 85, 90, 150, 130, 110, 140, 95, 105, 125, 115]
# Create histogram
plt.hist(rainfall, bins=6, color='cyan', edgecolor='black', alpha=0.6)
# Add title and labels
plt.title("Monthly Rainfall Distribution")
plt.xlabel("Rainfall (mm)")
plt.ylabel("Frequency")
# Display the plot
plt.show()Try It Yourself
Problem 1: Analyze Heights of Students
Input the heights of students in your class and create a histogram to analyze the height distribution.
Show Code
# Data
heights = [150, 160, 165, 170, 155, 180, 175, 165, 158, 162]
# Create histogram
plt.hist(heights, bins=5, color='orange', edgecolor='black')
# Add title and labels
plt.title("Height Distribution")
plt.xlabel("Height (cm)")
plt.ylabel("Frequency")
# Display the plot
plt.show()Problem 2: Analyze Product Sales
Visualize the sales data of 10 products using a histogram. Group the data into 4 intervals.
Show Code
# Data
sales = [200, 300, 400, 150, 250, 350, 450, 300, 220, 310]
# Create histogram
plt.hist(sales, bins=4, color='blue', edgecolor='black')
# Add title and labels
plt.title("Product Sales Distribution")
plt.xlabel("Sales (Units)")
plt.ylabel("Frequency")
# Display the plot
plt.show()Histograms are essential for understanding data distribution. Experiment with different customization options to create insightful visualizations.
PygroundTry It Out
Write, run, and experiment with Python code below!
Output:
Last updated on