Skip to Content

Histograms in Matplotlib

Histograms are used to represent the frequency distribution of a dataset. They are helpful for understanding the shape, spread, and central tendency of data.


Creating Histograms

The hist() function in Matplotlib is used to create histograms.

Example: Basic Histogram

import matplotlib.pyplot as plt import numpy as np # Data data = [22, 87, 5, 42, 88, 30, 56, 78, 95, 42, 67, 89, 42] # Create histogram plt.hist(data, bins=5, color='blue', edgecolor='black') # Add title and labels plt.title("Basic Histogram") plt.xlabel("Value Range") plt.ylabel("Frequency") # Display the plot plt.show()

Customizing Histograms

Matplotlib provides several parameters to customize histograms:

ParameterDescriptionExample Value
binsNumber of bins or intervals10, [0, 20, 40]
colorColor of the bars'blue', 'green'
edgecolorColor of the edges of the bars'black'
alphaTransparency level (0 to 1)0.5

Example: Customized Histogram

# Data data = np.random.normal(50, 10, 1000) # Generate random data # Create customized histogram plt.hist(data, bins=20, color='green', edgecolor='black', alpha=0.7) # Add title and labels plt.title("Customized Histogram") plt.xlabel("Value Range") plt.ylabel("Frequency") # Display the plot plt.show()

Comparing Multiple Histograms

Example: Overlapping Histograms

# Data data1 = np.random.normal(60, 10, 1000) data2 = np.random.normal(50, 15, 1000) # Create overlapping histograms plt.hist(data1, bins=20, alpha=0.5, label="Dataset 1", color='blue', edgecolor='black') plt.hist(data2, bins=20, alpha=0.5, label="Dataset 2", color='orange', edgecolor='black') # Add title, labels, and legend plt.title("Overlapping Histograms") plt.xlabel("Value Range") plt.ylabel("Frequency") plt.legend() # Display the plot plt.show()

Example: Side-by-Side Histograms

# Data data1 = [22, 87, 5, 42, 88, 30, 56] data2 = [32, 57, 15, 72, 48, 50, 66] # Define bin edges bins = [0, 20, 40, 60, 80, 100] # Create side-by-side histograms plt.hist([data1, data2], bins=bins, label=["Dataset 1", "Dataset 2"], color=['blue', 'green'], edgecolor='black') # Add title, labels, and legend plt.title("Side-by-Side Histograms") plt.xlabel("Value Range") plt.ylabel("Frequency") plt.legend() # Display the plot plt.show()

Practical Examples

Example 1: Student Test Scores

# Data scores = [56, 78, 45, 89, 90, 65, 76, 88, 92, 55, 69, 80, 77] # Create histogram plt.hist(scores, bins=5, color='purple', edgecolor='black') # Add title and labels plt.title("Student Test Scores") plt.xlabel("Score Range") plt.ylabel("Frequency") # Display the plot plt.show()

Example 2: Monthly Rainfall Data

# Data rainfall = [100, 120, 85, 90, 150, 130, 110, 140, 95, 105, 125, 115] # Create histogram plt.hist(rainfall, bins=6, color='cyan', edgecolor='black', alpha=0.6) # Add title and labels plt.title("Monthly Rainfall Distribution") plt.xlabel("Rainfall (mm)") plt.ylabel("Frequency") # Display the plot plt.show()

Try It Yourself

Problem 1: Analyze Heights of Students

Input the heights of students in your class and create a histogram to analyze the height distribution.

Show Code

# Data heights = [150, 160, 165, 170, 155, 180, 175, 165, 158, 162] # Create histogram plt.hist(heights, bins=5, color='orange', edgecolor='black') # Add title and labels plt.title("Height Distribution") plt.xlabel("Height (cm)") plt.ylabel("Frequency") # Display the plot plt.show()

Problem 2: Analyze Product Sales

Visualize the sales data of 10 products using a histogram. Group the data into 4 intervals.

Show Code

# Data sales = [200, 300, 400, 150, 250, 350, 450, 300, 220, 310] # Create histogram plt.hist(sales, bins=4, color='blue', edgecolor='black') # Add title and labels plt.title("Product Sales Distribution") plt.xlabel("Sales (Units)") plt.ylabel("Frequency") # Display the plot plt.show()

Histograms are essential for understanding data distribution. Experiment with different customization options to create insightful visualizations.


PygroundTry It Out

Write, run, and experiment with Python code below!

Output:

Last updated on