Skip to Content

Histograms in Matplotlib

Histograms are used to represent the frequency distribution of a dataset. They are helpful for understanding the shape, spread, and central tendency of data.


Creating Histograms

The hist() function in Matplotlib is used to create histograms.

Example: Basic Histogram

import matplotlib.pyplot as plt import numpy as np # Data data = [22, 87, 5, 42, 88, 30, 56, 78, 95, 42, 67, 89, 42] # Create histogram plt.hist(data, bins=5, color='blue', edgecolor='black') # Add title and labels plt.title("Basic Histogram") plt.xlabel("Value Range") plt.ylabel("Frequency") # Display the plot plt.show()

Customizing Histograms

Matplotlib provides several parameters to customize histograms:

ParameterDescriptionExample Value
binsNumber of bins or intervals10, [0, 20, 40]
colorColor of the bars'blue', 'green'
edgecolorColor of the edges of the bars'black'
alphaTransparency level (0 to 1)0.5

Example: Customized Histogram

# Data data = np.random.normal(50, 10, 1000) # Generate random data # Create customized histogram plt.hist(data, bins=20, color='green', edgecolor='black', alpha=0.7) # Add title and labels plt.title("Customized Histogram") plt.xlabel("Value Range") plt.ylabel("Frequency") # Display the plot plt.show()

Comparing Multiple Histograms

Example: Overlapping Histograms

# Data data1 = np.random.normal(60, 10, 1000) data2 = np.random.normal(50, 15, 1000) # Create overlapping histograms plt.hist(data1, bins=20, alpha=0.5, label="Dataset 1", color='blue', edgecolor='black') plt.hist(data2, bins=20, alpha=0.5, label="Dataset 2", color='orange', edgecolor='black') # Add title, labels, and legend plt.title("Overlapping Histograms") plt.xlabel("Value Range") plt.ylabel("Frequency") plt.legend() # Display the plot plt.show()

Example: Side-by-Side Histograms

# Data data1 = [22, 87, 5, 42, 88, 30, 56] data2 = [32, 57, 15, 72, 48, 50, 66] # Define bin edges bins = [0, 20, 40, 60, 80, 100] # Create side-by-side histograms plt.hist([data1, data2], bins=bins, label=["Dataset 1", "Dataset 2"], color=['blue', 'green'], edgecolor='black') # Add title, labels, and legend plt.title("Side-by-Side Histograms") plt.xlabel("Value Range") plt.ylabel("Frequency") plt.legend() # Display the plot plt.show()

Practical Examples

Example 1: Student Test Scores

# Data scores = [56, 78, 45, 89, 90, 65, 76, 88, 92, 55, 69, 80, 77] # Create histogram plt.hist(scores, bins=5, color='purple', edgecolor='black') # Add title and labels plt.title("Student Test Scores") plt.xlabel("Score Range") plt.ylabel("Frequency") # Display the plot plt.show()

Example 2: Monthly Rainfall Data

# Data rainfall = [100, 120, 85, 90, 150, 130, 110, 140, 95, 105, 125, 115] # Create histogram plt.hist(rainfall, bins=6, color='cyan', edgecolor='black', alpha=0.6) # Add title and labels plt.title("Monthly Rainfall Distribution") plt.xlabel("Rainfall (mm)") plt.ylabel("Frequency") # Display the plot plt.show()

Try It Yourself

Problem 1: Analyze Heights of Students

Input the heights of students in your class and create a histogram to analyze the height distribution.

Show Code

# Data heights = [150, 160, 165, 170, 155, 180, 175, 165, 158, 162] # Create histogram plt.hist(heights, bins=5, color='orange', edgecolor='black') # Add title and labels plt.title("Height Distribution") plt.xlabel("Height (cm)") plt.ylabel("Frequency") # Display the plot plt.show()

Problem 2: Analyze Product Sales

Visualize the sales data of 10 products using a histogram. Group the data into 4 intervals.

Show Code

# Data sales = [200, 300, 400, 150, 250, 350, 450, 300, 220, 310] # Create histogram plt.hist(sales, bins=4, color='blue', edgecolor='black') # Add title and labels plt.title("Product Sales Distribution") plt.xlabel("Sales (Units)") plt.ylabel("Frequency") # Display the plot plt.show()

Histograms are essential for understanding data distribution. Experiment with different customization options to create insightful visualizations.


Pyground

Play with Python!

Output:

Last updated on