AtomLearn
DashboardGoalsGraphAchievementsReviewSign In
Mathematics for Data ScienceNot Started

What is a Distribution?

Understanding how data is spread — the shape behind the numbers

0%

Knowledge Debt detected

You can study this freely — but your score may plateau if these foundations have gaps. The Mastery badge requires them to be solid.

Explanation

A distribution describes how values in a dataset are spread across possible outcomes.

Think of it as: "if I pick a random value from this dataset, how likely is it to be near X?"

Key shapes:

  • Symmetric / Bell-shaped — values cluster around the center equally on both sides
  • Right-skewed (positive skew) — tail extends to the right; mean > median (e.g. income)
  • Left-skewed (negative skew) — tail extends to the left; mean < median (e.g. exam scores where most score high)
  • Uniform — all values equally likely (e.g. dice roll)
  • Bimodal — two peaks (e.g. heights of a mixed male/female group)

Skewness rule of thumb:

  • Mean > Median → right-skewed
  • Mean < Median → left-skewed
  • Mean ≈ Median → symmetric

Examples

Visualizing skew with a histogram

exponential distribution mimics income data

import matplotlib.pyplot as plt
import numpy as np

# Right-skewed: income-like data
data = np.random.exponential(scale=2, size=1000)
plt.hist(data, bins=50)
plt.title('Right-skewed distribution')
plt.show()

Next in Mathematics for Data Science

The Normal Distribution

Continue