AtomLearn
DashboardGoalsGraphAchievementsReviewSign In
DashboardpandasReading a CSV File
pandasNot Started

Reading a CSV File

Load real-world data from CSV, Excel, and other formats

0%

Knowledge Debt detected

You can study this freely — but your score may plateau if these foundations have gaps. The Mastery badge requires them to be solid.

Explanation

The most common way to get data into pandas is pd.read_csv().

python
df = pd.read_csv('data.csv')

Useful parameters:

```python # Specify which column to use as index df = pd.read_csv('data.csv', index_col='id')

# Parse dates automatically df = pd.read_csv('data.csv', parse_dates=['date'])

# Read only certain columns df = pd.read_csv('data.csv', usecols=['name', 'age', 'salary'])

# Handle missing value markers df = pd.read_csv('data.csv', na_values=['N/A', 'none', '-'])

# Limit rows read (useful for large files) df = pd.read_csv('data.csv', nrows=1000) ```

Other formats:

  • pd.read_excel('file.xlsx')
  • pd.read_json('file.json')
  • pd.read_sql(query, connection)

Saving back:

python df.to_csv('output.csv', index=False) # index=False avoids writing the row numbers

Examples

Load, peek, and check

Always check shape and info() after loading

import pandas as pd

df = pd.read_csv('titanic.csv')
print(df.shape)    # (891, 12)
print(df.head())
print(df.info())   # shows nulls per column

# Save a cleaned version
df.to_csv('titanic_clean.csv', index=False)

Next in pandas

Selecting Columns

Continue