Chapter 14 of 14

Modules & Packages

Organise your code with modules, packages, imports, and explore Python's powerful standard library.

Meritshot53 min read
PythonModulesPackagesImportsStandard Library
All Python Chapters

What Are Modules?

As your programs grow beyond a few dozen lines, keeping everything in a single file becomes unmanageable. Python solves this with modules — a way to split your code into separate, reusable files.

A module is simply any file with a .py extension. Every Python file you have ever written is already a module.

Why Modules Exist

Modules address several fundamental programming challenges:

BenefitDescription
Code OrganisationBreak large programs into logical, manageable files
ReusabilityWrite a function once, use it in many programs
Namespace ManagementEach module has its own namespace, preventing name collisions
CollaborationTeam members can work on different modules simultaneously
MaintenanceEasier to find, fix, and update code in smaller files
TestingTest individual modules in isolation

How Python Sees Modules

When you write a file called greetings.py, Python treats it as a module named greetings (without the .py extension). Any functions, classes, variables, or constants you define inside that file become attributes of that module.

# greetings.py — this file IS a module named "greetings"

DEFAULT_GREETING = "Hello"

def say_hello(name):
    """Return a greeting string."""
    return f"{DEFAULT_GREETING}, {name}!"

def say_goodbye(name):
    """Return a farewell string."""
    return f"Goodbye, {name}. See you soon!"

class Greeter:
    """A class that manages personalised greetings."""
    def __init__(self, greeting="Hi"):
        self.greeting = greeting

    def greet(self, name):
        return f"{self.greeting}, {name}!"

Everything in this file — the constant DEFAULT_GREETING, the functions say_hello and say_goodbye, and the class Greeter — can now be imported and used by other Python files.


Creating Your Own Modules

Creating a module is as simple as creating a .py file. There are no special declarations needed.

What Can Go in a Module

A module can contain any valid Python code:

# math_utils.py — A utility module for math operations

# ---- Constants ----
PI = 3.14159265358979
E = 2.71828182845905
GOLDEN_RATIO = 1.61803398874989

# ---- Variables ----
_calculation_count = 0  # leading underscore = "private by convention"

# ---- Functions ----
def add(a, b):
    """Return the sum of two numbers."""
    global _calculation_count
    _calculation_count += 1
    return a + b

def multiply(a, b):
    """Return the product of two numbers."""
    global _calculation_count
    _calculation_count += 1
    return a * b

def circle_area(radius):
    """Calculate the area of a circle."""
    return PI * radius ** 2

def factorial(n):
    """Return n! using recursion."""
    if n <= 1:
        return 1
    return n * factorial(n - 1)

def get_calculation_count():
    """Return how many calculations have been performed."""
    return _calculation_count

# ---- Classes ----
class Vector:
    """A simple 2D vector class."""
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def magnitude(self):
        return (self.x ** 2 + self.y ** 2) ** 0.5

    def __repr__(self):
        return f"Vector({self.x}, {self.y})"

Module-Level Code Execution

Any code at the top level of a module runs when the module is first imported. This is important to understand:

# config.py
print("Loading config module...")  # This runs on import!

DATABASE_URL = "postgresql://localhost/mydb"
DEBUG = True

def get_config():
    return {"db": DATABASE_URL, "debug": DEBUG}

print("Config module loaded!")  # This also runs on import!
# main.py
import config  # This triggers the print statements in config.py
# Output:
# Loading config module...
# Config module loaded!

print(config.DATABASE_URL)  # postgresql://localhost/mydb

Python caches imported modules, so the top-level code only runs once, even if you import the module multiple times in different files.


Importing Modules

Python provides several ways to import modules, each suited to different situations.

import module

The most straightforward approach — import the entire module:

import math_utils

result = math_utils.add(5, 3)        # 8
area = math_utils.circle_area(10)    # 314.159...
v = math_utils.Vector(3, 4)
print(v.magnitude())                  # 5.0
print(math_utils.PI)                  # 3.14159265358979

Pros: Clear where each name comes from. No name collisions. Cons: Verbose if you use many items from the module.

from module import item

Import specific items directly into your namespace:

from math_utils import add, multiply, PI

result = add(5, 3)       # 8 — no prefix needed
product = multiply(4, 2)  # 8
print(PI)                  # 3.14159265358979

Pros: Concise. Only import what you need. Cons: Can cause name collisions if two modules export the same name.

import module as alias

Give a module a shorter name:

import math_utils as mu
import datetime as dt

result = mu.add(5, 3)
now = dt.datetime.now()

This is extremely common in the Python ecosystem. Many libraries have conventional aliases:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf

from module import item as alias

Rename specific imports:

from math_utils import circle_area as area
from math_utils import factorial as fact

print(area(5))    # 78.539...
print(fact(10))   # 3628800

This is useful when two modules export items with the same name:

from math_utils import add as math_add
from string_utils import add as string_add  # hypothetical

math_add(5, 3)          # numeric addition
string_add("hi", "!")   # string concatenation

from module import * (Avoid This)

Import everything from a module:

from math_utils import *

# Now add, multiply, PI, Vector, etc. are all in your namespace
print(add(5, 3))

Why you should avoid this:

  1. Name collisions — You might overwrite existing names without realising
  2. Unclear origin — Hard to tell where a function came from when reading code
  3. Maintenance headache — Adding new names to the module can silently break your code
# Dangerous example
from math import *
from cmath import *   # Overwrites sqrt, log, etc. from math!

# Which sqrt is this? The real or complex version?
print(sqrt(4))  # cmath.sqrt — returns (2+0j), not 2.0!

The one acceptable use is in interactive sessions (Python REPL) for quick exploration.

Import Search Path (sys.path)

When you write import something, Python searches for the module in a specific order:

  1. The current directory (directory of the script being run)
  2. PYTHONPATH environment variable directories (if set)
  3. Standard library directories
  4. Site-packages (where pip installs third-party packages)

You can inspect and modify this search path:

import sys

# View the search path
for path in sys.path:
    print(path)
# Output (example):
# /home/user/my_project        (current directory)
# /usr/lib/python3.12
# /usr/lib/python3.12/lib-dynload
# /home/user/.local/lib/python3.12/site-packages

# Add a custom directory to the search path
sys.path.append("/home/user/my_libraries")

# Now Python will also look in /home/user/my_libraries
import my_custom_module  # Found in the appended path

The if __name__ == "__main__" Pattern

This is one of the most important patterns in Python. Every Python developer must understand it.

What Is __name__?

Every module has a built-in attribute called __name__. Its value depends on how the module is being used:

  • If the file is run directly (e.g., python my_script.py), __name__ is set to "__main__"
  • If the file is imported by another file, __name__ is set to the module name (e.g., "my_script")
# demo.py
print(f"__name__ is: {__name__}")
# Run directly
$ python demo.py
__name__ is: __main__
# other.py
import demo
# Output: __name__ is: demo

The Guard Pattern

This lets you write code that only runs when the file is executed directly:

# calculator.py
def add(a, b):
    """Return the sum of two numbers."""
    return a + b

def subtract(a, b):
    """Return the difference of two numbers."""
    return a - b

def multiply(a, b):
    """Return the product of two numbers."""
    return a * b

def divide(a, b):
    """Return the quotient. Raises ValueError if b is zero."""
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b

# This block ONLY runs when you do: python calculator.py
# It does NOT run when another file does: import calculator
if __name__ == "__main__":
    # Test our functions
    print("Testing calculator functions:")
    print(f"add(10, 5) = {add(10, 5)}")           # 15
    print(f"subtract(10, 5) = {subtract(10, 5)}")  # 5
    print(f"multiply(10, 5) = {multiply(10, 5)}")  # 50
    print(f"divide(10, 5) = {divide(10, 5)}")      # 2.0

    # Test error handling
    try:
        divide(10, 0)
    except ValueError as e:
        print(f"Caught error: {e}")  # Caught error: Cannot divide by zero

Now when another file imports calculator, only the functions are available — the test code does not execute:

# main.py
from calculator import add, divide

print(add(100, 200))    # 300
print(divide(100, 4))   # 25.0
# No test output appears!

Practical Patterns

Pattern 1: Module with a CLI interface

# word_counter.py
import sys

def count_words(text):
    """Count words in a string."""
    return len(text.split())

def count_lines(text):
    """Count lines in a string."""
    return len(text.splitlines())

def analyze_text(text):
    """Return a dictionary of text statistics."""
    return {
        "words": count_words(text),
        "lines": count_lines(text),
        "characters": len(text),
        "characters_no_spaces": len(text.replace(" ", "")),
    }

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python word_counter.py <filename>")
        sys.exit(1)

    filename = sys.argv[1]
    with open(filename, "r") as f:
        content = f.read()

    stats = analyze_text(content)
    for key, value in stats.items():
        print(f"{key}: {value}")

Pattern 2: Quick demo / documentation

# shapes.py
import math

def circle_area(radius):
    return math.pi * radius ** 2

def rectangle_area(width, height):
    return width * height

def triangle_area(base, height):
    return 0.5 * base * height

if __name__ == "__main__":
    # Serve as live documentation / examples
    print("=== Shape Area Calculator ===")
    print(f"Circle (r=5): {circle_area(5):.2f}")
    print(f"Rectangle (4x6): {rectangle_area(4, 6):.2f}")
    print(f"Triangle (b=3, h=8): {triangle_area(3, 8):.2f}")

Packages

As your project grows, you need to organise modules into directories. A package is a directory that contains modules and a special __init__.py file.

Directory Structure

my_project/
├── main.py
└── utils/                  # This is a package
    ├── __init__.py         # Makes "utils" a package
    ├── math_helpers.py     # A module in the package
    ├── string_helpers.py   # Another module
    └── file_helpers.py     # Another module

The __init__.py File

The __init__.py file serves several purposes:

  1. Marks the directory as a package (required in Python 3.2 and earlier, recommended in all versions)
  2. Runs when the package is imported — initialisation code goes here
  3. Controls the package's public API — define what from package import * exports
# utils/__init__.py

# Import key items so users can access them directly from the package
from .math_helpers import add, multiply, circle_area
from .string_helpers import capitalize_words, slugify
from .file_helpers import read_file, write_file

# Define what "from utils import *" exports
__all__ = [
    "add", "multiply", "circle_area",
    "capitalize_words", "slugify",
    "read_file", "write_file",
]

# Package metadata
__version__ = "1.0.0"
__author__ = "Your Name"

Now users can import cleanly:

# Thanks to __init__.py, these all work:
from utils import add, capitalize_words
from utils import __version__

# Instead of the longer:
from utils.math_helpers import add
from utils.string_helpers import capitalize_words

An empty __init__.py is also perfectly valid — it simply marks the directory as a package without any extra setup.

Nested Packages (Sub-packages)

Packages can contain other packages:

my_project/
├── main.py
└── mylib/
    ├── __init__.py
    ├── core/
    │   ├── __init__.py
    │   ├── engine.py
    │   └── config.py
    ├── utils/
    │   ├── __init__.py
    │   ├── math_helpers.py
    │   └── string_helpers.py
    └── io/
        ├── __init__.py
        ├── readers.py
        └── writers.py
# Importing from nested packages
from mylib.core.engine import Engine
from mylib.utils.math_helpers import add
from mylib.io.readers import read_csv

Relative Imports

Inside a package, you can use relative imports to refer to sibling modules or parent packages:

# mylib/utils/string_helpers.py

# Relative import from the same package (utils/)
from .math_helpers import add            # same directory
from . import file_helpers               # same directory

# Relative import from parent package (mylib/)
from ..core.config import DATABASE_URL   # up one level, then into core/
from ..core import engine                # up one level, then into core/

Relative import syntax:

  • . means "current package"
  • .. means "parent package"
  • ... means "grandparent package"

Important: Relative imports only work inside packages. They do not work in scripts run directly.


The Python Standard Library

Python's motto is "batteries included" — it ships with a massive standard library covering everything from math to networking to file compression. No installation needed.

Overview by Category

CategoryKey Modules
Math & Numbersmath, decimal, fractions, statistics
Data Structurescollections, heapq, bisect, array
Text Processingstring, re, textwrap, difflib
Date & Timedatetime, time, calendar
File & I/Oos, pathlib, shutil, glob, tempfile
Data Formatsjson, csv, xml, configparser
Functionalitertools, functools, operator
Systemsys, os, platform, subprocess
Concurrencythreading, multiprocessing, asyncio
Networkingurllib, http, socket, email
Type Systemtyping, dataclasses, abc
Debugginglogging, pdb, traceback, warnings
Testingunittest, doctest
Compressionzipfile, gzip, tarfile
Cryptographyhashlib, secrets, hmac
Copyingcopy

Let us now explore the most important modules in depth.


math — Mathematical Functions

The math module provides access to mathematical functions defined by the C standard.

import math

Constants:

print(math.pi)     # 3.141592653589793
print(math.e)      # 2.718281828459045
print(math.tau)    # 6.283185307179586 (2 * pi)
print(math.inf)    # inf (positive infinity)
print(math.nan)    # nan (not a number)

Rounding and Absolute Value:

print(math.ceil(4.2))     # 5  — round up
print(math.ceil(-4.2))    # -4
print(math.floor(4.8))    # 4  — round down
print(math.floor(-4.8))   # -5
print(math.trunc(4.8))    # 4  — truncate toward zero
print(math.trunc(-4.8))   # -4
print(math.fabs(-5.5))    # 5.5 — absolute value (always float)

Powers, Roots, and Logarithms:

print(math.sqrt(16))       # 4.0
print(math.sqrt(2))        # 1.4142135623730951
print(math.pow(2, 10))     # 1024.0 (always returns float)
print(math.log(100, 10))   # 2.0 (log base 10 of 100)
print(math.log(math.e))    # 1.0 (natural log)
print(math.log2(1024))     # 10.0
print(math.log10(1000))    # 3.0
print(math.isqrt(10))      # 3 (integer square root)

Factorials and Combinatorics:

print(math.factorial(5))   # 120 (5! = 5 * 4 * 3 * 2 * 1)
print(math.factorial(10))  # 3628800
print(math.comb(10, 3))    # 120 (10 choose 3)
print(math.perm(10, 3))    # 720 (permutations of 3 from 10)
print(math.gcd(48, 18))    # 6 (greatest common divisor)
print(math.lcm(12, 18))    # 36 (least common multiple)

Trigonometric Functions (angles in radians):

# Convert degrees to radians and back
print(math.radians(180))   # 3.141592653589793
print(math.degrees(math.pi))  # 180.0

# Trigonometric functions
print(math.sin(math.pi / 2))   # 1.0
print(math.cos(0))              # 1.0
print(math.tan(math.pi / 4))   # 0.9999999999999999 (approx 1)

# Inverse trigonometric functions
print(math.asin(1))    # 1.5707963... (pi/2)
print(math.acos(0))    # 1.5707963... (pi/2)
print(math.atan(1))    # 0.7853981... (pi/4)
print(math.atan2(1, 1))  # 0.7853981... (pi/4) — handles quadrants

Special Value Checks:

print(math.isnan(float("nan")))   # True
print(math.isnan(42))              # False
print(math.isinf(float("inf")))   # True
print(math.isinf(42))              # False
print(math.isfinite(42))           # True
print(math.isfinite(float("inf")))  # False
print(math.isclose(0.1 + 0.2, 0.3, rel_tol=1e-9))  # True

random — Random Number Generation

The random module generates pseudo-random numbers for various distributions.

import random

Basic Random Numbers:

# Random float between 0.0 and 1.0
print(random.random())        # e.g., 0.7431448254356782

# Random float in a range
print(random.uniform(1.0, 10.0))  # e.g., 6.234...

# Random integer in a range (inclusive on both ends)
print(random.randint(1, 100))     # e.g., 42

# Random integer in a range (exclusive upper bound)
print(random.randrange(0, 100, 5))  # Random multiple of 5: 0, 5, 10, ... 95

Choosing from Sequences:

colors = ["red", "green", "blue", "yellow", "purple"]

# Pick one random item
print(random.choice(colors))   # e.g., "blue"

# Pick multiple WITH replacement (items can repeat)
print(random.choices(colors, k=3))  # e.g., ["red", "red", "blue"]

# Pick multiple WITHOUT replacement (no repeats)
print(random.sample(colors, k=3))   # e.g., ["green", "purple", "red"]

# Weighted random choices
fruits = ["apple", "banana", "cherry"]
weights = [50, 30, 20]  # apple is most likely
picks = random.choices(fruits, weights=weights, k=10)
print(picks)  # Mostly apples, some bananas, few cherries

Shuffling:

deck = list(range(1, 53))  # A deck of 52 cards
random.shuffle(deck)        # Shuffle in place
print(deck[:5])             # e.g., [34, 7, 51, 22, 3]

Reproducible Results with Seeds:

random.seed(42)
print(random.randint(1, 100))  # Always 82 with seed 42
print(random.randint(1, 100))  # Always 15

random.seed(42)  # Reset the seed
print(random.randint(1, 100))  # 82 again — same sequence!
print(random.randint(1, 100))  # 15 again

Statistical Distributions:

# Gaussian (normal) distribution — mean=0, std_dev=1
print(random.gauss(0, 1))     # e.g., -0.234...

# Generate 1000 samples to see the distribution
samples = [random.gauss(100, 15) for _ in range(1000)]
avg = sum(samples) / len(samples)
print(f"Average: {avg:.1f}")  # Close to 100

datetime — Dates and Times

The datetime module supplies classes for manipulating dates and times.

from datetime import datetime, date, time, timedelta

Getting the Current Date and Time:

now = datetime.now()
print(now)                # 2026-03-24 14:30:45.123456
print(now.year)           # 2026
print(now.month)          # 3
print(now.day)            # 24
print(now.hour)           # 14
print(now.minute)         # 30
print(now.second)         # 45
print(now.weekday())      # 1 (0=Monday, 6=Sunday)

today = date.today()
print(today)              # 2026-03-24

Creating Specific Dates and Times:

# Create a date
birthday = date(1995, 8, 15)
print(birthday)           # 1995-08-15

# Create a time
alarm = time(7, 30, 0)
print(alarm)              # 07:30:00

# Create a full datetime
event = datetime(2026, 12, 31, 23, 59, 59)
print(event)              # 2026-12-31 23:59:59

Formatting Dates (strftime):

now = datetime.now()

print(now.strftime("%Y-%m-%d"))           # 2026-03-24
print(now.strftime("%d/%m/%Y"))           # 24/03/2026
print(now.strftime("%B %d, %Y"))          # March 24, 2026
print(now.strftime("%I:%M %p"))           # 02:30 PM
print(now.strftime("%A, %B %d, %Y"))      # Tuesday, March 24, 2026
print(now.strftime("%d %b %Y %H:%M"))     # 24 Mar 2026 14:30
CodeMeaningExample
%Y4-digit year2026
%mMonth (zero-padded)03
%dDay (zero-padded)24
%HHour (24-hour)14
%IHour (12-hour)02
%MMinute30
%SSecond45
%pAM/PMPM
%AFull weekdayTuesday
%aShort weekdayTue
%BFull monthMarch
%bShort monthMar

Parsing Date Strings (strptime):

date_str = "24-03-2026"
parsed = datetime.strptime(date_str, "%d-%m-%Y")
print(parsed)             # 2026-03-24 00:00:00

date_str2 = "March 24, 2026 02:30 PM"
parsed2 = datetime.strptime(date_str2, "%B %d, %Y %I:%M %p")
print(parsed2)            # 2026-03-24 14:30:00

Date Arithmetic with timedelta:

now = datetime.now()

# Add or subtract time
tomorrow = now + timedelta(days=1)
next_week = now + timedelta(weeks=1)
two_hours_later = now + timedelta(hours=2)
last_month_approx = now - timedelta(days=30)

print(f"Tomorrow: {tomorrow.strftime('%Y-%m-%d')}")
print(f"Next week: {next_week.strftime('%Y-%m-%d')}")

# Difference between dates
new_year = datetime(2026, 12, 31)
diff = new_year - now
print(f"{diff.days} days until New Year's Eve")
print(f"That's about {diff.days // 7} weeks")

Timezone Basics:

from datetime import timezone

# UTC time
utc_now = datetime.now(timezone.utc)
print(utc_now)  # 2026-03-24 09:30:45.123456+00:00

# Create a timezone offset
ist = timezone(timedelta(hours=5, minutes=30))  # India Standard Time
ist_now = datetime.now(ist)
print(ist_now)  # 2026-03-24 15:00:45.123456+05:30

os — Operating System Interface

The os module provides functions for interacting with the operating system.

import os

Working Directory:

# Get current working directory
print(os.getcwd())  # /home/user/my_project

# Change directory (use sparingly — prefer absolute paths)
os.chdir("/tmp")
print(os.getcwd())  # /tmp

Listing and Creating Directories:

# List files and folders in a directory
entries = os.listdir(".")
print(entries)  # ['file1.py', 'folder1', 'file2.txt']

# List a specific directory
entries = os.listdir("/home/user/documents")

# Create a single directory
os.mkdir("new_folder")

# Create nested directories (like mkdir -p)
os.makedirs("output/reports/2026", exist_ok=True)
# exist_ok=True prevents error if directory already exists

File Operations:

# Rename a file or directory
os.rename("old_name.txt", "new_name.txt")

# Remove a file
os.remove("unwanted_file.txt")

# Remove an empty directory
os.rmdir("empty_folder")

# Remove nested empty directories
os.removedirs("output/reports/2026")

Path Operations (os.path):

# Join path components (handles OS separators automatically)
path = os.path.join("home", "user", "documents", "file.txt")
print(path)  # home/user/documents/file.txt (on Unix)

# Check if path exists
print(os.path.exists("myfile.txt"))    # True or False

# Check if it's a file or directory
print(os.path.isfile("myfile.txt"))    # True
print(os.path.isdir("my_folder"))      # True

# Get file name and directory from a path
print(os.path.basename("/home/user/doc.txt"))  # doc.txt
print(os.path.dirname("/home/user/doc.txt"))   # /home/user

# Split into directory and filename
print(os.path.split("/home/user/doc.txt"))     # ('/home/user', 'doc.txt')

# Get file extension
print(os.path.splitext("report.pdf"))  # ('report', '.pdf')

# Get file size in bytes
print(os.path.getsize("myfile.txt"))   # 1024

Environment Variables:

# Get an environment variable
home = os.environ.get("HOME")
print(home)  # /home/user

# Get with a default value
db_url = os.environ.get("DATABASE_URL", "sqlite:///default.db")

# Set an environment variable (for current process only)
os.environ["MY_APP_MODE"] = "development"

Walking a Directory Tree:

# os.walk traverses all files and subdirectories
for dirpath, dirnames, filenames in os.walk("/home/user/project"):
    print(f"Directory: {dirpath}")
    for filename in filenames:
        full_path = os.path.join(dirpath, filename)
        print(f"  File: {full_path}")

sys — System-Specific Parameters

The sys module provides access to system-specific parameters and functions.

import sys

Command-Line Arguments:

# script.py
# Run: python script.py hello world 42
print(sys.argv)
# ['script.py', 'hello', 'world', '42']

print(sys.argv[0])  # 'script.py' — the script name
print(sys.argv[1])  # 'hello'     — first argument
print(len(sys.argv))  # 4         — total count including script name

Python Version and Platform:

print(sys.version)
# 3.12.0 (main, Oct 2 2024, 00:00:00) [GCC 12.2.0]

print(sys.version_info)
# sys.version_info(major=3, minor=12, micro=0, ...)

print(sys.platform)    # 'linux', 'darwin' (macOS), or 'win32'
print(sys.executable)  # /usr/bin/python3

Module Search Path:

# View where Python looks for modules
for p in sys.path:
    print(p)

# Add a custom directory
sys.path.insert(0, "/my/custom/modules")

Standard Streams:

# Write to stdout
sys.stdout.write("Hello from stdout\n")

# Write to stderr (for error messages)
sys.stderr.write("This is an error message\n")

# Read from stdin
# line = sys.stdin.readline()

Memory and Exit:

# Size of an object in bytes
print(sys.getsizeof(42))           # 28
print(sys.getsizeof("hello"))      # 54
print(sys.getsizeof([1, 2, 3]))    # 88
print(sys.getsizeof({}))           # 64

# Maximum integer size (for recursion limits, etc.)
print(sys.maxsize)          # 9223372036854775807 (on 64-bit)
print(sys.getrecursionlimit())  # 1000 (default)

# Exit the program
# sys.exit(0)   # Exit with success code
# sys.exit(1)   # Exit with error code
# sys.exit("Something went wrong")  # Exit with error message

collections — Specialised Container Types

The collections module provides alternatives to Python's built-in containers.

from collections import Counter, defaultdict, OrderedDict, namedtuple, deque, ChainMap

Counter — Count Occurrences:

from collections import Counter

# Count items in a list
fruits = ["apple", "banana", "apple", "cherry", "banana", "apple", "date"]
count = Counter(fruits)
print(count)
# Counter({'apple': 3, 'banana': 2, 'cherry': 1, 'date': 1})

# Count characters in a string
char_count = Counter("mississippi")
print(char_count)
# Counter({'s': 4, 'i': 4, 'p': 2, 'm': 1})

# Most common items
print(count.most_common(2))   # [('apple', 3), ('banana', 2)]

# Arithmetic with Counters
inventory = Counter(apples=5, bananas=3)
sold = Counter(apples=2, bananas=1)
remaining = inventory - sold
print(remaining)  # Counter({'apples': 3, 'bananas': 2})

# Total count
print(count.total())  # 7

defaultdict — Dict with Default Values:

from collections import defaultdict

# List as default — great for grouping
students_by_grade = defaultdict(list)
students_by_grade["A"].append("Alice")
students_by_grade["B"].append("Bob")
students_by_grade["A"].append("Arjun")
students_by_grade["C"].append("Charlie")
print(dict(students_by_grade))
# {'A': ['Alice', 'Arjun'], 'B': ['Bob'], 'C': ['Charlie']}

# Int as default — great for counting
word_count = defaultdict(int)
for word in ["the", "cat", "sat", "on", "the", "mat"]:
    word_count[word] += 1
print(dict(word_count))
# {'the': 2, 'cat': 1, 'sat': 1, 'on': 1, 'mat': 1}

# Set as default — great for unique collections
tags = defaultdict(set)
tags["python"].add("programming")
tags["python"].add("scripting")
tags["python"].add("programming")  # Duplicate ignored
print(dict(tags))
# {'python': {'programming', 'scripting'}}

OrderedDict — Dictionary that Remembers Insertion Order:

from collections import OrderedDict

# In Python 3.7+, regular dicts maintain insertion order.
# OrderedDict is still useful for its extra methods.

od = OrderedDict()
od["first"] = 1
od["second"] = 2
od["third"] = 3

# Move an item to the end
od.move_to_end("first")
print(list(od.keys()))  # ['second', 'third', 'first']

# Move an item to the beginning
od.move_to_end("third", last=False)
print(list(od.keys()))  # ['third', 'second', 'first']

# Pop the last item
print(od.popitem())      # ('first', 1)

# Pop the first item
print(od.popitem(last=False))  # ('third', 3)

namedtuple — Lightweight Class:

from collections import namedtuple

# Define a named tuple
Point = namedtuple("Point", ["x", "y"])
p = Point(3, 4)
print(p.x, p.y)      # 3 4
print(p[0], p[1])     # 3 4 — also supports indexing
print(p)              # Point(x=3, y=4)

# Can be used as a dictionary key (tuples are hashable)
Student = namedtuple("Student", ["name", "grade", "age"])
s = Student("Alice", "A", 20)
print(f"{s.name} got grade {s.grade}")  # Alice got grade A

# Convert to dictionary
print(s._asdict())  # {'name': 'Alice', 'grade': 'A', 'age': 20}

# Create a modified copy
s2 = s._replace(grade="A+")
print(s2)  # Student(name='Alice', grade='A+', age=20)

deque — Double-Ended Queue:

from collections import deque

# Create a deque
dq = deque([1, 2, 3, 4, 5])

# Add to both ends (O(1) — much faster than list for this)
dq.append(6)        # Add to right: [1, 2, 3, 4, 5, 6]
dq.appendleft(0)    # Add to left:  [0, 1, 2, 3, 4, 5, 6]

# Remove from both ends
dq.pop()             # Remove from right: returns 6
dq.popleft()         # Remove from left:  returns 0
print(dq)            # deque([1, 2, 3, 4, 5])

# Rotate elements
dq.rotate(2)         # Rotate right by 2
print(dq)            # deque([4, 5, 1, 2, 3])

dq.rotate(-2)        # Rotate left by 2
print(dq)            # deque([1, 2, 3, 4, 5])

# Fixed-size deque (oldest items are dropped)
recent = deque(maxlen=3)
recent.append("a")
recent.append("b")
recent.append("c")
recent.append("d")  # "a" is dropped
print(recent)        # deque(['b', 'c', 'd'], maxlen=3)

ChainMap — Merge Multiple Dictionaries:

from collections import ChainMap

defaults = {"color": "blue", "size": "medium", "font": "Arial"}
user_prefs = {"color": "green", "font": "Helvetica"}
cli_args = {"color": "red"}

# ChainMap searches in order: cli_args -> user_prefs -> defaults
config = ChainMap(cli_args, user_prefs, defaults)
print(config["color"])   # red (from cli_args)
print(config["font"])    # Helvetica (from user_prefs)
print(config["size"])    # medium (from defaults)

itertools — Efficient Iteration Tools

The itertools module provides fast, memory-efficient tools for working with iterators.

import itertools

Combining Iterables:

from itertools import chain

# Flatten multiple lists into one
combined = list(chain([1, 2], [3, 4], [5, 6]))
print(combined)  # [1, 2, 3, 4, 5, 6]

# Flatten a list of lists
nested = [[1, 2], [3, 4], [5, 6]]
flat = list(chain.from_iterable(nested))
print(flat)  # [1, 2, 3, 4, 5, 6]

Combinatorics:

from itertools import product, combinations, permutations

# Cartesian product (all pairs)
pairs = list(product("AB", [1, 2]))
print(pairs)  # [('A', 1), ('A', 2), ('B', 1), ('B', 2)]

# Product with repeat
dice_rolls = list(product(range(1, 7), repeat=2))
print(f"Two dice: {len(dice_rolls)} combinations")  # 36

# Combinations (order doesn't matter, no replacement)
combos = list(combinations("ABCD", 2))
print(combos)
# [('A','B'), ('A','C'), ('A','D'), ('B','C'), ('B','D'), ('C','D')]

# Permutations (order matters)
perms = list(permutations("ABC", 2))
print(perms)
# [('A','B'), ('A','C'), ('B','A'), ('B','C'), ('C','A'), ('C','B')]

Infinite Iterators:

from itertools import count, cycle, repeat

# count: infinite counter
for i in count(start=10, step=3):
    if i > 25:
        break
    print(i, end=" ")  # 10 13 16 19 22 25

# cycle: repeat an iterable forever
colors = cycle(["red", "green", "blue"])
for _, color in zip(range(7), colors):
    print(color, end=" ")
# red green blue red green blue red

# repeat: repeat a value
zeros = list(repeat(0, 5))
print(zeros)  # [0, 0, 0, 0, 0]

Slicing and Grouping:

from itertools import islice, groupby, accumulate

# islice: slice an iterator without converting to a list
squares = (x**2 for x in range(100))
first_five = list(islice(squares, 5))
print(first_five)  # [0, 1, 4, 9, 16]

# Skip first 3, take next 4
items = list(islice(range(100), 3, 7))
print(items)  # [3, 4, 5, 6]

# groupby: group consecutive items (data must be sorted by key)
data = [
    ("fruit", "apple"), ("fruit", "banana"),
    ("veggie", "carrot"), ("veggie", "pea"),
    ("fruit", "cherry"),
]
data.sort(key=lambda x: x[0])  # Must sort first!
for key, group in groupby(data, key=lambda x: x[0]):
    items = [item[1] for item in group]
    print(f"{key}: {items}")
# fruit: ['apple', 'banana', 'cherry']
# veggie: ['carrot', 'pea']

# accumulate: running totals
nums = [1, 2, 3, 4, 5]
running_sum = list(accumulate(nums))
print(running_sum)  # [1, 3, 6, 10, 15]

# Running maximum
import operator
running_max = list(accumulate([3, 1, 4, 1, 5, 9], func=max))
print(running_max)  # [3, 3, 4, 4, 5, 9]

functools — Higher-Order Functions

The functools module provides tools for working with functions and callable objects.

from functools import reduce, lru_cache, partial, wraps, total_ordering

reduce — Accumulate a Sequence to a Single Value:

from functools import reduce

# Sum of numbers (same as built-in sum)
total = reduce(lambda a, b: a + b, [1, 2, 3, 4, 5])
print(total)  # 15

# Product of numbers
product = reduce(lambda a, b: a * b, [1, 2, 3, 4, 5])
print(product)  # 120

# Find maximum (same as built-in max)
largest = reduce(lambda a, b: a if a > b else b, [3, 1, 4, 1, 5, 9])
print(largest)  # 9

# Flatten nested list
nested = [[1, 2], [3, 4], [5, 6]]
flat = reduce(lambda a, b: a + b, nested)
print(flat)  # [1, 2, 3, 4, 5, 6]

lru_cache — Memoisation (Cache Results):

from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n):
    """Calculate nth Fibonacci number with caching."""
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

# Without cache: extremely slow for large n
# With cache: near-instant!
print(fibonacci(50))   # 12586269025
print(fibonacci(100))  # 354224848179261915075

# Check cache statistics
print(fibonacci.cache_info())
# CacheInfo(hits=98, misses=101, maxsize=128, currsize=101)

# Clear the cache
fibonacci.cache_clear()

partial — Pre-fill Function Arguments:

from functools import partial

def power(base, exponent):
    return base ** exponent

# Create specialised versions
square = partial(power, exponent=2)
cube = partial(power, exponent=3)

print(square(5))  # 25
print(cube(3))    # 27

# Practical: create a custom print function
debug_print = partial(print, "[DEBUG]")
debug_print("Starting process")   # [DEBUG] Starting process
debug_print("Value is", 42)       # [DEBUG] Value is 42

wraps — Preserve Function Metadata in Decorators:

from functools import wraps
import time

def timer(func):
    @wraps(func)  # Preserves the name, docstring, etc. of the original function
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        elapsed = time.time() - start
        print(f"{func.__name__} took {elapsed:.4f}s")
        return result
    return wrapper

@timer
def slow_function():
    """This function is intentionally slow."""
    time.sleep(0.1)
    return "done"

slow_function()  # slow_function took 0.1003s
print(slow_function.__name__)  # slow_function (not "wrapper"!)
print(slow_function.__doc__)   # This function is intentionally slow.

total_ordering — Complete Comparison Methods:

from functools import total_ordering

@total_ordering
class Temperature:
    """You only need to define __eq__ and one of __lt__, __gt__, etc."""
    def __init__(self, celsius):
        self.celsius = celsius

    def __eq__(self, other):
        return self.celsius == other.celsius

    def __lt__(self, other):
        return self.celsius < other.celsius

    def __repr__(self):
        return f"Temperature({self.celsius}C)"

t1 = Temperature(20)
t2 = Temperature(30)
print(t1 < t2)    # True
print(t1 > t2)    # False   (auto-generated!)
print(t1 <= t2)   # True    (auto-generated!)
print(t1 >= t2)   # False   (auto-generated!)

json — JSON Encoding and Decoding

The json module handles reading and writing JSON data.

import json

Converting Python to JSON (dumps):

data = {
    "name": "Alice",
    "age": 30,
    "hobbies": ["reading", "coding", "hiking"],
    "address": {
        "city": "Mumbai",
        "country": "India"
    },
    "active": True,
    "score": None
}

# Convert to JSON string
json_str = json.dumps(data)
print(json_str)
# {"name": "Alice", "age": 30, ...}

# Pretty-printed JSON
json_pretty = json.dumps(data, indent=2)
print(json_pretty)
# {
#   "name": "Alice",
#   "age": 30,
#   "hobbies": [
#     "reading",
#     "coding",
#     "hiking"
#   ],
#   ...
# }

# Sort keys alphabetically
json_sorted = json.dumps(data, indent=2, sort_keys=True)

Converting JSON to Python (loads):

json_string = '{"name": "Bob", "age": 25, "active": true}'
parsed = json.loads(json_string)
print(parsed["name"])     # Bob
print(parsed["active"])   # True (Python bool, not JSON true)
print(type(parsed))       # <class 'dict'>

Reading and Writing JSON Files (load / dump):

# Write JSON to a file
data = {"users": ["Alice", "Bob", "Charlie"], "count": 3}
with open("data.json", "w") as f:
    json.dump(data, f, indent=2)

# Read JSON from a file
with open("data.json", "r") as f:
    loaded = json.load(f)
print(loaded)  # {'users': ['Alice', 'Bob', 'Charlie'], 'count': 3}

Custom Serialisation:

from datetime import datetime

class DateTimeEncoder(json.JSONEncoder):
    """Custom encoder that handles datetime objects."""
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat()
        return super().default(obj)

event = {
    "name": "Meeting",
    "time": datetime(2026, 3, 24, 14, 30),
}

# Without custom encoder: TypeError
# With custom encoder: works!
json_str = json.dumps(event, cls=DateTimeEncoder, indent=2)
print(json_str)
# {
#   "name": "Meeting",
#   "time": "2026-03-24T14:30:00"
# }

re — Regular Expressions

The re module provides pattern matching for strings.

import re

Basic Functions:

text = "My phone number is 123-456-7890 and email is alice@example.com"

# search: find first match anywhere in string
match = re.search(r"\d{3}-\d{3}-\d{4}", text)
if match:
    print(match.group())  # 123-456-7890

# match: match at the BEGINNING of string only
result = re.match(r"My", text)
print(result.group())  # My

# findall: find ALL matches (returns list of strings)
numbers = re.findall(r"\d+", text)
print(numbers)  # ['123', '456', '7890']

# findall with emails
emails = re.findall(r"[\w.+-]+@[\w-]+\.[\w.]+", text)
print(emails)  # ['alice@example.com']

Substitution:

# sub: replace matches
text = "I have 2 cats and 3 dogs"
result = re.sub(r"\d+", "many", text)
print(result)  # I have many cats and many dogs

# Replace with a function
def double_number(match):
    return str(int(match.group()) * 2)

result = re.sub(r"\d+", double_number, text)
print(result)  # I have 4 cats and 6 dogs

Compiling Patterns (for repeated use):

# Compile a pattern for better performance when used multiple times
email_pattern = re.compile(r"[\w.+-]+@[\w-]+\.[\w.]+")

texts = [
    "Contact alice@example.com for info",
    "Send to bob@company.org please",
    "No email here",
]

for t in texts:
    found = email_pattern.findall(t)
    if found:
        print(f"Found: {found}")
# Found: ['alice@example.com']
# Found: ['bob@company.org']

Common Patterns:

PatternMatchesExample
\dAny digit0, 9
\wWord character (letter, digit, _)a, Z, _
\sWhitespacespace, tab, newline
.Any character (except newline)anything
^Start of string^Hello
$End of stringworld$
*0 or moreab* matches a, ab, abb
+1 or moreab+ matches ab, abb
?0 or 1ab? matches a, ab
{n}Exactly n\d{3} matches 123
{n,m}Between n and m\d{2,4} matches 12, 1234
[abc]Any of a, b, ccharacter set
[^abc]Not a, b, or cnegated set
(...)Capture groupgroup matches
|Orcat|dog

pathlib — Object-Oriented Filesystem Paths

pathlib is the modern, recommended way to work with file paths in Python (preferred over os.path).

from pathlib import Path

Creating Paths:

# Current directory
cwd = Path.cwd()
print(cwd)  # /home/user/my_project

# Home directory
home = Path.home()
print(home)  # /home/user

# Create a path from a string
p = Path("/home/user/documents/report.txt")

# Join paths with /
data_dir = Path("project") / "data" / "raw"
print(data_dir)  # project/data/raw

config_file = Path.home() / ".config" / "myapp" / "settings.json"
print(config_file)  # /home/user/.config/myapp/settings.json

Path Properties:

p = Path("/home/user/documents/report.pdf")

print(p.name)       # report.pdf
print(p.stem)       # report (name without extension)
print(p.suffix)     # .pdf
print(p.parent)     # /home/user/documents
print(p.parts)      # ('/', 'home', 'user', 'documents', 'report.pdf')
print(p.anchor)     # /
print(p.is_absolute())  # True

Checking Existence:

p = Path("myfile.txt")
print(p.exists())      # True or False
print(p.is_file())     # True if it is a file
print(p.is_dir())      # True if it is a directory

Reading and Writing Files:

# Write text to a file
p = Path("output.txt")
p.write_text("Hello, World!\nLine 2\n")

# Read text from a file
content = p.read_text()
print(content)  # Hello, World!\nLine 2\n

# Write bytes
p.write_bytes(b"\x00\x01\x02")

# Read bytes
data = p.read_bytes()

Globbing (Finding Files by Pattern):

project = Path(".")

# Find all Python files in current directory
for py_file in project.glob("*.py"):
    print(py_file)

# Find all Python files recursively
for py_file in project.rglob("*.py"):
    print(py_file)

# Find all image files
for img in project.rglob("*.png"):
    print(img)

Creating Directories:

new_dir = Path("output") / "reports" / "2026"
new_dir.mkdir(parents=True, exist_ok=True)
# parents=True creates intermediate directories
# exist_ok=True does not raise an error if the directory already exists

string — String Constants and Templates

The string module provides useful string constants and a template class.

import string

String Constants:

print(string.ascii_letters)    # abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
print(string.ascii_lowercase)  # abcdefghijklmnopqrstuvwxyz
print(string.ascii_uppercase)  # ABCDEFGHIJKLMNOPQRSTUVWXYZ
print(string.digits)           # 0123456789
print(string.hexdigits)        # 0123456789abcdefABCDEF
print(string.punctuation)      # !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
print(string.whitespace)       # ' \t\n\r\x0b\x0c'
print(string.printable)        # All printable characters

Template Strings (Safe Substitution):

from string import Template

# Create a template
t = Template("Hello, $name! You have $count new messages.")
result = t.substitute(name="Alice", count=5)
print(result)  # Hello, Alice! You have 5 new messages.

# safe_substitute: does not raise error for missing keys
t2 = Template("$greeting, $name!")
result = t2.safe_substitute(greeting="Hi")
print(result)  # Hi, $name!  (missing key left as-is)

Practical Use: Generating Random Strings

import string
import random

def generate_password(length=12):
    """Generate a random password."""
    chars = string.ascii_letters + string.digits + string.punctuation
    return "".join(random.choice(chars) for _ in range(length))

print(generate_password())     # e.g., k9$Tz!mP@2xR
print(generate_password(20))   # e.g., aB3$kLm!Pq9@rT5&wX2z

copy — Shallow and Deep Copy

The copy module provides functions to duplicate objects.

import copy

Why Copying Matters:

# Assignment does NOT copy — both variables point to the same object
original = [1, 2, [3, 4]]
reference = original

reference[0] = 99
print(original)   # [99, 2, [3, 4]] — original changed too!

Shallow Copy (copy.copy):

import copy

original = [1, 2, [3, 4]]
shallow = copy.copy(original)

# Top-level changes are independent
shallow[0] = 99
print(original)   # [1, 2, [3, 4]] — not affected

# BUT nested objects are still shared!
shallow[2][0] = 99
print(original)   # [1, 2, [99, 4]] — nested list WAS affected!

Deep Copy (copy.deepcopy):

import copy

original = [1, 2, [3, 4]]
deep = copy.deepcopy(original)

# Everything is fully independent
deep[2][0] = 99
print(original)   # [1, 2, [3, 4]] — completely unaffected!
print(deep)       # [1, 2, [99, 4]]

When to Use Each:

ScenarioUse
Simple flat list/dictcopy.copy() (shallow)
Nested structurescopy.deepcopy() (deep)
Immutable data (strings, tuples of ints)No copy needed
Performance-critical, large datacopy.copy() if possible

typing — Type Hints

The typing module provides tools for type annotations, which help with code clarity and IDE support.

from typing import List, Dict, Tuple, Optional, Union, Any, Callable

Basic Type Hints:

# Variable annotations
name: str = "Alice"
age: int = 30
score: float = 95.5
active: bool = True

# Function annotations
def greet(name: str) -> str:
    return f"Hello, {name}!"

def add(a: int, b: int) -> int:
    return a + b

Collection Types:

from typing import List, Dict, Tuple, Set

# List of integers
scores: List[int] = [95, 87, 92, 78]

# Dictionary with string keys and int values
age_map: Dict[str, int] = {"Alice": 30, "Bob": 25}

# Tuple with specific types
coordinate: Tuple[float, float] = (3.5, 7.2)

# Set of strings
tags: Set[str] = {"python", "coding", "tutorial"}

# Note: In Python 3.9+, you can use built-in types directly:
# scores: list[int] = [95, 87, 92, 78]
# age_map: dict[str, int] = {"Alice": 30, "Bob": 25}

Optional and Union:

from typing import Optional, Union

# Optional: value can be the given type or None
def find_user(user_id: int) -> Optional[str]:
    """Return username or None if not found."""
    users = {1: "Alice", 2: "Bob"}
    return users.get(user_id)

# Union: value can be one of several types
def process(value: Union[int, str]) -> str:
    return str(value)

# Python 3.10+ allows: int | str instead of Union[int, str]

Callable:

from typing import Callable

# A function that takes a function as an argument
def apply_operation(x: int, y: int, func: Callable[[int, int], int]) -> int:
    return func(x, y)

result = apply_operation(5, 3, lambda a, b: a + b)
print(result)  # 8

Any:

from typing import Any

def log_value(value: Any) -> None:
    """Accept any type of value."""
    print(f"Value: {value}, Type: {type(value).__name__}")

Type hints are not enforced at runtime — they are documentation and tooling aids. Use tools like mypy to check types statically.


Installing Third-Party Packages

Python's standard library is extensive, but the real power lies in the hundreds of thousands of third-party packages available on PyPI (Python Package Index).

pip — The Package Installer

pip is Python's built-in package manager.

# Install a package
pip install requests

# Install a specific version
pip install pandas==2.1.0

# Install minimum version
pip install numpy>=1.24.0

# Install multiple packages at once
pip install requests pandas numpy matplotlib

# Upgrade a package to the latest version
pip install --upgrade requests

# Uninstall a package
pip uninstall requests

Managing Dependencies

# List all installed packages
pip list

# Show details about a specific package
pip show requests

# Save current environment to requirements.txt
pip freeze > requirements.txt

# Install all packages from requirements.txt
pip install -r requirements.txt

requirements.txt Format

# requirements.txt
requests==2.31.0
pandas>=2.1.0,<3.0.0
numpy~=1.24.0
flask
python-dotenv>=1.0
SyntaxMeaning
==2.31.0Exact version
>=2.1.0Minimum version
<=3.0.0Maximum version
>=2.1.0,<3.0.0Version range
~=1.24.0Compatible release (>=1.24.0, <1.25.0)
flaskAny version (latest)

Editable Install

For developing your own packages:

# Install in "editable" mode — changes to source are immediately reflected
pip install -e .

# Install with optional development dependencies
pip install -e ".[dev]"

Virtual Environments

Every Python project should use a virtual environment to isolate its dependencies.

Why Virtual Environments?

Without virtual environments:

  • Project A needs requests==2.25.0
  • Project B needs requests==2.31.0
  • Only one version can exist system-wide — one project breaks.

Virtual environments give each project its own Python installation and packages.

Creating and Using a Virtual Environment

# Create a virtual environment named "venv"
python -m venv venv

# Activate it
# macOS / Linux:
source venv/bin/activate

# Windows:
venv\Scripts\activate

# Your prompt changes to show the active env:
# (venv) $

# Now pip installs go into this venv only
pip install requests pandas

# Verify: packages are isolated
pip list
# Shows only packages installed in this venv

# When done, deactivate
deactivate

Best Practices for Virtual Environments

Always add venv/ to .gitignore:

# .gitignore
venv/
.venv/
env/
__pycache__/
*.pyc

Use requirements.txt to share dependencies (not the venv itself):

# Developer A creates the requirements file
pip freeze > requirements.txt
git add requirements.txt
git commit -m "Add project dependencies"

# Developer B recreates the environment
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Naming conventions:

  • venv or .venv are the most common names
  • .venv (with dot) keeps it hidden on Unix systems

Here is a curated list of widely-used Python packages every developer should know about:

PackageCategoryDescriptionInstall
requestsHTTPSimple HTTP requestspip install requests
httpxHTTPAsync-capable HTTP clientpip install httpx
pandasDataData manipulation and analysispip install pandas
numpyMathNumerical computing, arrayspip install numpy
matplotlibVisualisationPlotting and chartspip install matplotlib
flaskWebLightweight web frameworkpip install flask
fastapiWebModern async web frameworkpip install fastapi
djangoWebFull-featured web frameworkpip install django
sqlalchemyDatabaseSQL toolkit and ORMpip install sqlalchemy
pytestTestingTesting frameworkpip install pytest
beautifulsoup4ScrapingHTML/XML parsingpip install beautifulsoup4
seleniumAutomationBrowser automationpip install selenium
pillowImagesImage processingpip install pillow
clickCLICommand-line interfacespip install click
richCLIRich terminal formattingpip install rich

Quick Examples

requests — Making HTTP Requests:

import requests

response = requests.get("https://api.github.com/users/octocat")
print(response.status_code)   # 200
data = response.json()
print(data["name"])            # The Octocat
print(data["public_repos"])    # 8

pytest — Writing Tests:

# test_calculator.py
def add(a, b):
    return a + b

def test_add():
    assert add(2, 3) == 5
    assert add(-1, 1) == 0
    assert add(0, 0) == 0

# Run with: pytest test_calculator.py

Package Distribution Basics

When you want to share your Python code as an installable package, you need a project configuration file.

pyproject.toml (Modern Standard)

The modern way to define a Python project:

# pyproject.toml
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.backends._legacy:_Backend"

[project]
name = "my-awesome-package"
version = "0.1.0"
description = "A short description of your package"
readme = "README.md"
license = {text = "MIT"}
requires-python = ">=3.8"
authors = [
    {name = "Your Name", email = "you@example.com"}
]
dependencies = [
    "requests>=2.25.0",
    "click>=8.0",
]

[project.optional-dependencies]
dev = ["pytest", "mypy", "black"]

setup.py (Legacy but Still Common)

# setup.py
from setuptools import setup, find_packages

setup(
    name="my-awesome-package",
    version="0.1.0",
    packages=find_packages(),
    install_requires=[
        "requests>=2.25.0",
        "click>=8.0",
    ],
)

Version Management

A common pattern is to define the version in one place and read it from __init__.py:

# my_package/__init__.py
__version__ = "0.1.0"

Publishing to PyPI (Brief Overview)

# 1. Build the package
pip install build
python -m build

# 2. Upload to PyPI (requires an account at pypi.org)
pip install twine
twine upload dist/*

# 3. Now anyone can install it!
# pip install my-awesome-package

Practical Examples

Let us put everything together with practical, real-world examples.

Example 1: Building a Utility Package

Create a package with string, math, and file helpers:

Project structure:

my_utils/
├── __init__.py
├── string_helpers.py
├── math_helpers.py
└── file_helpers.py

string_helpers.py:

# my_utils/string_helpers.py

def slugify(text):
    """Convert text to URL-friendly slug.

    >>> slugify("Hello World!")
    'hello-world'
    """
    import re
    text = text.lower().strip()
    text = re.sub(r"[^\w\s-]", "", text)
    text = re.sub(r"[\s_]+", "-", text)
    text = re.sub(r"-+", "-", text)
    return text.strip("-")

def truncate(text, max_length=50, suffix="..."):
    """Truncate text to max_length, adding suffix if truncated.

    >>> truncate("Hello, World!", max_length=8)
    'Hello...'
    """
    if len(text) <= max_length:
        return text
    return text[:max_length - len(suffix)] + suffix

def word_count(text):
    """Count words in text.

    >>> word_count("Hello beautiful world")
    3
    """
    return len(text.split())

def title_case(text):
    """Convert text to title case, handling small words.

    >>> title_case("the quick brown fox")
    'The Quick Brown Fox'
    """
    small_words = {"a", "an", "the", "and", "but", "or", "for", "in", "on", "at", "to"}
    words = text.split()
    result = []
    for i, word in enumerate(words):
        if i == 0 or word.lower() not in small_words:
            result.append(word.capitalize())
        else:
            result.append(word.lower())
    return " ".join(result)

math_helpers.py:

# my_utils/math_helpers.py

def clamp(value, min_val, max_val):
    """Restrict value to the range [min_val, max_val].

    >>> clamp(15, 0, 10)
    10
    >>> clamp(-5, 0, 10)
    0
    """
    return max(min_val, min(value, max_val))

def percentage(part, whole):
    """Calculate percentage.

    >>> percentage(25, 200)
    12.5
    """
    if whole == 0:
        return 0.0
    return (part / whole) * 100

def average(numbers):
    """Calculate the arithmetic mean.

    >>> average([10, 20, 30])
    20.0
    """
    if not numbers:
        return 0.0
    return sum(numbers) / len(numbers)

def is_prime(n):
    """Check if a number is prime.

    >>> is_prime(17)
    True
    >>> is_prime(4)
    False
    """
    if n < 2:
        return False
    for i in range(2, int(n ** 0.5) + 1):
        if n % i == 0:
            return False
    return True

file_helpers.py:

# my_utils/file_helpers.py

from pathlib import Path

def read_lines(filepath):
    """Read a file and return a list of stripped lines."""
    return Path(filepath).read_text().strip().splitlines()

def write_lines(filepath, lines):
    """Write a list of strings to a file, one per line."""
    Path(filepath).write_text("\n".join(lines) + "\n")

def file_size_human(filepath):
    """Return human-readable file size.

    >>> file_size_human("small_file.txt")  # If file is 1536 bytes
    '1.50 KB'
    """
    size = Path(filepath).stat().st_size
    for unit in ["B", "KB", "MB", "GB", "TB"]:
        if size < 1024:
            return f"{size:.2f} {unit}"
        size /= 1024
    return f"{size:.2f} PB"

def ensure_directory(path):
    """Create a directory (and parents) if it does not exist."""
    Path(path).mkdir(parents=True, exist_ok=True)

__init__.py:

# my_utils/__init__.py

from .string_helpers import slugify, truncate, word_count, title_case
from .math_helpers import clamp, percentage, average, is_prime
from .file_helpers import read_lines, write_lines, file_size_human, ensure_directory

__version__ = "1.0.0"
__all__ = [
    "slugify", "truncate", "word_count", "title_case",
    "clamp", "percentage", "average", "is_prime",
    "read_lines", "write_lines", "file_size_human", "ensure_directory",
]

Using the Package:

# main.py
from my_utils import slugify, is_prime, average, ensure_directory

print(slugify("Hello World! This is Great"))  # hello-world-this-is-great
print(is_prime(17))                            # True
print(average([85, 90, 78, 92, 88]))          # 86.6
ensure_directory("output/reports")

Example 2: Password Generator (Using Standard Library)

# password_generator.py
import random
import string
import math

def generate_password(
    length=16,
    use_uppercase=True,
    use_digits=True,
    use_symbols=True,
    exclude_chars="",
):
    """Generate a secure random password."""
    chars = string.ascii_lowercase

    if use_uppercase:
        chars += string.ascii_uppercase
    if use_digits:
        chars += string.digits
    if use_symbols:
        chars += string.punctuation

    # Remove excluded characters
    for ch in exclude_chars:
        chars = chars.replace(ch, "")

    if not chars:
        raise ValueError("No characters available for password generation")

    password = "".join(random.choice(chars) for _ in range(length))
    return password

def password_strength(password):
    """Estimate password strength."""
    charset_size = 0
    if any(c in string.ascii_lowercase for c in password):
        charset_size += 26
    if any(c in string.ascii_uppercase for c in password):
        charset_size += 26
    if any(c in string.digits for c in password):
        charset_size += 10
    if any(c in string.punctuation for c in password):
        charset_size += 32

    if charset_size == 0:
        return "Empty", 0

    entropy = len(password) * math.log2(charset_size)

    if entropy < 28:
        strength = "Very Weak"
    elif entropy < 36:
        strength = "Weak"
    elif entropy < 60:
        strength = "Moderate"
    elif entropy < 80:
        strength = "Strong"
    else:
        strength = "Very Strong"

    return strength, round(entropy, 1)

if __name__ == "__main__":
    print("=== Password Generator ===\n")

    for length in [8, 12, 16, 24]:
        pw = generate_password(length=length)
        strength, entropy = password_strength(pw)
        print(f"Length {length:2d}: {pw}")
        print(f"          Strength: {strength} ({entropy} bits of entropy)\n")

    # Generate passwords without confusing characters
    pw = generate_password(length=16, exclude_chars="0OIl1|")
    print(f"Easy-to-read: {pw}")

Example 3: Date Calculator

# date_calculator.py
from datetime import datetime, date, timedelta
import calendar

def days_between(date1_str, date2_str, fmt="%Y-%m-%d"):
    """Calculate the number of days between two date strings."""
    d1 = datetime.strptime(date1_str, fmt)
    d2 = datetime.strptime(date2_str, fmt)
    diff = abs((d2 - d1).days)
    return diff

def days_until(target_str, fmt="%Y-%m-%d"):
    """Calculate days from today until a target date."""
    target = datetime.strptime(target_str, fmt).date()
    today = date.today()
    diff = (target - today).days
    return diff

def add_business_days(start_date, num_days):
    """Add business days (skipping weekends) to a date."""
    current = start_date
    added = 0
    while added < num_days:
        current += timedelta(days=1)
        if current.weekday() < 5:  # Monday=0 to Friday=4
            added += 1
    return current

def age_calculator(birth_date_str, fmt="%Y-%m-%d"):
    """Calculate age in years, months, and days."""
    birth = datetime.strptime(birth_date_str, fmt).date()
    today = date.today()

    years = today.year - birth.year
    months = today.month - birth.month
    days = today.day - birth.day

    if days < 0:
        months -= 1
        # Get days in the previous month
        prev_month = today.month - 1 if today.month > 1 else 12
        prev_year = today.year if today.month > 1 else today.year - 1
        days += calendar.monthrange(prev_year, prev_month)[1]

    if months < 0:
        years -= 1
        months += 12

    return years, months, days

if __name__ == "__main__":
    print("=== Date Calculator ===\n")

    # Days between two dates
    d = days_between("2026-01-01", "2026-12-31")
    print(f"Days in 2026: {d}")

    # Days until New Year
    until_ny = days_until("2026-12-31")
    print(f"Days until Dec 31, 2026: {until_ny}")

    # Business days
    start = date.today()
    deadline = add_business_days(start, 10)
    print(f"10 business days from today: {deadline.strftime('%Y-%m-%d (%A)')}")

    # Age calculator
    years, months, days = age_calculator("1995-08-15")
    print(f"Age (born 1995-08-15): {years} years, {months} months, {days} days")

Example 4: File Organiser Script

# file_organiser.py
"""Organise files in a directory by extension."""

import os
import shutil
from pathlib import Path
from collections import defaultdict, Counter
from datetime import datetime

# Map extensions to folder names
EXTENSION_MAP = {
    # Images
    ".jpg": "Images", ".jpeg": "Images", ".png": "Images",
    ".gif": "Images", ".bmp": "Images", ".svg": "Images", ".webp": "Images",
    # Documents
    ".pdf": "Documents", ".doc": "Documents", ".docx": "Documents",
    ".txt": "Documents", ".rtf": "Documents", ".odt": "Documents",
    # Spreadsheets
    ".xls": "Spreadsheets", ".xlsx": "Spreadsheets", ".csv": "Spreadsheets",
    # Videos
    ".mp4": "Videos", ".avi": "Videos", ".mkv": "Videos", ".mov": "Videos",
    # Audio
    ".mp3": "Audio", ".wav": "Audio", ".flac": "Audio", ".aac": "Audio",
    # Archives
    ".zip": "Archives", ".tar": "Archives", ".gz": "Archives", ".rar": "Archives",
    # Code
    ".py": "Code", ".js": "Code", ".html": "Code", ".css": "Code",
    ".java": "Code", ".cpp": "Code", ".c": "Code", ".rs": "Code",
}

def organise_directory(source_dir, dry_run=True):
    """Organise files in a directory by their extension.

    Args:
        source_dir: Path to the directory to organise.
        dry_run: If True, only prints what would happen without moving files.

    Returns:
        Dictionary mapping category to list of moved files.
    """
    source = Path(source_dir)
    if not source.is_dir():
        print(f"Error: '{source_dir}' is not a valid directory")
        return {}

    moved = defaultdict(list)
    stats = Counter()

    for filepath in source.iterdir():
        # Skip directories and hidden files
        if filepath.is_dir() or filepath.name.startswith("."):
            continue

        ext = filepath.suffix.lower()
        category = EXTENSION_MAP.get(ext, "Other")

        dest_dir = source / category
        dest_file = dest_dir / filepath.name

        # Handle name conflicts
        if dest_file.exists():
            stem = filepath.stem
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            dest_file = dest_dir / f"{stem}_{timestamp}{ext}"

        if dry_run:
            print(f"  [DRY RUN] {filepath.name} -> {category}/")
        else:
            dest_dir.mkdir(exist_ok=True)
            shutil.move(str(filepath), str(dest_file))
            print(f"  Moved: {filepath.name} -> {category}/")

        moved[category].append(filepath.name)
        stats[category] += 1

    return dict(moved)

if __name__ == "__main__":
    import sys

    target = sys.argv[1] if len(sys.argv) > 1 else "."
    dry = "--execute" not in sys.argv

    print(f"Organising: {os.path.abspath(target)}")
    if dry:
        print("(Dry run — add --execute to actually move files)\n")
    else:
        print("(EXECUTING — files will be moved)\n")

    results = organise_directory(target, dry_run=dry)

    print(f"\nSummary:")
    total = 0
    for category, files in sorted(results.items()):
        print(f"  {category}: {len(files)} files")
        total += len(files)
    print(f"  Total: {total} files")

Best Practices

1. Organise Imports Properly

Follow the standard import ordering convention (also enforced by tools like isort):

# 1. Standard library imports
import os
import sys
from datetime import datetime
from pathlib import Path

# 2. Third-party imports
import requests
import pandas as pd
from flask import Flask, jsonify

# 3. Local / project imports
from myapp.models import User
from myapp.utils import format_date

Each group should be separated by a blank line, and imports within each group should be alphabetically sorted.

2. Use __all__ to Control Exports

Define __all__ to explicitly declare what your module exports:

# mymodule.py

__all__ = ["public_function", "PublicClass"]

def public_function():
    """This will be exported."""
    pass

def _private_helper():
    """This will NOT be exported (leading underscore convention)."""
    pass

class PublicClass:
    """This will be exported."""
    pass

When someone does from mymodule import *, only names listed in __all__ are imported.

3. Avoid Circular Imports

Circular imports happen when two modules import each other:

# BAD: Circular import
# module_a.py
from module_b import function_b   # module_b imports module_a!

def function_a():
    return function_b()

# module_b.py
from module_a import function_a   # module_a imports module_b!

def function_b():
    return function_a()

Solutions:

  • Move the shared code into a third module
  • Use late imports (import inside the function)
  • Restructure your code
# Solution: late import
# module_a.py
def function_a():
    from module_b import function_b  # Import only when needed
    return function_b()

4. Prefer Absolute Imports

# GOOD: Absolute imports — clear and unambiguous
from myproject.utils.math_helpers import add
from myproject.models.user import User

# OK: Relative imports within a package (keep them short)
from .math_helpers import add      # Same package
from ..models.user import User     # Parent package

# AVOID: Deep relative imports
from ...core.base.mixins import LogMixin  # Hard to follow

Common Mistakes

1. Naming Files the Same as Standard Library Modules

This is the single most common beginner mistake:

# You create a file called "random.py"
# random.py
import random   # This imports YOUR file, not the standard library!

print(random.randint(1, 10))  # AttributeError!

Solution: Never name your files random.py, math.py, os.py, json.py, string.py, email.py, test.py, or any other standard library module name.

2. Circular Imports

As covered above, this happens when module A imports module B, and module B imports module A. The fix is to restructure your code or use late (inside-function) imports.

3. Forgetting __init__.py

While Python 3.3+ supports "namespace packages" without __init__.py, it is best practice to always include it:

# Without __init__.py, imports may behave unexpectedly
my_package/
├── module_a.py
└── module_b.py

# With __init__.py — explicit, clear, reliable
my_package/
├── __init__.py
├── module_a.py
└── module_b.py

4. Import Side Effects

Avoid running significant code at module level. It executes on every import:

# BAD: Side effects on import
# config.py
import requests

# This HTTP request runs every time someone imports config!
response = requests.get("https://api.example.com/config")
CONFIG = response.json()
# GOOD: Wrap side effects in functions
# config.py
import requests

_config_cache = None

def get_config():
    """Load config on first call, then cache it."""
    global _config_cache
    if _config_cache is None:
        response = requests.get("https://api.example.com/config")
        _config_cache = response.json()
    return _config_cache

5. Using from module import * in Production Code

# BAD: Unclear where names come from, risk of collisions
from os import *
from sys import *
from json import *

# GOOD: Explicit imports
from os import path, getcwd, listdir
from sys import argv, exit
from json import loads, dumps

Practice Exercises

Exercise 1: Module Creation

Create a module called temperature.py with functions:

  • celsius_to_fahrenheit(c) — convert Celsius to Fahrenheit
  • fahrenheit_to_celsius(f) — convert Fahrenheit to Celsius
  • celsius_to_kelvin(c) — convert Celsius to Kelvin
  • is_boiling(celsius) — return True if water boils at this temperature

Add a __name__ == "__main__" guard that tests each function.

# temperature.py
def celsius_to_fahrenheit(c):
    return (c * 9 / 5) + 32

def fahrenheit_to_celsius(f):
    return (f - 32) * 5 / 9

def celsius_to_kelvin(c):
    return c + 273.15

def is_boiling(celsius):
    return celsius >= 100

if __name__ == "__main__":
    print(celsius_to_fahrenheit(100))   # 212.0
    print(fahrenheit_to_celsius(32))    # 0.0
    print(celsius_to_kelvin(0))         # 273.15
    print(is_boiling(100))              # True
    print(is_boiling(99))               # False

Exercise 2: Standard Library Exploration

Write a script that uses at least 5 standard library modules to:

  1. Generate 10 random integers between 1 and 100 (random)
  2. Calculate their mean and standard deviation (math)
  3. Save the results with a timestamp to a JSON file (json, datetime)
  4. Print the file size (os)
import random
import math
import json
from datetime import datetime
import os

# 1. Generate random numbers
numbers = [random.randint(1, 100) for _ in range(10)]

# 2. Calculate statistics
mean = sum(numbers) / len(numbers)
variance = sum((x - mean) ** 2 for x in numbers) / len(numbers)
std_dev = math.sqrt(variance)

# 3. Save to JSON with timestamp
result = {
    "timestamp": datetime.now().isoformat(),
    "numbers": numbers,
    "mean": round(mean, 2),
    "std_dev": round(std_dev, 2),
}
filename = "stats_output.json"
with open(filename, "w") as f:
    json.dump(result, f, indent=2)

# 4. Print file size
size = os.path.getsize(filename)
print(f"Numbers: {numbers}")
print(f"Mean: {mean:.2f}, Std Dev: {std_dev:.2f}")
print(f"Saved to {filename} ({size} bytes)")

Exercise 3: Package Builder

Create a package called texttools with the following structure:

texttools/
├── __init__.py
├── analysis.py   (word_count, char_count, sentence_count)
├── transform.py  (reverse, to_snake_case, to_camel_case)
└── validate.py   (is_email, is_url, is_phone)

Write the __init__.py to export all functions, then write a main.py that uses them.

Exercise 4: Collections Challenge

Given a list of student records, use collections to:

  1. Count how many students got each grade (use Counter)
  2. Group students by grade (use defaultdict)
  3. Find the top 3 most common grades (use Counter.most_common)
from collections import Counter, defaultdict

students = [
    ("Alice", "A"), ("Bob", "B"), ("Charlie", "A"),
    ("Diana", "C"), ("Eve", "A"), ("Frank", "B"),
    ("Grace", "A"), ("Hank", "B"), ("Ivy", "C"),
    ("Jack", "A"), ("Kate", "D"), ("Leo", "B"),
]

# 1. Count grades
grade_counts = Counter(grade for _, grade in students)
print(grade_counts)  # Counter({'A': 5, 'B': 4, 'C': 2, 'D': 1})

# 2. Group by grade
by_grade = defaultdict(list)
for name, grade in students:
    by_grade[grade].append(name)
for grade in sorted(by_grade):
    print(f"Grade {grade}: {by_grade[grade]}")

# 3. Top 3 grades
print(grade_counts.most_common(3))
# [('A', 5), ('B', 4), ('C', 2)]

Exercise 5: File Organiser Enhancement

Extend the file organiser example from this chapter:

  1. Add a --log flag that writes all moves to a log file
  2. Add support for organising by date (files modified this month, last month, older)
  3. Add a summary that shows total size moved per category

Exercise 6: Build a CLI Tool

Use sys.argv (or the argparse standard library module) to build a command-line tool that:

  1. Accepts a filename and an operation (--count-words, --count-lines, --find PATTERN)
  2. Reads the file and performs the operation
  3. Prints the result
# text_tool.py
import sys
import re

def count_words(text):
    return len(text.split())

def count_lines(text):
    return len(text.splitlines())

def find_pattern(text, pattern):
    matches = re.findall(pattern, text, re.IGNORECASE)
    return matches

if __name__ == "__main__":
    if len(sys.argv) < 3:
        print("Usage: python text_tool.py <file> <--count-words|--count-lines|--find PATTERN>")
        sys.exit(1)

    filename = sys.argv[1]
    operation = sys.argv[2]

    with open(filename, "r") as f:
        content = f.read()

    if operation == "--count-words":
        print(f"Words: {count_words(content)}")
    elif operation == "--count-lines":
        print(f"Lines: {count_lines(content)}")
    elif operation == "--find":
        if len(sys.argv) < 4:
            print("Error: --find requires a pattern")
            sys.exit(1)
        pattern = sys.argv[3]
        matches = find_pattern(content, pattern)
        print(f"Found {len(matches)} matches for '{pattern}':")
        for m in matches:
            print(f"  {m}")
    else:
        print(f"Unknown operation: {operation}")
        sys.exit(1)

Summary

In this final chapter, you learned how Python organises and distributes code:

  • Modules are .py files that group related functions, classes, and variables
  • Importing gives you access to code from other modules using import, from ... import, and aliases
  • The __name__ guard lets you write code that runs only when a file is executed directly
  • Packages are directories of modules with an __init__.py file, supporting nested sub-packages
  • The standard library provides a vast collection of ready-to-use modules — from math and random to json, collections, itertools, pathlib, and many more
  • pip installs third-party packages from PyPI, and virtual environments keep project dependencies isolated
  • Best practices include organising imports, using __all__, and avoiding circular imports

Congratulations!

You have completed the entire Python tutorial series. You now have a solid foundation covering:

  1. Variables, data types, and operators
  2. Control flow (if/elif/else, loops)
  3. Data structures (lists, tuples, dictionaries, sets)
  4. Functions and scope
  5. String manipulation
  6. File handling
  7. Error handling and exceptions
  8. Object-oriented programming (classes, inheritance)
  9. List comprehensions and generators
  10. Decorators and closures
  11. Modules, packages, and the standard library

What to Build Next

The best way to solidify your knowledge is to build projects. Here are some ideas:

ProjectSkills Practised
To-do list CLI appFile I/O, JSON, argparse, OOP
Web scraperrequests, beautifulsoup4, csv, file handling
Personal budget trackerClasses, file I/O, datetime, collections
Quiz gamerandom, dictionaries, loops, file I/O
Weather apprequests, json, API interaction
URL shortenerflask/fastapi, hashlib, databases
Markdown to HTML converterre, file I/O, pathlib
Chat botString processing, random, json

Advanced Topics to Explore

Once you are comfortable building projects, dive deeper into:

  • Asynchronous programmingasyncio, async/await
  • Testingpytest, test-driven development (TDD)
  • Web development — Flask, FastAPI, Django
  • Data science — pandas, NumPy, matplotlib
  • Databases — SQLAlchemy, SQLite, PostgreSQL
  • APIs — Building and consuming REST APIs
  • Type checkingmypy for static analysis
  • Packaging — Publishing your own packages on PyPI
  • Design patterns — Singleton, Factory, Observer, Strategy
  • Concurrencythreading, multiprocessing, asyncio

Happy coding!