Subsection

Rich Console Output & Progress Bars for GIS CLIs

Adding terminal feedback to geospatial batch CLIs turns opaque multi-minute raster runs into observable, debuggable pipelines — this page is part of the CLI Architecture & Design Patterns guide.

Prerequisites

  • Python 3.9+ (required for stable concurrent.futures behavior and modern type hints)
  • rich>=13.0.0 — progress tracking, themed tables, ANSI color management, auto terminal detection
  • A CLI dispatch layer: use Argument Parsing with Typer for type-safe entry points that accept Path and EPSG arguments
  • Geospatial stack: rasterio, geopandas, or pyproj for real data operations
  • Terminal with ANSI support (Windows Terminal, iTerm2, GNOME Terminal, modern CI runners with TERM=xterm-256color)
pip install "rich>=13.0.0" "typer[all]" geopandas pyproj rasterio

Problem framing

Silent batch jobs fail operators. A reprojection loop that quietly skips 400 files due to a missing EPSG:3857 override looks identical to a successful run — until the downstream mosaic has holes. Without progress feedback you also lose throughput visibility: there is no way to know whether 10 000 GeoTIFFs will finish in two minutes or two hours. Rich Console Output solves both problems by exposing per-file state, elapsed time, and error counts in the terminal, while keeping stdout clean for machine-readable data.

Architecture overview

The diagram below shows how a Rich console layer sits between the CLI entry point and the core geoprocessing routines. Presentation logic is fully decoupled; the processing functions yield (Path, bool) tuples and never import rich.

Rich console architecture for geospatial CLIs Three-layer diagram showing CLI entry point, Rich console/progress manager, and geoprocessing core with a results summary table at the bottom. CLI Entry Point (Typer / Click) @app.command() clip(input_dir, output_dir, quiet) Rich Console Layer (stderr, themed) Console(theme=…) Progress(transient=True) render_summary() Geoprocessing Core (no rich imports) process_raster_batch() → Iterator[(Path, bool)] results → summary table → exit code 0/1

Step-by-step implementation

Step 1 — Initialize the Console on stderr with a GIS theme

Rich’s Console object auto-detects terminal width, color depth, and encoding. Binding it to stderr separates diagnostic output from the data stream, so stdout remains pipe-safe.

from rich.console import Console
from rich.theme import Theme

gis_theme = Theme({
    "info":    "cyan",
    "warning": "yellow bold",
    "error":   "red bold",
    "crs":     "green",
    "path":    "dim",
})

console = Console(theme=gis_theme, stderr=True)

stderr=True is the single most important setting for GIS tooling. Pipelines like my_tool clip . output/ | ogr2ogr … will break if Rich renders ANSI escape sequences on stdout.

Step 2 — Build a Progress context with geospatial-aware columns

Pre-calculate file counts before entering the Progress context so the bar shows a real total rather than an indeterminate spinner. transient=True collapses finished tasks from the scrollback, which matters when you chain multiple reprojection, clip, and validation stages.

from rich.progress import (
    Progress, SpinnerColumn, TextColumn,
    BarColumn, TaskProgressColumn, TimeRemainingColumn,
    MofNCompleteColumn,
)

def make_progress() -> Progress:
    return Progress(
        SpinnerColumn(),
        TextColumn("[progress.description]{task.description}"),
        BarColumn(bar_width=40),
        MofNCompleteColumn(),       # shows "42/1000" alongside the bar
        TaskProgressColumn(),
        TimeRemainingColumn(),
        console=console,
        transient=True,
    )

MofNCompleteColumn is particularly useful for raster batches where each file may take a different amount of time — seeing “42/1000 files” is more actionable than a percentage alone.

Step 3 — Bind progress tasks to raster I/O iterators

Keep the processing function ignorant of Rich. Accept a Progress instance as a parameter and call progress.advance() inside the loop. This makes unit-testing the geoprocessing logic straightforward — pass a no-op progress object in tests.

from pathlib import Path
from typing import Iterator
import rasterio
from rasterio.crs import CRS

def process_raster_batch(
    input_dir: Path,
    output_dir: Path,
    target_epsg: int,
    progress: Progress,
) -> Iterator[tuple[Path, bool]]:
    """Reproject every GeoTIFF in input_dir to target_epsg, yield (path, success)."""
    files = sorted(input_dir.glob("*.tif"))
    task_id = progress.add_task(
        description=f"[crs]Reprojecting → EPSG:{target_epsg}",
        total=len(files),
    )
    target_crs = CRS.from_epsg(target_epsg)

    for src_path in files:
        dest_path = output_dir / f"epsg{target_epsg}_{src_path.name}"
        try:
            with rasterio.open(src_path) as src:
                if src.crs == target_crs:
                    # CRS already matches — copy without transformation
                    dest_path.write_bytes(src_path.read_bytes())
                else:
                    import rasterio.warp as warp
                    transform, width, height = warp.calculate_default_transform(
                        src.crs, target_crs, src.width, src.height, *src.bounds
                    )
                    profile = src.profile.copy()
                    profile.update(crs=target_crs, transform=transform,
                                   width=width, height=height)
                    with rasterio.open(dest_path, "w", **profile) as dst:
                        for band_idx in src.indexes:
                            warp.reproject(
                                source=rasterio.band(src, band_idx),
                                destination=rasterio.band(dst, band_idx),
                                src_crs=src.crs,
                                dst_crs=target_crs,
                                resampling=warp.Resampling.lanczos,
                            )
            yield src_path, True
        except Exception as exc:
            console.log(f"[error]FAIL[/error] [path]{src_path.name}[/path]: {exc}")
            yield src_path, False
        finally:
            progress.advance(task_id)

progress.advance() is thread-safe, so you can wrap this iterator in a concurrent.futures.ThreadPoolExecutor without adding locks around the Rich calls.

Step 4 — Render a structured summary table after the batch

Never halt the pipeline on the first error. Collect (Path, bool) results and render a rich.table.Table when the batch completes. This gives operators a single actionable view of what succeeded and what needs investigation.

from rich.table import Table
from rich.panel import Panel

def render_summary(results: list[tuple[Path, bool]]) -> int:
    """Print a summary table and return exit code (0 = all OK, 1 = some failed)."""
    table = Table(title="Batch Reprojection Summary", show_lines=False)
    table.add_column("File",   style="path",  no_wrap=True, max_width=50)
    table.add_column("Status", justify="center", width=8)

    failures = 0
    for path, ok in results:
        table.add_row(path.name, "[info]OK[/info]" if ok else "[error]FAIL[/error]")
        if not ok:
            failures += 1

    console.print(table)
    console.print(Panel(
        f"[info]Total:[/info] {len(results)}  "
        f"[crs]OK:[/crs] {len(results) - failures}  "
        f"[error]Failed:[/error] {failures}",
        border_style="cyan",
    ))
    return 1 if failures else 0

For richer CRS-aware column formatting — displaying EPSG authority codes, axis order badges, and projection type labels — see Customizing Rich tables for coordinate system outputs.

Step 5 — Wire the console layer into the Typer command

Pass Console and Progress instances from the command handler down into the processing functions. Never import console as a module-level global inside geoprocessing modules.

import sys
import typer
from pathlib import Path

app = typer.Typer()

@app.command()
def reproject(
    input_dir:   Path = typer.Argument(..., help="Directory of source GeoTIFFs"),
    output_dir:  Path = typer.Argument(..., help="Destination directory"),
    epsg:        int  = typer.Option(3857,  "--epsg",  help="Target EPSG code"),
    quiet:       bool = typer.Option(False, "--quiet", "-q", help="Suppress progress"),
) -> None:
    output_dir.mkdir(parents=True, exist_ok=True)

    if quiet:
        from rich.progress import Progress as _Progress
        with _Progress(console=Console(quiet=True, stderr=True)) as progress:
            results = list(process_raster_batch(input_dir, output_dir, epsg, progress))
    else:
        with make_progress() as progress:
            results = list(process_raster_batch(input_dir, output_dir, epsg, progress))

    exit_code = render_summary(results)
    raise typer.Exit(code=exit_code)

Exit code 0 means all files succeeded; exit code 1 means at least one file failed. This follows POSIX conventions and allows CI pipelines to treat partial failures as build errors. For command dispatch patterns that structure multiple subcommands with shared option groups, see CLI Subcommand Organization.

Configuration integration

Rich console behavior should be controllable through the same layered config stack used for the rest of the CLI: environment variables override file-based config, and explicit flags override both.

import os, sys

def build_console() -> Console:
    """
    Precedence: --quiet flag > QUIET_MODE=1 env var > TTY detection > default.
    Force terminal mode when CI=true but the runner supports ANSI (e.g. GitHub Actions).
    """
    force_terminal = (
        os.environ.get("FORCE_COLOR") == "1"
        or os.environ.get("TERM_PROGRAM") in {"iTerm.app", "vscode"}
    )
    quiet = os.environ.get("QUIET_MODE") == "1"
    return Console(
        theme=gis_theme,
        stderr=True,
        force_terminal=force_terminal,
        quiet=quiet,
    )

For layered YAML/TOML configuration that governs defaults like default_epsg, log_level, and output_format, Managing YAML configs for geospatial CLI workflows shows how to merge file, environment, and flag layers into a single config object that your console layer can read at startup.

Data-flow diagram: the progress lifecycle

Rich Progress lifecycle for a geospatial batch job Flow diagram showing five stages: file discovery, add_task, process loop with advance, exception branch to console.log, and final render_summary returning exit code 0 or 1. File discovery files = list(dir.glob("*.tif")) Register task progress.add_task(total=len(files)) Processing loop rasterio.open → warp.reproject progress.advance(task_id) ← finally yield (path, True) console.log yield (path, False) render_summary(results) rich.table.Table + Panel raise typer.Exit(code=0 | 1)

Error handling and gotchas

CRS mismatch not caught at open time. rasterio.open() succeeds even when the on-disk CRS is missing or inconsistent. Check src.crs is not None before passing to warp.calculate_default_transform, and emit a [warning] console message rather than letting the warp call raise a cryptic PROJ exception.

Thread safety in Progress. Progress.advance() and Progress.update() are safe from threads. They are not safe from subprocesses — multiprocessing.Pool workers run in separate memory spaces and cannot call the parent’s Progress directly. Use a multiprocessing.Queue to send (task_id, increment) tuples back to the main process, which calls progress.advance() from a listener thread.

GDAL driver availability. On minimal Docker images, GDAL may lack cloud-optimized GeoTIFF (COG) or NetCDF drivers. Catch rasterio.errors.RasterioIOError and inspect the message for “no driver” before logging; surface the missing driver name explicitly so the operator knows which GDAL plugin to install.

Transient progress and scrollback. With transient=True, completed task bars are erased from the terminal. If your pipeline redirects stderr to a file for auditing, the file will contain ANSI escape sequences that erase lines — they appear as garbled output in plain text viewers. Add force_terminal=False when stderr is not a TTY.

Windows conhost colour depth. The legacy Windows Console Host (conhost.exe) reports limited color support. Rich auto-detects this and downgrades to 8-bit colors, but if colors are critical (e.g., pass/fail badges), force Console(color_system="256") on Windows or instruct users to use Windows Terminal.

Verification

After running a batch, verify the pipeline behaved correctly:

# Count successfully reprojected files
ls output/ | wc -l

# Confirm all outputs carry the correct CRS
python - <<'PY'
from pathlib import Path
import rasterio

for p in sorted(Path("output").glob("*.tif")):
    with rasterio.open(p) as src:
        epsg = src.crs.to_epsg() if src.crs else None
        status = "OK" if epsg == 3857 else f"UNEXPECTED epsg={epsg}"
        print(f"{p.name}: {status}")
PY

Exit code 1 from the CLI means at least one file failed. Wrap the command in your CI pipeline:

python -m mypackage reproject ./raw ./reprojected --epsg 3857
echo "Exit: $?"   # 0 = all good, 1 = partial failure

Performance notes

Rich’s progress rendering adds negligible CPU overhead (microseconds per advance() call) compared to rasterio I/O. The dominant bottlenecks are disk throughput and GDAL warping.

  • For batches under ~500 files, single-threaded sequential I/O with transient=True is typically fastest — no thread-pool overhead.
  • For larger batches, use concurrent.futures.ThreadPoolExecutor with max_workers=4–8. rasterio releases the GIL during reads, so threads help when files are spread across an NFS or S3-backed filesystem.
  • For CPU-bound warping (e.g., Lanczos resampling on large rasters), ProcessPoolExecutor outperforms threads. Collect results via a multiprocessing.Queue and aggregate progress in the main process.
  • Memory: transient=True does not reduce memory — Rich still tracks completed tasks internally until the Progress context exits. For extremely long batches (100 000+ files), call progress.remove_task(task_id) after advance() to free the internal task record.

FAQ

Is Rich's Progress object thread-safe for concurrent raster processing?

Yes. Progress.advance() and Progress.update() acquire an internal lock before modifying the render state, so multiple threads can call them concurrently without corruption. For multiprocessing workers you need an IPC channel (Queue or Manager proxy) because worker processes cannot share the parent’s memory.

Why should console output go to stderr rather than stdout?

Geospatial CLIs frequently pipe their output into other programs — ogr2ogr, gdal_translate, jq, or custom consumers. Mixing ANSI progress bars into stdout breaks these pipes. Binding the Console to stderr (Console(stderr=True)) keeps stdout clean for GeoJSON, CSV, or binary raster data.

How do I disable progress bars inside GitHub Actions or GitLab CI?

GitHub Actions sets CI=true. Detect this and fall back to plain log lines:

import os, sys

def is_interactive() -> bool:
    return sys.stderr.isatty() and not os.environ.get("CI")

console = Console(stderr=True, force_terminal=is_interactive())

Alternatively, pass --quiet on the CI command line — the Typer entry point shown above honours this flag.

What does transient=True do, and when should I turn it off?

transient=True removes each progress bar from the terminal display as soon as its task completes. This keeps multi-stage pipelines tidy. Turn it off (transient=False) when you want a permanent record of each stage’s completion time in the terminal — useful during debugging or live demos.

Can I render CRS metadata tables alongside the progress output?

Yes. Because Console is passed explicitly, you can call console.print(table) at any point inside the Progress context — Rich will render the table above the live progress bar. For full EPSG column configuration, axis order badges, and overflow handling, see Customizing Rich tables for coordinate system outputs.