Skip to content

Data Analysis with the SDK

The Data Analysis section is the entry point for data analysts who want to explore UN Comtrade trade data interactively. The SDK plays well with pandas, Jupyter notebooks, DuckDB, Parquet, and Markdown report generators.

Purpose

This page covers:

  1. Loading a CanonicalDataset into a pandas DataFrame.
  2. Querying a stored dataset with DuckDB SQL.
  3. Generating Markdown reports from analytics output.
  4. Iterative exploration patterns in Jupyter.

Prerequisites

  • un-comtrade-sdk[all] installed (includes Parquet + DuckDB).
  • Familiarity with pandas, Jupyter notebooks, and SQL basics.

Walkthrough

Load into pandas

from un_comtrade import ComtradeClient

with ComtradeClient() as client:
    dataset = client.trade.get_exports(reporter_code=699, period="2022")
    df = dataset.to_pandas()

print(df.head())

CanonicalDataset.to_pandas() returns a pandas.DataFrame with typed columns (Decimalobject, ISO-8601 → string, frozensetobject).

Query with DuckDB

import duckdb

con = duckdb.connect("india_exports.duckdb")
result = con.execute("""
    SELECT partner_code, SUM(primary_value) AS total
    FROM exports
    GROUP BY partner_code
    ORDER BY total DESC
    LIMIT 10
""").fetchall()

Generate a Markdown report

from un_comtrade import ComtradeClient

with ComtradeClient() as client:
    exports = client.trade.get_exports(reporter_code=699, period="2022")
    top = client.analytics.top_partners(exports, by="exports", limit=10)

print("# Top 10 export partners — India, 2022\n")
print("| Partner | Total (USD) | Records |")
print("| ------- | -----------: | ------: |")
for row in top:
    print(f"| {row.partner_label} | ${row.value:,.2f} | {row.record_count} |")

Jupyter iteration

import matplotlib.pyplot as plt
from un_comtrade import ComtradeClient

years, totals = [], []
with ComtradeClient() as client:
    for year in range(2010, 2024):
        exports = client.trade.get_exports(reporter_code=699, period=str(year))
        years.append(year)
        totals.append(float(exports.aggregate_total()))

plt.plot(years, totals)
plt.xlabel("Year")
plt.ylabel("Total exports (USD)")
plt.title("India exports 2010–2023")
plt.show()

Examples

A pandas-only workflow:

from un_comtrade import ComtradeClient
import pandas as pd

with ComtradeClient() as client:
    df = pd.concat([
        client.trade.get_exports(reporter_code=699, period=str(y)).to_pandas()
        for y in range(2020, 2024)
    ])

print(df.groupby(df.ref_period_id).primary_value.sum())

A DuckDB-only workflow:

import duckdb
from un_comtrade import ComtradeClient

with ComtradeClient() as client:
    exports = client.trade.get_exports(reporter_code=699, period="2022")
    client.storage.open("india_exports.duckdb").write(exports)

con = duckdb.connect("india_exports.duckdb")
print(con.execute("SELECT COUNT(*) FROM exports").fetchone())

A Jupyter cell that produces a Markdown table:

from un_comtrade import ComtradeClient
from IPython.display import Markdown, display

with ComtradeClient() as client:
    exports = client.trade.get_exports(reporter_code=699, period="2022")
    top = client.analytics.top_partners(exports, by="exports", limit=5)

md = "| Partner | Total |\n| --- | ---: |\n"
for row in top:
    md += f"| {row.partner_label} | ${row.value:,.2f} |\n"
display(Markdown(md))
  • RECIPE-021Compute a country trade balance.
  • RECIPE-111India exports to report — full Markdown report from a single CLI / Python call.
  • RECIPE-113HS explorer to Markdown.

Next steps

  • Exploration — drill into the iterative workflow.
  • Reporting — produce shareable Markdown reports.