{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "# Fast factorized backprojection (FFBP)\n",
    "\n",
    "Direct backprojection forms a SAR image by summing, for every output pixel, the\n",
    "contribution of every pulse in the aperture. It asymptotic scaling is therefore $\\mathcal{O}(NRA)$, where $N$ is the number of pulses, $R$ is the number of range pixels, and $A$ is the number of azimuth pixels. The number of azimuth pixels also scales linearly with number of pulses for linear track, so we can say that the algoritm is $\\mathcal{O}(N^2)$.\n",
    "\n",
    "Fast factorized backprojection (FFBP) reduces this cost. The idea is:\n",
    "\n",
    "1. Split the aperture into short subapertures. A short subaperture has a small\n",
    "   angular (Doppler) bandwidth, so its image can be formed on a coarse polar grid\n",
    "2. Coherently merge pairs of subaperture images onto a finer grid repeating in a\n",
    "   binary tree. Each merge doubles the azimuth resolution (and doubles the image azimuth size)\n",
    "\n",
    "At the limit of many recursion stages, the asymptotic cost is $\\mathcal{O}(N \\log N)$.\n",
    "\n",
    "The library exposes this as a single call, `torchbp.ops.ffbp`. In this notebook\n",
    "the algorithm is built by hand from the low-level ops to demonstrate how it works.\n",
    "`backprojection_polar_2d` is used to form the lowest-level subapertures and the `ffbp_merge2` op is use to recursively\n",
    "merge the images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1",
   "metadata": {},
   "outputs": [],
   "source": [
    "import math\n",
    "import torch\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from numpy import hamming\n",
    "\n",
    "import torchbp\n",
    "from torchbp.util import center_pos\n",
    "from torchbp.ops import backprojection_polar_2d, ffbp_merge2\n",
    "\n",
    "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
    "print(\"Device:\", device)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2",
   "metadata": {},
   "source": [
    "## Scene and radar parameters\n",
    "\n",
    "A single point target, imaged on a heavily oversampled polar grid zoomed around the\n",
    "target so the point-spread function (PSF) is smooth and easy to inspect. The aperture\n",
    "is split into a power-of-two number of subapertures, so we use `nsweeps = 512`. `ffbp` can also handle non power-of-two `nsweeps` efficiently, but this choice simplifies the example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3",
   "metadata": {},
   "outputs": [],
   "source": [
    "fc = 6e9          # RF center frequency (Hz)\n",
    "bw = 200e6        # RF bandwidth (Hz), positive for rising sweep\n",
    "tsweep = 100e-6   # Sweep length (s)\n",
    "fs = 2e6          # Sampling frequency (Hz)\n",
    "nsweeps = 512     # Number of pulses in the aperture\n",
    "oversample = 2    # Range FFT oversampling (reduces interpolation error)\n",
    "nsamples = int(fs * tsweep)\n",
    "\n",
    "# Heavily oversampled polar grid, zoomed on the target. theta is sin(angle).\n",
    "nr, ntheta = 256, 512\n",
    "grid = {\"r\": (96.0, 104.0), \"theta\": (-0.06, 0.06), \"nr\": nr, \"ntheta\": ntheta}\n",
    "\n",
    "# Single point target at 100 m, broadside.\n",
    "target_pos = torch.tensor([[100.0, 0.0, 0.0]], dtype=torch.float32, device=device)\n",
    "target_rcs = torch.ones((1, 1), dtype=torch.float32, device=device)\n",
    "\n",
    "# Straight platform track along y at 50 m height. Element spacing = lambda/4.\n",
    "pos = torch.zeros((nsweeps, 3), dtype=torch.float32, device=device)\n",
    "pos[:, 1] = torch.linspace(-nsweeps / 2, nsweeps / 2, nsweeps, device=device) * 0.25 * 3e8 / fc\n",
    "pos[:, 2] = 50.0"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4",
   "metadata": {},
   "source": [
    "## Simulate and range-compress the data\n",
    "\n",
    "Identical to the backprojection example. Generate raw data, apply a range\n",
    "window, and FFT to range-compress. `data_fmod` shifts the data spectrum to DC\n",
    "and `alias_fmod` centers the image spectrum. Both reduce interpolation error and must\n",
    "be passed consistently to backprojection and the FFBP merge."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5",
   "metadata": {},
   "outputs": [],
   "source": [
    "raw = torchbp.util.generate_fmcw_data(target_pos, target_rcs, pos, fc, bw, tsweep, fs)\n",
    "w = torch.tensor(hamming(raw.shape[-1])[None, :], dtype=torch.float32, device=device)\n",
    "data = torch.fft.ifft(raw * w, dim=-1, n=nsamples * oversample)\n",
    "\n",
    "data_fmod = -math.pi * (1 - (oversample - 1) / oversample)\n",
    "data = data * torch.exp(1j * data_fmod * torch.arange(data.shape[-1], device=device))[None, :]\n",
    "\n",
    "r_res = 3e8 / (2 * abs(bw) * oversample)           # range bin size in the data\n",
    "dr = (grid[\"r\"][1] - grid[\"r\"][0]) / grid[\"nr\"]    # range bin size in the image\n",
    "im_margin = oversample * r_res / dr - 1\n",
    "alias_fmod = -math.pi * (1 - im_margin / (1 + im_margin))\n",
    "print(f\"r_res={r_res:.3f} m, image dr={dr:.3f} m, data_fmod={data_fmod:.3f}, alias_fmod={alias_fmod:.3f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6",
   "metadata": {},
   "source": [
    "A small helper to display a polar image in dB."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7",
   "metadata": {},
   "outputs": [],
   "source": [
    "extent = [*grid[\"r\"], *grid[\"theta\"]]\n",
    "\n",
    "def show_polar(img, title, ax=None, dyn=30, ref_max=None):\n",
    "    img = img.detach().cpu()\n",
    "    db = 20 * torch.log10(torch.abs(img) + 1e-12)\n",
    "    m = db.max() if ref_max is None else ref_max\n",
    "    if ax is None:\n",
    "        _, ax = plt.subplots()\n",
    "    im = ax.imshow(db.numpy().T, origin=\"lower\", vmin=m - dyn, vmax=m,\n",
    "                   extent=extent, aspect=\"auto\")\n",
    "    ax.set_title(title)\n",
    "    ax.set_xlabel(\"Range (m)\")\n",
    "    ax.set_ylabel(\"Angle (sin)\")\n",
    "    return m\n",
    "\n",
    "def relerr(a, b):\n",
    "    return (torch.linalg.norm(a - b) / torch.linalg.norm(a)).item()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8",
   "metadata": {},
   "source": [
    "## Reference: Direct backprojection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9",
   "metadata": {},
   "outputs": [],
   "source": [
    "img_bp = backprojection_polar_2d(\n",
    "    data, grid, fc, r_res, pos, dealias=True, data_fmod=data_fmod, alias_fmod=alias_fmod\n",
    ").squeeze()\n",
    "\n",
    "m = show_polar(img_bp, \"Direct backprojection (reference)\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10",
   "metadata": {},
   "source": [
    "## Step 1: Backproject the subapertures (coarse azimuth)\n",
    "\n",
    "Split the aperture in two halves. Each half is backprojected on its own local\n",
    "polar frame, centered on that half's mean antenna phase center (APC) with\n",
    "`center_pos`. Because each half spans only half the aperture, its azimuth bandwidth is\n",
    "halved, so half as many azimuth samples suffice. Since backprojection scales quadratically with number of sweeps, generating two coarser images from both halves of the data is 2x faster than generating the full resolution image. We keep a small internal oversampling factor to limit interpolation errors at the edges during the later merge."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11",
   "metadata": {},
   "outputs": [],
   "source": [
    "OSR = 1.4  # internal oversampling of subaperture grids (limits interpolation error)\n",
    "\n",
    "def backproject_subaperture(sl, ntheta_sub):\n",
    "    \"\"\"Backproject the pulses in slice `sl` onto a local coarse-azimuth polar grid.\"\"\"\n",
    "    p = pos[sl]\n",
    "    p_local, origin = center_pos(p)\n",
    "    g = {\"r\": grid[\"r\"], \"theta\": grid[\"theta\"],\n",
    "         \"nr\": int(OSR * nr), \"ntheta\": int(OSR * ntheta_sub)}\n",
    "    img = backprojection_polar_2d(\n",
    "        data[sl], g, fc, r_res, p_local,\n",
    "        dealias=True, data_fmod=data_fmod, alias_fmod=alias_fmod,\n",
    "    ).squeeze()\n",
    "    return {\"img\": img, \"grid\": g, \"origin\": origin[0], \"z\": p[:, 2].mean()}\n",
    "\n",
    "half = nsweeps // 2\n",
    "subA = backproject_subaperture(slice(0, half), ntheta // 2)\n",
    "subB = backproject_subaperture(slice(half, nsweeps), ntheta // 2)\n",
    "\n",
    "print(\"subaperture image shape:\", tuple(subA[\"img\"].shape), \"(full grid is\", (nr, ntheta), \")\")\n",
    "fig, axs = plt.subplots(1, 2, figsize=(11, 4))\n",
    "show_polar(subA[\"img\"], \"Subaperture A (first half)\", ax=axs[0])\n",
    "show_polar(subB[\"img\"], \"Subaperture B (second half)\", ax=axs[1])\n",
    "plt.tight_layout(); plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12",
   "metadata": {},
   "source": [
    "Each subaperture image is correctly focused but resolution is not as good as in the reference, since only half of the data is used. The main lobe is\n",
    "about twice as wide as in the reference. The two images also sit at different coordinate frames, each referenced to its own APC."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13",
   "metadata": {},
   "source": [
    "## Step 2: Merge\n",
    "\n",
    "`ffbp_merge2` interpolates the two subaperture images onto a common, finer output grid\n",
    "referenced to the combined APC, and sums them coherently. To do that it needs, for\n",
    "each input, the vector from its origin to the new origin (`dorigin`, with the height\n",
    "component handled separately) and the carrier frequency `fc` to keep the phases\n",
    "consistent. Combining the two halves restores the full azimuth bandwidth, hence full\n",
    "resolution.\n",
    "\n",
    "`ffbp_merge2` is essentially the same operation as `polar_interp`, but merges the two interpolations in one kernel for faster execution. It uses Knab pulse interpolation that is almost optimal finite interpolation kernel, when the oversampling rate is known. (J. Knab, \"The sampling window,\" in IEEE Transactions on Information Theory, vol. 29, no. 1, pp. 157-159, January 1983.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14",
   "metadata": {},
   "outputs": [],
   "source": [
    "def merge(a, b, grid_new):\n",
    "    new_origin = 0.5 * (a[\"origin\"] + b[\"origin\"])\n",
    "    new_z = 0.5 * (a[\"z\"] + b[\"z\"])\n",
    "    da = (new_origin - a[\"origin\"]).clone(); da[2] = -(new_z - a[\"z\"])\n",
    "    db = (new_origin - b[\"origin\"]).clone(); db[2] = -(new_z - b[\"z\"])\n",
    "    img = ffbp_merge2(\n",
    "        a[\"img\"], b[\"img\"], da, db, [a[\"grid\"], b[\"grid\"]], fc, grid_new,\n",
    "        z0=float(new_z), method=(\"knab\", 6, 1.5),\n",
    "        alias=False, alias_fmod=alias_fmod, output_alias=True,\n",
    "    )\n",
    "    return {\"img\": img.squeeze(), \"grid\": grid_new, \"origin\": new_origin, \"z\": new_z}\n",
    "\n",
    "img_ffbp2 = merge(subA, subB, grid)[\"img\"]\n",
    "\n",
    "fig, axs = plt.subplots(1, 2, figsize=(11, 4))\n",
    "show_polar(img_bp, \"Direct backprojection\", ax=axs[0], ref_max=m)\n",
    "show_polar(img_ffbp2, \"FFBP, 2 subapertures (manual)\", ax=axs[1], ref_max=m)\n",
    "plt.tight_layout(); plt.show()\n",
    "\n",
    "print(f\"relative error vs direct BP: {relerr(img_bp, img_ffbp2):.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15",
   "metadata": {},
   "source": [
    "The merged image matches the reference. Azimuth cut overlays perfectly. The largest error is near the edges caused by edge effects from interpolation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16",
   "metadata": {},
   "outputs": [],
   "source": [
    "pk = np.unravel_index(torch.abs(img_bp).cpu().argmax(), img_bp.shape)\n",
    "theta_axis = np.linspace(grid[\"theta\"][0], grid[\"theta\"][1], ntheta)\n",
    "cut_bp = 20 * torch.log10(torch.abs(img_bp[pk[0]]).cpu() + 1e-12)\n",
    "cut_ff = 20 * torch.log10(torch.abs(img_ffbp2[pk[0]]).cpu() + 1e-12)\n",
    "\n",
    "fig, axs = plt.subplots(1, 2, figsize=(12, 4))\n",
    "axs[0].plot(theta_axis, cut_bp - cut_bp.max(), label=\"direct BP\")\n",
    "axs[0].plot(theta_axis, cut_ff - cut_bp.max(), \"--\", label=\"FFBP (manual)\")\n",
    "axs[0].set_xlim(-0.02, 0.02); axs[0].set_ylim(-40, 1)\n",
    "axs[0].set_xlabel(\"Angle (sin)\"); axs[0].set_ylabel(\"dB\")\n",
    "axs[0].set_title(\"Azimuth cut through peak\"); axs[0].legend()\n",
    "show_polar(img_bp - img_ffbp2, \"Difference (FFBP - direct)\", ax=axs[1], ref_max=m, dyn=50)\n",
    "plt.tight_layout(); plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17",
   "metadata": {},
   "source": [
    "Spectrum of both images. FFBP interpolation error spreads the spectrum a little bit in azimuth. Range aliasing comes from linear interpolation used in backprojection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "18",
   "metadata": {},
   "outputs": [],
   "source": [
    "spec_bp_db = 20*np.log10(np.abs(np.fft.fftshift(np.fft.fft2(img_bp)))).T\n",
    "ms = np.max(spec_bp_db)\n",
    "\n",
    "fig, axes = plt.subplots(1, 2, figsize=(11, 4))\n",
    "axes[0].imshow(spec_bp_db, origin=\"lower\", aspect=\"auto\", vmax=ms, vmin=ms - 80, extent=[-0.5, 0.5, -0.5, 0.5])\n",
    "axes[0].set_xlabel(\"Range freq (cycles/sample)\")\n",
    "axes[0].set_ylabel(\"Azimuth freq (cycles/sample)\")\n",
    "axes[0].set_title(\"BP spectrum\");\n",
    "\n",
    "spec_ff_db = 20*np.log10(np.abs(np.fft.fftshift(np.fft.fft2(img_ffbp2)))).T\n",
    "\n",
    "axes[1].imshow(spec_ff_db, origin=\"lower\", aspect=\"auto\", vmax=ms, vmin=ms - 80, extent=[-0.5, 0.5, -0.5, 0.5])\n",
    "axes[1].set_xlabel(\"Range freq (cycles/sample)\")\n",
    "axes[1].set_ylabel(\"Azimuth freq (cycles/sample)\")\n",
    "axes[1].set_title(\"FFBP spectrum\")\n",
    "plt.tight_layout();"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19",
   "metadata": {},
   "source": [
    "## Recursion\n",
    "\n",
    "One merge halves the work of the azimuth dimension, the real speedup comes from\n",
    "recursing. With four subapertures we backproject four coarse images and merge them in\n",
    "a two-level binary tree: merge A+B and C+D to half-resolution intermediates, then merge\n",
    "those two to the full image. In general $N_\\text{sub}$ subapertures need $\\log_2 N_\\text{sub}$ merge stages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "20",
   "metadata": {},
   "outputs": [],
   "source": [
    "NSUB = 4\n",
    "n = nsweeps // NSUB\n",
    "leaves = [backproject_subaperture(slice(i * n, (i + 1) * n), ntheta // NSUB)\n",
    "          for i in range(NSUB)]\n",
    "\n",
    "# Level 1: merge pairs onto half-resolution intermediate grids.\n",
    "grid_mid = {\"r\": grid[\"r\"], \"theta\": grid[\"theta\"],\n",
    "            \"nr\": int(OSR * nr), \"ntheta\": int(OSR * (ntheta // 2))}\n",
    "mid0 = merge(leaves[0], leaves[1], grid_mid)\n",
    "mid1 = merge(leaves[2], leaves[3], grid_mid)\n",
    "\n",
    "# Level 2: merge the intermediates onto the full output grid.\n",
    "img_ffbp4 = merge(mid0, mid1, grid)[\"img\"]\n",
    "\n",
    "fig, axs = plt.subplots(1, 3, figsize=(15, 4))\n",
    "show_polar(leaves[0][\"img\"], \"Leaf 0 (1/4 aperture)\", ax=axs[0])\n",
    "show_polar(mid0[\"img\"], \"After level-1 merge (1/2)\", ax=axs[1])\n",
    "show_polar(img_ffbp4, \"After level-2 merge (full)\", ax=axs[2], ref_max=m)\n",
    "plt.tight_layout(); plt.show()\n",
    "\n",
    "print(f\"4-subaperture FFBP, relative error vs direct BP: {relerr(img_bp, img_ffbp4):.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21",
   "metadata": {},
   "source": [
    "The azimuth resolution sharpens at each merge. The four-leaf result is still\n",
    "visually focused, but its residual is larger than the two-leaf case, because every merge\n",
    "interpolates and those errors accumulate with tree depth. This is the central trade-off of FFBP.\n",
    "\n",
    "The point target is at different azimuth angles in different images, since the image origin is at the center of the data that is processed."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "22",
   "metadata": {},
   "source": [
    "## Library op `torchbp.ops.ffbp`\n",
    "\n",
    "Everything above is what `torchbp.ops.ffbp` does internally. `stages` is the number of\n",
    "recursion levels. It should match the hand-built results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "23",
   "metadata": {},
   "outputs": [],
   "source": [
    "img_lib1 = torchbp.ops.ffbp(data, grid, fc, r_res, pos, stages=1, divisions=2,\n",
    "                            data_fmod=data_fmod, alias_fmod=alias_fmod, dealias=True).squeeze()\n",
    "img_lib2 = torchbp.ops.ffbp(data, grid, fc, r_res, pos, stages=2, divisions=2,\n",
    "                            data_fmod=data_fmod, alias_fmod=alias_fmod, dealias=True).squeeze()\n",
    "\n",
    "print(f\"stages=1 vs manual 2-subaperture: {relerr(img_ffbp2, img_lib1):.4f}\")\n",
    "print(f\"stages=2 vs manual 4-subaperture: {relerr(img_ffbp4, img_lib2):.4f}\")\n",
    "print(f\"stages=1 vs direct BP:            {relerr(img_bp, img_lib1):.4f}\")\n",
    "print(f\"stages=2 vs direct BP:            {relerr(img_bp, img_lib2):.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "24",
   "metadata": {},
   "source": [
    "## Benefit and downside\n",
    "\n",
    "Benefit: FFBP is asymptotically faster and the difference can be huge for large image (10 to 100x faster for realistic size image).\n",
    "\n",
    "Downside: interpolation error. Each merge resamples the images with a finite interpolation kernel and the error accumulates with the number of stages. The error can be reduced with higher-order interpolation kernel (`interp_method=(\"knab\", order, oversample)`) and larger internal oversampling (`oversample_r`, `oversample_theta`), but both of these increase the amount of work. In practice, number of stages, interpolation kernel size, and oversampling factor needs to be chosen based on the maximum allowed error in the final result."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}