Multi-GPU Joins

Treat this page as design-state documentation. Do not promise distributed join execution, cross-device partitioning, or multi-GPU WCOJ in current release artifacts.

What Exists

xlog-cuda contains a MultiGpuMemoryManager substrate. It wraps a GpuDevicePool, builds one GpuMemoryManager per device, and supports:

device-count inspection;

allocation on a specified device;

round-robin allocation on the next device;

per-device remaining-byte reporting;

access to the underlying device pool.

CUDA certification also checks basic multi-GPU detection consistency when the test environment exposes multiple devices.

What Does Not Exist Yet

The repository does not currently ship:

a distributed relation buffer type for query execution;

partitioning kernels that route rows by hash key across devices;

peer-to-peer shuffle orchestration for joins;

distributed hash-join execution;

cross-device WCOJ or Free Join;

optimizer costing for multi-GPU partition plans.

Those pieces are future architecture work.

Design Direction

A future distributed hash join would likely use hash partitioning:

compute a partition for each row from the join key;

move left and right partitions to the same device;

run the normal local join kernel per device;

concatenate or expose the partitioned result as a distributed relation.

That design still has unresolved production requirements:

skew handling for hot keys;

memory budgeting across devices;

peer-to-peer versus host-mediated copies;

deterministic result ordering or explicit unordered semantics;

relation-generation and cache invalidation across devices;

fallback behavior when only one GPU is present.

User Guidance

For current workloads, plan around one CUDA device per executor. Use the single-GPU execution, WCOJ, adaptive-indexing, and factorized-execution pages for the routes that actually run today.

If you see multi-GPU allocation APIs in source, treat them as substrate, not as evidence that distributed joins are available.

​What Exists

​What Does Not Exist Yet

​Design Direction

​User Guidance

What Exists

What Does Not Exist Yet

Design Direction

User Guidance