|
| 1 | +# POSIX SHM DMA Transport |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +There are two independent features at play in the POSIX SHM transport port. Understanding which is which is key. |
| 6 | + |
| 7 | +## 1. The Transport: POSIX Shared Memory (`posix_transport_shm`) |
| 8 | + |
| 9 | +This is purely a **transport layer** -- it moves request/response messages between client and server processes. It works like this: |
| 10 | + |
| 11 | +- **Server** creates a POSIX shared memory object (`shm_open`) with a layout of: |
| 12 | + ``` |
| 13 | + [ 64-byte header | request buffer | response buffer | optional DMA section ] |
| 14 | + ``` |
| 15 | +- **Client** opens the same named object and `mmap`s it into its address space |
| 16 | +- Both sides then delegate to `wh_transport_mem` (the generic memory-based transport) for actual message passing via CSR registers in the request/response buffers |
| 17 | +- The header contains PIDs for RT-signal-based async notification |
| 18 | + |
| 19 | +The transport's job is **only** to shuttle serialized request/response packets. It knows nothing about crypto, keys, or DMA semantics. |
| 20 | + |
| 21 | +The optional **DMA section** at the end of the shared memory region is the transport providing a chunk of shared address space that *both* processes can access. This is just raw shared memory -- the transport allocates it but doesn't use it itself. It's plumbing for the DMA feature. |
| 22 | + |
| 23 | +## 2. The Feature: DMA (`WOLFHSM_CFG_DMA`) |
| 24 | + |
| 25 | +DMA is a **separate, transport-agnostic feature** in wolfHSM core (`wh_dma.h`, `wh_server_dma.c`, `wh_client_dma.c`). It allows crypto operations to reference client memory **by address** rather than copying data into the transport's request/response buffers. This matters because: |
| 26 | + |
| 27 | +- Standard messages are limited by `WOLFHSM_CFG_COMM_DATA_LEN` (typically ~4KB) |
| 28 | +- DMA messages send *addresses* in the request, and the server reads/writes client memory directly |
| 29 | + |
| 30 | +The DMA feature has a callback-based architecture: |
| 31 | +- `wh_Server_DmaProcessClientAddress()` -- server calls this with a client address, the registered callback transforms it to something the server can dereference |
| 32 | +- `wh_Client_DmaProcessClientAddress()` -- client calls this to transform its local address into whatever the server will receive in the message |
| 33 | +- PRE/POST operations handle setup and teardown (cache flush/invalidate, temporary buffer allocation, etc.) |
| 34 | + |
| 35 | +On real hardware (e.g. Infineon TC3xx), this is literal hardware DMA -- client and server are on different cores with different address maps, and the callbacks handle the MMU/bus address translation. |
| 36 | + |
| 37 | +## 3. The Glue: Static Memory Pool Allocator in the SHM DMA Callbacks |
| 38 | + |
| 39 | +The `posixTransportShm_ClientStaticMemDmaCallback` and `posixTransportShm_ServerStaticMemDmaCallback` in `posix_transport_shm.c` are the **port-specific DMA callbacks** that bridge the POSIX SHM transport with the DMA feature. Here's the clever part: |
| 40 | + |
| 41 | +**Problem:** On POSIX, client and server are separate processes with separate virtual address spaces. A raw client pointer like `0x7fff12345000` means nothing to the server. But the DMA section in shared memory is mapped into *both* processes (at potentially different virtual addresses). |
| 42 | + |
| 43 | +**Solution using the pool allocator:** |
| 44 | + |
| 45 | +1. wolfCrypt's `WOLFSSL_STATIC_MEMORY` pool allocator (`wc_LoadStaticMemory_ex`) is initialized with the DMA section as its backing memory pool |
| 46 | +2. When the client DMA callback gets a PRE operation with a client address that's **not** already in the DMA area, it: |
| 47 | + - Allocates a temporary buffer from the pool (`XMALLOC` with the heap hint) |
| 48 | + - Copies client data into it |
| 49 | + - Returns an **offset** from the DMA base (not a pointer) -- this is what gets sent to the server |
| 50 | +3. The server DMA callback simply takes that offset, validates it's in bounds, and returns `dma_base + offset` |
| 51 | +4. On POST, the client callback copies results back (for writes) and frees the temporary buffer |
| 52 | + |
| 53 | +If the client address **is already** in the DMA section (the client allocated directly from the pool), it skips the copy and just computes the offset -- zero-copy. |
| 54 | + |
| 55 | +The pool allocator here is used as a **bump/slab allocator for the shared DMA region**. It has nothing to do with the transport itself -- it's the DMA callback's strategy for managing the shared buffer. wolfHSM could use a different allocator; the pool allocator was chosen because it's already available in wolfCrypt and works without `malloc`. |
| 56 | + |
| 57 | +## Summary Table |
| 58 | + |
| 59 | +| Aspect | Transport (SHM) | DMA Feature | Pool Allocator | |
| 60 | +|--------|-----------------|-------------|----------------| |
| 61 | +| **Layer** | Communication | Application/Crypto | Memory management | |
| 62 | +| **Scope** | Port-specific (POSIX) | Core wolfHSM | DMA callback impl detail | |
| 63 | +| **Purpose** | Move request/response packets | Let server access client memory by address | Manage temporary buffers in shared DMA area | |
| 64 | +| **Config** | `posixTransportShmConfig` | `WOLFHSM_CFG_DMA` | `WOLFSSL_STATIC_MEMORY` | |
| 65 | +| **Without it** | No communication | Data must fit in request/response buffers | Would need a different allocator for DMA region | |
| 66 | + |
| 67 | +The DMA section is **allocated by the transport** but **used by the DMA callbacks**. The pool allocator is **used by the DMA callbacks** to subdivide that DMA section. Three layers, three concerns. |
0 commit comments