Skip to content

Commit 8bff1fa

Browse files
committed
Peer review fixes + ZynqMP robustness improvements
Squashes previous two work-in-progress commits and adds further hardening discovered during large (>128 MB) signed FIT boot bring-up: hal/zynq.c: - Route IOU_TAPDLY_BYPASS writes through pmu_request when running at EL <= 2 in the <=40 MHz and <=100 MHz branches; the upstream code only did this in the <=150 MHz branch, but the register is equally unwritable from EL2/EL1 at lower clocks. - Add qspi_flash_reset() (0x66 RESET_ENABLE + 0x99 RESET_MEMORY), called once per chip in dual-parallel during qspi_init so the flash starts from a known state regardless of what FSBL/BootROM left behind (e.g. XIP, 4-byte address, or auto-boot probing). - Remove unused 'reg' in csu_aes and 'ms' in csu_init so -Werror=unused-variable builds (OPTIMIZATION_LEVEL=0 / DEBUG=1) succeed. hal/zynq.ld: - Move wolfBoot ORIGIN from 0x08000000 to 0x10000000. For FIT images whose kernel load address is 0x00200000 and whose payload approaches or exceeds ~126 MB, the kernel memcpy at handoff would sweep across 0x08000000 and overwrite wolfBoot's own running code. 0x10000000 leaves headroom below WOLFBOOT_LOAD_ADDRESS (0x18000000). tools/scripts/zcu102/zcu102-ca53-qspi.cmm: - Full rewrite based on Lauterbach TRACE32 ZCU102 QSPI demo: adds PREPAREONLY entry mode, &dualqspi single/dual toggle, READ_ID_TEST subroutine for single-flash variants, and separate dialogs to flash BOOT.BIN at offset 0 and test-app/image_v1_signed.bin at the partition boot address. - Document TRACE32 temp-memory ceiling (~128 MB) near FLASHFILE.Create: Load buffers the entire source file into temp memory, so files larger than that must be split externally (e.g. via dd) and each chunk loaded in its own FLASHFILE.ReProgram ALL / off bracket. config/examples/zynqmp_sdcard.config, docs/Targets.md, include/sdhci.h, src/boot_aarch64.c: peer review feedback from PR662.
1 parent ee943dd commit 8bff1fa

10 files changed

Lines changed: 360 additions & 248 deletions

File tree

config/examples/zynqmp_sdcard.config

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,11 @@ CFLAGS_EXTRA+=-DBOOT_PART_A=1
8484
CFLAGS_EXTRA+=-DBOOT_PART_B=2
8585

8686
# Disk read chunk size for firmware loading (update_disk.c). 512KB gives the
87-
# best throughput (~1.4s for 32MB). The SDMA engine handles boundary crossings
88-
# every 4KB (SDHCI_DMA_THRESHOLD default) within each 512KB chunk.
87+
# best throughput (~1.4s for 32MB). The SDMA engine handles SDMA buffer
88+
# boundary crossings within each 512KB chunk; this boundary is 4KB by default
89+
# (auto-derived from SDHCI_DMA_THRESHOLD). To reduce boundary IRQs, override
90+
# SDHCI_DMA_BUFF_BOUNDARY independently, e.g.:
91+
# CFLAGS_EXTRA+=-DSDHCI_DMA_BUFF_BOUNDARY=SDHCI_SRS01_DMA_BUFF_512KB
8992
CFLAGS_EXTRA+=-DDISK_BLOCK_SIZE=0x80000
9093

9194
# Linux rootfs is on partition 4. Device naming depends on whether both

docs/Targets.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2599,7 +2599,7 @@ qemu-system-aarch64 -machine xlnx-zcu102 -cpu cortex-a53 -serial stdio -display
25992599

26002600
Use `config/examples/zynqmp_sdcard.config`. This uses the Arasan SDHCI controller (SD1 - external SD card slot on ZCU102) and an **MBR** partitioned SD card.
26012601

2602-
wolfBoot unconditionally flushes the EL2 D-cache/I-cache and disables the EL2 MMU before handoff (see `el2_flush_and_disable_mmu` in `src/boot_aarch64_start.S`), satisfying the ARM64 Linux boot protocol with no extra config flag required.
2602+
On the direct-jump handoff path, wolfBoot flushes the EL2 D-cache/I-cache and disables the EL2 MMU via `el2_flush_and_disable_mmu` in `src/boot_aarch64_start.S` when `BOOT_EL1` is not enabled and the current exception level is EL2. The ERET-to-EL1 handoff path is different, so this cleanup is not unconditional.
26032603

26042604
**Partition layout**
26052605
| Partition | Name | Size | Type | Contents |
@@ -2705,8 +2705,11 @@ The ZynqMP uses an Arasan SDHCI v3.0 controller. Key considerations:
27052705
level. `SDHCI_FORCE_CARD_DETECT` is set in the config since FSBL already booted from
27062706
the same SD card.
27072707
- **`DISK_BLOCK_SIZE`**: Controls the firmware read chunk size in `update_disk.c` (default
2708-
64KB). This determines the per-read size passed to the SDHCI driver. Must be less than
2709-
the SDMA buffer boundary (4KB with the default threshold).
2708+
64KB). This determines the per-read size passed to the SDHCI driver. It does not need
2709+
to be smaller than `SDHCI_DMA_BUFF_BOUNDARY`; if a read crosses one or more SDMA buffer
2710+
boundaries, the SDHCI driver handles that via the normal SDMA boundary interrupt path.
2711+
In practice, this setting is a tradeoff: larger reads may trigger boundary IRQs more
2712+
often, while smaller reads reduce crossings but increase request overhead.
27102713

27112714
**Debug**
27122715

hal/zynq.c

Lines changed: 75 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -504,7 +504,6 @@ static int csu_dma_config(int ch, int doSwap)
504504
int csu_aes(int enc, const uint8_t* iv, const uint8_t* in, uint8_t* out, uint32_t sz)
505505
{
506506
int ret;
507-
uint32_t reg;
508507

509508
/* Flush data cache for variables used */
510509
flush_dcache_range((unsigned long)iv, (unsigned long)iv + AES_GCM_TAG_SZ);
@@ -601,7 +600,6 @@ int csu_init(void)
601600
#endif
602601
uint32_t reg1 = pmu_mmio_read(CSU_IDCODE);
603602
uint32_t reg2 = pmu_mmio_read(CSU_VERSION);
604-
uint64_t ms;
605603

606604
wolfBoot_printf("CSU ID 0x%08x, Ver 0x%08x\n",
607605
reg1, reg2 & CSU_VERSION_MASK);
@@ -1361,6 +1359,36 @@ static int qspi_exit_4byte_addr(QspiDev_t* dev)
13611359
}
13621360
#endif
13631361

1362+
/* Soft-reset the flash to a known idle state.
1363+
* FSBL / BootROM may leave the flash in an unexpected mode (XIP enabled,
1364+
* 4-byte addr set, auto-boot probing, etc.). Issue RESET_ENABLE (0x66) +
1365+
* RESET_MEMORY (0x99) to bring it back to defaults before first transaction.
1366+
* Per Micron MT25Q datasheet: t_SHSL2 ~ 40 us max after RESET_MEMORY. */
1367+
static int qspi_flash_reset(QspiDev_t* dev)
1368+
{
1369+
int ret;
1370+
uint8_t cmd[4]; /* size multiple of uint32_t */
1371+
1372+
memset(cmd, 0, sizeof(cmd));
1373+
cmd[0] = RESET_ENABLE_CMD;
1374+
ret = qspi_transfer(dev, cmd, 1, NULL, 0, NULL, 0, 0,
1375+
GQSPI_GEN_FIFO_MODE_SPI);
1376+
#if defined(DEBUG_ZYNQ) && DEBUG_ZYNQ >= 2
1377+
wolfBoot_printf("Flash Reset Enable: Ret %d\n", ret);
1378+
#endif
1379+
if (ret == GQSPI_CODE_SUCCESS) {
1380+
cmd[0] = RESET_MEMORY_CMD;
1381+
ret = qspi_transfer(dev, cmd, 1, NULL, 0, NULL, 0, 0,
1382+
GQSPI_GEN_FIFO_MODE_SPI);
1383+
#if defined(DEBUG_ZYNQ) && DEBUG_ZYNQ >= 2
1384+
wolfBoot_printf("Flash Reset Memory: Ret %d\n", ret);
1385+
#endif
1386+
}
1387+
/* Allow flash time to complete the reset and become ready. */
1388+
hal_delay_ms(1);
1389+
return ret;
1390+
}
1391+
13641392
/* QSPI functions */
13651393
void qspi_init(void)
13661394
{
@@ -1444,13 +1472,29 @@ void qspi_init(void)
14441472
#if (GQSPI_CLK_REF / (2 << GQSPI_CLK_DIV)) <= 40000000 /* 40MHz */
14451473
/* At <40 MHz, the Quad-SPI controller should be in non-loopback mode with
14461474
* the clock and data tap delays bypassed. */
1447-
IOU_TAPDLY_BYPASS |= IOU_TAPDLY_BYPASS_LQSPI_RX;
1475+
/* IOU_TAPDLY_BYPASS is not writable from EL2/EL1 without going through PMU. */
1476+
if (current_el() <= 2) {
1477+
pmu_request(PM_MMIO_WRITE, IOU_TAPDLY_BYPASS_ADDR,
1478+
IOU_TAPDLY_BYPASS_LQSPI_RX, IOU_TAPDLY_BYPASS_LQSPI_RX,
1479+
0, NULL);
1480+
}
1481+
else {
1482+
IOU_TAPDLY_BYPASS |= IOU_TAPDLY_BYPASS_LQSPI_RX;
1483+
}
14481484
GQSPI_LPBK_DLY_ADJ = 0;
14491485
GQSPI_DATA_DLY_ADJ = 0;
14501486
#elif (GQSPI_CLK_REF / (2 << GQSPI_CLK_DIV)) <= 100000000 /* 100MHz */
14511487
/* At <100 MHz, the Quad-SPI controller should be in clock loopback mode
14521488
* with the clock tap delay bypassed, but the data tap delay enabled. */
1453-
IOU_TAPDLY_BYPASS |= IOU_TAPDLY_BYPASS_LQSPI_RX;
1489+
/* IOU_TAPDLY_BYPASS is not writable from EL2/EL1 without going through PMU. */
1490+
if (current_el() <= 2) {
1491+
pmu_request(PM_MMIO_WRITE, IOU_TAPDLY_BYPASS_ADDR,
1492+
IOU_TAPDLY_BYPASS_LQSPI_RX, IOU_TAPDLY_BYPASS_LQSPI_RX,
1493+
0, NULL);
1494+
}
1495+
else {
1496+
IOU_TAPDLY_BYPASS |= IOU_TAPDLY_BYPASS_LQSPI_RX;
1497+
}
14541498
GQSPI_LPBK_DLY_ADJ = GQSPI_LPBK_DLY_ADJ_USE_LPBK;
14551499
GQSPI_DATA_DLY_ADJ = (GQSPI_DATA_DLY_ADJ_USE_DATA_DLY |
14561500
GQSPI_DATA_DLY_ADJ_DATA_DLY_ADJ(2));
@@ -1485,6 +1529,19 @@ void qspi_init(void)
14851529
(void)reg_cfg;
14861530
(void)reg_isr;
14871531

1532+
/* Issue flash soft reset so we start from a known state regardless of
1533+
* whatever mode FSBL/BootROM left the device in. Send to each chip in
1534+
* dual-parallel configurations by targeting both chip selects. */
1535+
mDev.mode = GQSPI_GEN_FIFO_MODE_SPI;
1536+
mDev.bus = GQSPI_GEN_FIFO_BUS_LOW;
1537+
mDev.cs = GQSPI_GEN_FIFO_CS_LOWER;
1538+
(void)qspi_flash_reset(&mDev);
1539+
#if GQPI_USE_DUAL_PARALLEL == 1
1540+
mDev.bus = GQSPI_GEN_FIFO_BUS_UP;
1541+
mDev.cs = GQSPI_GEN_FIFO_CS_UPPER;
1542+
(void)qspi_flash_reset(&mDev);
1543+
#endif
1544+
14881545
/* ------ Flash Read ID (retry) ------ */
14891546
timeout = 0;
14901547
while (++timeout < QSPI_FLASH_READY_TRIES) {
@@ -1577,6 +1634,10 @@ void hal_init(void)
15771634
wolfBoot_printf(bootMsg);
15781635
wolfBoot_printf("Current EL: %d\n", current_el());
15791636

1637+
#ifndef WOLFBOOT_REPRODUCIBLE_BUILD
1638+
wolfBoot_printf("Build: %s %s\n", __DATE__, __TIME__);
1639+
#endif
1640+
15801641
#if defined(EXT_FLASH) && (EXT_FLASH == 1)
15811642
qspi_init();
15821643
#endif
@@ -1810,14 +1871,23 @@ void RAMFUNCTION ext_flash_unlock(void)
18101871
}
18111872

18121873
#if defined(MMU) && defined(__WOLFBOOT)
1874+
/* Fallback timer frequency if CNTFRQ_EL0 is not configured (e.g. boot path
1875+
* that did not run ATF/BL31). ZynqMP system counter is 100 MHz. */
1876+
#ifndef ZYNQMP_TIMER_CLK_FREQ
1877+
#define ZYNQMP_TIMER_CLK_FREQ 100000000ULL
1878+
#endif
1879+
18131880
/* Get current time in microseconds using ARMv8 generic timer */
18141881
uint64_t hal_get_timer_us(void)
18151882
{
18161883
uint64_t count, freq;
18171884
__asm__ volatile("mrs %0, CNTPCT_EL0" : "=r"(count));
18181885
__asm__ volatile("mrs %0, CNTFRQ_EL0" : "=r"(freq));
1886+
/* Fall back to a known frequency rather than returning 0, so udelay()
1887+
* callers that spin on hal_get_timer_us() advancing remain monotonic
1888+
* (matches hal/versal.c). */
18191889
if (freq == 0)
1820-
return 0;
1890+
freq = ZYNQMP_TIMER_CLK_FREQ;
18211891
/* Use __uint128_t to avoid overflow of (count * 1e6) at long uptimes
18221892
* (would overflow uint64_t after ~51h at 100MHz). */
18231893
return (uint64_t)(((__uint128_t)count * 1000000ULL) / freq);

hal/zynq.ld

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,16 @@ MEMORY
1313
{
1414
/* psu_ddr_0_MEM_0 : ORIGIN = 0x0, LENGTH = 0x80000000 */
1515
/* wolfBoot DDR location (2MB reserved):
16-
* Loaded by FSBL/BL31 to DDR at 0x8000000 (128MB)
17-
* Same address used for both QSPI and SD card boot
16+
* Loaded by FSBL/BL31 to DDR at 0x10000000 (256MB).
17+
* Must be above the OS kernel load region. For FIT images whose "load"
18+
* address is 0x00200000 (typical for AArch64 kernels) and whose payload
19+
* approaches or exceeds ~126MB, the kernel memcpy into 0x00200000..
20+
* sweeps across 0x08000000 and would clobber wolfBoot's own running code
21+
* if linked there. 0x10000000 leaves headroom below
22+
* WOLFBOOT_LOAD_ADDRESS (0x18000000) where the signed image is staged
23+
* for verification. Same address used for QSPI and SD card boot.
1824
*/
19-
psu_ddr_0_MEM_0 : ORIGIN = 0x8000000, LENGTH = 0x200000
25+
psu_ddr_0_MEM_0 : ORIGIN = 0x10000000, LENGTH = 0x200000
2026
psu_ddr_1_MEM_0 : ORIGIN = 0x800000000, LENGTH = 0x80000000
2127
psu_ocm_ram_0_MEM_0 : ORIGIN = 0xFFFC0000, LENGTH = 0x40000
2228
psu_qspi_linear_0_MEM_0 : ORIGIN = 0xC0000000, LENGTH = 0x20000000

include/sdhci.h

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,13 @@
5757
#define DISK_TEST_BLOCK_ADDR 149504 /* ~76MB offset */
5858
#endif
5959

60-
/* Auto-select DMA buffer boundary based on threshold */
60+
/* DMA buffer boundary: how often the SDMA engine pauses to refresh its
61+
* address pointer (handled by sdhci_irq_handler() via SDHCI_SRS12_DMAINT).
62+
* This is a throughput knob and is independent of SDHCI_DMA_THRESHOLD
63+
* (which controls when to switch from PIO to SDMA). Override in target
64+
* .config to match the largest expected single transfer for fewer
65+
* boundary IRQs; otherwise auto-select based on the threshold. */
66+
#ifndef SDHCI_DMA_BUFF_BOUNDARY
6167
#if (SDHCI_DMA_THRESHOLD > (256U * 1024U))
6268
#define SDHCI_DMA_BUFF_BOUNDARY SDHCI_SRS01_DMA_BUFF_512KB
6369
#if (SDHCI_DMA_THRESHOLD != (512U * 1024U))
@@ -99,6 +105,7 @@
99105
#warning "SDHCI_DMA_THRESHOLD rounded up to 4KB (minimum)"
100106
#endif
101107
#endif
108+
#endif /* !SDHCI_DMA_BUFF_BOUNDARY */
102109

103110
/* Timeouts */
104111
#ifndef SDHCI_INIT_TIMEOUT_US

src/boot_aarch64.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434
#include "hal/versal.h"
3535
#elif defined(TARGET_zynq)
3636
#include "hal/zynq.h"
37-
#elif defined(TARGET_ls1028a)
37+
#elif defined(TARGET_nxp_ls1028a)
3838
#include "hal/nxp_ls1028a.h"
3939
#endif
4040

tools/scripts/nxp_t1040/t1040_debug.cmm

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,10 @@
1818
; 3. wolfBoot runs from DDR (0x7FF00000)
1919
; ------------------------------------------------------------------------------
2020

21-
; Base directory for wolfBoot build output (adjust to match your build path)
22-
&basedir="/home/davidgarske/GitHub/wolfboot-alt"
21+
; Base directory for wolfBoot build output. "." means the TRACE32 current
22+
; working directory - set this to your wolfBoot checkout path if running
23+
; TRACE32 from elsewhere (e.g. "C:/src/wolfBoot" or "/home/user/wolfBoot").
24+
&basedir="."
2325

2426
PRINT "========================================"
2527
PRINT "T1040 wolfBoot Debug Session"

tools/scripts/nxp_t1040/t1040_flash.cmm

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,15 @@
2525
; 0xEFFFC000: Stage 1 loader (16 KB, includes reset vector)
2626
; ------------------------------------------------------------------------------
2727

28-
; Base directory for wolfBoot build output (adjust to match your build path)
29-
&basedir="/home/davidgarske/GitHub/wolfboot-alt"
30-
31-
; Persistent backup directory (survives make clean)
32-
&backupdir="/home/davidgarske/Projects/NXP/t1040rdb"
28+
; Base directory for wolfBoot build output. "." means the TRACE32 current
29+
; working directory - set this to your wolfBoot checkout path if running
30+
; TRACE32 from elsewhere (e.g. "C:/src/wolfBoot" or "/home/user/wolfBoot").
31+
&basedir="."
32+
33+
; Persistent backup directory (survives make clean). Leave as "." to keep
34+
; artifacts alongside the build tree, or point elsewhere to preserve
35+
; signed/backup images across source cleans.
36+
&backupdir="."
3337

3438
; FLASH Number of banks
3539
; The JS28F00AM29EWHA is a single 128MB NOR chip. CPLD virtual banking

0 commit comments

Comments
 (0)