opensbi firmware源码分析(2)

冷善

2023-12-01

0. 序

接上一篇，这次我们从sbi_init源码开始分析

1. 目前状态

在进入sbi_init时，内存布局为

| end of link address | hart 0 stack | hart 0 scratch | hart 1 stack | hart 1 scratch | ...

每个cpu进入sbi_init时，都传入了自己对应的scratch结构体。在进入sbi_init前，每个cpu都把mtvec设置为_trap_handler的地址。在fw_base.S中没有对mstatus和mie进行设置，不过qemu给的默认初始值均为0，所以目前禁用了所有中断。mscratch寄存器中保存了当前hart对应的scratch结构体的地址。

2. coldboot VS warmboot

率先把全局变量coldboot_lattery置1的cpu执行init_coldboot，其它cpu则执行init_warmboot。执行init_warmboot的cpu首先打开M态的软中断，将全局变量coldboot_wait_hmask中对应自己hartid的比特位置1，随后循环wfi指令，等待执行冷启动逻辑的cpu完成初始化。

static void wait_for_coldboot(struct sbi_scratch *scratch, u32 hartid)
{
	unsigned long saved_mie, cmip;
	const struct sbi_platform *plat = sbi_platform_ptr(scratch);

	/* Save MIE CSR */
	saved_mie = csr_read(CSR_MIE);

	/* Set MSIE bit to receive IPI */
	csr_set(CSR_MIE, MIP_MSIP);

	/* Acquire coldboot lock */
	spin_lock(&coldboot_lock);

	/* Mark current HART as waiting */
	sbi_hartmask_set_hart(hartid, &coldboot_wait_hmask);

	/* Wait for coldboot to finish using WFI */
	while (!coldboot_done) {
		spin_unlock(&coldboot_lock);
		do {
			wfi();
			cmip = csr_read(CSR_MIP);
		 } while (!(cmip & MIP_MSIP));
		spin_lock(&coldboot_lock);
	};
	....

这里有意思的地方是wfi等待时并没有打开mstatus.mie全局中断位。riscv手册对wfi指令描述如下：

The WFI instruction can also be executed when interrupts are disabled. The operation of WFI must be unaffected by the global interrupt bits in mstatus (MIE/SIE/UIE) and the delegation registers [m|s]ideleg (i.e., the hart must resume if a locally enabled interrupt becomes pending, even if it has been delegated to a less-privileged mode), but should honor the individual interrupt enables(e.g, MTIE) (i.e., implementations should avoid resuming the hart if the interrupt is pending but not individually enabled). WFI is also required to resume execution for locally enabled interrupts pending at any privilege level, regardless of the global interrupt enable at each privilege level.

这意味着，只要设置了mie.msip位，即使没有打开全局中断，cpu也会从wfi返回，但不触发中断，上面的代码从wfi中返回后通过检查mip寄存器来查看是否执行冷启动的cpu发送了核间软中断。

3. 初始化函数1：sbi_scratch_init

int sbi_scratch_init(struct sbi_scratch *scratch)
{
	u32 i;
	const struct sbi_platform *plat = sbi_platform_ptr(scratch);

	for (i = 0; i < SBI_HARTMASK_MAX_BITS; i++) {
		if (sbi_platform_hart_invalid(plat, i))
			continue;
		hartid_to_scratch_table[i] =
			((hartid2scratch)scratch->hartid_to_scratch)(i,
					sbi_platform_hart_index(plat, i));
		if (hartid_to_scratch_table[i])
			last_hartid_having_scratch = i;
	}

	return 0;
}

该函数仅有冷启动的cpu执行，枚举可能的hartid, 根据前面的内存布局建立了hartid -> struct scratch的映射，记录最后一个合法的hartid为last_hartid_having_scratch，以后每次枚举hartid枚举到last_hartid_having_scratch即可。

4. sbi_scratch_alloc_offset函数

unsigned long sbi_scratch_alloc_offset(unsigned long size, const char *owner);
// 分配size个字节，返回分配空间的起始地址，owner参数无实际意义

每个cpu的scratch space为4096字节，struct scratch只占前80个字节，后面的字节用来供该函数临时分配内存，内存分配的逻辑很简单，全局变量extra_offset指向未分配的内存，每次分配内存时就把extra_offset前移需要分配的字节（因此从不回收）。所有cpu同时分配，因此只需要一个extra_offset变量。

5. 分配init_count变量空间

init_count_offset = sbi_scratch_alloc_offset(__SIZEOF_POINTER__,
						     "INIT_COUNT");

动态分配临时内存的动作都只有冷启动的cpu才会执行。这8字节或者4字节的内存用来存储cpu启动的次数（每次cpu冷启动或者暖启动完毕，该计数加1）。该值可能大于1，因为opensbi实现sbi_hart_stop这个sbi系统调用的方法非常有趣。

void __noreturn sbi_hsm_exit(struct sbi_scratch *scratch)
{
	//.....
	void (*jump_warmboot)(void) = (void (*)(void))scratch->warmboot_addr;

	// .....
	if (sbi_platform_has_hart_hotplug(plat)) {
		sbi_platform_hart_stop(plat);
		/* It should never reach here */
		goto fail_exit;
	}

	/**
	 * As platform is lacking support for hotplug, directly jump to warmboot
	 * and wait for interrupts in warmboot. We do it preemptively in order
	 * preserve the hart states and reuse the code path for hotplug.
	 */
	jump_warmboot();

fail_exit:
	/* It should never reach here */
	sbi_printf("ERR: Failed stop hart [%u]\n", current_hartid());
	sbi_hart_hang();
}

S态的软件调用sbi_hart_stop请求sbi停止该hart。如果平台不显式支持停止hart的方法，opensbi会最终跳转到_warm_start后再一次进入sbi_init函数，然后陷入sbi_hsm_hart_wait函数循环等待S态软件调用sbi_hart_start来重新唤醒该hart（注意由于此时coldboot_done已经置1了，所以该hart不会在wait_for_coldboot函数中循环等待）

static void sbi_hsm_hart_wait(struct sbi_scratch *scratch, u32 hartid)
{
	// .....
	/* Wait for hart_add call*/
	while (atomic_read(&hdata->state) != SBI_HART_STARTING) {
		wfi();
	};
	// ....
}

一旦该hart被重新唤醒，又会重新执行一遍暖启动的过程，init_count计数加1。

6. 初始化函数2： sbi_hsm_init

int sbi_hsm_init(struct sbi_scratch *scratch, u32 hartid, bool cold_boot)
{
	u32 i;
	struct sbi_scratch *rscratch;
	struct sbi_hsm_data *hdata;

	if (cold_boot) {
		hart_data_offset = sbi_scratch_alloc_offset(sizeof(*hdata),
							    "HART_DATA");
		if (!hart_data_offset)
			return SBI_ENOMEM;

		/* Initialize hart state data for every hart */
		for (i = 0; i <= sbi_scratch_last_hartid(); i++) {
			rscratch = sbi_hartid_to_scratch(i);
			if (!rscratch)
				continue;

			hdata = sbi_scratch_offset_ptr(rscratch,
						       hart_data_offset);
			ATOMIC_INIT(&hdata->state,
			(i == hartid) ? SBI_HART_STARTING : SBI_HART_STOPPED);
		}
	} else {
		sbi_hsm_hart_wait(scratch, hartid);
	}
	return 0;
}

可以看到，对于冷启动的cpu，它负责调用sbi_scratch_alloc_offset为每个cpu分配一个struct sbi_hsm_data的空间，该结构体只有一个state成员，用于标记该cpu的状态。冷启动的cpu状态初始化为SBI_HART_STARTING，其它cpu为SBI_HART_STOPPED。

暖启动的cpu则会执行sbi_hsm_hart_wait函数，在上一节我们看到，cpu会在该函数中循环等待冷启动cpu将它们的状态从SBI_HART_STOPPED修改为SBI_HART_STARTING（注意暖启动的cpu执行该函数也是在coldboot_done置1之后，这是相当后面的事情了）。

7. 初始化函数3： sbi_platform_early_init

在generic平台下，该函数会直接调用generic_early_init函数。

static int generic_early_init(bool cold_boot)
{
	int rc;

	if (generic_plat && generic_plat->early_init) {
		rc = generic_plat->early_init(cold_boot, generic_plat_match);
		if (rc)
			return rc;
	}

	if (!cold_boot)
		return 0;

	return fdt_reset_init();
}

可以看到，该函数默认情况下是调用fdt_reset_init函数，如果是暖启动则为空操作（如果没有被之前调用的fw_platform_lookup_special初始化的generic_plat截获）。

fdt_reset_init函数用来初始化平台级的管理系统关闭或重启的硬件。opensbi中初始化硬件都遵循这样一种模式：opensbi中会事先准备若干种driver能够驱动这一类硬件的driver，每一个driver有自己的match_table，需要初始化该硬件时，遍历设备树，比较driver的match_table和设备树节点的compatible参数，如果匹配成功，则使用该driver来驱动该硬件，并设置全局遍历current_driver指向该driver。

以fdt_reset_init为例，在fdt_reset.c中可以看到如下声明

extern struct fdt_reset fdt_reset_sifive;
extern struct fdt_reset fdt_reset_htif;

而在fdt_reset_sifive.c中

static const struct fdt_match sifive_test_reset_match[] = {
	{ .compatible = "sifive,test1" },
	{ },
};

struct fdt_reset fdt_reset_sifive = {
	.match_table = sifive_test_reset_match,
	.init = sifive_test_reset_init,   // 该driver初始化硬件的函数
	.system_reset = sifive_test_system_reset // 该driver执行系统重置的函数
};

而在qemu传给opensbi的设备树中有如下节点

test@100000 {
			phandle = <0x0a>;
			reg = <0x00 0x100000 0x00 0x1000>;
			compatible = "sifive,test1\0sifive,test0\0syscon";
		};

再看看fdt_reset_init的源码来加深对上面描述的理解

int fdt_reset_init(void)
{
	int pos, noff, rc;
	struct fdt_reset *drv;
	const struct fdt_match *match;
	void *fdt = sbi_scratch_thishart_arg1_ptr();

	for (pos = 0; pos < array_size(reset_drivers); pos++) {
		drv = reset_drivers[pos];

		noff = fdt_find_match(fdt, -1, drv->match_table, &match);
		if (noff < 0)
			continue;

		if (drv->init) {
			rc = drv->init(fdt, noff, match); // 陷入到sifive_test_reset_init函数中
			if (rc)
				return rc;
		}
		current_driver = drv;
		break;
	}

	return 0;
}

因此fdt_reset_sifive会作为驱动该管理平台重启和关闭的硬件的driver。对应的初始化函数sifive_test_reset_init很简单，只是把该设备的物理地址记录到全局变量sifive_test_base中。

初始化这一driver的目的是用来支持sbi_shutdown这一sbi调用。

8. 初始化函数4： sbi_hart_init

int sbi_hart_init(struct sbi_scratch *scratch, u32 hartid, bool cold_boot)
{
	// 删去了一些无关紧要的检查
	int rc;

	if (cold_boot) {
		hart_features_offset = sbi_scratch_alloc_offset(
						sizeof(struct hart_features),
						"HART_FEATURES");
	}

	hart_detect_features(scratch);

	mstatus_init(scratch, hartid);

	rc = fp_init(hartid);

	rc = delegate_traps(scratch, hartid);

	return pmp_init(scratch, hartid);
}

除了冷启动时需要调用sbi_scratch_alloc_offset为hart_features分配空间外，每个hart都会调用五个初始化函数，我们来看它们都干什么。

在sbi_hart.h中定义了如下feature

enum sbi_hart_features {
	/** Hart has PMP support */
	SBI_HART_HAS_PMP = (1 << 0),
	/** Hart has S-mode counter enable */
	SBI_HART_HAS_SCOUNTEREN = (1 << 1),
	/** Hart has M-mode counter enable */
	SBI_HART_HAS_MCOUNTEREN = (1 << 2),
	/** HART has timer csr implementation in hardware */
	SBI_HART_HAS_TIME = (1 << 3),

	/** Last index of Hart features*/
	SBI_HART_HAS_LAST_FEATURE = SBI_HART_HAS_TIME,
};

hart_detect_features函数的思路很简单，就是读相应的控制寄存器，看是否引发了异常，由此来确定平台是否有该feature。hart_detect_features执行完毕后，就初始化好了该hart对应的hart_features结构体。

mstatus_init根据平台支持的riscv拓展，初始化了一些mstatus, mie, satp等寄存器，如果可能，激活平台支持的hardware counter。

static void mstatus_init(struct sbi_scratch *scratch, u32 hartid)
{
	unsigned long mstatus_val = 0;

	/* Enable FPU */
	if (misa_extension('D') || misa_extension('F'))
		mstatus_val |=  MSTATUS_FS;

	/* Enable Vector context */
	if (misa_extension('V'))
		mstatus_val |=  MSTATUS_VS;

	csr_write(CSR_MSTATUS, mstatus_val);

	/* Enable user/supervisor use of perf counters */
	if (misa_extension('S') &&
	    sbi_hart_has_feature(scratch, SBI_HART_HAS_SCOUNTEREN))
		csr_write(CSR_SCOUNTEREN, -1);
	if (sbi_hart_has_feature(scratch, SBI_HART_HAS_MCOUNTEREN))
		csr_write(CSR_MCOUNTEREN, -1);

	/* Disable all interrupts */
	csr_write(CSR_MIE, 0);

	/* Disable S-mode paging */
	if (misa_extension('S'))
		csr_write(CSR_SATP, 0);
}

fp_init函数负责将浮点寄存器和fcsr寄存器置0。

delegate_traps负责初始化MIDELEG, MIDELEG寄存器。将一部分的中断和异常代理到S态。比较重要的一点是，illegal instruction exception没有代理到S态。

pmp_init负责初始化pmp寄存器

static int pmp_init(struct sbi_scratch *scratch, u32 hartid)
{
	// 省略了不重要的代码
	/* Firmware PMP region to protect OpenSBI firmware */
	pmp_set(pmp_idx++, 0, fw_start, fw_size_log2);

	for (i = 0; i < count && pmp_idx < (pmp_count - 1); i++) {
		if (sbi_platform_pmp_region_info(plat, hartid, i, &prot, &addr,
						 &log2size))
			continue;
		pmp_set(pmp_idx++, prot, addr, log2size);
	}
	/*
	 * Default PMP region for allowing S-mode and U-mode access to
	 * memory not covered by:
	 * 1) Firmware PMP region
	 * 2) Platform specific PMP regions
	 */
	pmp_set(pmp_idx++, PMP_R | PMP_W | PMP_X, 0, __riscv_xlen);

	return 0;
}

设置的pmp保护段主要有三部分，第一部分保护opensbi的代码，不允许S态访问。第二部分依赖于平台定义的hooks（如果平台中有任何不希望被S态访问的地址，qemu-virt是没有的），第三部分是默认值，保证S态能访问其余的物理地址。设置第三部分是基于如下的riscv spec描述。

If no PMP entry matches an M-mode access, the access succeeds. If no PMP entry matches an S-mode or U-mode access, but at least one PMP entry is implemented, the access fails.

总的来说，sbi_hart_init函数负责检查hart feature以及初始化重要的控制寄存器。

9. 初始化函数5： sbi_platform_irqchip_init

在generic平台中，该函数会调用fdt_irqchip_init函数。

int fdt_irqchip_init(bool cold_boot)
{
	int rc;

	if (cold_boot) {
		rc = fdt_irqchip_cold_init();
		if (rc)
			return rc;
	}

	return fdt_irqchip_warm_init();
}

fdt_irqchip_cold_init逻辑与fdt_reset_init很类似，最终driver选择到fdt_irqchip_plic。冷启动陷入到函数irqchip_plic_cold_init。

irqchip_plic_cold_init函数初始化了两个重要的数据结构plic_hartid2data, plic_hartid2context，负责把hartid映射到相应的plic_data, plic_context

struct plic_data {
	unsigned long addr;  // plic的物理地址
	unsigned long num_src; // plic支持的中断源个数
};
static struct plic_data *plic_hartid2data[SBI_HARTMASK_MAX_BITS];
static int plic_hartid2context[SBI_HARTMASK_MAX_BITS][2];

通常情况下只有一个plic，因此都指向同一个plic_data。context用来标明设置priority,threshold等参数的地址，具体可以参考plic spec。
注意区分hart和context，一个hart的每个特权模式对应一个context，因此这里的plic_hartid2context第二维长度为2，对应该hart的Mcontext和Scontext。

每个hart都会执行fdt_irqchip_warm_init()函数，负责将相应的context的enable参数置0,并设置一个threshold参数。因此plic此时的中断请求是被关闭的。

10. 结

OK，第二篇的分析就写到这里。