1.The TLP’s size limits are set at the peripheral’s configuration stage, but typical numbers are a maximum of 128, 256 or 512 bytes per TLP,注意pcie的tlp的帧头格式是按照DW为单位的。所以字节都要换算成双字(32bit),即[MSB:2],再根据[1:0]是否为2'd0,来判断是否需要[MSB:2]+1;
2.在使用DMA操作时,当要读/写的数据很大时,根据TLP’s size limits将数据分拍写出或读出。
3.对于TLP的读写失败,有以下几个方面:
1)信号线初始化不满足,比如fifo的empty信号没有连接,ISE默认将empty置为0,而pcie ip核的连接信号又受empty控制,就会导致信号线初始化不满足。
2)写操作时,帧头的长度与实际要写的数据长度不对应,可通过chipscope抓信号来看。
3)若读写没有成功,要看看其前面的帧是否正确,对于写帧,可以通过kdb来查看是否写入成功。
4)make sure the payload matches the length field in the TLP. Make sure trn_trem_n is correct.
4.提示项
- Posted writes and MSI’s arrive in the order they were sent. Now, all memory writes are posted, and MSIs are in fact (posted) memory writes. So we know for sure that memory writes are executed in order, and that if we issued an MSI after filling a buffer (writes…) it will arrive after the buffer was actually written to.
- A read request will never arrive before a write request or MSI sent before it. As a matter of fact, performing a Read Request is a safe way to wait for a write to complete.
- Write requests may very well come before read requests sent before them. This mechanism prevents deadlock in certain exotic scenarios. Don’t write to a certain memory area while waiting for the read completion to come in.
- Read completions for a certain request (i.e. with the same Tag and Requester ID) arrive in the order they were sent (so they arrive in order with rising addresses). Read completions of different request may be reordered (but who cares).
5.4K boundary
1)Requests must not specify an Address/Length combination which causes a Memory Space access to cross a 4-KB boundary.就是说在pcie使用DMA机制时,基地址的[11:0]+len不能大于12‘hfff,否则读取会出问题(经实际测试,当基地址的[11:0]+len>=16‘h1000时,应答帧有时有,有时没有)。比如基地址为xxxxxxffc,长度为20DW,这就不行了。要不就分成两次读取,要不地址申请时地址加长度在4K范围内。
2)Why the limit exists is something one should ask those who wrote the standards. Anyhow, a lot of things on computer hardware is bounded to 4 kB. Maybe it's because DDR memory rows are 4 kB in size, so crossing such a boundary would force the memory controller to run two row fetch operations
3) if a bus request starts at address START and has length LENGTH in bytes (LENGTH=1 is one byte), then we require, for 32-bit addressing, that START & 0xfffff000 == (START + LENGTH - 1) & 0xfffff000。
4)有时在4K内,但因rcb(read completion boundary)的限制,也会导致读一次有多个应答帧,尤其使用龙芯cpu的要格外注意了。
5.require ID:对于这个你要注意了,我们都知道pcie是基于switch的,当有应答帧时,数据会通过物理层到达链路层,但如果require ID不对应,链路层就会把该帧丢掉,你可以看到数据帧写进bra ram,但地址没有变换,即有写无读,通过chipscope你就可以很清晰的看出来了。