As one of the conclusion of the last log, in order to use the LPDDR memory in 32-bit mode on the Pano Logic G1, a cache is almost a must. Sure I can just use 16-bit mode, half the capacity (16 MB) isn't really an issue for me... But I still decided to just implement a cache, it shouldn't be that hard.
So, as a result, I have got cache working on Pano Logic G1. It is a 8-KBytes 2-way set-associative cache. Replacement policy is LRU and write policy is write back. (The whole point of having a cache is because write through is almost impossible given the data mask cannot be used.) It is connected between the PicoRV32 CPU and the MIG memory controller, so all read and writes to the LPDDR is cached. I won't go into details about the cache since I feel like there is nothing special worth talking about except being slow and inefficient. I will add an bus master arbiter between the PicoRV32 and the cache in the future, so the GameBoy CPU could access the LPDDR as well. Though one need to keep in mind this is only a 2-way set-associative cache, having multiple masters would lead to very questionable performance.
So, what about the performance? Currently:
- Read hit: 2 cycles
- Write hit: 2 cycles
- Read miss: 4 cycles + memory read latency
- Write miss: 5 cycles + memory read latency
- Read miss + flush: 12 cycles + memory write latency + memory read latency
- Write miss + flush: 13 cycles + memory write latency + memory read latency
So you can see.. The cost of missing is high, and the cost of flush is very high.
Also, due to my bad coding and the limitation of Spartan-3E's block RAM (it does not support byte enable, which is important for a cache that allows byte enable), compared to 16-bit non-cached version, the whole design uses 1500 more LUTs. I assume mostly comes from the cache, and some from the 32-bit memory controller.
But anyway... It Works™.