Paper Title
Early Estimation Of Cache Properties For Multicore Embedded Processors

The state-of-the-art embedded systems are expected to have multicore processors as multicore architecture provides high performance to power ratio. Although cache improves the overall performance, designing multicore embedded processors with multilevel caches is a great challenge. Caches make thermal constraint crucial; parallel thread execution difficult; and timing unpredictability even worse. An effective early estimation technique can be very valuable to design complex systems like multicore embedded systems. In this paper, we propose a simulation methodology to determine the impact of cache on performance, power consumption, and predictability to facilitate the design of future embedded multicore systems. Effective cache parameters (such as cache size), organizations (such as shared CL2), and techniques (such as cache locking) for target applications can be predetermined using this method. We model a quad-core embedded system with two levels of caches (where CL2 is shared). By varying total CL2 cache size and locked CL2 cache size, we run the simulation program using popular FFT, GIF, JPEG, MPEG-3, and MPEG-4 workloads. Simulation results indicate that the optimal value of CL2 is 128 KB in this experiment for the selected applications when average memory access time (i.e., delay) per task and total power consumption are concerned. Experimental results also indicate that up to 25% CL2 cache locking is helpful for the simulated system. It is observed that both mean delay per task and total power consumption decrease when cache size is increased and/or 25% cache locking is applied; however, the impact of shared CL2 cache on power consumption is more significant than that on mean delay. Index Terms—Average Memory Access Time (Delay), Cache Memory, Embedded Systems, Multicore Processors, Power Consumption.