Computers have evolved from their big, bulky and slow form to faster, smarter and a miniature version. Our everyday work and life has become majorly dependent on computers, and what has made possible the super-fast speed of these computers is CPU cache. CPU caches has brought a dynamic change in world of computing and has taken technological advancements to next level. CPU cache main purpose is to increase the processing speed of a code or a command, for that some data from the main memory that is used very frequently is copied to the cache memory, so that every time processor doesn’t have to search for the data in the huge pool of main memory. Initially the concept of cache memory started with just one level, which further increased to 2, 3 and even 4 for big processors. L1 is supposed to be main cache memory with further has sub levels L1i and L1d for instructions and data respectively. The higher levels i.e. L2, L3 and so on generally don’t have sub levels.
In the above graph it is clearly evident how a processor with cached memory takes lesser time and has better performance compared to non-cached processor.
Why these L1 and L2 are required?
The main reason you have more than one level of cache memory is to overcome the problem of cache miss and increase the hit rate, which will further reduce latency. Hit rate is the promptness of the cache memory to present data when required. Greater the hit rate, higher would be the performance, faster would be the speed of the processor. What happens is, the processor tries to fetch data from L1 of the cache memory, if the processor fails to attain data from L1, then it starts its search at L2 followed by L3, L4 and so on. L1 is the smallest and hence the fastest level of cache memory, L2 in turn is slightly larger and slower than L1, L3 is the largest amongst the three and hence has higher latency. Usually these processors are composed of only three levels of cache memory, higher levels are used in special cases. What gives cache memory an edge over main memory is that cache memory is made up of SRAM (Static RAM) whereas the main memory is DRAM (Dynamic RAM). The draw bag with DRAM is that DRAM periodically needs to rewrite its data, which indeed is a big reason for latency in DRAM.
How cache size and associativity affects the miss rates
The above graph shows how cache size influences miss rates. The miss rates for the small cache size like 1K to 4K is very high compared to the large cache size. Small cache sizes come with greater possibility of conflicts which becomes a reason for latency. Another thing that affects the miss rate possibility is the block size. Any data in a cache memory is stored in block, cache memory is divided into a no of blocks, and the way in which data is mapped in cache memory also affects the performance and latency. There are three ways in which these data can be mapped in the cache memory which includes direct associative, set associative and full associative.
Direct associative – every block has a specific place in cache.
Set associative – There are small sets in the cache memory. A block can be a part of any of these small locations. For instance if it is a 2 way set associative, it can fit 2 blocks in each set, similarly for 4 way and 8 way.
Fully associative – Any block can placed anywhere in the cache memory, it doesn’t follow any particular format. Each and every block needs to be searched for fetching a particular data.
The above graph also depicts how associativity affects miss rates. Miss rates are highest for direct associativity and least for fully associative.
However researchers are focusing on preparing even faster processor, to achieve the target of zero percent miss rates with smaller chip size. Different companies like Intel, AMD are trying to improve the performance of the processor. However Intel has made its mark on and is on the top right now.