A Common Framework for Memory Hierarchy

Zhao Cong

Question 1: Where Can a Block Be Placed?

  1. Block placement scheme:Direct mapped;Set associative;Fully associative
Scheme name Number of sets Blocks per set
Direct mapped Number of blocks in cache 1
Set associative Number of blocks in the cache Associativity Associativity (typically 2–16)
Fully associative 1 Number of blocks in the cache
  1. The advantage of increasing the degree of associativity is that it usually decreases the miss rate. The improvement in miss rate comes from reducing misses that compete for the same location.
  2. As cache sizes grow, the relative improvement from associativity increases only slightly.
  3. The potential disadvantages of associativity are increased cost and slower access time.

Question 2: How Is a Block Found?

  1. The choice of how we locate a block depends on the block placement scheme
Associativity Location method Comparisons required
Direct mapped Index 1
Set associative Index the set, search among elements Index the set, search among elements
Full Search all cache entries Size of the cache
Full Separate lookup table 0
  1. Fully associative caches are prohibitive except for small sizes
  2. virtual memory systems almost always use fully associative placement.
    • Full associativity is beneficial, since misses are very expensive
    • Full associativity allows soft ware to use sophisticated replacement schemes that are designed to reduce the miss rate.
    • The full map can be easily indexed with no extra hardware and no searching required.
  3. Set-associative placement is often used for caches and TLBs
  4. A few systems have used direct mapped caches because of their advantage in access time and simplicity. The advantage in access time occurs because finding the requested block does not depend on a comparison

Question 3: Which Block Should Be Replaced on a Cache Miss?

  1. In a fully associative cache, all blocks are candidates for replacement. If the cache is set associative, we must choose among the blocks in the set.Replacement is easy in a direct-mapped cache because there is only one candidate.
  2. Random: Candidate blocks are randomly selected, possibly using some hardware assistance. For example, MIPS supports random replacement for TLB misses.
  3. Least recently used (LRU): The block replaced is the one that has been unused for the longest time.
  4. In practice, LRU is too costly to implement for hierarchies with more than a small degree of associativity (two to four, typically), since tracking the usage information is costly.
  5. In caches, the replacement algorithm is in hardware, which means that the scheme should be easy to implement. Random replacement is simple to build in hardware.In fact, random replacement can sometimes be better than the simple LRU approximations that are easily implemented in hardware.
  6. In virtual memory, some form of LRU(ex:Reference bits) is always approximated, since even a tiny reduction in the miss rate can be important when the cost of a miss is enormous.

Question 4: What Happens on a Write?

  1. the two basic options:
    • Write-through: The information is written to both the block in the cache and the block in the lower level of the memory hierarchy (main memory for a cache).
    • Write-back: The information is written only to the block in the cache. The modified block is written to the lower level of the hierarchy only when it is replaced. Virtual memory systems always use write-back.
  2. The key advantages of write-back are the following
    • Individual words can be written by the processor at the rate that the cache, rather than the memory, can accept them.
    • Multiple writes within a block require only one write to the lower level in the hierarchy.
    • When blocks are written back, the system can make effective use of a high-bandwidth transfer, since the entire block is written.
  3. Write-through has these advantages:
    • Misses are simpler and cheaper because they never require a block to be written back to the lower level.
    • Write-through is easier to implement than write-back, although to be practical, a write-through cache will still need to use a write buffer.
  4. In virtual memory systems, only a write-back policy is practical because of the long latency of a write to the lower level of the hierarchy.

The Three Cs: An Intuitive Model for Understanding the Behavior of Memory Hierarchies

  1. Compulsory misses: These are cache misses caused by the first access to a block that has never been in the cache. These are also called cold-start misses.
  2. Capacity misses: These are cache misses caused when the cache cannot contain all the blocks needed during execution of a program. Capacity misses occur when blocks are replaced and then later retrieved.
  3. Conflict misses:These are cache misses that occur in set-associative or direct-mapped caches when multiple blocks compete for the same set. Conflict misses are those misses in a direct-mapped or set-associative cache that are eliminated in a fully associative cache of the same size. These cache misses are also called collision misses.
  4. increasing associativity reduces conflict misses. Associativity, however, may slow access time, leading to lower overall performance.
  5. Capacity misses can easily be reduced by enlarging the cache,Of course, when we make the cache larger, we must also be careful about increasing the access time, which could lead to lower overall performance. Thus, first-level caches have been growing slowly, if at all.
  6. reduce the number of compulsory misses is to increase the block size,This will reduce the number of references required to touch each block of the program once, because the program will consist of fewer cache blocks. As mentioned above, increasing the block size too much can have a negative effect on performance because of the increase in the miss penalty.

Additional Information

  1. In real cache designs, many of the design choices interact, and changing one cache characteristic will often affect several components of the miss rate.
  2. The challenge in designing memory hierarchies is that every change that potentially improves the miss rate can also negatively affect overall performance.

Summary

  1. Question 1: Where can a block be placed? Answer: One place (direct mapped), a few places (set associative), or any place (fully associative)
  2. Question 2: How is a block found? Answer:There are four methods: indexing (as in a direct-mapped cache), limited search (as in a set-associative cache), full search (as in a fully associative cache), and a separate lookup table (as in a page table ,no comparisons required).
  3. Question 3: What block is replaced on a miss? Answer: Typically, either the least recently used or a random block.
  4. Question 4: How are writes handled? Answer: Each level in the hierarchy can use either write-through or write-back.