what happens on a write

reads are more frequent than writes.

instruction fetches are reads

most instructions do not write to memory (writeback, writethrough)

make common case fastest: optimize caches for read

easy to make cache fast for read:

block can be read at same time as tag compared (thus block read happens as soon as block address avail.)
if the read is a hit, requested part of block sent to CPU immediately.
if miss, no benefit but no harm either: value is just ignored.

writes are much harder to optimize:

block cannot be modified until tag is checked for a cache hit.
checking cannot happen in parallel, writes thus take longer than reads.
writes access 1 byte or small number of bytes; only that portion of cache should be modified, whereas reads can access any number of bytes safely.

there are 2 basic designs for writing to cache:

writethrough: information written to block in cache and block in next lower memory (e.g. main store, or the like).
writeback: information written only to cache; block written to next lower memory only when block gets replaced.
to reduce frequency of writes to main memory on block replacement, a "dirty bit" is often used as a status bit to indicate a block has been modified while in cache. if it has been modified, the block is not written on a miss, since next down memory level down (main memory) has same information as that block of cache.

comparision of writeback and writethrough

writeback happens at speed of cache.

with writeback, writes to same block need only have one write to main memory (useful in multi CPU computers).

writethrough: cache miss never results in writes to next level down.
writethrough much easier to implement in hardware.
writethrough ensures next lower level of memory always has most recent copy of data. this is important for i/o and multiprocessors.

difficulties of multiprocessors and i/o

multiprocessors and i/o are difficult to use with caches because they want characteristics of writeback to reduce requirement of memory bandwidth, but also they want characteristics of writethrough to keep data in all of the CPUs consistent.

when CPU must wait for a write to complete, it is known as a write stall.

this problem can be alleviated by a write buffer, allowing overlap of operations.

data is not needed on a write, so we have 2 options on a write miss:

write allocate: block is loaded, followed by write hit actions similar to a read miss.
nowrite allocate: block is modifiedin lower level memory and not loaded into cache.

writeback generally uses write allocate (hopeing subsequent writes are captured by cache).

writethrough generally uses nowrite allocate (since subsequent writes to that block still have to go to lwoer level of memory).

improving performance of caches