### **<u>1- Basic Concepts:</u>**

In this section, we introduce a number of fundamental concepts that relate to the memory hierarchy of a computer..

#### Memory Hierarchy

a typical memory hierarchy starts with a small, expensive, and relatively fast unit, called the cache, followed by a larger, less expensive, and relatively slow main memory unit. Cache and main memory are built using solid-state semiconductor material. It is customary to call the fast memory level the primary memory.

The solid-state memory is followed by larger, less expensive, and far slower magnetic memories that consist typically of the (hard) disk and the tape. It is customary to call the disk the secondary memory, while the tape is conventionally called the tertiary memory.

The objective behind designing a memory hierarchy is to have a memory system that performs as if it consists entirely of the fastest unit and whose cost is dominated by the cost of the slowest unit. The memory hierarchy can be characterized by a number of parameters.

Access: refers to the action that physically takes place during a *read* or *write* operation.

<u>Capacity</u>: The capacity of a memory level is usually measured in bytes.

<u>Cycle time</u>: The cycle time is defined as the time elapsed from the start of a read operation to the start of a subsequent read.

**Latency:** The latency is defined as the time interval between the request for information and the access to the first bit of that information.

**Bandwidth**: The bandwidth provides a measure of the number of bits per second that can be accessed.

<u>Cost:</u> The cost of a memory level is usually specified as dollars per megabytes.



<u>Random access</u> refers to the fact that any access to any memory location takes the same fixed amount of time regardless of the actual memory location and/or the sequence of accesses that takes place.

#### Example:

If a write operation to <u>memory location 100 takes 15</u> ns and if this operation is followed by a read operation to <u>memory location 3000</u>, then the latter operation will also take 15 ns.

The effectiveness of a <u>memory hierarchy</u> depends on the principle of <u>moving</u> <u>information</u> into the fast memory <u>infrequently</u> and <u>accessing</u> it many times before replacing it with new information. This principle is possible due to a phenomenon called **locality of reference**. There exist two forms of locality: **spatial** and **temporal** locality.

<u>Spatial locality</u> refers to the phenomenon that <u>when a given address</u> has been <u>referenced</u>, it is most likely that addresses near it will be referenced within a <u>short period of time</u>, for example, consecutive instructions in a straight-line program.

<u>Temporal locality</u> refers to the phenomenon that <u>once a particular memory item</u> has been <u>referenced</u>, it is most likely that it will be referenced next, for example, an instruction in a program loop.

The <u>sequence of events</u> that takes place when the processor makes a request for an item is as follows.

**<u>First</u>**: the item is sought in the first memory level of the memory hierarchy.

- The probability of <u>finding</u> the requested item in the first level is called the *hit* ratio, h1.
- The probability of <u>not finding</u> (missing) the requested item in the first level of the memory hierarchy is called the *miss ratio*, (1-h1).

When the requested item causes a "miss", it is sought in the next subsequent memory level.

- The probability of <u>finding</u> the requested item in the second memory level, the hit ratio of the second level, is h2.
- > The miss ratio of the second memory level is (1-h2).

The process is <u>repeated until the item is found</u>. Upon finding the requested item, it is brought and sent to the processor.

### 2- Main Memory:

The <u>main memory provides the main storage</u> for a <u>computer</u>. <u>Two CPU registers</u> are used to interface the CPU to the main memory. These are the **memory address register** (MAR) and the **memory data register** (MDR)



A typical CPU and main memory interface

It is possible to visualize a typical internal main memory structure as consisting of rows and columns of basic cells. Each cell is capable of storing one bit of information.



In this figure above, cells belonging to a given row can be assumed to form the bits of a given memory word.

Address lines  $An-1, An-2 \dots A1, A0$  are used as inputs to the address decoder in order to generate the word select lines  $W2n-1 \dots W1$ , W0.

A <u>given word</u> select line is common to <u>all memory cells</u> in the <u>same row</u>. At any given time, the <u>address decoder activates</u> only one word select line while <u>deactivating</u> the remaining lines.

A word select line is <u>used to enable all cells in a row</u> for read or write.

Data (bit) lines are used to input or output the contents of cells.

Each memory cell is <u>connected</u> to two data lines. A given data line is common to all cells in a given column.

**Example:** a  $1K \times 4$  memory chip. The memory array should be organized as 1K rows of cells, each consisting of four cells. The chip will then have to have 10 pins for the address and four pins for the data.

However, this may not lead to the best utilization of the chip area.

Another possible organization of the memory cell array is as a  $64 \times 64$ , that is, to organize the array in the form of 64 rows, each consisting of 64 cells. In this case, six address lines (forming what is called the row address) will be needed in order to select one of the 64 rows. The remaining four address lines (called the column address) will be used to select the appropriate 4 bits among the available 64 bits constituting a row.



Efficient internal organization of a 1K\_4 memory chip

**Example** the design of a 4M bytes main memory subsystem using 1M bit chip. The number of required chips is 32 chips. It should be noted that the number of address lines required for the 4M system is 22, while the number of data lines is 8. Figure below shows a block diagram for both the intended memory subsystem and the basic building block to be used to construct such a subsystem.



Block diagram of a required memory system and its basic building block