# EECS150 - Digital Design

Lecture 13 - Project Description, Part 2: Memory Blocks

> Mar 2, 2010 John Wawrzynek

Spring 2010

EECS150 - Lec13-proj2

Page 1

# **Project Overview**

- A. MIPS150 pipeline structure
- B. Serial Interface
- C. Memories, project memories and FPGAs
- D. Video subsystem
- E. Ethernet Interface
- F. Project specification and grading standard

### **Memory-Block Basics**

#### • Uses:

Whenever a large collection of state elements is required.

- data & program storage
- general purpose registers
- data buffering
- table lookups
- CL implementation
- Basic Types:
  - RAM random access memory
  - ROM read only memory
  - EPROM, FLASH electrically programmable read only memory

```
Spring 2010
```

EECS150 - Lec13-proj2

Page 3

# Memory Components Types:

- Volatile:
  - Random Access Memory (RAM):
    - DRAM "dynamic"
    - (• SRAM "static")

Focus Today

- Non-volatile:
  - Read Only Memory (ROM):
    - Mask ROM "mask programmable"
    - EPROM "electrically programmable"
    - EEPROM "erasable electrically programmable"
    - FLASH memory similar to EEPROM with programmer integrated on chip

All these types are available as stand alone chips or as blocks in other chips.



## **Standard Internal Memory Organization**



2-D arrary of bit cells. Each cell stores one bit of data.

Special circuit tricks are used for the cell array to improve storage density.

- RAM/ROM naming convention:
  - examples: 32 X 8, "32 by 8" => 32 8-bit words
  - 1M X 1, "1 meg by 1" => 1M 1-bit words



Spring 2010

EECS150 - Lec13-proj2



## Address Decoding



Spring 2010

## **Memory Block Internals**



SRAM Cell Array Details.........Most common is 6-<br/>transistor (6T) cell array.......................................................................................................................................................................................................................................................................................................................................................<

For read operation, column bit lines are equalized (set to same voltage), then released. Cell pulls down one bit line or the other. Spring 2010 EECS150-Lec13-proj2

# Column MUX in ROMs and RAMs:

- Permits input/output data widths different from row width.
- · Controls physical aspect ratio
  - Important for physical layout and to control delay on wires.



# **Cascading Memory-Blocks**

How to make larger memory blocks out of smaller ones.

Increasing the width. Example: given 1Kx8, want 1Kx16



## **Cascading Memory-Blocks**

How to make larger memory blocks out of smaller ones.

Increasing the depth. Example: given 1Kx8, want 2Kx8



Spring 2010

EECS150 - Lec13-proj2

Page 11

## **Multi-ported Memory**

#### • Motivation:

- Consider CPU core register file:

- 1 read or write per cycle limits processor performance.
- Complicates pipelining. Difficult for different instructions to simultaneously read or write regfile.
- Common arrangement in pipelined CPUs is 2 read ports and 1 write port.
- I/O data buffering:



dual-porting allows both sides to simultaneously access memory at full bandwidth.

EECS150 - Lec13-proj2

disk or network interface

data

buffe

CPU

#### **Dual-ported Memory Internals**

- Add decoder, another set of ٠ read/write logic, bits lines, word lines:
- Example cell: SRAM



# Adding Ports to Primitive Memory Blocks

Adding a read port to a simple dual port (SDP) memory.

Example: given 1Kx8 SDP, want 1 write & 2 read ports.



### Adding Ports to Primitive Memory Blocks

How to add a write port to a simple dual port memory. Example: given 1Kx8 SDP, want 1 read & 2 write ports.









## **Example Distributed RAM (LUT RAM)**



# **Distributed RAM Primitives**



- All are built from a single slice or less.
- Quad-Port 64 x 1-bit RAMSimple Dual-Port 64 x 3-bit RAM
- Single-Port 128 x 1-bit RAM Remember, though, that the SLICEM LUT
- Dual-Port 128 x 1-bit RAM
  Single-Port 256 x 1-bit RAM
  is naturally only 1 read and 1 write port.



#### **Distributed RAM Timing**



Figure 5-27: Simplified Virtex-5 FPGA SLICEM Distributed RAM

Spring 2009

EECS150 - LecO3-FPGA

| Device     | Configurable Logic Blocks (CLBs) |                                   |                                |                                 | Block RAM Blocks     |       |             |                     | PowerPC             | Endpoint                     |                                 | Max RocketIO<br>Transceivers <sup>(6)</sup> |     | Total                       | Max                        |
|------------|----------------------------------|-----------------------------------|--------------------------------|---------------------------------|----------------------|-------|-------------|---------------------|---------------------|------------------------------|---------------------------------|---------------------------------------------|-----|-----------------------------|----------------------------|
|            | Array<br>(Row x Col)             | Virtex-5<br>Slices <sup>(1)</sup> | Max<br>Distributed<br>RAM (Kb) | DSP48E<br>Slices <sup>(2)</sup> | 18 Kb <sup>(3)</sup> | 36 Kb | Max<br>(Kb) | CMTs <sup>(4)</sup> | Processor<br>Blocks | Blocks for<br>PCI<br>Express | Ethernet<br>MACs <sup>(5)</sup> | GTP                                         | GTX | I/O<br>Banks <sup>(8)</sup> | User<br>I/O <sup>(7)</sup> |
| XC5VLX30   | 80 x 30                          | 4,800                             | 320                            | 32                              | 64                   | 32    | 1,152       | 2                   | N/A                 | N/A                          | N/A                             | N/A                                         | N/A | 13                          | 400                        |
| XC5VLX50   | 120 x 30                         | 7,200                             | 480                            | 48                              | 96                   | 48    | 1,728       | 6                   | N/A                 | N/A                          | N/A                             | N/A                                         | N/A | 17                          | 560                        |
| XC5VLX85   | 120 x 54                         | 12,960                            | 840                            | 48                              | 192                  | 96    | 3,456       | 6                   | N/A                 | N/A                          | N/A                             | N/A                                         | N/A | 17                          | 560                        |
| XC5VLX110  | 160 x 54                         | 17,280                            | 1,120                          | 64                              | 256                  | 128   | 4,608       | 6                   | N/A                 | N/A                          | N/A                             | N/A                                         | N/A | 23                          | 800                        |
| XC5VLX155  | 160 x 76                         | 24,320                            | 1,640                          | 128                             | 384                  | 192   | 6,912       | 6                   | N/A                 | N/A                          | N/A                             | N/A                                         | N/A | 23                          | 800                        |
| XC5VLX220  | 160 x 108                        | 34,560                            | 2,280                          | 128                             | 384                  | 192   | 6,912       | 6                   | N/A                 | N/A                          | N/A                             | N/A                                         | N/A | 23                          | 800                        |
| XC5VLX330  | 240 x 108                        | 51,840                            | 3,420                          | 192                             | 576                  | 288   | 10,368      | 6                   | N/A                 | N/A                          | N/A                             | N/A                                         | N/A | 33                          | 1,200                      |
| XC5VLX20T  | 60 x 26                          | 3,120                             | 210                            | 24                              | 52                   | 26    | 936         | 1                   | N/A                 | 1                            | 2                               | 4                                           | N/A | 7                           | 172                        |
| XC5VLX30T  | 80 x 30                          | 4,800                             | 320                            | 32                              | 72                   | 36    | 1,296       | 2                   | N/A                 | 1                            | 4                               | 8                                           | N/A | 12                          | 360                        |
| XC5VLX50T  | 120 x 30                         | 7,200                             | 480                            | 48                              | 120                  | 60    | 2,160       | 6                   | N/A                 | 1                            | 4                               | 12                                          | N/A | 15                          | 480                        |
| XC5VLX85T  | 120 x 54                         | 12,960                            | 840                            | 48                              | 216                  | 108   | 3,888       | 6                   | N/A                 | 1                            | 4                               | 12                                          | N/A | 15                          | 480                        |
| XC5VLX110T | 160 x 54                         | 17,280                            | 1,120                          | 64                              | 296                  | 148   | 5,328       | 6                   | N/A                 | 1                            | 4                               | 16                                          | N/A | 20                          | 680                        |
| XC5VLX155T | 160 x 76                         | 24,320                            | 1,640                          | 128                             | 424                  | 212   | 7,632       | 6                   | N/A                 | 1                            | 4                               | 16                                          | N/A | 20                          | 680                        |
| XC5VLX220T | 160 x 108                        | 34,560                            | 2,280                          | 128                             | 424                  | 212   | 7,632       | 6                   | N/A                 | 1                            | 4                               | 16                                          | N/A | 20                          | 680                        |
| XC5VLX330T | 240 x 108                        | 51,840                            | 3,420                          | 192                             | 648                  | 324   | 11,664      | 6                   | N/A                 | 1                            | 4                               | 24                                          | N/A | 27                          | 960                        |
| XC5VSX35T  | 80 x 34                          | 5,440                             | 520                            | 192                             | 168                  | 84    | 3,024       | 2                   | N/A                 | 1                            | 4                               | 8                                           | N/A | 12                          | 360                        |
| XC5VSX50T  | 120 x 34                         | 8,160                             | 780                            | 288                             | 264                  | 132   | 4,752       | 6                   | N/A                 | 1                            | 4                               | 12                                          | N/A | 15                          | 480                        |
| XC5VSX95T  | 160 x 46                         | 14,720                            | 1,520                          | 640                             | 488                  | 244   | 8,784       | 6                   | N/A                 | 1                            | 4                               | 16                                          | N/A | 19                          | 640                        |
| XC5VSX240T | 240 x 78                         | 37,440                            | 4,200                          | 1,056                           | 1,032                | 516   | 18,576      | 6                   | N/A                 | 1                            | 4                               | 24                                          | N/A | 27                          | 960                        |
| XC5VTX150T | 200 x 58                         | 23,200                            | 1,500                          | 80                              | 456                  | 228   | 8,208       | 6                   | N/A                 | 1                            | 4                               | N/A                                         | 40  | 20                          | 680                        |
| XC5VTX240T | 240 x 78                         | 37,440                            | 2,400                          | 96                              | 648                  | 324   | 11,664      | 6                   | N/A                 | 1                            | 4                               | N/A                                         | 48  | 20                          | 680                        |
| XC5VFX30T  | 80 x 38                          | 5,120                             | 380                            | 64                              | 136                  | 68    | 2,448       | 2                   | 1                   | 1                            | 4                               | N/A                                         | 8   | 12                          | 360                        |
| XC5VFX70T  | 160 x 38                         | 11,200                            | 820                            | 128                             | 296                  | 148   | 5,328       | 6                   | 1                   | 3                            | 4                               | N/A                                         | 16  | 19                          | 640                        |
| XC5VFX100T | 160 x 56                         | 16,000                            | 1,240                          | 256                             | 456                  | 228   | 8,208       | 6                   | 2                   | 3                            | 4                               | N/A                                         | 16  | 20                          | 680                        |
| XC5VFX130T | 200 x 56                         | 20,480                            | 1,580                          | 320                             | 596                  | 298   | 10,728      | 6                   | 2                   | 3                            | 6                               | N/A                                         | 20  | 24                          | 8402                       |
| XC5VFX200T | 240 x 68                         | 30,720                            | 2,280                          | 384                             | 912                  | 456   | 16,416      | 6                   | 2                   | 4                            | 8                               | N/A                                         | 24  | 27                          | 960                        |

Table 1: Virtex-5 FPGA Family Members

#### **Block RAM Overview**



Spring 2009

- 36K bits of data total, can be configured as:2 independent 18Kb RAMs, or one 36Kb RAM.
- Each 36Kb block RAM can be configured as:
  - 64Kx1 (when cascaded with an adjacent 36Kb block RAM), 32Kx1, 16Kx2, 8Kx4, 4Kx9, 2Kx18, or 1Kx36 memory.
- Each 18Kb block RAM can be configured as: - 16Kx1, 8Kx2, 4Kx4, 2Kx9, or 1Kx18 memory.
- Write and Read are synchronous operations.
- The two ports are symmetrical and totally independent (can have different clocks), sharing only the stored data.
- Each port can be configured in one of the available widths, independent of the other port.
  The read port width can be different from the write port width for each port.
- The memory content can be initialized or cleared by the configuration bitstream.

EECS150 - LecO3-FPGA



- Note this is in the default mode, "WRITE FIRST". Other • possible modes are "READ\_FIRST", and "NO\_CHANGE".
- Optional output register, would delay appearance of output data by one cycle.
- Maximum clock rate, roughly 400MHz.

Spring 2009

EECS150 - LecO3-FPGA

Page 25

# Verilog Synthesis Notes

- Block RAMS and LUT RAMS all exist as primitive library ٠ elements (similar to FDRSE). However, it is much more convenient to use inference.
- Depending on how you write your verilog, you will get either a collection of block RAMs, a collection of LUT RAMs, or a collection of flip-flops.
- The synthesizer uses size, and read style (synch versus asynch) to determine the best primitive type to use.
- It is possible to force mapping to a particular primitive by using synthesis directives. However, if you write your verilog correctly, you will not need to use directives.
- The synthesizer has limited capabilities (eq., it can combine ٠ primitives for more depth and width, but is limited on porting options). Be careful, as you might not get what you want.
- See Synplify User Guide, and XST User Guide for examples. EECS150 - Lec13-proj2 Spring 2010

#### **Inferring RAMs in Verilog**

// 64X1 RAM implementation using distributed RAM

endmodule

Spring 2010

EECS150 - Lec13-proj2

Page 27

## **Dual-read-port LUT RAM**

```
11
// Multiple-Port RAM Descriptions
11
module v_rams_17 (clk, we, wa, ra1, ra2, di, do1, do2);
    input clk;
    input we;
    input [5:0] wa;
    input [5:0] ra1;
    input [5:0] ra2;
    input [15:0] di;
    output [15:0] do1;
    output [15:0] do2;
    reg [15:0] ram [63:0];
    always @(posedge clk)
    begin
       if (we)
           ram[wa] <= di;</pre>
   assign do1 = ram[ra1]; ______Multiple reference to
    end
    assign do2 = ram[ra2];
endmodule
```

#### **Block RAM Inference**

```
11
// Single-Port RAM with Synchronous Read
11
module v rams 07 (clk, we, a, di, do);
   input clk;
   input we;
   input [5:0] a;
   input [15:0] di;
   output [15:0] do;
         [15:0] ram [63:0];
   reg
         [5:0] read_a;
   req
   always @(posedge clk) begin
       if (we)
         ram[a] <= di;
                                Synchronous read
       infers Block RAM
   end
   assign do = ram[read_a];
endmodule
```

Spring 2010

EECS150 - Lec13-proj2

Page 29

### **Block RAM** initialization

```
module RAMB4_S4 (data_out, ADDR, data_in, CLK, WE);
   output[3:0] data_out;
   input [2:0] ADDR;
   input [3:0] data_in;
   input CLK, WE;
   reg [3:0] mem [7:0];
   reg [3:0] read addr;
   initial
                                            "data.dat" contains initial RAM
     begin
       $readmemb("data.dat", mem); contents, it gets put into the bitfile
                                           and loaded at configuration time.
     end
                                           (Remake bits to change contents)
   always@(posedge CLK)
     read addr <= ADDR;</pre>
   assign data out = mem[read addr];
   always @(posedge CLK)
     if (WE) mem[ADDR] = data in;
   endmodule
```

#### **Dual-Port Block RAM**

module test (data0,data1,waddr0,waddr1,we0,we1,clk0, clk1, q0, q1); parameter d width = 8; parameter addr width = 8; parameter mem depth = 256; input [d\_width-1:0] data0, data1; input [addr\_width-1:0] waddr0, waddr1; input we0, we1, clk0, clk1; reg [d\_width-1:0] mem [mem\_depth-1:0] reg [addr\_width-1:0] reg\_waddr0, reg\_waddr1; output [d\_width-1:0] q0, q1; assign q0 = mem[reg waddr0]; assign q1 = mem[reg\_waddr1]; always @(posedge clk0) begin if (we0) mem[waddr0] <= data0;</pre> reg\_waddr0 <= waddr0;</pre> end always @(posedge clk1) begin if (wel) mem[waddr1] <= data1;</pre> reg\_waddr1 <= waddr1;</pre> end endmodule Spring 2010 EECS150 - Lec13-proj2

# Processor Design Considerations (1/2)

Page 31

Page 32

#### • Register File: Consider distributed RAM (LUT RAM)

- Size is close to what is needed: distributed RAM primitive configurations are 32 or 64 bits deep. Extra width is easily achieved by parallel arrangements.
- LUT-RAM configurations offer multi-porting options useful for register files.
- Asynchronous read, might be useful by providing flexibility on where to put register read in the pipeline.
- Instruction / Data Memories : Consider Block RAM
  - Higher density, lower cost for large number of bits
  - A single 36kbit Block RAM implements 1K 32-bit words.
  - Configuration stream based initialization, permits a simple "boot strap" procedure.
- Other Memories in Project? Video "Frame Buffer"? Spring 2010 EECS150 - Lec13-proj2

#### **XUP Board External SRAM**

