# <u>EECS150 - Digital Design</u> <u>Lecture 10 - Interfacing</u>

### Feb 21, 2013 John Wawrzynek

Spring 2013

EECS150 - Lec10-interfacing

Page 1

# <u>Topics</u>

- How do components (modules) communicate?
- Combining Hardware with Processors next week.

#### Synchronous Data Transfer

- In synchronous systems, the clock signal is used to coordinate the movement of data around the system.
- Take for example, transferring from module to module:



- By design, the clock period is sufficiently long to accommodate wire delay and time to get the data into the receiver.
- Assumes:
  - sender is ready to send data on each cycle & receiver is ready to receive data on each cycle
- What if the communication is sporadic?

```
Spring 2013
```

EECS150 - Lec10-interfacing

Page 3

#### Data Transfer with Control



- After checking to see if the receiver is ready to receive, the sender asserts the "valid" signal to indicate that the data lines hold data for transferring
- As with synchronous transfer, the sender assumes that the transfer happens successfully
- Design constraint: the ready signal needs to be stable early enough in the cycle to allow the sender to respond with data and the valid signal before set-up time of receiver.
- Can we pipeline the control?

### First-in-first-out (FIFO) Memory

- Used to implement queues.
- These find common use in computers and communication circuits.
- Generally, used to "decouple" actions of producer and consumer: stating state



- Producer can perform many writes without consumer performing any reads (or vis versa). However, because of finite buffer size, on average, need equal number of reads and writes.
- Typical uses:
  - interfacing I/O devices.
    Example network interface.
    Data bursts from network, then processor bursts to memory buffer (or reads one word at a time from interface).
    Operations not synchronized.
  - Example: Audio output.
    Processor produces output samples in bursts (during process swap-in time). Audio DAC clocks it out at constant sample rate.

Spring 2013

EECS150 - Lec10-interfacing

Page 5



- After write or read operation, FULL and EMPTY indicate status of buffer.
- Used by external logic to control own reading from or writing to the buffer.
- FIFO resets to EMPTY state.
- HALF FULL (or other indicator of partial fullness) is optional.



write ptr-

If pointers equal after write  $\Rightarrow$  FULL:



If pointers equal after read  $\Rightarrow$  EMPTY:



Spring 2013

#### **Decoupled Communication**



- · Allow sender and receiver to transfer data independently
- Assumes, on average, equal number of sends and receives
- FIFO buffer must be large enough to accommodate instantaneous difference in send and receive rate

Spring 2013

EECS150-Lec10-interfacing

Page 7

#### **Communicating Across Clock Boundaries**

- Many synchronous systems need to interface to asynchronous input signals:
  - Consider a computer system running at some clock frequency, say 1GHz with:
    - Interrupts from I/O devices, keystrokes, etc.
    - Data transfers from devices with their own clocks
      - Ethernet has its own 100MHz clock
      - PCI bus transfers, 66MHz standard clock.
  - These signals could have no known timing relationship with the system clock of the CPU.

#### "Synchronizer" Circuit

• For a single asynchronous input, we use a simple flip-flop to bring the external input signal into the timing domain of the system clock:



• The D flip-flop samples the asynchronous input at each cycle and produces a synchronous output that meets the setup time of the next stage.

Spring 2013

EECS150 - Lec10-interfacing

Page 9

## "Synchronizer" Circuit

 It is essential for asynchronous inputs to be synchronized at only one place.



- Two flip-flops may not receive the clock and input signals at precisely the same time (clock *and* data skew).
- When the asynchronous changes near the clock edge, one flip-flop may sample input as 1 and the other as 0.

Spring 2013

## "Synchronizer" Circuit

- Single point of synchronization is even more important when input goes to a combinational logic block (ex. FSM)
- The CL block can accidentally hide the fact that the signal is synchronized at multiple points.
- The CL magnifies the chance of the multiple points of synchronization seeing different values.



Spring 2013

EECS150 - Lec10-interfacing

Page 11

## Synchronizer Failure & Metastability

V<sub>out1</sub>  $= V_{in2}$ 

- We think of flip-flops having only two stable states - but all have a third metastable state halfway between 0 and 1.
- When the setup and hold times of a flip-flop are not met, the flip-flop could be put into the metastable state.
- Noise will be amplified and push the flip-flop one way or other.
- However, in theory, the time to transition to a legal state is unbounded
- Does this really happen?
- The probability is low, but number of trials is high!



Spring 2013

## Synchronizer Failure & Metastability

- If the system uses a synchronizer output while the output is still in the metastable state  $\Rightarrow$  synchronizer failure.
- Initial versions of several commercial ICs have suffered from metastability problems effectively synchronization failure:
  - AMD9513 system timing controller
  - AMD9519 interrupt controller
  - Zilog Z-80 Serial I/O interface
  - Intel 8048 microprocessor
  - AMD 29000 microprocessor
- To avoid synchronizer failure wait long enough before using a synchronizer's output. "Long enough", according to Wakerly, is so that the mean time between synchronizer failures is several orders of magnitude longer than the designer's expected length of employment!
- In practice all we can do is reduce the probability of failure to a vanishing small value.

```
Spring 2013
```

EECS150 - Lec10-interfacing

Page 13

# Reliable Synchronizer Design

- The probability that a flip-flop stays in the metastable state decreases exponentially with time.
- Therefore, any scheme that delays using the signal can be used to decrease the probability of failure.
- In practice, delaying the signal by a cycle is usually sufficient:



- If the clock period is greater than metastability resolution time plus FF2 setup time, FF2 gets a synchronized version of ASYNCIN.
- Multi-cycle synchronizers (using counters or more cascaded flipflops) are even better - but often overkill.

## Xilinx Virtex5 FIFOs

- Virtex5 BlockRAMS include dedicated circuits for FIFOs (details in User Guide (ug190)).
- Takes advantage of separate dual ports and independent ports clocks.



• Often used for crossing clock boundaries.

Spring 2013

EECS150 - Lec10-interfacing

Page 15

# **Purely Asynchronous Circuits**

- Many researchers (and a few industrial designers) have proposed a variety of circuit design methodologies that **eliminate the need for a globally distributed clock**.
- They cite a variety of important potential advantages over synchronous systems.
- To date, these attempts have remained mainly in Universities.
- A few commercial asynchronous chips/systems have been build.
- Sometimes, asynchronous blocks sometimes appear inside otherwise synchronous systems.
  - Asynchronous techniques have long been employed in DRAM and other memory chips for generation internal control without external clocks. (Precharge/sense-amplifier timing based on address line changes.
- In GALS (globally asynchronous locally synchronous) systems, independently synchronous islands communicate asynchronously.

Page 16

#### Delay Insensitive (self-timed transfer)



Request/acknowledge "handshake" signal pair used to coordinate data transfer.



### Delay Insensitive (self-timed transfer)



- Only two transitions per transfer. Maybe higher performance.
- More complex logic. 4-cycle return to zero can usually be overlapped with other operations.