Digital Simulation:Performance and Use of Logic Simulation

Performance and Use of Logic Simulation

Two simple logic circuits, an AND with inverter and an R-S latch are used to demonstrate performance and use of logic simulation. The simple circuits are chosen to put the performance of a simulator into foreground and not the circuit by its complex function.

Inverter with AND

The logic circuit inverter with AND inverts the input signal EIN_UND0 to the variable INVERT. The AND gate has two inputs, INVERT and IN_AND1. The two outputs of the circuit are OUT_AND and INVERT (ﬁg. 11.8, list 11.1).

Output INVERT was added to show the value of the internal node explicitly.

Functional simulation: circuit inv_and

Functional simulation will use a simple model with constant time delay (unit time delay) for every gate. Dynamic processes owed to different time delays of gates which may produce glitches can not be discovered at this stage. Additionally, all signal changes are assumed to be ideal step functions. The technology of the target device is not considered either.

Test stimuli are developed by the designer or a co- designer (list 11.2). The actual test program was entered using a graphic interface and altered manually afterwards. This speeds up the generation process because one can directly see the circuit re- action and thus develop test program accordingly. The ineffective test steps can be deleted manually.

Most work benches provide tools for inspecting the signals of every node of the hardware. This helps greatly identifying design faults during the design process.

Using test stimuli given in list 11.2 the functional simulation produces the timing diagram of ﬁg. 11.9.

To recognize each combination of the two input variables easily, all inputs are bundled with as a additional variable INPUTS whose values are displayed in hexadecimal numbers. The numbers 0, 3, 2; and 1 show that all input combinations are applied in the ﬁrst 4 steps of test stimuli.

Simulation begins with both inputs 0. After one unit time delay both outputs change from undeﬁned to deﬁned values, that is INVERT = 1 and OUT_AND = 0. The delay for output OUT_AND is only one time unit because IN_AND1 = 0 directly forces the output OUT_AND to 0, disregard- ing the unknown output value of INVERT.

Depending on the sequence of changing input variables a hazard may occur. This is the case when both circuit inputs change at time marked with ➀ from 00 to 11 (ﬁg. 11.9); this is depicted as a change from 0 to 3 of the bundle variable IN- PUTS: After one time unit, the output OUT_AND changes to 1 and returns to 0 after one time unit.

In particular, the changes of values for the ﬁrst part of the input pattern are as follows: 11 Input UND_EIN0 changes its value at ➀ from 0 to 1. Therefore the output INVERT changes to 0 at ➁, which happens 1 time unit after ➀. During the time period between ➀ and ➁ IN_AND1 = 1 und IN- VERT = 1. Owing to this behaviour OUT_AND goes to 1 at ➁. OUT_AND will return to 0 one time unit later, which happens at ➂ because INVERT has changed to 0 at ➁.

The pulse at output OUT_AND is shown clearly by the simulator. However, it depends on the value of the relevant parameter in the AND model used in the simulator. For the time of the parameter speciﬁed there the simulator will suppress this pulse.

This logic model need not necessarily show the function of the target device which will be shown in the following passage.

A short note concerning the input values in the timing diagram ﬁg. 11.9: Input variables may also have the value X (unknown). In the design ex- ample the input IN_AND0 is set to X at ➃. After one time unit INVERT changes to X which affects the output OUT_AND after another time unit to change to X as well. If IN_AND0 returns to 0,

one time unit later both INVERT and OUT_AND will obtain 1 and 0 respectively, after one time unit. This is because a 0 – in this case IN_AND0 – at one input of an AND gate asserts a 0 at its output, regardless of the value of its other inputs.

Pre-layout simulation: circuit inv_and

After functional simulation the structure of the circuit must be tailored according to the target device. The logic of the circuit is converted in such a way that the speciﬁc logic modules of the target device are used exclusively. To demonstrate this effect the FPGA 4003 of XILINX is used as a target device. This process occurs similarly in the development of ASICs.

The FPGA uses Conﬁgurable Logic Blocks (CLB) which are the logic basis of this FPGA. In a CLB look up tables (LUT) multiplexers and ﬂip ﬂops can be conﬁgured to implement the intended logic function. In this case LOGIC2 does the synthesis for the circuit inv_and and generates the logic function (ﬁg. 11.10) tailored for the FPGA. The designer has to verify the logic implementation with pre-layout simulation. Naturally test stimuli of the pre-layout simulation are used again to show the differences of the implementation.

In pre-layout simulation the load imposed on out- puts is ignored. Load is imposed by the capacitance of interconnections and the input load of connected gates.

Most interesting is that the output INVERT is generated directly by an inverter. The reason for

his implementation is that a CLB has an inverter for each external output, which in this case is directly used and does not require any additional resources of the CLB. Outputs are inverted within the CLB without using additional resources. The logic function itself is implemented with an LUT. Logic functions with 4 inputs can be programmed in this LUT without any need for optimization.

Pre-layout simulation (ﬁg. 11.11) shows that no pulse is generated after the change of INPUTS from 0 to 3. All changes of output values occur after one unit time delay.

As mentioned before, the logic function is implemented in a LUT within the CLB. Accordingly no inverters are used for the logic implementation. The logic is just programmed into the LUT. The output of a LUT will change after one unit time delay independent of the logic programmed in the LUT. In ﬁg. 11.12 both inputs F2 (IN_AND0) and F4 (IN_AND1) feed the LUT and the output F of the LUT is connected to output X (OUT_AND).

The layout of the logic AND within the FPGA is shown in ﬁg. 11.13. Logic functions of each signal in the FPGA can be evaluated by the workbench. The output F is equal to /F2*F4, using the internal variables of the FPGA. F2 refers to IN_AND0 and F4 to IN_AND1 respectively (ﬁg. 11.14).

For debugging this feature is very useful.

Post-layout simulation: circuit ‘inv_and’

For the post-layout simulation the design entry tool Foundation of Xilinx was used. Instead of the term post-layout simulation the term timing simulation is used (ﬁg. 11.15). Before starting the simulation the circuit parameters are extracted from the actual layout.

The delays from input to output are signiﬁcant. This is owed to the delay of input buffers and output buffers. These do not show up in functional and pre-layout simulation.

Output INVERT goes high prior to the output signal OUT_AND. The time delay for the output signal is slightly shorter than the path delay for the output OUT_AND. This shows clearly that the results of the simulation depend on the hardware of the implementation.

Design Example: RS Flip ﬂop

The RS ﬂip ﬂop implemented with NOR gates (ﬁg. 11.16) is another design example to demonstrate the capability of functional simulation with unit time delay.

It is special to this implementation that the out- put QSTAR is not the negated Q. Because in the literature there is no standard way of drawing the schematic of a RS ﬂip ﬂop, it should be noted that

the input SET is in the upper left of the schematic and the output QSTAR is on the upper right.

Functional simulation of the RS ﬂip ﬂop circuit

In functional simulation the simultaneous change of both inputs SET and RESET from 11 to 00 shows that both outputs Q and QSTAR oscillate in phase (ﬁg. 11.17). The ﬂip ﬂop is in an unstable condition. The input combination 11 for SET and RESET is not permissible, the simulator indicates the problem in this circuit. The explanation in detail is as follows. Starting at 30 ns, both inputs are 1. Taking the values of the inputs as they are means setting and resetting the ﬂip ﬂop at the same time. This is not a meaningful action. The reaction of the ﬂip ﬂop is that both outputs Q and QSTAR are 1. But considering this condition, what happens if at 40 ns both inputs go low? A logic 1 at the outputs is a logic 0 at the outputs NOR_WITH_SET and NOR_WITH_RESET after one unit time delay. Therefore all inputs of both NOR gates are 0, which in turn leads to a 1 for both NOR_WITH_SET and NOR_WITH_RESET after another unit time delay. This behavior of the circuit only shows up in simulation. The delay times of both gates are identical in the simulated model of the circuit, which is never true in a real logic circuit. But simulation indicates that an illegal input combination has occurred.

Post-layout simulation of circuit RS ﬂip ﬂop (timing simulation)

Pre-layout simulation is left out because no additional information is expected in this simple ex- ample. After layout all macro cells, output drivers, and the wiring are ﬁxed. The actual parameters can be extracted from the circuit, which is an FPGA in this case. Using these parameters the circuit model is modiﬁed to match the performance of the circuit model with the real circuit. Mostly so called lumped models are used; for instance, a single capacitor is inserted instead of the distributed capacity of a wire. The process of insert- ing parameters extracted from the layout is called back annotation. Thus hardware oriented delay parameters of the target device can improve the accuracy of the simulation results.

The goal of post-layout simulation is to take dynamic behaviour into consideration

Varying delays of gates cause many events in the circuit at different times. The purpose of post- layout simulation is to analyse the dynamic behavior of the circuit. If post-layout simulation recognizes no unknown logic values and no oscillations in a circuit the designer is one step nearer to design a functioning circuit with reproducible results.

Post-layout simulation produces a result with stable states (ﬁg. 11.18) for the given test stimuli.

Both outputs, called P_Q:PAD and P_QSTAR.PAD in the diagram have stable values. The reason is that in a real circuit two gates with their related delay times have different delay times. This is the difference compared to the results of the functional simulation. Therefore one should use both – functional simulation and post-layout simulation.

Design example: 4-bit synchronous binary counter

The 4-bit binary synchronous counter with carry and reset is a design example which will be de- signed in two ways, as a Medvedev machine and as a Moore machine. Simulation is used to demonstrate the differences between their timing. In particular the path delays of both circuits are analyzed. The differences of internal delay times which are caused by all elements associated with the function are discussed.

The counter is implemented in a FPGA but all results are equally true in an ASIC design. Elements like input buffers, interconnections, and special clock lines exist in all implementations for logic circuits. The differences are in the absolute values which can be acquired in any technology by back annotation in the workbench used.

At ﬁrst the 4-bit binary synchronous counter with carry and reset is implemented as a Moore ma- chine. The carry signal is generated by decod- ing the state of the counter. The result will be compared with the second design (which is a Medvedev machine). Medvedev machines use ﬂip ﬂops for every output of the circuit. This means that an additional ﬂip ﬂop must be used for the carry output. It will be demonstrated that the speed of a circuit can be enhanced if additional hardware is used.

For the Moore machine it is expected that the delay time from clock edge up to the change of the value of the carry output is longer.

Functional simulation of both counters

It is shown in ﬁg. 11.19 that the signal CY is decoded with an AND gate. The change of CY will occur after the outputs of the ﬂip ﬂops have changed. In contrast to this, in ﬁg. 11.20 CY is the output of an additional ﬂip ﬂop. Therefore, all outputs will change synchronously.

In this case, unit time delay of 1 ns is used. Functional simulation of the Moore machine shows 2 ns time delay between clock edge and output change of the signal CY (ﬁg. 11.21). The 2 ns add up from 1 ns delay for the output change of the ﬂip ﬂop and 1 ns for the gate delay of the AND gate. Opposed to this, only 1 ns time delay can be observed in the Moore machine (ﬁg. 11.22).

Post-layout simulation: Moore machine

Post-layout simulation in XILINX foundation soft- ware is called timing simulation. The delay time ‘minimum delay’ was selected for this simulation. The output buffer has 8.5 ns time delay.

For the Moore machine in ﬁg. 11.23 there is a delay of 20.2 ns from the positive clock edge to CY. The Medvedev machine shows 16.5 ns (ﬁg. 11.24).

What is the reason for the different results in functional simulation and in post-layout simulation? In post-layout simulation extracted values for the capacitance and resistance of interconnections are observed as well as the fan out of the logic elements. Additionally delay times of buffers have to be taken into account for the real circuit in the IC because all signals of the counter are connected to the pins of the FPGA.

All timing values of delay paths are listed in detail for the simulation (list 11.3, list 11.4). For the Moore machine the delay time is 4.016 ns +16.204 ns = 20.22 ns and for the Medvedev machine the delay time adds up to 4.023 ns + 12.486 ns = 16.509 ns.

Another model of the circuit is used to discuss delay times. In the block diagram of ﬁg. 11.25 the inputs and outputs of this model are depicted in order to illustrate the path delays calculated in what follows. Every element that inﬂuences delay time is shown graphically in a diagram (ﬁg. 11.26). Delay time for the path FFCLOCK → CLOCK is subdivided in three parts

• the delay time from input pad to the output of the input buffer;

• the delay time of the connection between the output of the input buffer and the input of the output buffer;

• the delay time from input of the output buffer to the output pad.

In the simulation the delay time is calculated as 15 ns (list 11.5).

In detail, the delay time for the path FFCLOCK →

Cy consists of:

• the delay time from input pad to the output of the input buffer;

• the delay time of the connection between the output of the input buffer and the clock input of the ﬂip ﬂop;

• the delay time from clock edge to the output of the ﬂip ﬂop;

• the delay time of the connection between the output of the ﬂip-ﬂop and the input of the LUT;

• the delay time of the LUT;

• the delay time of the connection between the output of the LUT and the input of the output buffer;

• the delay time from the input of the output buffer to the output pad.

In simulation a delay time of 20.2 ns is calculated. It has been shown in detail how delay times are simulated and how they are affected by the various elements of a circuit. The comparison of both counters in table 11.5 shows that the gate count using gate equivalent is 60 for the Medvedev ma- chine and 54 for the Moore machine.

4-bit counter as ASIC

Both counters will be implemented in an ASIC in order to illustrate differences in speed. Addition- ally some properties of the technology used are demonstrated. Furthermore some features of the workbench are shown:

• Technology: Alcatel;
• Mietec 0.5 μm;
• Library: MTC35000;
• Workbench: Mentor Design_Architect Version C1;
• Model Technology Modelsim Simulator Version 5.1g.

Logically the schematics of the ASIC implementation remain unchanged (ﬁg. 11.27, ﬁg. 11.28). In principle, simulation shows very similar results if the differences of the absolute values of delay times are not taken into account (ﬁg. 11.29).

Signals of internal nodes are designated by Modelsim with special names. They are listed together with the signals in table 11.6 for the Medvedev machine to simplify the interpretation of the timing diagram.

There is no tool for listing the details of the delays for individual elements which is equivalent to the one used in the FPGA. Individual delay times can be extracted from the timing diagram and the list must be created manually.

Note that no input and output buffers are included in this simulation. The delay time from clock edge of FFCLOCK to output Q0 is 1500 ps (ﬁg. 11.29) in the Medvedev machine. There are no delay times included in the simulation, the delays of the input and output buffers which are 425 ns and 420 ns, respectively, are not included either. The path delay from the input pin of the clock to the output is 425 ps+1,500 ps+420 ps = 2,345 ps.

This demonstrates the high speed of an ASIC.

If the simulation results of the Moore machine implemented in an ASIC is analyzed the results are similar. In this case path delay from input pin of the clock to the output is 420 ps + 1,496 ps + 658 ps + 420 ps = 2,994 ps (ﬁg. 11.30).

Search This Blog

Electronic Design Automation (EDA)