Gate Level Simulation - Various Issues

View this thread on: d.buzz | hive.blog | peakd.com | ecency.com
·@azarmadr3·
0.000 HBD
Gate Level Simulation - Various Issues
The typical RTL-to-gate-level-netlist flow is
![gls_flow_chart.png](https://images.ecency.com/DQmVak1ayUgQ1UiucFkDZ4RJtLhuRfMZQQxfp77dSYTvhBj/gls_flow_chart.png)

GLS verification:-
- Verifying netlist which is in the form of gates
- TB setup of RTL will be modified upon
- done on
  - DC - Synthesized netlist *Design Compiler*
  - PD - Place and Route netlist
    - design close to a representation of the actual geometries we manufacture

Why do GLS?

verification| GLS
---|---
__RTL__ | clock and reset circuits bring-up<br>PLL clash with digital in conjunction to timing and delays<br>Realistic power analysis with delays<br>CDC and Synchronization checks
__LEC__ |Done to check RTL simulation vs Synthesis equivalence<br>DFT verification, since scan-chains are inserted after RTL synthesis<br>[Clock-tree synthesis](https://www.vlsiguide.com/2018/07/clock-tree-synthesis-cts.html)<br>For switching factor to estimate power<br>Analyzing X state pessimism or an optimistic view, in RTL or GLS
__STA__| Dynamic Timing behaviour of chip logic<br>The inability of STA to identify asynchronous interfaces<br>Static timing constraint requirements, like false and multi-cycle paths<br>Verifying system initialization and that the reset sequence is correct

GLS challenges:-
- Long compilation time with increasing netlist and SDF Annotations
- Shrinking Technology nodes:
  * 90nm -> 65nm -> 45nm
  * Different nodes often imply different circuit generations and architectures
  * Generally, the smaller the technology node means the smaller the feature size, producing smaller transistors which are both faster and more power-efficient
- Complexity of design reflected in SDF timing annotations
- Large Compile DB sizes which translate to
  * large Machine Memory requirements
  * frequent simulation hangs
- Longer simulation runtimes extending upto *3-4 days or even a week*
- GLS Debug is cumbersum and time consuming
  * false failures due to `x-propogation`

GLS setup requrements:-
- Netlist design
- Technology library models *gates, flops, HardMacro(IC in block form)*
- SDF file for timing annotations
- Non-resettable flops list *from STA engineer*
- Special models for Analog/Mixed Signals like __PLL__ *Phase-locked loop*

GLS steps:-
- `'GATESIM` switch to instantiate Netlist DUT
- SV environment for GLS
  - adding `'GATESIM` wherever we refer RTL hierarchy
    - optimized by the sysnthesis tool
  - Testcases, asserstions and memory paths differ with RTL
- Changing paths for preloading memories (RAM)
  - Compiled C code, *.hex form*, should be loaded into from where the processor starts fetching data
    - On-chip RAM, DDR RAM, Code-RAM, Data-RAM
    - hierarchy of these memories may change during sysnthesis
- SDF Annotations
- adding Force file
  - .do files prepared by GLS engineer from a list of non-resettable flops *STA engineers*
  - contains syntax to initialize non-resettable flops and memory outputs
  - force `-feeze` and `-deposit`
    - `freeze` will fix the value of a register throughout the simulation
    - `deposit` will just intializing allowing it to be driven later by the design
  - can be used to address simulation hangs by forcing some flop values
    - `x` propogation from memory models can be identified by forcing `NO_CORRUPT` bit on memory
    - `clock` enables can also be forced

GLS Testcase Selection:-
- we don't run entire RTL regression, *because*
  - GLS is time consuming
- must involve boot up or initialization
- atleast one testcase for each block in the design
- clock checking and source switching
- clock frequency scaling
- asynchronous paths in design
- to check entry/exit from different modes of the design
- Dedicated tests for timing exceptions in the STA
- Patterns covering multi clock domain paths in the design

For example

SoC Verification | GLS Verification
---|---
CPU's <br> Power Manager <br> Peripherals <br> Buses <br> Multi-media SubSystem etc.. | Booting <br> Cache access with MMU enabled <br> Interrupt handling (Generic Interrupt controller) <br> Power collapse entry and exit
CPU's total testcases might comprise of 200-300 testcases | number of tests shrink to 20-30
testcases will target different features of the CPU's | Peripherals may have one testcase each *likeUART, I2C, SPI*

GLS types:-
- Zero / Unit delay simulation *no Annotations*
  - early clean-up of netlist
  - can be done in the early design cycle
  - before SDF is ready or annotations added
  - checks reset sequence and system boot
  - checks if there are issues due to scan insertion
  - can identify zero delay loops and thus race conditions into the design
  - faster than SDF sim
- Timing simulation with __SDF Annotations__ *behaviour simulating real Silicon(Si)*
  - SDF provides timing delay values for all cells and interconnects
    * min: best delay for ideal Process Voltage Temperature *PVT* conditions
      - for hold timing checks
    * max: worst case timing delay
      - for setup timing checks
    * typical: intermediate of min & max

## GLS Issues

### TB compile time Setup issues
    - due to design optimization by sysnthesis tool
    - nomenclature of hierarchies and signals may change
      - GenVar
      - Bus signals getting bit blasted
      - change/removal of signals due to optimization

### X-propogation
- X optimism refers to how simulations may incorrectly exhibit determinate behaviour even when inputs to logic are X and have indeterminate value.
  - It can be dangerous and can mask real RTL bugs
  - `posedge` and `negedge` from `x and z`
- `x` pessimism: drives `'z` or `'x` even in determinate situations where having `x` is not ambiguous
  - mux example with sel as `x`
- Causes:-
  - Library models:-
    - Each gate has a corresponding library model containing its functionality
    - models might have `specify` blocks which might add delays or use `#delay`
    - they prevent x to updated or initialized in a given timing event
  - Design
    - non-resettable flops can't break from `x-propogation`
    - clock divider example where w/o reset can't break x and with reset can't generate the clock
  - Synthesis
    - resets might be boggled by logic
    - Incomplete Logic Optimization Interferes With Reset
  - SDF

### Zero delay Simulation issues- Zero delay timing
#### Zero delay loops
- Combinational gate loops formed in netlist
  - Adding small delay will break these loops
  - tool dependent command-line options to generate logs
    * `+vcs+loopreport=100000`
    * `+autofindloop` in questa
- [Scheduling semantics](https://verificationguide.com/systemverilog/systemverilog-scheduling-semantics/)
  - circular logic with continuous assignments, either with `forever` or `always_*` blocks or `assign` statements
  - multi-bit circular logic can be evaded during sysnthesis
  - zero delay simulations do not have zero delay loops because there a netlist of gates is formed where timing of each gate is specified need to check
    * Ex:- circular logic with `forever` loop
    * Ex:- circular logic with `always_comb`
    * Ex:- circular logic in netlist gates
- Fixing requires adding delays in gate paths
  - can be done using SDF annotations
  - `.tfile` will add delays without changing design code

### Functional bugs with Netlist
#### Multi driven signals
- FV(Formal Verification) will catch it
#### Glitches in simulation
- generally in clock
#### Unconnected ports
- `'hz` *(default value)* causing `x`-propogation
- caught in FV
#### Simulation hang issues
#### Additional functional issues, *found during debug*

### Simulation hang due to TB/model issues
  - Memory data getting corrupted - *initialization*
  - Memory data out signal `X`
  - `x`-propogation from TB
  - Non-resettable flops causing x-propogation

### SDF simulation issues
discrepencies b/w SDF file and design chages should trigger SDF file regeneration netlist and SDF cells and nets names must match

### Wishbone debugging:-
- `wb_ack_o` is used to validate the transaction.
- `udp_dff` was the cause of the problem. `udp` is User Defined Primitive
👍 , , , , , , , , , , , , , , , ,