The internal logic within a given quadrant can also drive RCLKs to create internally generated regional clocks and other high fan-out control signals. Regional clocks cover single quadrants of the FPGA. It is a good option for routing global reset and clear signals or routing clocks throughout the device. This clock region has the maximum insertion delay when compared with other clock regions, but allows the signal to reach every destination in the device. Cyclone V I/O elements (IOEs) and internal logic can also driveGCLKs to create internally-generated global clocks and other high fan-out control signals, such assynchronous or asynchronous clear and clock enable signals. The GCLKs serve as low-skew clock sources for functional blocks, such as adaptive logic modules (ALMs), digital signal processing(DSP), embedded memory, and PLLs. The Cyclone V devices contain the following clock networks that are organized into a hierarchical structure:Ĭyclone V devices provide GCLKs that can drive throughout the device. Each MLAB supports a maximum of 640 bits of simple dual-port SRAMĪ diagram summarizing the ALM, and more ALM Detail.Ī specific ALM configuration dumped from the Quartus Chip planner interface: Detail example. To do this, each ALM LUT is re-purposed as RAM. You can configure each ALM in an MLAB as a 32 x 2 memory block, resulting in a configuration of 32 x 20 simple dual-port SRAM block in one MLAB. The output of the carry computation is fed to the next adder using adedicated connection called the shared arithmetic chain. Each LUT either computes the sum of three inputs or the carry of three inputs. This mode configures the ALM with four 4-input LUTs. The ALM in shared arithmetic mode can implement a 3-input add in the ALM. The final carry-outsignal is routed to an ALM, where it is fed to local, row, or column interconnects.
Carry chains can begin in either the first ALM or the fifth ALM in a LAB. The carry chain provides a fast carry function between the dedicated adders in arithmetic or sharedarithmetic mode.The two-bit carry select feature in Cyclone V devices halves the propagation delay of carry chains withinthe ALM.
The ALM in arithmetic mode uses two sets of two 4-input LUTs along with two dedicated full adders.The dedicated adders allow the LUTs to perform pre-adder logic therefore, each adder can add the output of two 4-input functions.
In extended mode, if the 7-input function is unregistered, the unused eighth input is available for register packing.Functions that fit into the template, as shown in the figure, often appear in designs as “if-else”statements in Verilog HDL code. There is also 4 bits of (optionally) registered output. The ALM can support certain combinations of completely independent functions and various combinations of functions that have common inputs. Up to eight data inputs from the LAB local interconnect are inputs to the combinational logic. Normal mode allows two 4-bit logic functions to be implemented in one ALM, or a single function of up to six inputs. Longer distance connections are handled by row/column connects which trade off speed and distance.Įach ALM can be configured in several ways. Neighboring LABs, MLABs, M10K blocks, or digital signal processing (DSP) blocks from the left or right can also drive the LAB’s local interconnect using the direct link connection. Ten ALMs are in any given LAB and ten ALMs are in each of the adjacent LABs.The local interconnect can drive ALMs in the same LAB using column and row interconnects and ALMoutputs in the same LAB. Zooming in to one column of the LAB structure shows some of the interconnect structure which connects ALMs within a LAB and connections between LABs.Įach LAB can drive 30 ALMs through fast-local and direct-link interconnects. Pale blue for unused ALMs, dark blue for ALMs in use.
Another view of the fabric is a screen dump from the Quartus chip planner which shows the column structure color coded for block type. The column structure mixes LABs, DSP, and M10k memory for fast, hopefully efficient, routing.