Skip to content

Frequency Scaled Router

guofengbupt edited this page May 27, 2016 · 6 revisions

Table of Contents

Frequency Scaled Router

Frequency Scaled Router is an energy efficient router that is able to dynamically adapt its routing capability in response to real-time traffic load, achieving energy proportional routing. A green prototype router that can operate in five different frequencies is built based on the NetFPGA reference router.

Project Summary

Status : In Progress

Version : 1.0

NetFPGA base source : 2.2.0

Download

  1. Download and install the project at the following link.

The Architecture of Reference Router and Frequency Scaled Router

Reference Router

The NetFPGA reference pipeline is designed in a modular style. In the pipeline, each stage is a separate module, which enables developers to design and implement their own project without starting from scratch. New functions like energy efficient mechanisms can be integrated by adding custom modules or by modifications to the existing modules.

As shown in Figure 1, the NetFPGA reference pipeline consists of multiple modules including eight receive queues, eight transmit queues and the user data path. Both receive queues and transmit queues are divided into two groups: four Media Access Control (MAC) interfaces and four Central Processing Unit (CPU) Direct Memory Access (DMA) interfaces. The receive queues receive packets from I/O ports such as the Ethernet ports and the Peripheral Component Interconnect (PCI) over DMA, while the transmit queues send packets out of the I/O ports instead of receiving.

The pipeline in the user data path is 64 bit wide and all the internal module interfaces use standard request grant First-In-First-Out (FIFO) protocol. In the user data path, the Input Arbiter module decides which receive queue to service next, and pulls the packet from that receive queue and hands it to the Output Port Lookup module. The Output Port Lookup module decides which port a packet goes out of. After that decision is made, the packet is then handed to the Output Queues module which stores the packet in the corresponding output queue and sends the packet out of the output queue when the corresponding transmit queue is ready to accept the packet for transmission.

Frequency Scaled Router

In the RR, the two SRAMs use the same clock as used by the core logic FPGA processor for writing and reading data, to ensure transmit queues could transmit data with little or no delay between packets. An inbuilt register sets the operating frequency of the core logic FPGA to either 125 MHz or 62.5 MHz. However, due to the clock synchronization between the SRAMs and the core logic FPGA, the RR cannot switch the operating frequency of the core logic FPGA on the fly. When toggling the operating frequency of the RR between 125MHz and 62.5MHz, the frequency switching causes a board reset to restart the SRAMs and the core logic FPGA hardware with updated synchronous frequency. The board reset involves remirroring and reloading MAC addresses, IP addresses, routing table, and ARP table into the core logic FPGA hardware, which takes approximately 2 ms. All the buffered packets are lost during the board reset.

To eliminate the board reset problem, a custom module of asynchronous FIFO (AFIFO) is inserted between the SRAMs and the core logic FPGA. The AFIFO allows safe data exchange between the SRAMs clock domain and the core FPGA clock domain, where the two clock domains are asynchronous to each other. As shown in Figure 2, the AFIFO module can isolate the SRAMs alone and keep them running at 125MHz constantly, while the operating frequency of the core FPGA can be tuned among allowed frequencies in response to actual traffic processing needs.

In the RR, power consumption at these two frequencies (125 MHz and 62.5 MHz) with different numbers of active ports (0 to 4) for different aggregated traffic loads (400 Mb/s to 4 Gb/s) and different packet sizes (140 Bytes, 531 Bytes and 1470 Bytes) are reported in our previous work. To quantify the power savings from energy proportional techniques, the FSR is developed to provide the core logic FPGA with five operating frequency options (125 MHz, 62.5 MHz, 31.3 MHz, 15.6 MHz, and 7.8 MHz). This is accomplished by integrating custom frequency division modules into the digital clock manager (DCM) available on the core logic Virtex II FPGA. The custom frequency division modules in the DCM provides advanced clocking capability which can generate new clock frequencies by dividing source clock frequency with allowed divisors. In comparison with the RR, the three additional operating frequencies (31.3 MHz, 15.6 MHz and 7.8 MHz) in the FSR are derived from the source clock 125MHz by simultaneous frequency division with three custom divisors (4, 8 and 16).

Frequency Control Policies for Dynamic Frequency Scaling

The dynamic frequency scaling on the FSR are implemented by reading and writing to the relevant memory-mapped I/O registers introduced in the FSR, which adaptively control the operating frequency in response to the instantaneous traffic load. Since the NetFPGA packages (NFPs) releases 2.0.0, a register system is introduced in the reference pipeline. The registers in the register system can indicate the status information and set the control signals for each separate module.

As shown in Figure 3, statistics monitoring and preset thresholds are adopted in the design of frequency control policies of the proposed energy efficient FSR. The statistics monitoring is designed to indicate the current core logic frequency, the total number of bytes received from all receive queues (byte_counter_received) and the total number of bytes dropped from all output queues (byte_counter_dropped) in a certain sampling period. The sampling period is extremely sensitive, because it directly determines the delay between the request for a new frequency and the actual frequency transition. Experimental results indicate that 10 ms is a reasonable interval time for the sampling period and is consistent with that used in implementing ALR. The preset thresholds are adopted to divide the routing capacity into five grades in response to the incoming traffic load in the previous 10 ms sampling period. Different frequency control policy may have a different set of rules for setting the thresholds.

Clone this wiki locally