数字滤波器设计译文原文_毕业设计论文(编辑修改稿)内容摘要:

2, and 3 full adders, the two independent 4bit functions were used to generate the sum and carry outputs. We can easily include the AND gate in the CLB just by replacing, for example, X with(xi and ai) when configuring the CLB. The horizontal inputs(xi,ai) can use the horizontal longlines which are associated with each row for distribution of the signal with a very short routing delay. Other interconnections can be made using the singlelength or doublelength lines via ProgrammableInterconnection Points (PIP) or switching matrices.. Adder ImplementationIn the XC4000 series, each CLB includes highspeed carry logic that can be activated by configuration. The two 4input functionGenerators may be configured as a 2bit adder with builtin hidden carry that can be expanded to any length. The 16bit adder in our MAC unit, which uses the dedicated carry logic, requires nine CLBs. The middle 14bits use 7 CLBs, one CLB is used for the MSB, and one is used for the LSB of the adder. For each CLB in the middle section, the F function is used for lowerorder bit and the G function is used for higherorder bit. Obviously, we need to use the G function for the LSB bit and F function for the MSB bit. In the case of the LSB CLB, two values must be input on the G1 and G4 pins. The carry signal enters on the F1 pin,propagates through the G carry logic, and exits on the COUT pin. The F function of this CLB is not used and can be used for other purposes. For the middle CLBs, the logic is configured to perform a 2bit addition of A+B in both the F and G functions,with the lowerorder A and B inputs on the F1 and F2 pins, andthe higherorder A and B inputs on the G1 and G4 pins. The carry signal enters on the CIN pin, propagates through the F and G carry logic, and exits on the COUT pin. For the MSB CLB, the two values must be input on F1 and F2 pins. The carry signal enters on the CIN pin, propagates through the F carry logic, and exits on the COUT pin. The G function generator of this CLB is used to access the carry out signal or calculate a two’s plement overflow.The limitation of using this builtin carry logic is that the carry out (COUT) pin of a CLB can only be connected to the carry in (CIN) pin of the CLBs above or below. Thus the adder using fast carry logic can only be configured vertically in the array.The dedicated carry circuitry greatly increases the efficiency and performance of adders. Conventionalmethods for improving performance such as carry generate/propagate are not useful even at 16bit level, and are of marginal benefit at longer our case, the 16bit adder has a binatorial delay of only ns.. MAC ImplementationWe use the most significant 8 output bits of the multiplier as the input to the low order bits of the adder. The 8bit input of the adder is signextended and added with previous outputs using two’s plement addition.The basic structure of the MAC unit can use pipeline registersbetween the multiplier and accumulator to increase the flipflops in the CLBs are used as pipeline registers and hence noadditional CLBs are needed.The layout of a single MAC unit on an XC4000series part isshown in Figure 5.The performance of the MAC unit with an 8bit by 8bit multiply and 16 bit accumulator is determined by the speed of the multiplier. The worst case multiplier delay reported is approaching 100 ns. The MAC unit can thus support a clock speed better than 10 MHz. With the use of the horizontal longlines to distribute the critical path signals, the speed can be further improved,although this may restrict the use of the MAC unit in various system configurations. The implementation of a MAC unit on an XC4000series part requires 73 CLBs. FILTERS. Filter StructuresThe transfer function of an N tap FIR filter is given by This structure can be realized in many ways, such as the canonical form, pipelined form, and inverted form as depicted in Figure 6. . High Performance Filters on FPGAsThe inverted form shown in Figure 6(c) is wellsuited for achieving a high sampling rate even for higher order filters. This is possible because the throughput does not depend strongly on the number of taps due to extensive pipelining. The fact that the multipliers occupy a large area, however,might render the implementation of higher order filters impractical.It has been shown in [2] that a high performance FIR filter with substantial number of taps can be implemented on FPGAs by approximating the filter coefficients to a sum or difference of two poweroftwo terms. Implementation of digital filters may be simplified by using only a limited number of poweroftwo terms so that only a small number of shift and add operations is required. A variety of techniques have been proposed [15, 16] to minimize the deterioration of the frequency response due to these constraints. Such coefficient optimization techniques yield performance sufficient for most practical applications.. Moderate Performance Filters on FPGAsWhen the size of the chip is a constraint, the arithmetic resources need to be shared at the expense of speed. The structure shown in Figure 7 is suitable for sharing of arithmetic resources. This is a multiply/accumulate (MAC) unit with four multipliers and an adder tree. The inputs and the corresponding filter coefficients are fed to the MAC unit as shown in Figure 7. With the insertion of pipeline registers, the clock speed is increased. The delay in the multiplier is greater than that in the adder and hence the clock frequency is dependent on the delay in the multiplier. As there are four multipliers in this MAC unit, summation of four terms isputed every clock cycle. Hence a four tap filter can be made to operate at a sampling rate equal to the clock rate, and an eight tap filter to operate at a sampling rate half that of the clock rate.In general, if there are M multipliers in a chip and if the delay in the multiplier is Tsec, then a。
阅读剩余 0%
本站所有文章资讯、展示的图片素材等内容均为注册用户上传(部分报媒/平媒内容转载自网络合作媒体),仅供学习参考。 用户通过本站上传、发布的任何内容的知识产权归属用户或原始著作权人所有。如有侵犯您的版权,请联系我们反馈本站将在三个工作日内改正。