Data Path - Biological Sequence Alignment

February 24, 2012

The control logic implementing the sequence alignment algorithm is described here.

DATAPATH

It is assumed that the scores of alignment are already present in a scoring matrix. The scoring matrix is implemented using a RAM memory. The logic for calculating scores is not implemented. It can be interfaced with the memory so as to write into it. It is designed to give the values of f(ai , bj) for a particular i (row) and all j (columns). In a single read operation it ouputs 8 bytes of data.

The scores along with the previous table entries are used to calculate all the w[j] values in parallel. This is done by the w_calculation module. This module receives 8 bytes from the scoring matrices corresponding to f(ai , bj) for the respective row and 8 bytes from the previously calculated row from the table namely T(i-1,j) for all values of j. It calculates the following.

w[j] = max ((T[i-1,j] + f(ai , '-'), (T[i-1,j-1] + f(ai , bj))
z[j] = w[j] – y[j] for all 8 values of j. f(ai , '-') is assumed to be constant '-1' and y[j] represents the zeroth row of the table and is assumed to be constant (-1 to -8). The output of the w_calculation module is 8 bytes of z which is used to calculate the table entries for the current row.

(aargh clarity vs boundaries - clarity wins)

The 8 bytes of z[j] are fed into 8 nodes of parallel prefix module. In 3 clock cycles the 8 bytes of x[j] are calculated in parallel. The operation performed is
x[j] = max((w[j]+gj), (x[j-1])

Once Z[j] values are known, the values of z[j] are added with y[j] which is stored in a separate RAM. 8 signed adders are used to perform the operation T[i,j] = x[j] – g[j] in parallel.

The result is stored into a Table Matrix which contains the optimized scores of alignment excluding the zeroth row and column. The table matrix is implemented as a RAM. It stores 8 rows each consisting of 8 bytes each corresponding to the optimal alignment. Trace back can be done by reading through the table. Trace back has not been implemented.

The final block diagram of the digital circuit is given here.

Search This Blog

Poly SPARC

Data Path - Biological Sequence Alignment

Comments

Post a Comment

Popular posts from this blog

Setting up atalanta ATPG to work on Linux

Sparse matrix - 3 tuple representation

Generating 16k ROM using Xilinx IP COREGEN and simulating it in modelsim