Sunday, March 27, 2011

OPENPICIDE (simulator for picoblaze) : beginners tutorial

(Continuation of the tutorial which describes the setup of the software in ubuntu)

How to open a new project
1. Project -> new project
2. In the window that pops up
version -> fil as per requirements
processor -> XIlinx picoblaze family
Processor (tab) -> Picoblaze 3 (Note : Memory bank size = 1024 instruction. Rest is the default)
compiler (tab) -> Entity name : prog_rom (Name of required output vhd/v file)
VHDL template file : /location_of_kcpsm3/Assembler/rom_form.vhd
Verilog template file : /location_of_kcpsm3/Assembler/rom_form.v
sources -> if you have the assembly program ready put it here
applicaiton -> editor options set as per user requirement

NOTE IMPORTANT while writing the program
1. Certain syntaxes are different
need to write a script to automate conversion.
2. "constants" syntax is different - yet to find out the new syntax
(kindly let me know if you find any other syntax differences)

1. In the openpicide window choose
picoblaze -> check syntax (or) click F7

1. In the openpicide window choose
picoblaze -> Enable Simulator

1. In the openpicide window choose
picoblaze -> compile to VHDL / compile to verilog

BUGS IN SIMULATOR : window size is tooo big.

for a sample test program CLICK HERE

Abishek Ramdas,
NYU Poly

Monday, March 21, 2011

Transition Fault Testing using pattern shifting - The sequential Pseudocode

Program - Transition fault testing using pattern shifting
Serial version of the program is described here in detail
This is the program i had written for testing delay transition faults. I am trying to parallelize this program.

slow to raise and slow to fall patterns are to be tested.
to test a slow to rise fault- apply a sa1 pattern (for a modified netlist) followed by a sa0 pattern (for unmodified netlist)
to test a slow to fall fault - apply a sa0 pattern (for a modified netlist) followed by a sa1 pattern (for unmodified netlist)

                     pattern 1   pattern 2
 Slow to rise    sa1         sa0      
 Slow to fall    sa0         sa1   

pattern 1 - patterns for modified netlist
pattern 2 - patterns for unmodified netlist

Testing for faults
A. fault pattern generation
1. For each wire - Slow to rise/fall (s2r,s2f)
1. Generate pattern1 and pattern2.
2. for each pattern 1
for each pattern2
shift pattern1 by 1 bit and compare with each pattern2.
if this shifted pattern is compatible with a pattern2 then merge them together, the slow-to-rise/fall fault is detected.
if none of patterns in pattern2 match with pattern1, get the next pattern in pattern1 and repeat the process.
3. If all pattern1s are exhausted and still the fault is not detected then mark the wire as undetected.
4. get the next wire and repeat the process.
B. Compaction
2. In the end of the day you have patterns for wires whose s2r and s2f faults are detected in a file, compact the faults to reduce the number of test patterns.
Read all the patterns in the output file into an array
for each pattern
while patterns exist
compare pattern with the patterns succeding it,
if compatible replace the pattern with merged pattern and continue comparing

COMPARSION OF PATTERNS - shift comparison
1. Two sets of patterns
2. Get a pattern from set 1
3. shift the pattern by 1 bit
4. Compare the pattern 1 with all patterns in set 2
5. When a match is found, fetch next pattern from set 1 and repeat the steps

Comparison of patterns - compaction
1. Two sets of patterns
2. Get a pattern from set 1
3. compare wiht a pattern from set 2
4. if compatible, merge and continue comparison

The program flow
1. Calls perl to create a file "circuit_name.sa0.flt" that has all the wires in a standard stuck at 0 representation.
2. For each wire
1. Read a wire /0 from the above created file and create a wire /1 representation for that wire.
2. Modify the netlist - make the wire under consideration an output and add it in the end of the netlist - we have the modified netlist
3. Test for slow to rise fault - put the tested faults into a file (circuit.ttest)
4. Test for slow to fall fault - put the tested faults into a file (circuit.ttest)
5. Compact the faults present in circuit.ttest and put the resulting compacted patterns into the same file.

Test for slow to rise fault
1. open fault file and put "wire /1" into the fault file and close - (test for wire sa1)
2. call atalanta for the fault file - generates pattern1 faults into the default output file (circuit.test).
3. call file_parser().  - parse the default output file of atalanta (circuit.test) to read the pattern1s into a 2D array.
1. If fault patterns (pattern1s) are not generated, the fault is undetected, proceed to test for slow to fall fault for the wire
4. if pattern1s are generated, generate pattenrn2s
open fault file and put "wire /0" into the fault file and close (test for wire sa0)
call atalanta to generate pattern2s
call file_parser() to read the pattern2s into another 2D array.
5. Pattern1 and Pattern2s are available in 2D arrays.
6. For each pattern 1
For each pattern 2
call compare_test() in mode 0
if compatible, get the merged pattern and break - fault is detected
else continue.
7. If fault is detected, put the wire and the merged pattern into a file (circuit.ttest) and proceed to test for slow to fall fault
8. If fault is not detected then the fault is undetected, proceed to test for slow to fall fault

Test for slow to fall fault 
1. open fault file and put "wire /0" into the fault file and close - (test for wire sa0)
2. call atalanta for the fault file - generates pattern1 faults into the default output file (circuit.test).
3. call file_parser().  - parse the default output file of atalanta (circuit.test) to read the pattern1s into a 2D array.
1. If fault patterns (pattern1s) are not generated, the fault is undetected, proceed to test for slow to fall fault for the wire
4. if pattern1s are generated, generate pattenrn2s
open fault file and put "wire /1" into the fault file and close (test for wire sa1)
call atalanta to generate pattern2s
call file_parser() to read the pattern2s into another 2D array.
5. Pattern1 and Pattern2s are available in 2D arrays.
6. For each pattern 1
For each pattern 2
call compare_test() in mode 0
if compatible, get the merged pattern and break - fault is detected
else continue.
7. If fault is detected, put the wire and the merged pattern into a file (circuit.ttest) and proceed to test for slow to fall fault
8. If fault is not detected then the fault is undetected, proceed to test for slow to fall fault.

Compact the faults present in the output fault file (circuit.ttest)
1. call file_parser_ttest(). To read all the patterns into a 2D array
2. For each pattern (patternA) in the array
while next patterns(patternB) exist
compare patternA with the patternB.
If compatible, merge them together and store it in patternA.
else continue comparison with the next patternB.
if patternB are exhausted, get next patternA and repeat the compaction
3. when all patternAs are compacted, write the patterns into the output file (circuit.ttest)

1. A Call to a function written in perl. Reads the netlist and extracts the wires, appends with a /0 and writes into a fault file.

2. func:
void atalanta_call(char *)
Desc: Forks a new process which calls the atalanta tool. The parent waits till the child completes completes. Atalanta generates a file which holds test patterns for the fault under test.

3. file_parser()
Reads the file genereated by atalanta and stores the patterns into a 2D array- Parallelization possible

4. compare_patterns()
two modes of operation: mode 0, mode 1 -  mode 0 is used to shift compare (used in shift pattern testing), mode 1 is used to compare (used in compaction).
pattern matching techniques possible - look into it.

5. copy_files()
copies one file into another.

6. file_parser_ttest()
reads the circuit.ttest patterns into a 2D array used for compaction.

The complete program is explained here. The next step is analyse the program for parallelizability.
Things to be taken into consideration for parallelization.
1. Decomposition into tasks that can be run in parallel.
2. For the tasks identify dependencies, shared address space, message passing operations.
3. Take into consideration the granularity of the code executing in parallel.
1. Assignment of the tasks to different threads
2. Look into efficient utilization of cache.

Running a system level simulation of OpenSPARCT1 with ModelSIM

Running a system level simulation of OpenSPARCT1 with ModelSIM

Simulation of the opensparc is very important if you are going to understand its inner details. System level Simulation steps are given in the Design and verification guide chapter 6.7. The flow is to compile the libraries requried for simulation in ModelSIM and let the simulation scripts provided by the sparc people take care of simulating the system. Two sets of libraries are to be compiled. One for the ISE and the other for the EDK in the same order. The system requirements are

ModelSIM SE/PE v6.5 is required to simulate
XPS - 10.1.03 is required-
If you have xilinx 10.1 to update it to 10.1.03
1. open xps
2. help -> about
3. If version is 10.1
4. help -> xilinx update
5. if version is 10.1.03 then you are allset
RAM > 1.5 GB

Compiling the libraries for modelsim in ISE
source xilinx env variables
source modelsim env variables

1. open a new project in ISE
2. Choose the device and language parameters for the new project.
3. In the processes window click on the design utilities to unfold the options
4. Right click compiler HDL Simulation libraries
5. Choose properties and
language : verilog
compiled directory : no need to change default value (environment variables must be properly set)
Simulation path : ~/location to ModelSIM/modeltech/linux (since mine is a linux system. use win32 in windows system. automatically detected)
6. Open ~/location to modelsim/modelsim.ini and set "resoultion = ps" (chage write permissions if needed)

ERROR: unable to parse initialization file. Check if the
       file modelsim.ini is present in the current directory
       with read/write permissions
SOLUTION: Right click on the "design utilites" in the process window, choose properties,
 in properties, select language : verilog

First need to compile ISE libraries followed by compilation of edk libraries

Compiling libraries for ModelSIM in EDK
1. Make sure all the libraries for ISE are compiled
2. From the previous library compilation for ISE, you would have specified a "compiled directory" where all the libraries for ISE will be compiled. The default location is ~/*location to Xilinx*/*ver*/ISE/verilog/mti_se and ~/*location to Xilinx*/*ver*/ISE/vhdl/mti_se. Copy all folders inside these folders to ~/*location to modelsim*/modeltech/
3. open modelsim.
4. In the library window rightclick -> new -> library
5. Create -> a map to existing library
   Library name -> unisim
   Library path ~/*location to modelsim/modeltech/unisim
6. Do this for libraries unisim, simprim, XilinxCoreLib, unisims_ver, simprims_ver, XilinxCoreLib_ver
7. Open a new terminal (if reqd source modelsim and xilix xps env variables)
8. Type the following command :
compedklib -X ~/ModelSIM/modeltech/ -o ~/location_wer_u_want_them_compiled

Setting up simulation
after compiling libraries in EDK, open XPS
1. click on edit -> preferences
2. Make the EDK LIbrary point to the location where you compiled the EDK libraries
3. Make Xilinx library point to where you compiled the ISE libraries.
4. change the write permissions for /design/sys/edk/system.mhs and system.xmp to writable by user
   cd ~/OpenSPARCT1/design/sys/edk/
   chmod u+w system.xmp system.mhs
5. Open the Design and Verification guide (OpenSPARCT1/doc/OpenSPARCT1_DVguide)
6. Follow the instructions given in chapter 6.7

COMPXLIB is a tool for compiling xilinx HDL based simulation libraries with tools provided by the simulation vendors.
The libraries are compiled and put in
/home/location to Xilinx/10.1/ISE/verilog (or) vhdl

How do I compile simulation models for the EDK 10.1 design tools?

1. First make sure that "compxlib" has been run. The example below compiles all of the UniSim and SimPrim models:
compxlib -s mti_se -f virtex -l vhdl -o ~/test/xsim10libs

2. Now run "compedklib":
compedklib -X ~/test/xsim10libs -o ~/test/xsim6libs/edk_nd_libs

the compxlib runs fine but got the following errors in running the comedklib.

error while loading shared libraries: cannot open shared object file: No such file or directory
in the LD_LIBRARY_PATH environment variable include ~/location to xilinx/ISE/lib/lin
LD_LIBRARY_PATH = ~/path to edk/EDK/lib/lin:~/path to ise/ISE/lib/lin

/home/abishek/opt/Xilinx/10.1/ISE/bin/lin/unwrapped/xilperl: error while loading
shared libraries: cannot open shared object file: No such file or
ERROR:MDT - Running child process return failure status: 127!

in synaptic package manager searched for Berkeley DB Database Library Version 4.1. Got version 4.4 installed it.
create a symbolic link /usr/lib/ to /usr/lib/
sudo ln -s

Development system reference guide from xilinx

Abishek Ramdas
NYU Poly

Thursday, March 17, 2011

KCPSM Assembler and simulator for picoblaze

Hi :) (:
I was implementing a code in kcpsm assembly language for the picoblaze mc. Every time i had to test it, I had to dump it on the FPGA and look for the desired output. I dint have the time to search for a kcpsm simulator then so had to manage with the poor debugging capabilities and long dumping latencies for completing the assignment.

More over KCPSM assembler that comes with the package, runs only in a windows environment. so i had to use DOSEMU everytime to just compile the program. All this sucked big time. So i went in search of a simulator/assembler for linux and then there was OPENPICIDE

Open Pic IDE is open and supports linux windows mac and is downright amazing. Here i have included the steps to install and run openpicide in an UBUNTU system.

Simulation of Picoblaze assembly code
There are two tools to simulate picoblaze assembly language
1. openPICIDE
2. picoasm

Integrated assembler development environment (IDE). Provides the following components for developing assembler code
1. project manager
2. Assembler editor syntax highlighting
3. code parser
4. compiler
5. simulator

- Extract the files to required location
- Change into "build" folder
- run "cmake ../"
- run "make"
- sudo run "make install"
default installation directory is in /usr/local/openPICIDE/

Install cmake
1. sudo apt-get install cmake
2. you need KDE and QT4 development tools. find and install them using your package manager.

CMake Error at /usr/share/cmake-2.6/Modules/FindQt4.cmake:1432 (MESSAGE):   Qt qmake not found!
1. QT4 development tools are required to be installed
2. KDE libraries are to be installed if you dont have them already
3. In synaptic package manager. qt4-dev-tools - install them.

1. Assuming you are using bash shell, add the following lines to your ~/.bashrc
export PATH= $PATH:/usr/local/openPICIDE/v0.5.0/bin YOU NEED TO DO THIS STEP ONLY ONCE.
2. start a new shell and type - openpicide&


Monday, March 7, 2011

Parallelizing Transition fault testing algorithm


I am working towards parallelization of one of my own sequential program. The sequential program is used to generate test patterns for Delay testing. (Testing of the rise and fall delays in a circuit). It takes up a lot of time to run and is a massively parallel program. So parallelization is a good idea to achieve speed up. I gathered a few points on parallelization of the program from the book "Parallel programming for Multicore and cluster systems" by Thomas Rauber Gudula Runger, I present them here.

Aim: To parallelize the transition fault testing program so that it runs on more than one processor

Steps involved in parallelization

1. Decomposition of the computations :
GOAL : of task decomposition is keeping all the processors busy at all times.

a. Computations of the sequential algorithm are decomposed into tasks and dependencies between the tasks are determined.Tasks are the smallest units of parallelism.

b. Task may involve accesses to shared address space or may execute message passing operations

GRANULARITY :  computation time of a task. must take this into consideration when dividing the work into tasks. the granularity must be long enough to compensate for the scheduling overheads.

The decomposition step must find a good compromise between the number of tasks and their granularity.

GOAL: partition the tasks to achieve good load balancing results.

a. The flow of control executed by a single processor is a process or a thread
usually the number of processes or threads is the same as the number of cores.

b. To take into consideration, the number of memory accesses or communication operation for data exchange. EX- assign two tasks which work on the same data set to the same thread since this leads to a good cache usage.

c. SCHEDULING : the assignment of tasks to processes or threads is scheduling.
    static scheduling
    dynamic scheduling

3. Mapping of processes or threads to physical cores - mapping can be done by the operating systems supported by program statements. The main goal is get an equal utilization of the processors or cores while keeping communication between the processors as small as possible.

Next step is to analyze the program so as to implement the steps and make a detailed documentation. A job which is started properly is half done :)

Abishek Ramdas
NYU Poly

Saturday, March 5, 2011

My dad is an amazing poet. I am posting here a few of his Poems. Enjoy :)

A sense silently pervades.
of solace ?
the sense permeates.
of fulfillment ?
the deep crevices of my heart
is filled with a serene sensation.
solitude ?
achievement ?
i wonder...
words ebb as slight pink baubles
from the silent silvery
lake of my mind,
to burst forth on the canvas
of the terminal,
to converse with you
of something tangible to me.
a reaching is ma...

Truth and Lie

Truth and Lie

decisions decisions
choices that defines our lives
that defines our goals and our means
choices that define you.
what is good and what is bad
nobody really knows
what is right and what is wrong
both the definitions are lost
in the entangled webs of self righteousness and practicality.
none of them exist.
pretty pictures in the minds of individuals,
a hallucination for the begging soul to make believe.
there are choices then there is the truth
and there is the lie.
choose wisely between the truth and the lie
choose wisely who you are.



In life all things seem futile.
the work being done, the events that occur.
seems irrelevent to any condition
random acts that seem to arise
from the whim and fancy of the snake charmer.
we dance in awkward steps
to the barbaric noises from his drum

going around in circles
banality and platitude.
experiences, situations, exercises, tests
that seem to take us no where.
progress?? no.
hard work achieving what? no idea.
stagnant sardonic and even ridiculous at times

the pull is strong
to stop the bull shit
the utter futility in doing the job
the laziness
the lack of reason
the lack of motivation
sometimes it seems logical to just give up

but know that what you see are just the dots
unconnected dots spread around in space and time
irrelevent and even irritating to the untrained eye.
there is a grand picture that comes out of joining the dots.
a beautiful plan unfolding slowly
dots that can be connected.

so hang in there when the dots are being laid.
hang in there through the futility
the depression the terpidity.
work with acceptance. work through the skepticism,
and bless the credulous mind.
hoping some day we will get to see the beautiful picture
and all life makes perfect sense.

6T SRAM Cell design, Implementation and Testing

A lot of text books are available which tell us the logical operation of 6T SRAM Cell. But only a few deal with the actual calculations of the widths of the transistors. I took up the challenge to design a 6T SRAM Cell from scratch.

The detailed report of my work in building a 6T SRAM incorporating the read and write conditions can be found using the following link.

The schematic is tested using the test circuit shown in the circuit. Layouts are drawn using Cadence Virtuso. Was a great learning expeirence.

Abishek Ramdas
NYU Poly

Built In Self Test

Report on few techniques of Built In Self Test

It was required of me to read and write a report on Built In Self Test. Reading and understanding the mathematical implications were very exciting. You can find the report in the following link

Abishek Ramdas
NYU Poly

Setting up atalanta ATPG to work on Linux

Atalanta is an ATPG tool that is used to generate test patterns for combinational circuits. atalanta is a windows executable file. I am using Atalanta ATPG tool to generate the test patterns required for generating the Transition fault Test patterns

You need to have WINE installed to run windows executable binary.
Created a symbolic link in the usr/bin to the atalanta.exe, name of the link s atalanta. (not necessary if you do not have root privilages)

chmod u+x /place/atalanta/atalanta.exe
cd /usr/bin
sudo ln -s /place/atalanta/atalanta.exe ./atalanta

atalanta [option] filename
//put this in a shell script

Simulation of a single core of OpenSPARCT1

MODELSIM - simulation of the sparc single core

1. create a working directory
2. open modelsim and create a library inside the working directory
3. after creating library open modelsim.ini file in an editor
4. search for Voptflow variable and change its value to 0 (Voptflow = 0)
5. add all the files inside OpenSPARCT1/design/sys/iop/sparc/xst/sparc.flist into the working directory
6. add all header files from the Opensparc/design/sys/iop/include into this folder
7. vlog *.v
8. vsim sparc.v

The test benches are not known. I am in the process of identifying the test bench. :) Thought would dig into the EDK project into the MicroBlaze. One hell of a plan.

OpenSPARC Regression on NC Verilog, ModelSIM

A. Running the regression. 
Using NC Verilog simulator. core1_full regression. The core1 environment consists of one SPARC CPU core.

1. Download the OpenSPARCT1.tar.bz2 to the directory "/home/abishek/OpenSPARCT1" folder.

2. unzip and extract in the same folder

3. Set following environment variables in OpenSPARCT1.bash. comment out the rest

(create the folder specified by the path represented by MODEL_DIR variable.}
(if not running in solaris system)
(default path. change if required to change depending on the installation of NCverilog)
(if not running in solaris system)
PATH= (include only the necessary path variables, remove the rest)

4. Source the environment variables using
source OpenSPARCT1.bash

5. Depending on the system running the verification,  we have to create a symbolic link
cd $DV_ROOT/tools/env
ln -s Makefile.Linux.x86_64 Makefile.system (if it is a x86_64 cluster) what cluster are we using. mine is i686 system (uname -a)

5. change directory to the directory mentioned in the MODEL_DIR variable
cd /home/abishek/OpenSPARCT1_model

6. run the sims command to run the regression that is required (core1_full) in this case
sims -sim_type=ncv -group=core1_mini -novera_run -novera_build

7. Run the regreport command to get a summary of the regression
regreport $PWD/yyyy_mm_dd_ID > report.log

(the date on which the regression is done it is in the OpenSparcT1_MODEL directory created for this purpose)

B. Reference
README that comes along with the downloadables
OpenSPARC T1 Processor Design and Verification User's Guide
OpenSPARC T1 Processor External Interface Specification
OpenSPARC T1 Processor Datasheet
OpenSPARC T1 Processor Megacell Specification
OpenSPARC T1 Micro-Architecture Specification

NOTE :  Tried recompiling the libraries using mkplilib()  but ran into problems. I think the problems lie mainly with the 32 bit and 64 bit libraries and that I used debian linux and not opensolaris. So make sure when you are running the regression you use a Open Solaris system. 

Running Regression with ModelSIM - not solved but proceeds in the right direction. I feel would work on Open Solaris.

The SIMS command to be run
abishek@ubuntu:~/OpenSPARCT1_model$ sims -sim_type=mti -group=core1_mini -sim_build_cmd=/home/abishek/ModelSIM/modeltech/linux/vlog -sim_run_cmd=/home/abishek/ModelSIM/modeltech/linux/vsim -sim_build_args="-work /home/abishek/OpenSPARCT1_model/work" -sim_run_args=/home/abishek/OpenSPARCT1_model/work.cmp_top -novera_build -novera_run -novcs_run

Description - The simulation type is mti?
-sim_build_cmd - must point to the location of vlog compiler. this compiler is used to compile all the verilog files.
-sim_build_args -  contains the arguments that need to be passed to the vlog command.
-sim_run_cmd - used to open the vsim of MODELSIM
-sim_run_args -  arguments that are to be passed to the vsim command.

Error logs : solved
In different blocks the same error
ERROR:Calling task $error outsideof action block is illegal
REASON: the function $error is not defined in MODELSIM. the alternate to $error is $display. $error is defined in synopsis tools. $display function is defined in MODELSIM 
added the lines `define MODELSIM 1

2. Error: /home/abishek/OpenSPARCT1/design/sys/iop/srams/rtl/bw_r_rf16x160.v(646): Calling task $error outside of action_block is illegal.,filedetails?repname=sparc64soc&path=/sparc64soc/trunk/T1-common/srams/bw_rf_16x81.v&rev=2&isdir=0

`ifdef MODELSIM  
      $display ("sram_conflict", "conflict between read: %h and write: %h pointers", rd_a_ff, wr_a_ff);
      $error ("sram_conflict", "conflict between read: %h and write: %h pointers", rd_a_ff, wr_a_ff);
also add a `define MODELSIM 1 in the beginning of this file.

Abishek Ramdas
NYU Poly

OpenSPARC - Synthesis of OpenSPARC using Xilinx ISE - Abishek Ramdas

How to synthesize sparc.ngc (openSPARCT1) using Xilinx ISE IDE

use Xilinx 10.1. compatibility issues witn Xilinx 12.2

The procedure was discussed by "formal guy" from the xilinx forum. But there were a few more parameters that needed to be set so that Open Sparc can be synthesized using the ISE. The method described below is tested.

Here is a little more information on how to synthesize the T1 core manually from the ISE GUI. This is the procedure to follow if you can't use our automated scripts rsynp and rxil. (For exampe if you are on a Windows machine).

From the start menu, select ISE -> Project Navigator

From the Project Navigator GUI, select File->New Project ( the new project wizard may come up automatically if it is the first time you are running Project Navigator

Select a project name and the project path
Click next
The next window is Device Properties:
Select the correct Device, Package, and speed grade for your board.
Click next
The next window is Create new source.
Just skip this and click next
The next window is Add existing Source
Just skip this and click next
Finish the Wizard to create the project.

Now search for the following file which is a source list for the sparc core:

You need to select add sources and add each file from the file list to the project.

Note that you may need to copy files ending with .behV to a new name ending with .v so that ISE recognizes the files a Verilog files.

The next step is to set the compile time macros: To do this:
(make sure "sparc" is the top module)
1. Look for the ISE sub-window on the left labeled "Processes"
2. Find the "Synthesize-XST" entry in that window
3. Right click on that entry and select "Properties"
4. In the popup window "Process Properties - Synthesis Options,
select "synthesis options in the left window
5. Set the property display level to "Advanced"
6. In the right list scroll down to find the property
"Verilog Macros"
7. Type the value FPGA_SYN FPGA_SYN_1THREAD FPGA_SYN_NO_SPU FPGA_SYN_8TLB in that box.
8. now DESELCT "process->properties->xilinx specification options->add i/o buffers"

Finally Run synthesis:
From the Processes window, right-click on Synthesize-XST and select Run in the popup Menu.

Error Logs:
Error1 : cannot find verilog module sys.h
so adding all the library files - tlu.h, sys.h, sys_paths.h, lsu.h, iop.h, ifu.h, xst_defines.h from /design/sys/include

ERROR2:  cannot find bw_r_irf_register
maybe because of changing the top module. changing it back to bw_r_irf_register and running synthesis. synthesis of bw_r_irf_register completed successfully. changing the top module to sparc.v and trying again.

Synthesis Successful ran for more than half an hour.

Error3: Adding sparc.ngc into the EDK, the process fails with following "warnings"
INFO: NgBuild:889 - Pad Net '<something>' is not connected to external port in this design. A new port '<somethiing else>'is added and is connected to this signal

You need to resynthesise the sparc.ngc. Follow the steps given above and make sure you do step 8. DESELCT "process->properties->xilinx specification options->add i/o buffers"
Ignoring this step causes the sparc.ngc to be synthesised properly but fail during the EDK process.

Abishek Ramdas
NYU Poly

OpenSPARC - Beginners Introduction - Abishek Ramdas

Here I briefly describe the documents that are important if you are starting with opensparc for the first time. I have added the description of the important folders. 

If you are trying to do synthesis, regression, diag tests read the OpenSPARC DV GUIDE. Most important resource.

    1. README: file :This file must be read initially and specifies the values to the environment variables that are to be sourced.
    2. OpenSPARCT1.bash (or) OpenSPARCT1.csh : script : are the scripts where the environment variables are actually initialized. Depending on your shell, one of these files are modified and sourced.
    3. script :This script is used to initialize the locations of the xilinx tools that are required to run rxil and create NGC and V files. Commnet out lines that are not required by you.
    4. Rxil: script : it is a script that check the env variables set in the OpenSPARCT1.bash file and executes a perl script OpenSPARCT1/tools/perlmod/rxil1.2

Different folders description
NOTE: This description has a number of folders that are created after the successful running of rxil script. Only the documents that are required in rxil are mentioned here. Working details and interrelation of only a few folders are provided.

    1. OpenSPARC/design/sys/xst : folder : contains the xst files of the different devices supported by opensparc. It also contains a block_list file. This file has the different modules that are to be synthesised (OpenSPARC/design/sys/xst/block.list)
    2. OpenSPARCT1/design/sys/iop : folder : contains the modules that are speicifed in the block list.
        It also has a number of modules that are not specified in the blocklist. These modules are used by the modules specified in the blocklist for their implementation.
    3. OpenSPARCT1/design/sys/iop/*module in blk list*/xst : folder : example
        /home/abishek/OpenSPARCT1/design/sys/iop/ccx/xst :
        1. contains a folder in the name of the device that is selected ex XC4VLX200. This folder is the working folder for the module under consideration. It contains a copy of the flist file of the module. Flist files are described below.
        2. *module in blk list*.flist : ex ccx.flist : file : contains the list of modules that are utilized in the functioning of this module. Contains the relative locations of the verilog files of the modules.
        3. Device#.xst : file : this is the xst file of the device under consideration. It is created after running the rxil script successfully. It is copied from the above mentioned OpenSPARC/design/sys/xst folder and modified to suit the needs of the block that is under consideration.

Hope this was useful :)

Abishek Ramdas
NYU Poly

Detailed Description of RXIL of OpenSPARC - Abishek Ramdas

by Abishek Ramdas NYU Poly

         The RXIL1.2 version of the perl script is described here. This perl script is invoked by the rxil command followed by paramenters for xilinx synthesis. Visit the design and verification guide for details on running the rxil command for xilinx.

     The rxil command is located inside OpenSPARCT1/tools/bin/rxil. It is a bash script. There are different versions of rxil (perl script) if you look inside OpenSPARCT1/tools/perlmod/ (ex rxil,1.0 ,rxil,1.1 ,rxil,1.2). rxil bash script is used to find the latest version of rxil perl script and transfer control to that script. This bash script calls the  configsrch script to searcg for the latest configuration of rxil. ( it executes configsrch rxil /). It is also used to identify the machine and set the environment variable PERL5OPT according to the machine. This script then calls the current version of rxil(perl script) and transfers control.

         The files identifies the device and the list of all the blocks that are to be synthesised. It creates a work directory and creates NGC and verilog netlist files for different modules that are specified. The same procedure is followed for any device selected. The common errors are also noted and their corrective measures are provided.
         The file consists of two parts. “Run through the command line arguments” and “running part of rxil script”. The first part identifes the command line arguments that are given and follows a course of operation depending on the arguments that are specified. The second part is responsible for creation of
NGC and verilog files.


-device  if this parameter is used, the device xst file is checked foravailabitlity.
         The list of devices that are supported are (also depends on the Xilinx Synthesis Tool (XST) version on your machine, for our case we use 10.1.03.         There is some migration problem if you are using xilinx 12.2)
         XC4VFX100 XC4VLX200 XC5VLX110 XC5VLX220
         XC2VP70 XC4VFX60 XC5VLX110T XC5VLX155T XC5VLX330T
         the default device is XC4VLX200(specified in the rxil using variable $device)
         this list is available in OpenSPARC/design/sys/xst folder. The device.xst files are present.

         if the following error occurs
         ERROR RXIL : Device #device is not found !!!
         then check that you are using one of the devices mentioned above.
2. if the blocks are directly specified as arguments,
     Then blocks specified in the command line are pushed into a local array called block by the rxil script
         the modules that are specified in the command line are compared with the entries in
         OpenSPARC/design/sys/xst/block.list . if they are present in the block.list then the blocks are
         added into another array called block.list

         the different modules are present in the OpenSPARCT1/design/sys/iop folder. The different
         modules are
         analog ccx2mb common dram fpu iobdg pads rtl scdata sparc ccx cmp ctu efc include jbi pr_macro scbuf sctag srams

         of these the block.list contains only these modules namely these are the main modules that are
         to be synthesised. these modules use the above mentioned modules in its implementation.
         these modules utilize the verilog files found in the above mentioned modules of the iop folder.
         For example ccx uses verilog files from common, analog etc

         RXIL ERROR : No matching modules found.
         if you get this error make sure you have enterd the module names correctly. you can enter them
         either as given in the block.list or just module name. that is parameter srams/bw_r_icd and
         bw_r_icd is the same. the script takes care of unwanted white space characters and "srams/" of
         In the end all the modules that are specified in the argument list that match with those present in
         the block.list are added to an array @block_list
-h (-help)
                 If -h or -help is given as the parameter then help menu is opened and perl script is exited.
     If -all is the parameter specified then the file OpenSPARC/design/sys/xst/block.list is opened for reading.
         the modules that are specified in this file are loaded into an array called block_list

         RXIL ERROR: No Matching modules found
         make sure parameter when running rxil is specified as either -all or blocks from the
         OpenSPARC/design/sys/xst/block.list file. read 2 and 4 points

   1. The modules for synthesis are present in the array block_list.
   2. Each module has an xst folder
   3. Inside the xst folder of each module a directory in the name of the device that is selected is created.
   4. Each module has a flist file. the flist file indicates the verilog files that are utilized by the module and the header files. look into the flist file of the modules mentioned in the block_list.
   5. This flist file of each module is copied into the device directory that is created.
   6. The files that are specified in this flist are then printed out on the screen.
   7. Inside the device directory, a "xst" folder and a projnav.tmp folder is created inside the xst folder. A block.lso is ascii text is created and "work" is put inside the folder
   8. The xst file of the device selected is copied from OpenSPARC/design/sys/xst to the block_dir if it is not available in the device folder of the module. It is also modified to suit the conditions required for the particular block.
   9. This xst file is the virtex_file.the details of the virtex file are shown on the screen.
   10. The command 'xst_cmd'-ifn $vertex_file is executed on the particular device xst file that is choosen. ngc and v files are created using netgen_cmd and xst_cmd commands.
Rxil.1.2 perl script

   1. Knowledge on regular expressions in perl. Look in here
   2. to download the opensparct1 core
   3. Running rxil – my earlier document. To run the rxil without errors
   4. Rxil general information – my earlier document.
   5. OpenSPARC/tools/perlmod/rxil1.2 – the perl script under consideration.

Installing USB drivers in Linux

1. How to install USB driver in linux

1. $apt-get install libusb-dev fxload (for slackware, downlaod fxload and install it manually)
2. download usb-driver-HEAD.tar.gz and extract (
3. go into the extracted folder and $make
4. $cp /path/to/ISE/bin/lin/xusbdfwu.rules /etc/udev/rules.d/xusbdfwu.rules 
5. $sed -i -e 's/TEMPNODE/tempnode/' -e 's/SYSFS/ATTRS/g' -e 's/BUS/SUBSYSTEMS/' /etc/udev/rules.d/xusbdfwu.rules
6. $cp /path/to/ISE/bin/lin/xusb*.hex /usr/share/
7. $restart udev
8. replug the cable
9. $export LD_PRELOAD=/path/to/ (or include this in the script)

Important Documents

Parallelization of SOBOL Quasi Random Number Generator

Sobol Quasi Random Number Generaotors

Sobol Quasi Random Number Generaotors (sobol QRNG) are pseudo random number generators. There are often applications, for example in financial engineering, where random numbers are to be generated within an upper and lower limit. The main requirement of the random numbers to be generated in these applications are that the random numbers must fill the N space more uniformly than uncorrelated random points.

The serial algorithm for Sobol QRNG can be found at
"Implementation of Sobol Quasi Random Number generator “sobseq()” Chapter 7.7, Numerical Recepies in C,"

I have used pthreads to parallelize the algorithm. The main idea is "Divide and Conquer". If 64000 patterns are to be generated by 4 threads then each thread generates 64000/4 = 16000 random numbers.

The catch is that each thread must be provided with an intial seed so that no two threads generate the same set of random numbers. How do u generate the seed? That is another big story.

Below you can find my detailed report on parallelizing the pseudo random number generator using pthreads.

Detailed Report :  Sobol QRNG parallelized version.

Useful References:
1.ALGORITHM 659 Implementing Sobol’s Quasirandom Sequence Generator PAUL BRATLEY and BENNETT L. FOX Universite de Montrkal .
2.Implementation of Sobol Quasi Random Number generator “sobseq()” Chapter 7.7, Numerical Recepies in C,
3.“A primer for Monte Carlo method by Sobol” 1994 CRC Press.Inc, ISBN 0-8493-8673-X.
4.Quasirandom Number Generators for Parallel Monte Carlo Algorithms , B. C. BROMLEY , 38,101–104 (1996) JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING.
5.Posix Threads Programming Tutorial,
6.Normalized distributions, Wikipedia,
7.Gray Code, Wikipedia,

It had been a good experience for me getting the feel of parallel programming. Save a few bugs, the program is running fine. Need to resolve the bugs asap :)
I feel good.

MultiCore - Introduction

I have started working on Multicore Programming. The subject is fascinating because all the super computers are multicored (obviously). Multicore is introducing a paradigm shift in computing. Earlier Clock frequencies were increased to achieve a higher speed. But a point has been reached where increasing the clock frequencies are bound to give diminishing returns because of increased leakage and heating.

So to keep up with Moore's law speed up is achieved by parallelizing the program and running it on multiple cores (Amdahl's law). It is a point of consideration that most of the programs are serial by nature and the parallelize-able portion is very small when compared to the serial portion. so one might argue that there might not be significant speed up. But consider a massively parallel machine which runs the same program on different sets of data. It offers a definite advantage to run them simultaneously on different sets of data. That is what i believe multicore programming is trying to achieve.

Good reading : paper "Amdahl's Law in MultiCore era"

defines speed up to be
speed up = original execution time/enhanced execution time.
if fraction f of a computation is speeded up by a factor S. the overall speed up (of  the entire computation)
Speedup(f,S) = [1/((1-f)+(f/s))]

Ncore scenario.
fraction f of a computation is "infinitely" parallelizable with no scheduling overhead. (1-f) is sequential. if we use n parallel cores then the sped up is
(f,n) = [1/((1-f)+(f/n))]

Amdahl argues that typical values of (1-f) were large enough to favour single processors. But drawback of amdahls law is that fraction of a program that is parallelizable remains fixed. John Gustafson said that amdahls law dint do justice foe massively parallel machines. argument is : a macnine with greater parallel computation ability lets computation to operate on larger data sets in same amount of time.

Procedure : writing and burning the PICOBLaze assembly code onto the FPGA board.

Here is the general procedure that is followed to burn an assembly code onto the Pico Blaze MC.

1. Write your own ROM program by learning the PicoBlaze instruction sets.  
2.  Compile  the  program  to  VHDL  by  using  KCPSM  assembler  (This
assembler  requires  32­bit  operating  system)

3. Synthesize the VHDL program and the PicoBlaze soft code in ISE.
4. Generate bit file (You also need to include your modified UCF file).  
5. Download the bit file to the FPGA and check if that works.

You can find the detailed procedure in the manual that comes along with the downloads (last post). Read page 40 of the KCPSM manual on how to use the KCPSM assembler.


Problem Statement : Run the following  assembly program on the FPGA Board (we dont care about the output just  burning the program onto the soft MC)
loop : INPUT s2,00
          OUTPUT s2,00
jump loop
1. Save the assembly code in a file called "Prog_rom.psm" (file name important)
2. Use windows DOS box and follow the instructions in the page 40 of the KCPSM manualon how to compile the program

3. After compiling the program, the output is a file called "Prog_rom.vhd" (a Prog_rom.v file is also generated I am assuming you are using VHDL)

4. Open a ISE project and add the following file that come with the downloaded kcpsm zip file
        a. embedded_kcpsm3.vhd
        b. kcpsm3.vhd
        c. Prog_rom.vhd (generated by the assembler)
        (if you are using vhdl. there are verilog files also available)

5. Modifying the ucf file of your FPGA
       a. change the names of the nets in the UCF file corresponding to the ports in the top level entity in "embedded_kcpsm3.vhd".
      b. Since it is a basic test, I changed the switches in my UCF to in_ports and Seven segment led nets to out_ports

6. Synthesize, translate, place and route

7. Generate the bit stream.

Note: KCPSM is a windows executable. Usually wine takes care of running windows executables but shows some wierd error in this case. The quickest fix according the google is using DOSemu (Dos emulator). I am yet to try if KCPSM is working in DOSemu. Will try and post the results soon.

Friendly neighborhood Abishek.

PICO BLaze an Intro

Bit of an Intro to PicoBlaze ....

PicoBlaze is the designation of a series of three free soft processor cores from
Xilinx  for  use  in  their  FPGA  and  CPLD  products.  They  are  based  on  a  RISC architecture of 8 bits and can reach speeds up to 100 MIPS on the Virtex 4 FPGA's family. The processors have an 8‐bit address and data port for access to a wide range of peripherals. The license of the cores allows their free use.  

The hardware is synthesised using the HDL processor and the assembler generated PROM and can be is placed and routed on an FPGA. once this is done, the FPGA acts as a processor executing the instructions that are stored in the PROM one after the other. They can be used in typical embedded applications such as weather monitoring station, robotic control, home automation, washing machine, microwave controller. The advantage is any number of digital custom blocks can be connected with the microcontroller to achive any desired operation.

Where to download? Here is the link and it is free  
(download the version depending on what board you have)

I am the proud owner of Xilinx Spartan 3E FPGA, 500K (it is small but I am still a student! owning a Virtex 6 is a dream within a dream within a dream)

It is a student board and is very useful

The downside of using this board by me is that I have not been able to setup the USB driver for programming this board on linux. So use a Jtag cable. If you find a method to install the USB driver I would be more than happy to know.

ARM programs (Basic)

So I was asked to write a few programs as a homework assignment. Was really simple but was a good exercise with the assembly codes.

ARM Assembly Home Work 2

Date 02/10/11

Abishek Ramdas

1. x = (a+b);
    ADR r4, a        ;get address of variable 'a'
    LDR r0, [r4]    ;load the valuf of 'a' into r0. r0 <- a
    ADR r4, b        ;get address of b into r4
    LDR r1, [r4]     ;r1 <- b
    ADD r3, r1, r0  ;r3 <- a+b
    LDR r4, x         ;get addres of x into r4
    STR rs, [r4]      ;x <- (a+b)

2. y = (c-d) + (e-f)
    ADR r4, c
    LDR r0,[r4]        ;r0 <- c
    ADR r4, d
    LDR r1, [r4]       ;r1 <- d
    SUB r3, r0, r1    ;r3 <- (r0-r1) or r3 <- (c-d)
    ADR r4, e
    LDR r0,[r4]        ;r0 <- e
    ADR r4, f
    LDR r1,[r4]        ;r0 <- f
    SUB r5, r0, r1    ;r5 <- (r0-r1) or r5 <- (e-f)
    ADD r6, r3, r5    ;r6 <- r3+r5 or r6 <- (c-d)+ (e-f)
    ADR r4, y   
    STR r6, [r4]        ;y <- r6 or y <- (c-d)+ (e-f)

3. z = a*(b+c) - d*e
    ADR r4, a
    LDR r0, [r4]        ;r0 <- a
    ADR r4, b
    LDR r1, [r4]        ;r1 <- b
    ADR r4, c
    LDR r2, [r4]        ;r2 <- c
    ADD r3, r1, r2    ;r3 <- (r1+r2) or r3 <- (b+c)
    MUL r1, r0, r3    ;r1,r2 <- (r0*r3) or r1,r2 <- a*(b+c) (64 bit result)
    ADR r4, d
    LDR r5, [r4]        ;r5 <- d
    ADR r4, e
    LDR r6, [r4]         ;r6 <- e   
    MUL r3, r5, r6     ;r3,r4 <- (r5*r6) or r3,r5 <- d*e (64 bit result)
    SBC r6, r2, r4      ;r6 <- (r2-r4)     the subtract with carry of Least significant 32 bits of a*(b+c) and d*e
    SUB r5, r1, r3    ;r5 <- (r1-r3)     the subtract of Most significant 32 bits of a*(b+c) and d*e
                                ;r5,r6 contains the resulting 64 bit value of a*(b+c) - d*e
    ADR r4, z   
    STR r5, [r4]    ;Assuming little endian configuration, z contains the most significant 32 bits
    ADR r4, z+1    ;next 32 bit word   
    STR r6, [r4]    ;Assuming little endian configuration, z+1 contains the least significant 32 bits

A really good book for ARM ASM programming is
"ARM system developer's guide: designing and optimizing system software [Book] by Andrew N. Sloss, Dominic Symes, Chris Wright in Books".

I found it very interesting in the manner it explains each instruction and how its execution affects the state of the processor. Not the book prescribed in the syllabus though. The book prescribed in the syllabus is "Computers as components: principles of embedded computing system By Wayne Hendrix Wolf ".

From this book I learnt the importance of Foralism in system description and a Unified Modelling Language for the behavioural and structural descriptions of a system. Will definitely use UML to describe my system in my project.

QEMU and ARM assembly simulation for linux (debian)

I was looking for some assembly language simulators for ARM. Open sources preferable (free!) so after a lot of trial and error,  I found that QEMU was the solution I was looking for. here is a short description  of QEMU.

"QEMU is a generic and open source machine emulator and virtualizer. When used as a machine emulator, QEMU can run OSes and programs made for one machine (e.g. an ARM board) on a different machine (e.g. your own PC).

When used as a virtualizer, QEMU achieves near native performances by executing the guest code directly on the host CPU. QEMU supports virtualization when executing under the Xen hypervisor or using the KVM kernel module in Linux. When using KVM, QEMU can virtualize x86, server and embedded PowerPC, and S390 guests."

(source : )

We dint have a simulation environment set up in our school for running ARM programs. So i decided to get myself one because i knew there were a number of open sourced hardware emulators (long live open sources). Here i am explaining how to get QEMU running for ARM simulation


sudo apt-get install qemu
complains somthing about kvm

follow the tutorial in this link

now qemu is installed for the processor in your system

Install standard packages for arm emualtion namely
(search them in your synaptic package manager, they will pop up)

this will set up qemu for arm,

Now all i needed was a good tutorial to check if the setup is working, I found this wonderful amazing site.

follow the link in the tutorial

I ran the hello world arm in assembly and it works!! :)

Setting up ARM Simulation Environment using GNU Tool chain


Note : These are the steps i followed in setting up to develop C programs for ARM using the GNU tool chain. I am not using this currently, I am using an ARM emulator called QEMU.


(contains the files that are to be downloaded for simulator)

arm-elf-insight: error while loading shared libraries: cannot open shared object file: No such file or directory

cd /lib

this error is removed after this
1. synaptic package manager
2. search eclipse
3. mark and apply for installation

after it is installed you have to install the eclipse c/c++ development tool chain
In package manager search
eclipse cdt and install


1. Run -> external tools -> external tools configuration

2. click on the Lunch new configuration button

3. Name it "insight"

4. Location is /home/abishek/opt/ARM/bin/arm-elf-insight

5. select the work bench

6. Now go to the common tab of External Tools Configurations and configure it to be displayed in External Tools favorites menu.


setting up arm tools
using C to write ARM tools