tag:blogger.com,1999:blog-52448025885338445512024-03-12T19:32:34.372-07:00ioniptiA blog about tech things.nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.comBlogger84125tag:blogger.com,1999:blog-5244802588533844551.post-14689929506676590592018-05-22T15:00:00.000-07:002018-05-23T14:47:47.362-07:00SystemVerilog Interfaces - An unpleasant constructI haven't written in a while, but I just can't hold myself back. I have spent days writing some fancy little test bench code and because I can't fight my own nature, I just had to try and do it with the most advanced techniques possible - with the most elegant solutions.<br />
<br />
So I tried to use interfaces... and it has proven to be a colossal failure.<br />
<br />
Goals:<br />
1. Use the "correct" construct in each situation<br />
2. Minimal replication, both of signals and of code<br />
3. Use arrays<br />
<br />
Attempt:<br />
I have a couple modules that I want to test, testing each separately and then together.<br />
I have interconnect between these modules.<br />
I have test bench components that can optionally be used to drive or receive one side of the interconnect. (An interface has signals that goes between a DRIVER and a RECEIVER, I have test bench components that can emulate the DRIVER and RECEIVER.)<br />
<br />
An interface that declares a bunch of signals to connect two modules (DRIVER, RECEIVER) should have no clocking blocks - why? Because clocking blocks are only for test bench logic in the interface and would create multiple drive issues when you connect both a module and have a clocking block that can also drive the same signals. (Workarounds of course are possible, but trying to avoid those.)<br />
<br />
<b>First problem:</b><br />
Cannot declare a clocking block in a modport to make it only exist within the context of a test bench component connecting to the modport (or not exist when connecting to a module). We are stuck creating additional interface components to attach on top:<br />
<br />
I create an interface which declares the signals and the standard modports. Now this interface (SIGNAL_INTF) has no procedural test bench code.<br />
<br />
I then create interfaces called DRIVER_INTF and RECEIVER_INTF where I do not declare any signals - and I just use them to attach to SIGNAL_INTF as a DRIVER or RECEIVER. These blocks get the interface as a port.<br />
<br />
Now I have arrays of interfaces that I want to use in my code. So I write code to index into the interfaces.<br />
<br />
<b>Second problem:</b><br />
<blockquote class="tr_bq">
SIGNAL_INTF intf [NUM];<br />
always_comb begin<br />
sum = 0;<br />
for (int i = 0; i < NUM; i++)<br />
sum += <b>intf[i].value</b>;<br />
end</blockquote>
<br />
Nope - this syntax is not allowed. Can't index into an interface array with a non-const expression. 'i' is not a constant. Why? Because unlike a regular array an interface array is non-homogeneous as it can have different types stored within it - has to do with defparams. Reference to discussion:<br />
https://stackoverflow.com/questions/45058765/arrays-of-interface-instances-in-systemverilog-with-parametrized-number-of-eleme<br />
<br />
And of course interfaces are not strongly typed (did I use that terminology wrong?), and it doesn't appear that they can be. For example, the simulation tool I'm using doesn't bother to check modports at all. Remove modports - same result, change interface name to generic 'interface' - same results. Interfaces are connected at elaboration time, but even then, they let you connect anything up. Makes modports useless. Because of this, parameterizing interfaces within a port list is unnecessary - but also not compilable. I wish it was, as I like to use strong typing. It helps me avoid bugs.<br />
<br />
<b>Third problem (not really a problem as modports don't really matter):</b><br />
No syntax support for parametrizable interface on a port list<br />
No way to pass in modport name for interface instance array in an instantiation connection<br />
<br />
Yep, there are workarounds for all of this, but if you are like me, and you wish to code in the most advanced and "proper" way possible, I would avoid using interfaces too much. A little bit might be OK, but they are not well thought out and definitely not language ready.nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-41913155008647712362016-01-16T08:14:00.000-08:002016-01-16T08:14:03.129-08:00Converting a 2 Clock Read->Out FIFO to First Word Fall ThroughThere's a bunch of pages describing how to write a First Word Fall Through FIFO, see http://www.billauer.co.il/reg_fifo.html<br />
http://www.deathbylogic.com/2015/01/vhdl-first-word-fall-through-fifo/<br />
<br />
But Neither of these deal with how to write ] an FWFT FIFO on top of a FIFO that uses Xilinx's output registers. Why is this different?<br />
<br />
In a simple one clock read_enable->data_out scheme, you can pre buffer the first data out, and then use the first read acknowledge to trigger the next read enable. This gives you back to back results out of the FWFT FIFO.<br />
<br />
In a two clock read_enable->data_out scheme, you must wait an additional clock for your data, and that means that an FWFT implementation must perform 2 read enables without waiting for read acknowledges. You run the risk of overwriting the first read data with the second data. This makes it a bit tricky to convert such a FIFO to FWFT.<br />
<br />
And here's my solution:<br />
Use clock enables.<br />
Sounds simple, and it is. Using the clock enables allows you to read data out of the FIFO, and then control the FIFO logic to stop reading. Why not just use the read_enables?<br />
1. Can't stop read enables mid fetch without clock enables. Must stop second read enable from overwriting output data of the first read enable.<br />
2. Timing. My FWFT logic still makes 250 MHz + timing on a Virtex 6 and Virtex 7. (My Vivado project uses default synthesis and implementation options.)<br />
<br />
This is not as trivial as it sounds. You must carefully control both stages of clock enables. My own FIFO implementation can optionally implement arbitrarily wide and deep RAMB18/36E1 blocks to create it's FIFO memory. Since I support this, I also make sure to support the full range of clock enables going into the RAMB primitives. Xilinx was very intelligent in the RAMB design, and b/c of their foresight, we get to control each stage of output separately. As long as you can control both stages of data properly, you can implement an FWFT with an almost trivial amount of ease.<br />
<br />
Noahnachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-25838135000110581362015-06-19T14:35:00.002-07:002015-06-19T14:35:34.825-07:00CentOS 7 / RHEL 7 - XDMCPIf you're trying to get XDMCP working on CentOS 7 or RHEL 7, you must switch lightdm and xfce b/c GDM and Gnome (and I assume KDE) require direct hardware access. I kept getting a black screen when trying to connect. Switching to lightdm resolved this.<br />
<br />
This page explains it best.<br />
https://www.netsarang.com/forum/xmanager/4076/how_to_configure_xmanager_to_centos7nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-11625037196147409252014-12-26T15:03:00.000-08:002014-12-26T15:03:22.081-08:00Synplify and CDC files - Compiler Directives Constraint filesSynplify Compiler Directives Constraint (CDC) files:<div>
<br /></div>
<div>
Here's the missing links:</div>
<div>
<br /></div>
<div>
1. To refer to nets you must use the | syntax:</div>
<div>
define_directive {n:my_mod|my_net} {syn_keep} {1}</div>
<div>
<br /></div>
<div>
2. Remember that CDC files are at compile time. No hierarchies here.</div>
<div>
Don't do this:</div>
<div>
define_directive {n:my_mod.my_submod_inst|my_net} {syn_keep} {1}</div>
<div>
Do this:</div>
<div>
define_directive {n:my_submod|my_net} {syn_keep} {1}</div>
<div>
<br /></div>
<div>
This is all you really have to know.</div>
nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-66301917312230598602013-10-11T23:36:00.001-07:002014-12-26T15:05:09.281-08:00AES (Advanced Encryption Standard)I'm not gonna teach you how to perform AES encryption or decryption. So if you are looking for that, then you can turn right around and go. But I will tell you some interesting things that I learned about AES. These may help you in conjunction with other tutorials.<br />
<br />
1. Multiply is not multiply. You will need a function to perform gmul. It is multiply in a Galois field. I don't really know much about what a Galois field is, but it is an alternate universe when it comes to mathematics. So when they say multiply, this is what they mean.<br />
<br />
2. Add and subtract are actually XOR. Wherever it says subtract it is the same operation as add. Realize that every step of AES (key generation, adding round keys, substitute bytes, shifting rown, and mixing columns) require Galois operations. Multiply, add / subtract.<br />
<br />
3. Decryption is harder than encryption. Yes, that sounds weird, but it's true. What I mean is that to perform encryption you just need the key to begin with. You can actually generate the keys on the fly. To perform decryption, you MUST perform full key expansion to get the final key. They you can work backwards on the fly. Also decryption's inverse mix columns step requires 4 multiply look-up tables as opposed to 2 for encryption's mix columns step.<br />
<br />
4. AES 256 is easier to implement than AES 192. The biggest difficulty is on the fly key generation. If you want to generate AES 128 on the fly, then it is the same sequence for each round of encryption. For 256, it is the same round every 2 times. For 192, it is different. B/c each round of key expansion produces 192 bits (24 bytes) and each round of encryption uses 16 bytes, you have to loop through 1.5 rounds of encryption before starting a new line of key expansion. Of course AES 256 requires more flops.nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-80017074674193308332013-08-22T23:09:00.001-07:002013-08-26T17:12:21.782-07:00SystemVerilog array of objects initializationSo I'm updating one of my testbenches and I want to create an array of objects. For example:<br />
A_class a_instance[num];<br />
<br />
I also want to pass these objects to modules and other created objects.<br />
<br />
B_class b_instance = new (a_instance[0]);<br />
C_mod c_modinst (.a(a_instance[0]));<br />
<br />
The biggest issue is that a_instance isn't yet initialized.<br />
<br />
If you try and initialize with<br />
<br />
initial begin<br />
for(genvar i = 0; i &lt; num; i++) a_instance[i] = new();<br />
end<br />
<br />
This won't work. Has something to do with object and module creation running before initial lines. The initial statement is too late, a null object was passed in and that's what the object and modules will have.<br />
<br />
When I was passing a non array, it would work b/c the declaration included an assignment:<br />
A_class a_instance = new();<br />
but you can't call new on an array.<br />
<br />
Here's what appears to work:<br />
A_class a_instance = '{num{A_class::create()}};<br />
<br />
This surprised me as I am using replication. It looks like each instance points to its own object. This resolves the problem. Now on the declaration line, I can initialize the objects. Passing the objects around works well now.<br />
<br />
Update from Idan's comment:<br />
A_class a_instance = '{default:A_class::create()};<br />
<br />
This uses the default syntax for filling in an array. Much nicer than the replication mechanism.<br />
<br />
About the create function: SystemVerilog doesn't allow you to call new on a class type so I use a create function instead:<br />
<br />
class a;<br />
function new();<br />
$display("creating a");<br />
endfunction<br />
static function a create();<br />
class a_inst;<br />
a_inst = new();<br />
return a_inst;<br />
endfunction<br />
endclass<br />
<br />
I believe others refer to this as a factory create function or something like that. Now creating a is as easy as calling a::create().nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com3tag:blogger.com,1999:blog-5244802588533844551.post-82345308000382361482013-06-08T11:49:00.000-07:002013-06-08T11:49:06.231-07:00Installing rdesktop with user privileges (Red Hat EL 5.5)Recently I came across a challenge of installing remote desktop without root privileges. Here is the information. Kudos to: <a href="http://www.nordugrid.org/documents/rpm_for_everybody.html">http://www.nordugrid.org/documents/rpm_for_everybody.html</a> for showing me how to do this.<br />
<br />
In my home directory I performed these steps:<br />
# make rpm database<br />
mkdir rpmdb<br />
rpmdb --initdb --dbpath ~/rpmdb/<br />
# prepare folders<br />
mkdir -p rpmtop/RPMS/i386<br />
mkdir rpmtop/SRPMS<br />
mkdir rpmtop/SOURCES<br />
mkdir rpmtop/BUILD<br />
mkdir rpmtop/SPECS<br />
mkdir rpmtmp<br />
echo `%_dbpath /home/<username>/rpmdb' | cat >> ~/.rpmmacros </username><br />
echo '%_topdir /home/<username>/rpmtop' | cat >> ~/.rpmmacros </username><br />
echo '%_tmppath /home/<username>/rpmtmp' | cat >> ~/.rpmmacros </username><br />
# copy system installed rpm list (must do this to meet dependency requirements of desired apps)<br />
cp /var/lib/rpm/* rpmdb/.<br />
# build rdesktop<br />
rpmbuild --rebuild rdesktop-1.6.0-3.src.rpm<br />
# install rdesktop<br />
rpm -ivh rpmtop/RPMS/x86_64/rdesktop-1.6.0-3.x86_64.rpm<br />
# since rdesktop uses keymaps by default from /usr/local, and since rdesktop isn't installed there, we will cheat by creating a link to 'user' available keymaps<br />
ln -s usr/share/rdesktop/ ~/.rdesktop<br />
<div>
<br /></div>
<div>
This will give you:</div>
<div>
/usr/bin/rdesktop</div>
<div>
which works wonderfully well for connecting to remote windows systems.</div>
nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-32710770572810678752013-05-18T23:14:00.002-07:002013-05-22T16:59:01.493-07:00Xilinx ISE (Project Navigator) x64 (64 bit) on Windows 8Quick tip for those frustrated by file dialog crashes in Xilinx ISE x64 on Windows 8.<br />
<div>
<br /></div>
<div>
Rename libPortability.dll to libPortability.dll.orig, and copy libPortabilityNOSH.dll to libPortability.dll.<br />
<div>
</div>
<div>
Do this in:</div>
<div>
C:\Xilinx\14.5\ISE_DS\ISE\lib\nt64</div>
<div>
C:\Xilinx\14.5\ISE_DS\common\lib\nt64 (copy dll from first location)</div>
<div>
</div>
<div>
This turns off SmartHeap.</div>
<div>
</div>
<div>
This will fix ISE and iMPACT crashes on file dialogs.</div>
<div>
</div>
<div>
This information was found from another thread, thank you howardp from Xilinx in this thread:</div>
<div>
http://forums.xilinx.com/xlnx/board/crawl_message?board.id=DEENBD&message.id=1732</div>
</div>
<div>
<br /></div>
<div>
This doesn't resolve Vivado or PlanAhead issues. This only helps for ISE and iMPACT on Windows 8 x64.</div>
nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com6tag:blogger.com,1999:blog-5244802588533844551.post-7374300954117143792013-01-04T21:57:00.002-08:002013-01-04T21:57:33.097-08:00SystemVerilog wish list and SV2012So I've read up a bit on the newest SystemVerilog standard, SV 2012. There are a few simple things I like:<br />
You can now call new from another object.<br />
In SV 2009:<br />
class cl_base;<br />
...<br />
endclass<br />
class cl_ext extends cl_base;<br />
...<br />
endclass<br />
<br />
So now I want to instantiate a cl_ext and point to it with a cl_base pointer.<br />
Some people will code this verbosely:<br />
cl_base cl_b_inst;<br />
cl_ext cl_e_inst = new();<br />
cl_b_inst = cl_e_inst;<br />
<br />
I have always resolved this using another method in cl_ext:<br />
static function cl_ext create();<br />
cl_ext t;<br />
t = new();<br />
return t;<br />
endfunction<br />
<br />
This way allows me to do this:<br />
cl_base cl_b_inst = cl_ext::create();<br />
<br />
But now, with SV 2012, you can directly call new:<br />
cl_base cl_b_inst = cl_ext::new();<br />
<br />
Now onto the next improvement that I am excited about: Multiple Inheritance! The new SV 2012 now supports multiple inheritance by using an interface class. Don't know how that works as I haven't used it yet.<br />
<br />
Now, onto my wishlist:<br />
Allow constant functions to call system tasks. For example:<br />
localparam blah = $urandom();<br />
That'd help for some of my randomized teesting<br />
<br />
Variable length arguments would be nice, make it easier to create a new display function with added parameters.<br />
<br />
Pass signals directly into a class, but of course... that will never happen. For now you just have to wrap signals in an interface to keep them handy for a class to use.<br />
<br />
Allow multi dimensional arrays with both types and widths:<br />
wire [count - :0] int my_integers;<br />
<br />
Allow for seamless multidimensional array flipping:<br />
wire [a_count - 1:0] [b_count - 1:0] wires_a_by_b;<br />
for(int bi = 0; bi < b_count; b++)<br />
b_reduce[bi] = $flip(wires_a_by_b)[b];<br />
<br />
or something like that... This might work with a function, but I do believe that SystemVerilog still doesn't support unconstrained types for a function.<br />
<br />
Generate statements in a class:<br />
SV supports parameters in a class, but it won't allow for generate statements in a class. This is both unexpected, and annoying. If parameters are allowed appear identical to parameters for a module or interface, then they should behave more or less the same!<br />
<br />
Wildcard connections of parameters. It would've helped me today.<br />
<br />
I know there is something I want having to do with clocking blocks... One second, I have to find it...<br />
So I want some indication of when a clocking block updates a signal. See forum post for more information.<br />
<br />
Here's the link to my question:<br />
http://verificationguild.com/modules.php?name=Forums&file=viewtopic&p=20576<br />
<div>
<br /></div>
<br />
Now onto Cadence:<br />
PLEASE allow modports inside generate statements!<br />
<br />
I know there is more, but I can't recall now sitting in front of the TV.nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-55398347253948459512012-12-16T13:43:00.003-08:002012-12-17T21:22:44.116-08:00Xilinx KC705 PCI Express on Ivy Bridge (i7 3rd Gen)<span style="font-family: inherit;">I am doing work on a KC705 evaluation board from Xilinx. This chip (Kintex 7) uses the 7 Series Integrated Block for PCI Express. I am running this on a Gigabyte GA-Z77X-UP5 TH board. I have run into a brick wall trying to get this Xilinx board to work on this Gigabyte motherboard. Luckily, there is a note from Xilinx about this:</span><br />
<span style="font-family: inherit;">http://www.xilinx.com/support/answers/51135.htm</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">In short, there is a workaround. Check out this page, which tells you to set the <span style="background-color: white;">TX_RXDETECT_REF signal to 3'b011 instead of the default. The Answer Record also explains that this is due to an errata on Ivy Bridge cores. It points to a web page by Intel and indicates that it is errata </span><span style="background-color: white;">BV56</span><span style="background-color: white;">.</span></span><br />
<span style="font-family: inherit;">http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/3rd-gen-core-desktop-specification-update.pdf</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="background-color: white; font-family: inherit;">I don't understand the errata, but I can attest to the fact that this fixes the problem. The KC705 is now successfully detected on this motherboard.</span><br />
<span style="background-color: white; font-family: inherit;"><br /></span>
UPDATE: If you take a look at<br />
http://www.xilinx.com/support/documentation/boards_and_kits/kc705_PCIe_pdf_xtp197_14.2.pdf<br />
you can see that by setting the Bitstream Configuration, and by adding an emcclk you can accomplish a PCIe compliant FPGA load-time. This method doesn't need a soft-reboot to properly enumerate the bus.<br />
<br />
See note above, kept for accuracy:<br />
Even with this success, there are still some failures... For example, the load time for the FPGA configuration is too long. This means that I have to do a soft-reboot after a hard reboot to get the PCIe link to work. Xilinx has created a method to solve this, but I have yet to figure it out. It is called Tandem PROM and Tandem PCIe. It is supposed to quickly load the PCIe portion, negotiate the link, and then load the rest.<br />
<br />
For now, I'm just doing a soft-reboot (ctrl-alt-del).nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-15077195660178974722012-11-11T22:51:00.001-08:002012-11-11T22:51:20.382-08:00finding combinational loops in ncsimThere may be better ways of doing this... but here was my way:<br />
<br />
You are running gate level or rtl simulations, and the simulation gets stuck. Yup, what do you do?<br />
<br />
Simple answer is not to have combinational loops. If you do, perhaps you've done something wrong. If you must have combinational loops, or you are debugging someone else's code, then here's the way I found the loop path. This is especially useful when you are unfamiliar with the code, and the code spans many files and many processes.<br />
<br />
Use NCSIM's built in "Create Force" option. I just debugged a combinational loop, and this worked like clockwork. I could follow the whole loop, and figure out which branches were being taken by selectively forcing signals. When a force caused the simulation to continue, that signal is part of the loop. If the force had no effect, that signal is not part of the loop. This worked well for me as it didn't take too long to get the simulator stuck. If it takes a long time before it gets stuck this may not work well as it requires continuously restarting the simulation to test each sequential branch.nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com1tag:blogger.com,1999:blog-5244802588533844551.post-8432297172590551932012-10-14T20:08:00.002-07:002012-12-16T20:27:28.673-08:00VS 2005 convert to VS 2010 - property pagesA client sent me their Visual Studio 2005 project. I have Visual Studio 2010 Express. I opened the project, and it offered to convert it. It failed miserably. First thing is, it couldn't handle the x64 configurations. So I opened the VS 2005 vcproj file, and I removed the x64 configurations. Yay!<br />
<br />
It now opens the project, (reporting that conversion passed without errors). Except that it is lying. The property pages have converted as pretty much empty files. It has the XML header, but all of the property pages are empty.<br />
<br />
Long story short, go to the command line, and use vcupdate on the original vcproj file. One problem I noticed was the SolutionDir wasn't available as you can't run vcupdate on an sln file. So in the command prompt, type:<br />
<br />
set SolutionDir=(path of the project)<path of="of" project="project" the="the"></path><br />
<br />
Then run vcupgrade on the vcproj file. Make sure the path is a fixed path and not relative to the vcproj file.<br />
<br />
That does it, the property pages get converted properly!nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-13449854909539656402012-09-01T15:09:00.001-07:002012-09-01T15:09:01.189-07:00Xilinx Zynq - harder to implement than one would hope<div><p>So I get a new Zynq ZC702 development board, and I'm tasked with making it work. I got the new ISE 14.2 development tools from Xilinx. Here are some things you might want to know if you encounter the same board.</p>
<p>The FPGA logic is the PL (Programming Logic), and the A9 subsystem is the PS (Processor System). Don't attempt to use Impact to burn anything! The burning is all done through an application called zynq_flash.exe.</p>
<p>Do not use Project Navigator to build this FPGA bitstream! Even if you wanted to, which I did, you will find that ngdbuild crashes most of the time (when using the embedded A9) in this flow. Use PlanAhead. It is the new recommended flow from Xilinx. Its also a surprisingly good flow.</p>
<p>The flow I'm using is:<br>
Create a project in XPS. Set up everything that you need for interfacing to the PL there.<br>
Export the project for the SDK, it's an option in XPS. Don't generate the bitstream. Don't launch the SDK.<br>
Create a new project in PlanAhead. Import the .xmp project that you created in XPS. Right click on the embedded processor to create a stub file. Add your source files, and build the rest of the project there.<br>
Generate a bitstream.<br>
Open SDK. Create a new hardware platform specification. Browse for the .XML file under the XPS project in the SDK export folder. Choose the bitstream that was built by PlanAhead. If you have a .bmm file, then use it.<br>
Create a new BSP based on this platform.<br>
Create a new C project and choose Zynq FSBL. You need the FSBL (First Stage Bootloader) to burn the FPGA. It is also needed by the A9 to boot up. Select the platform that you created for the FSBL.<br>
Create a project for your application based on the platform you created.</p>
<p>To burn, right click on the application, and choose to create image. It will automatically add the FSBL, bitstream, and application. To burn the new .mcs file, use the Xilinx tools menu, and burn flash. This is using the zynq_flah applicaion.</p>
<p>Some caveats, I have seen zynq_flash claim to burn even when it doesn't. If the erase step doesn't take at least a few seconds (more like 30+), then it's not working. I have never seen the verify option work at all. If it isn't burning, then turn off the board for a few minutes, and try again.</p>
<p>That's about it.</p>
</div>nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com1Santa Clara, Santa Clara37.354107 -121.95524tag:blogger.com,1999:blog-5244802588533844551.post-17235526142054709522012-09-01T14:12:00.001-07:002012-11-17T16:39:18.250-08:00Libero SOC and IO constraints<div>
A rule that I have always kept to is : Keep design source files as text! A while back I had the misfortune of using ATEasy. Back then all ATEasy projects were binary, and the sources were stored in the project. This made it very hard to compare versions or to store the sources in revision control. This was bad back then. Even then, ATEasy became aware of the problem and added an option to store project files as text.<br />
Getting back to Microsemi / Actel... The newest version of the tool no longer allows out to use a PDC file to store IO constraints. You must maintain the constraints from within Designer. This is unpleasant to say the least. The constraints are held in the Designer project file. The project file is not only binary, but it is updated on every run.<br />
You can of course import a PDC file, but don't try and change the IOs around without first opening the tool to unassign all IOs from the Designer database. Otherwise you'll get errors about conflicting IO assignments.<br />
Please fix this!</div>
nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0Santa Clara, Santa Clara37.354107 -121.95524tag:blogger.com,1999:blog-5244802588533844551.post-77724555748021149802012-08-03T10:18:00.001-07:002012-08-03T10:18:30.717-07:00SystemVerilog variable argument display (printf)Ah, the joy of perfection.<br />
<div>
<br /></div>
<div>
Any C / C++ programmers now programming in SystemVerilog must feel pretty constrained by language limitations. One of the frustrations that I felt today was the lack of variable argument (varargs) support. I wanted to be able to overload a print function (ie write my own $display) and selectively print or not print based on debug levels or the like. Actually... what I wanted was to store all the debug prints between runs, and only print them in the case of an error, but that's besides the point. I needed a way to overload the $display function.</div>
<div>
<br /></div>
<div>
Here is what I've finally come up with: (be ready, it's wild)</div>
<div>
<br /></div>
<div>
$display("Print with no arguments");</div>
<div>
$display("Print with 1 argument %t", $time());
</div>
<div>
$display("Print with 2 arguments %t, %d", $time(), 5);
</div>
<div>
$display("Print with 3 arguments %t, %d, %s", $time(), 5, "debug");
</div>
<div>
etc...</div>
<div>
<br /></div>
<div>
I wanted this to become:</div>
<div>
<div>
DEBUG_PRINT("Print with no arguments");</div>
<div>
DEBUG_PRINT("Print with 1 argument %t", $time());</div>
<div>
DEBUG_PRINT("Print with 2 arguments %t, %d", $time(), 5);</div>
<div>
DEBUG_PRINT("Print with 3 arguments %t, %d, %s", $time(), 5, "debug");</div>
<div>
etc...</div>
</div>
<div>
Is it possible??</div>
<div>
<br /></div>
<div>
Yes! (at least until someone points out a flaw)</div>
<div>
<br /></div>
<div>
The solution is based on macros that can have variable number of arguments, and the preprocessor allowing ifdef on substituted strings</div>
<div>
<br /></div>
<div>
<div>
function void my_debug(string s);</div>
<div>
...</div>
<div>
endfunction</div>
</div>
<div>
<br /></div>
<div>
<div>
`define DELIM</div>
<div>
`define DEBUG_PRINT(p0, p1=ELIM, p2=ELIM, p3=ELIM, p4=ELIM, p5=ELIM) \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`ifdef D``p1 \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>my_debug($psprintf(p0)); \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`else \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`ifdef D``p2 \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>my_debug($psprintf(p0, p1)); \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`else \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`ifdef D``p3 \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>my_debug($psprintf(p0, p1, p2)); \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`else \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`ifdef D``p4 \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>my_debug($psprintf(p0, p1, p2, p3)); \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`else \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`ifdef D``p5 \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>my_debug($psprintf(p0, p1, p2, p3, p4)); \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`else \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>my_debug($psprintf(p0, p1, p2, p3, p4, p5)); \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`endif \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`endif \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`endif \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`endif \</div>
<div>
<span class="Apple-tab-span" style="white-space: pre;"> </span>`endif</div>
<div>
<br /></div>
<div>
This solution works on Cadence Incisive 12.1</div>
<div>
<br /></div>
</div>
<div>
Tell me what you think of this solution,</div>
<div>
Nachum</div>
<div>
<br /></div>nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com1tag:blogger.com,1999:blog-5244802588533844551.post-88151445758634329292012-07-07T17:50:00.001-07:002012-07-07T17:50:11.767-07:00ncsim and ReInvokeIf you are a chip designer, and you use Cadence's tools, you might get some advantage by reading this post. Cadence Incisive is a complete package for simulating digital designs (perhaps more than just digital...). Incisive includes such tools as ncvlog, ncvhdl, ncelab, ncupdate, ncsim, and others. ncvlog and ncvhdl compile Verilog/SV and VHDL respectively. ncelab elaborates the design from the top level module. ncupdate re-compiles all changed source files in an elaborated design and re-elaborates. ncsim loads an elaborated design, and can display it with the SimVision GUI.<br />
<br />
<span style="background-color: white;">A sorely underused feature is SimVision's ReInvoke. Using ReInvoke from the simulation menu will cause SimVision to call ncupdate and reload the elaborated design. What is most pleasant about re-invoking the design is that most of the GUI stuff is saved and restored once the re-invoke is done. This includes signals in waveform windows, and I assume it includes watch variables and breakpoints, etc...</span><br />
<span style="background-color: white;"><br /></span><br />
The problems with ReInvoke revolve around ncupdate. ncupdate does not properly update source files when packages are involved. This includes VHDL or SystemVerilog packages. Cadence has already informed me that ncupdate will NOT be fixed when it comes to updating SystemVerilog files (and I assume it will not fix VHDL package updates either). This has forced me to look for a way around the ncupdate from ReInvoke issue.<br />
<br />
My first attempts to resolve this was to perform a re-compilation and re-elaboration outside of SimVision. Then call ReInvoke. This DOES NOT work. The reasoning is that SimVision will call ncupdate, and ncupdate will once again see which files have been updated (assume the packages), and then it will re-compile the packages (again), and re-elaborate. The re-elaborate will fail since the package has been compiled after the modules that use the packages!<br />
<br />
To resolve this issue you can pass ncsim the "update" with "nosource" option. What this means is that SimVision on ReInvoke will NOT check for timestamps or perform any re-compilation of source files. ReInvoke will no only re-elaborate.<br />
<br />
Using these options, you can recompile outside of SimVision, and let ReInvoke perform the re-elaboration. Voila! No more errors.<br />
<br />
Side note: ReInvoke adds the "update" option automatically even if you didn't call ncsim with the "update" option to begin with. But ncsim will not allow you to pass the "nosource" option with the "update" option. Make sure to add the "update" option on the call to ncsim to avoid any issues.nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-31300094759489936062012-02-04T20:20:00.000-08:002012-06-27T22:23:46.656-07:00SystemVerilog Classes and ParametersI recently attempted to use an SV class with parameters. Simple parameters seem to work fine. You can templatize a class based on type or size. But that's about all you can do.<br />
<br />
The limitations of how parameters work within a class are disappointing. As you may be aware, a parameter in a module or interface allows you the flexibility of generating different code for different parameter values. But as another example of the lack of uniformity throughout SystemVerilog they chose to not allow generate statements in a class. So the paramters look the same, but they can't be used the same way.<br />
<br />
Another disappointment.nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com1tag:blogger.com,1999:blog-5244802588533844551.post-8535966756860605052012-01-21T11:20:00.000-08:002012-01-21T14:10:32.352-08:00SystemVerilog, Cadence, Exporting a module taskI have recently been writing a lot of verification code with Cadence's tools. I use Incisive for RTL simulations. After using the tools for a while there are definitely limitations that are frustrating. But overall, the tools are very stable and support a large portion of SV, and support it really well. Kudos to Cadence for giving decent error messages on most occasions. That is really a huge plus even if the error is SV legal, but not yet supported.<br />
<br />
Tip for calling tasks from a module using SV with Cadence's Incisive:<br />
There are 2 ways, without using Verilog standard probing, to access tasks from within a module.<br />
1. Create an interface. Declare a modport for the module with an exported task. Let the module define the exported task.<br />
Cadence doesn't support this yet.<br />
2. Create a class with the desired task declaration. Extend that class from within the module. Output an instance of the extended class for use elsewhere.<br />
This method works fine with Cadence's tools.<br />
<br />
Most people are satisfied with probing. In a small testbench, probing works OK. But when creating a large testbench with many identical modules (or modules with similar functionality), it is much nicer to have an array of class instances to work with. They lend themselves to more modular test designs.<br />
<br />
About: System Verilog, Cadence Incisive, ncsim, ncvlog, export task, extends, exported tasks, extended classes, inherited classes, inheritance<br />
<br />nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com1tag:blogger.com,1999:blog-5244802588533844551.post-38158641549725867322011-12-31T12:07:00.000-08:002011-12-31T12:07:35.855-08:00Rant about SystemVerilogWhy does SystemVerilog need so many constructs? There is too much overlap.<br />
<br />
An interface could almost be a module if you don't need any module instantiations inside. A task or function can exist in classes, interfaces, or modules. Which should they exist in? Call a task from a module or use an interface to control the IOs of the interface? Output an inherited class from the module and call the class' tasks? Use a typed or untyped mailbox? How about using nested classes? Should they be nested or just in a hidden package? Classes in packages or classes at the top-level with include files? What sounds right to you?nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-52294185768818440412011-10-14T09:10:00.000-07:002011-10-14T09:10:20.555-07:00Remote desktop & tscontscon is the most useful little command that I only recently learned about. It allows you to switch a remote desktop session back to the local console.<br />
<br />
I use this to log onto my compute remotely and then switch it back to console when I'm done.<br />
<br />
I also used this to get around a FlexLM requirement that wouldn't let me open an application while I was logged on remotely. I wrote a short script that switched back to console and then opened the application. Once I logged back on remotely, the application was open.<br />
<br />
anyhow use this:<br />
tscon rdp-tcp#0 /dest:console<br />
<br />
If this doesn't work, use:<br />
query session<br />
and then use the rdp session as listed there.<br />
<br />
This will switch the desktop back to the console.nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-54722122839443022852011-08-31T21:27:00.000-07:002011-08-31T21:27:04.165-07:00Actel and Libero IDEI have been using the SmartFusion FPGA for a while now. The advantage is a built-in Cortex M3. the disadvantage is Actel and their tools.<br />
<br />
Actel has violated a rule in design. They over-engineered Libero IDE. Libero attempts to automate constraints. Here-in lies one class of problems.<br />
<br />
Libero determines all of your clock constraints from the MSS setup. Those constraints are then hard wired into the Designer project. There is no pdc file to edit for the clock constraints. These constraints can't be edited using Designer's constraint editor because the automatic constraints are read-only.<br />
<br />
Why is this bad?<br />
<br />
Libero has a bug where it sets the constraint incorrectly if you set the divider of GLx to a non power of 2. Divide GLA by 3 and the constraint is set as if it was divided by 2.<br />
<br />
In general, over-engineering complicates both users' and developers' lives. I am referring to customers as the users and the Actel employed programmers as the developers. Over engineering creates for confusion as nothing is intuitive anymore. It also opens the door to many more bugs.<br />
<br />
Xilinx is an example of a high quality company that doesn't over-engineer (from what I recall). Their tools are not great, but they leave all of the configuration in your hands. Even if they tried to automate something, they still let you override whatever it is.<br />
<br />
Another frustrating problem is how Libero splits constraints into 2 places. I recently configured an MSS I/O to route to the FPGA. I did this because I needed Schmitt Triggers. What is overly confusing is how the default constraints file located under component/work/<top_entity_name>/<top_entity_name>.pdc is no longer the file for these constraints. What is even more strange is that if you edit that pdc file in the I/O Attribute Editor, you see the MSS I/Os and they offer to enable Schmitt Triggers. Why is that bad? Because they don't work! The moment you attempt to compile in Designer, you get a strange error message telling you how you can't modify these IO settings!</top_entity_name></top_entity_name><br />
<br />
The solution? Under the MSS configurator there is a middle tab for I/O Attributes. Use that tab. That's not all folks! Now you must add a new pdc file to the Designer project. I imagine that if you hit the Designer button in Libero it will add the new pdc file to the Designer project, but as I rarely hit that button, b/c Designer is already open, I can't say whether this is true. But not only must you add this file, you must also remove any reference to the incorrectly placed constraints from the original pdc file. The MSS I/Os must only appear under the MSS pdc file.<br />
<br />
Folks at Actel seem to have forgotten what simple and intuitive means. They should refresh their memory.nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-80807548851930437082011-06-14T21:47:00.000-07:002011-06-14T21:47:02.218-07:00More on metastabilityI spoke with the layout guy today, and I've once again had to re-evaluate my understanding. Seems like gates don't propagate metastability. Since gates are railed to VCC and GND, and they don't have internal feedback like flip-flops, metastability is unlikely to propagate through them. This includes regular gates, and buffers, and I'm sure other stuff too.<br />
<br />
This means that metastable oscillation is pretty unlikely. Unless of course the output of one flip flop is tied back to its input without going through buffers, or gates.<br />
<br />
I guess the only likely issue with metastable flops is the uncertainty of the output. Oh well. I'd still be careful.nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-10218498531343166082011-06-12T13:37:00.000-07:002011-06-12T14:32:43.068-07:00Metastability - what logic and research has led me to believeA long time ago I began learning about hardware design. This was at a time when I was primarily doing software development. Without formal training, I was relying on the explanations of others, logic, and experimentation. It has been years, and I've advanced quite a ways.<br />
<div><br />
</div><div>When I started designing, the ideas of synchronization were explained to me. Always double or triple sample signals from external sources, or from other clock domains. As it was explained to me, each flop reduced the chances of metastability by a very large factor. 2 samples were considered enough to lower the chances of metastaibility to almost nothing. 3 flops were even better.</div><div><br />
</div><div>As I got more experience, I got used to only using 2 flops to move between clock domains, or to bring an external signal in. But what was really going on?</div><div><br />
</div><div>There are 2 reasons for synchronization flops.</div><div><br />
</div><div>1. Metastable oscillation. This requires an explanation of metastability.</div><div><br />
</div><div>Metastability is a state whereby a flip-flop will sample an input signal on a clock edge, but the input signal is in a state of transition. This is a common situation when sampling from another clock domain or from an external signal. When a flip-flop samples an input signal that is not stable high or stable low, the flip-flop's output is undefined. Suffice it to say that an undefined output means that the output does not rise or fall as would be expected. The output wavers and takes a while to settle in one state or another.</div><div><br />
</div><div>A metastable flop will settle after a while. The assumed time it takes for a metastable flop to settle is a complete clock period. While a flop-flop is metastable, any flop sampling the metastable flop's output could also go metastable.</div><div><br />
</div><div>A second synchronization flop guarantees a full clock period of time for the output of the first metastable synchronization flop to settle and arrive at the second synchronization flop. As a rule of synchronization, there can be no combinatorial logic between the first and second flops.</div><div><br />
</div><div>Metastable oscillations can occur if there is a path whereby metastability can loop around. This can occur with a single flop, or with a sequence of flops. The simplest example of metastable oscillation can be with one flop. A single flop whose input comes from a mux that has a select signal to choose between an external signal and the flop's own output is prone to metastable oscillation. The flop can be used to sample an input signal, and then hold that value using the select of the input mux. But since the output of this flop is driven back to it's input using a mux, there may not be enough time for the output to settle before it is sampled back into itself.</div><div><br />
</div><div>2. Indeterminate input. When an output from another clock domain or an external signal is input into multiple flops at the same time, there is no guarantee that each flop will settle with the same input value. There are also the risks of metastability propagating through the sampling flops.</div>nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-70535464619553929192011-06-01T21:17:00.000-07:002011-06-01T21:17:11.970-07:00Can you guess why multicycle constraints are problematic?As I have been suffering with synthesis tools that don't analyze clock domains during synthesis, I realized a new aspect to this problem. Have you ever designed a module that uses a multicycle path? Well it can cause you real headaches if you're not careful. Imagine this code:<br />
<br />
<br />
entity demo_module is<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>port (<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span> clk : in std_logic;<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span> rstn : in std_logic;<br />
<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span> sel : in std_logic;<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span> inputa : in std_logic_vector(7 downto 0);<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span> inputb : in std_logic_vector(7 downto 0);<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span> output : out std_logic_vector(7 downto 0)<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span> );<br />
end entity;<br />
<br />
architecture rtl of demo_module is<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>signal sel_input : std_logic_vector(7 downto 0);<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>signal input_d : std_logic_vector(7 downto 0);<br />
begin<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>process(sel, inputa, inputb)<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>begin<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>if(sel = '0') then<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>sel_input <= inputa;<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>else<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>sel_input <= inputb;<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>end if;<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>end process;<br />
<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>process(clk, rstn)<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>begin<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>if(rstn = '0') then<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>input_d <= (others => '0');<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>output <= (others => '0');<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>elsif(clk'event and clk = '1') then<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>input_d <= sel_input;<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>output <= sel_input xor input_d;<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>end if;<br />
<span class="Apple-tab-span" style="white-space: pre;"> </span>end process;<br />
end rtl;<br />
<br />
I'm doing nothing complex here. Using sel to select which of the inputs I'm gonna sample and then output the xor of.<br />
<br />
Now let's say that it takes 2 cycles to change inputa or inputb, and I make sure that sel only switches every other cycle. This should be valid. Here's where the problem with synthesis tools come in to play. Synthesis tools can use any method they choose for implementing the sel mux. They do not have to use an actual mux. This could be done using a combination of gates / complex gates. What this means is that even when sel is stable on inputa, a change on inputb can cause a glitch (and vice versa). This is unimportant when inputa and inputb are guaranteed to change within a clock, but this is of vital importance when you want to use a multicycle constraint. The fact that inputa and inputb take more than one cycle to change means that at the clock edge when sel doesn't change (and sel is STABLE), inputa / inputb can be changing and thereby cause a glitch on the output of the muxing process (when done without a mux). This will then create an incorrect read of input_d (the sampled value).<br />
<br />
Beware!nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0tag:blogger.com,1999:blog-5244802588533844551.post-26528460986791570272011-05-26T23:18:00.000-07:002011-05-26T23:19:57.378-07:00Another metastability mistakeSo I was analyzing my design today and I happened upon another metastability mistake. I built this circuit:<br />
<div class="separator" style="clear: both; text-align: center;"><a href="http://3.bp.blogspot.com/-v5sf1yDxF2k/Td89zGtzZbI/AAAAAAAAAD4/nDgpl2m2wh4/s1600/possibly+metastable.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="158" src="http://3.bp.blogspot.com/-v5sf1yDxF2k/Td89zGtzZbI/AAAAAAAAAD4/nDgpl2m2wh4/s320/possibly+metastable.jpg" width="320" /></a></div>I wanted to synchronize an external signal, but only when enabled. There was a good reason to do this, but I didn't want to risk metastability. I figured that the second flop solved the issue. What I didn't take into account was the feedback path of the first flop when the enable was low. It was theoretically possible that the first flop would oscillate metastable, even if only for l additional clock. Under most conditions you would assume that a single mux would be minor when compared to the full period of the clock and therefore it wouldn't be a problem. But it's not just an additional mux, it is also two wire paths instead of 1. That means that layout tools with a max wire path distance would now be doubled as there is a path from the q to the mux input, and from the mux output to the d. A nice solution was replacing the mux and enable with a clock gate enabled by the enable. At least this way you don't have to worry about the feedback path.<br />
Be sure to balance the clocks!nachumkhttp://www.blogger.com/profile/15062996095552781936noreply@blogger.com0