As described here, I did some FPGA rework of a PCI-SCI bridge developed by the CERN. The primary reason was to evaluate to what extend such a reconfigurable communication hardware could be used to improve some parallel computing tasks in the field of Cluster Computing.
While working with the CERN card and becoming more and more familar with the general topic, several odds appeared that made this card a little bit difficult to use for the intended purpose (even with some FPGA redesign). It should be noted that the CERN PCI-SCI bridge was never intended for the use as a high-speed cluster communication network. It was rather intended just for quickly transferring data from A to B.
At the time when I worked with the CERN PCI-SCI bridge there was meanwhile commercial PCI-SCI hardware available from Dolphin. These cards did also show up several architectural features that I did not like. So I had some ideas how to improve such a PCI-SCI design and make it more applicable for cluster computing purposes. These ideas could not be realized by just taking the CERN design and changing the FPGA contents. Additionally, this piece of hardware was quite old already and there were better and faster FPGAs available.
Hence, it was a logical conclusion to design a completely new piece of hardware. The goal of this design was never to beat commercially available hardware in terms of raw bandwidth or latency. This is almost impossible when fighting with FPGAs against ASICs in the same context. The goal was rather to provide a proof-of-concept that some interesting features could be embedded into an PCI-SCI framework. These features would provide a significantly improved handling that makes things possible that were not possible before. As a result, the overall system throughput (when everything is playing together) would be increased.
So I made some in-depth analysis of what should be inside such a new PCI-SCI bridge and how it should be structured logically. This work became my Diploma Thesis which closed my computer science study. This work is available for download in the publications section.
My professor at the university (Wolfgang Rehm) asked me whether I want to stay there after my study and I took this occasion. I continued to work on this bridge design and realized it finally in hardware. This was my first rather complex design (10 layers, several 64bit busses) and I was quite happy that the first board revision worked almost flawlessly without any troubles.
In fact, the board has been developed before the complete design of the FPGAs has been made. That’s another interesting aspect of FPGAs – one can do an incremental design and add function by function.
Over the time several features have been implemented and there were made big plans. Two colleagues of mine (Friedrich Seifert and Daniel Balkanski) worked on Linux driver and general software infrastructure issues, while other colleagues (Carsten Dinkelmann and Sven Schindler) took some efforts to develop an MPI (Message Passing Interface) library that can make use of the special hardware features.
However, unfortuately there has been made a shift at our chair of the university to a completely different topic. As a result, all work on the hardware as well as the software layers has been laid down. At that time around 75% of the planned hardware functionality had been implemented.
Several papers regarding that project are available in my publications section. I’m also keeping the old project pages (hardware-related) on this web server that are available here. Please note that there might be some dead links inside.
Although this project has not been finished, some of the principles have found their way into commercially available hardware. So this work was not completely useless. In any case, I learned many many lessons.