Simulation is playing an increasingly important role in the design and evaluation of current and future high performance computer systems. Sparc-Sulima is an open-source UltraSPARC SMP simulator, written in C++ and developed by the CC-NUMA Project. Sparc-Sulima simulates an UltraSPARC processor by interpreting each instruction. It has timing models which predict an UltraSPARC IIICu shared memory processor performance to a high degree of accuracy.
Currently, Sparc-Sulima can simulate an unmodified Solaris (UltraSPARC) binary (executable program). When that program executes a system call trap instruction, the corresponding operating system effects are `emulated' (the simulated computer state is updated directly, rather than via simulating a lengthy and complex sequence of kernel-level instructions), and simulated execution then resumes. This mechanism is handled by the SolEmn emulation layer in Sparc-Sulima. Threaded programs (e.g. using the pthreads Posix thread libraries) ultimately access thread-related operating system services via the Solaris _lwp (lightweight process) trap instructions. The limitation of the current implementation is that, at any time, there can only be one active thread per CPU. Such a limitation has been in all emulation-based computer simulators in the literature also.
The aim of this project is to extend the emulation of the _lwp trap instructions to remove this limitation. It will require implementing thread switching, which can occur by time-outs, and also by the _lwp_suspend and yield system calls. The current thread context save/restore mechanisms will also have to be extended. If successful, the work should be publishable.
An extension of this project, more implementation oriented, would be to simulate, rather then emulate the _lwp traps. This would involve writing a mini-kernel that would perform an implementation of lightweight threads. Now, when Sparc-Sulima executes a threaded program and encounters an _lwp trap, it will starts executing the kernel, just like on a real machine. Note that the same underlying algorithms of the emulation mode can be used; however, some considerations must now be given to the `locking' of shared data structures. The advantage of this is that kernel-level (notably memory system-related) overheads can now be recorded by the simulator. Thus, the simulator will be able to predict more accurately the threaded program's overheads.
It is part of the CC-NUMA Project. It has the potential for publishable work.
Last Modified: Peter Strazdins, 06 Feb 2008