| The Particle in Cell (PIC) algorithm is a powerful and highly accurate tool for simulating physical effects for a variety of applications. VORPAL is a highly flexible, object oriented simulation code based on the PIC model used for simulating plasmas. VORPAL's complex object-oriented code and algorithms lead to bottlenecks. Nevertheless, significant performance improvement can be achieved through the introduction of sorting, dynamic load balancing, parallel building, and run-time code compilation. Particle sorting increases temporal locality of data structures, thus maximizing utilization of a processor cache system. Run-time code compilation overcomes static compilation barriers to further improve performance. Dynamic load balancing continuously adjusts load distribution in a parallel environment to maximize the utilization of all processors. Parallel building results in reduced development time by using available parallel resources in the development phase. These techniques collectively combine into a vertical strategy of optimization, extending from high-level to low-level, contributing to an overall tripling of performance. |