Since recent scientific and engineering simulations require heavy computations with large volumes of data, Highperformance Computing (HPC) systems need a high computational capability with a large memory capacity. Most recent HPC systems adopt a parallel processing architecture, where the computational capability of the processors is high, but the performance of the memory system is constrained. The bytes per flop (B/F), which is a ratio of the memory bandwidth to the flop/s, and the memory capacity on a single node of the HPC systems have been reduced according to the evolution of the HPC systems. To fully exploit the potential of the recent HPC systems, it is necessary to optimize practical scientific and engineering applications, not only considering the parallelism of the applications, but also the limitations of the memory systems of the HPC systems. In this paper, we discuss a set of approaches to optimization of the memory access behavior of the applications, which enable their executions on the recent HPC systems with improved performance. Our approaches include memory optimization through memory footprint controlling, memory restructuring for active elements, redundant data-structure elimination through combined calculations and optimized re-calculation of data. To validate the effectiveness of our approaches, a plasmonics simulation application is implemented on NEC SX-ACE. By applying our approaches to the implementation, the memory usage of the plasmonics simulation application can be reduced from 35.6 GB to 512 MB for a small-scale dataset, and from 65.1 GB to 4.3 GB for a large-scale dataset, enabling its execution on a single node of a distributed parallel system with lesser memory capacity. Besides, the performance evaluation shows that the optimization achieves 1.14 times faster execution.