Summary In this paper, we have discussed the potential of on-chip memory subsystems for future vector architectures. The performance evaluation based on the early experiments suggests that even with moderate-sized on-chip cache with 512KB to 2MB, it covered a lack of the memory bandwidths of vector load/store units with 2B/flop or lower, and boosted the sustained system performance up to the level of the 4B/flop performance. Selective caching, in which only the data with the high locality of reference are cached, is also effective for efficient use of limited on-chip caches.
|Number of pages||18|
|Publication status||Published - 2008 Jan 1|
|Event||2007 7th Teraflop Workshop - Sendai, Japan|
Duration: 2007 Nov 21 → 2007 Nov 22
|Other||2007 7th Teraflop Workshop|
|Period||07/11/21 → 07/11/22|
ASJC Scopus subject areas