An on-chip cache memory for vector processors, named vector cache, has been proposed to realize a high sustained memory bandwidth, which is balanced with the high computational performance of future vector processors. In our previous research, from the viewpoint of architectural design, it is clearly shown that the 3D die-stacking technology can increase the capacity of the vector cache and thereby the performance of vector processors. However, detailed design of vector caches with the 3D die-stacking technology has not been discussed well yet. Therefore it is still unclear how much the vector cache can benefit from the 3D die-stacking technology in terms of cost, latency, and energy consumption. In this paper, the vector caches are designed in detail so as to exploit the advantages of the 3D die-stacking technologies, such as reductions in long wires and energy consumption. In the cache design, this paper examines two strategies to partition the vector caches into some blocks and to place them onto multiple layers. One cache partitioning strategy places more emphasis on the reduction in the number of long wires. The other strategy reduces the number of through-silicon vias (TSVs). This paper evaluates latency, energy consumption, and the number of TSVs used in each cache partitioning strategy. This paper also discusses the 3D cache configuration suitable for vector processors.