Recently, the 3-D stacked integrated circuit technology has been expected to overcome the limitations in the design of the 2-D implemented microprocessors. This paper examines the potential of 3-D integration in design and implementation of large-scale arithmetic units. In this paper, 3-D stacked parallel multipliers with various operand sizes are designed, and the effect of circuit scale on the performance of 3-D stacked multipliers is discussed. In the design of a large-scale parallel multiplier, a lot of through-silicon-vias are required by the conventional partitioning pattern. This paper proposes a partitioning pattern suitable for a large-scale 3-D stacked parallel multiplier. The proposed partitioning pattern aims to reduce the number of TSVs with a large-scale parallel multiplier. Based on the proposed partitioning pattern, 3-D stacked 32, 64, and 128-bit multipliers are designed and evaluated. The proposed partitioning pattern achieves a 13.4% reduction in critical path delay and a 10.4% reduction in power consumption compared to the 2-D implementation, in the case of an 128-bit four-layer implemented 3-D stacked multiplier.