Based on acoustic features of four Mandarin tones, this study investigated the perceptual pattern between Tone1 (T1) and Tone4 (T4), Tone2 (T2) and Tone3 (T3) which are considered difficult for Japanese learners and Chinese native speakers to distinguish. We compared the performance of Mandarin and Japanese Listeners on the perception of Mandarin tones in a classical categorical perception experiment that employed identification and discrimination tasks. Experiments on T1 and T4 were designed using the fundamental frequency (fo) of endpoint as the acoustic cue, while experiments on T2 and T3 were designed using continual sound stimuli, which gradually changed from T2 to T3 varying in the timing of turning point (inflection point of the tone), Δfo (pitch difference between onset and turning point) or both acoustic dimensions. The results showed that when endpoint pitch was taken as the acoustic parameter, categorical perception was found between T1 and T4 by both Chinese native speakers and Japanese learners. And when the timing of turning point and Δfo were both taken as the acoustic parameters, both advanced Chinese learners and beginners demonstrated quasi-categorical perception of T2 and T3 whereas timing of turning point was used as a sole parameter, only a categorical perception tendency is observed.