Many-core processors have a vast potential to achieve interactive photo-realistic image synthesis by ray tracing. However, they need new efforts in program design to extract their performance. This paper proposes an architectureaware ray-triangle intersection test algorithm. As the raytriangle tests are the highest cost kernel in ray tracing, this paper explores an effective design and implementation of an intersection algorithm on a graphics processing unit (GPU) with many cores, and evaluates its performance through several experiments. Our intersection algorithm implementation can achieve the fast intersection tests by high-performance parallel processing on a many-core GPU with effective data management and load balancing. Experimental results show that our intersection algorithm on a 128-core GPU achieves 105 times faster intersection tests than that on a single-core processor. These results indicate that our intersection algorithm is promising in the coming era of many-core processors.