Recently, the high-performance computing world has moved to more heterogeneous architectures. Thus, it has become a standard practice to offload a part of application execution to dedicated accelerators. However, the disadvantage in productivity is still a problem in programming for accelerators. This paper proposes neoSYCL: a SYCL implementation for SX-Aurora TSUBASA, aiming to improve productivity and achieve comparable performance with native implementations. Unlike other implementations, neoSYCL can identify and separate the kernel part of the SYCL code at the source code level.Thus, this approach can easily be moved to any heterogeneous architectures using the offload programming model. In this paper, we show the evaluation results on SX-Aurora TSUBASA. To quantitatively discuss not only performance but also the productivity, we use two different benchmarks and code-complexity metrics for the evaluation. The results show that neoSYCL can improve productivity while reaching the same performance as native implementations.