Large-scale connectivity support is a critical challenge in the massive machine-type communications scenario. Grant-free random access (RA) is a promising solution because it can reduce severe signaling overhead in contention-based RA procedure. However, there will still be collisions due to the random selection of spectrum resources by the devices. Therefore, we propose a distributed Q-learning-assisted grant-free RA scheme to alleviate the collisions between devices. Considering the characteristic of the machine-type communications devices with bursty traffic, the random packet arrival model is adopted in this paper. In order to cope with the difficulties brought by the random transmission of devices to Q-learning, an action reward based on the active probabilities of devices is designed. In addition, we introduce the power domain nor-orthogonal multiple access to further enhance the number of accessible devices. Numerical results demonstrate the advantages of the proposed scheme from the devices' successful access probability.