Due to the powerful feature representation capability of deep learning and the effective policy learning capability of reinforcement learning (RL), deep reinforcement learning (DRL) has made remarkable achievements in a series of complex sequential decision-making problems. With the popularity of DRL in many single-agent tasks, its application in multi-agent systems is flourishing. Recently, multi-agent deep reinforcement learning (MADRL) has attracted increasing attention in the field of artificial intelligence, and the scalability and transferability have become one of the important issues. This paper first describes the development process and typical algorithms of DRL. Then, three types of learning paradigms of MADRL are introduced, and two typical classes of cooperative MADRL algorithms are analyzed, ie, the value function decomposition approach and the centralized value function approach. In addition, we summarize six types of scalable MADRL models such as attention mechanisms and graph neural networks, and investigate the research progress of transfer learning and curriculum learning of MADRL in the transferability DR., we discuss the application prospects and research directions of MADRL, providing some insights for the further development of MADRL in the future. providing some insights for the further development of MADRL in the future. providing some insights for the further development of MADRL in the future.