| Reinforcement learning techniques have been successfully used to solve single agent optimization problems but many of the real problems involve multiple agents, or multi-agent systems. This explains the growing interest in multi-agent reinforcement learning algorithms, or MARL. To be applicable in large real domains, MARL algorithms need to be both stable and scalable. A scalable MARL will be able to perform adequately as the number of agents increases. A MARL algorithm is stable if all agents (eventually) converge to a stable joint policy. Unfortunately, most of the previous approaches lack at least one of these two crucial properties.; This dissertation proposes a scalable and stable MARL framework using a network of mediator agents. The network connections restrict the space of valid policies, which reduces the search time and achieves scalability. Optimizing performance in such a system consists of optimizing two subproblems: optimizing mediators' local policies and optimizing the structure of the network interconnecting mediators and servers. I present extensions to Markovian models that allow exponential savings in time and space. I also present the first integrated framework for MARL in a network, which includes both a MARL algorithm and a reorganization algorithm that work concurrently with one another. To evaluate performance, I use the distributed task allocation problem as a motivating domain. |