| Multirobot learning enables robots to adapt themselves to their environment using real-world experience. Multirobot learning is a new research area, thus there are no standards that define systematic implementation. Researchers have proposed several methods to implement learning in decentralized multirobot systems. The most common way is to implement a learning entity on each robot separately. Each learning entity uses a single-robot algorithm, but has a specially designed reward system in order to achieve the best performance with the aid of human intelligence. These special reward systems are usually in the form of subgoals, heuristics, shaped reinforcement, and progress estimators, which vary in details according to different tasks.; This dissertation focuses on the use of a traditional reward system, which gives rewards to robots only when the robots reach desired goals. Our research question is whether decentralized multirobot systems that have traditional reward systems can achieve optimal performance. Our experiments indicate that traditional learning methods can be effectively used with decentralized multirobot systems, but with certain conditions. The success and the effectiveness of this method are potentially affected by various factors that we classify into two groups: the nature of the robots and the nature of the learning entities. We methodically test the effect of varying five common factors (reward scope, value function of the learning algorithm, diversity of robots, number of robots and delay of global information), first in simulation and then on real robots. The results show that three of these factors, reward scope, value function of the learning algorithm, and delay of global information, if set up incorrectly, can prevent optimal, cooperative solutions.; At the end of this dissertation, we propose dynamic task selection, which is a multirobot group architecture that allows task sharing and promotes robustness. In the last chapter, we propose the use of heuristics to help speed up learning by biasing robots' exploration without disturbing the original goal. |