Dealers, Insiders and Bandits: Learning and its Effects on Market Outcomes

Unknown author (2006-07-12)

PhD thesis

This thesis seeks to contribute to the understanding of markets populated by boundedly rational agents who learn from experience. Bounded rationality and learning have both been the focus of much research in computer science, economics and finance theory. However, we are at a critical stage in defining the direction of future research in these areas. It is now clear that realistic learning problems faced by agents in market environments are often too hard to solve in a classically rational fashion. At the same time, the greatly increased computational power available today allows us to develop and analyze richer market models and to evaluate different learning procedures and algorithms within these models. The danger is that the ease with which complex markets can be simulated could lead to a plethora of models that attempt to explain every known fact about different markets. The first two chapters of this thesis define a principled approach to studying learning in rich models of market environments, and the rest of the thesis provides a proof of concept by demonstrating the applicability of this approach in modeling settings drawn from two different broad domains, financial market microstructure and search theory. In the domain of market microstructure, this thesis extends two important models from the theoretical finance literature. The third chapter introduces an algorithm for setting prices in dealer markets based on the model of Glosten and Milgrom (1985), and produces predictions about the behavior of prices in securities markets. In some cases, these results confirm economic intuitions in a significantly more complex setting (like the existence of a local profit maximum for a monopolistic market-maker) and in others they can be used to provide quantitative guesses for variables such as rates of convergence to efficient market conditions following price jumps that provide insider information. The fourth chapter studies the problem faced by a trader with insider information in Kyle s (1985) model. I show how the insider trading problem can be usefully analyzed from the perspective of reinforcement learning when some important market parameters are unknown, and that the equilibrium behavior of an insider who knows these parameters can be learned by one who does not, but also that the time scale of convergence to the equilibrium behavior may be impractical, and agents with limited time horizons may be better off using approximate algorithms that do not converge to equilibrium behavior. The fifth and sixth chapters relate to search problems. Chapter 5 introduces models for a class of problems in which there is a search  season prior to hiring or matching, like academic job markets. It solves for expected values in many cases, and studies the difference between a  high information process where applicants are immediately told when they have been rejected and a  low information process where employers do not send any signal when they reject an applicant. The most important intuition to emerge from the results is that the relative benefit of the high information process is much greater when applicants do not know their own  attractiveness, which implies that search markets might be able to eliminate inefficiencies effectively by providing good information, and we do not always have to think about redesigning markets as a whole. Chapter 6 studies two-sided search explicitly and introduces a new class of multi-agent learning problems, two-sided bandit problems, that capture the learning and decision problems of agents in matching markets in which agents must learn their preferences. It also empirically studies outcomes under different periodwise matching mechanisms and shows that some basic intuitions about the asymptotic stability of matchings are preserved in the model. For example, when agents are matched in each period using the Gale-Shapley algorithm, asymptotic outcomes are always stable, while a matching mechanism that induces a stopping problem for some agents leads to the lowest probabilities of stability. By contributing to the state of the art in modeling different domains using computational techniques, this thesis demonstrates the success of the approach to modeling complex economic and social systems that is prescribed in the first two chapters.