A Short Course in Intermediate Microeconomics with Calculus 1st Edition Roberto Serrano and Allan M. Feldman Brown University January 21, 2012 1 Contents Preface 9 1 Introduction 12 I Theory of the Consumer 15 2 Preferences and Utility 16 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2 The Consumer’s Preference Relation . . . . . . . . . . . . . . . . . . . . . . 17 2.3 The Marginal Rate of Substitution . . . . . . . . . . . . . . . . . . . . . . . 22 2.4 The Consumer’s Utility Function . . . . . . . . . . . . . . . . . . . . . . . . 24 2.5 Utility Functions and the Marginal Rate of Substitution . . . . . . . . . . . 26 2.6 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Appendix: Differentiation of Functions . . . . . . . . . . . . . . . . . . . . . . . . . 34 3 The Budget Constraint and the Consumer’s Optimal Choice 35 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.2 The Standard Budget Constraint, the Budget Set and the Budget Line . . 35 3.3 Shifts of the Budget Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.4 Odd Budget Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5 Income and Consumption Over Time . . . . . . . . . . . . . . . . . . . . . 38 3.6 The Consumer’s Optimal Choice: Graphical Analysis . . . . . . . . . . . . 41 3.7 The Consumer’s Optimal Choice: UtilityMaximization Subject to the Bud- get Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.8 Two Solved Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Appendix: Maximization Subject to a Constraint: The Lagrange Function Method 52 4 Demand Functions 55 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2 4.2 Demand as a Function of Income . . . . . . . . . . . . . . . . . . . . . . . . 56 4.3 Demand as a Function of Price . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.4 Demand as a Function of Price of the Other Good . . . . . . . . . . . . . . 60 4.5 Substitution and Income Effects . . . . . . . . . . . . . . . . . . . . . . . . 61 4.6 The Compensated Demand Curve . . . . . . . . . . . . . . . . . . . . . . . 65 4.7 Elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.8 The Market Demand Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.9 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5 Supply Functions for Labor and Savings 73 5.1 Introduction to the Supply of Labor . . . . . . . . . . . . . . . . . . . . . . 73 5.2 Choice between Consumption and Leisure . . . . . . . . . . . . . . . . . . . 73 5.3 Substitution and Income Effects in Labor Supply . . . . . . . . . . . . . . . 76 5.4 Other Types of Budget Constraints . . . . . . . . . . . . . . . . . . . . . . 78 5.5 Taxing the Consumer’s Wages . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.6 Saving and Borrowing: the Intertemporal Choice of Consumption . . . . . 83 5.7 The Supply of Savings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.8 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6 Welfare Economics 1: The One-Person Case 94 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.2 Welfare Comparison of a Per Unit Tax and an Equivalent Lump Sum Tax . 94 6.3 Rebating a Per Unit Tax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 6.4 Measuring a Change in Welfare for One Person . . . . . . . . . . . . . . . . 98 6.5 Measuring Welfare for Many People, A Preliminary Example . . . . . . . . 103 6.6 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Appendix: Revealed Preference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 7 Welfare Economics 2: The Many-Person Case 113 3 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 7.2 Quasilinear Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 7.3 Consumer’s Surplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 7.4 A Consumer’s Surplus Example With Quasilinear Preferences . . . . . . . 119 7.5 Consumers’ Surplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 7.6 A Last Word on the Quasilinearity Assumption . . . . . . . . . . . . . . . . 124 7.7 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 II Theory of the Producer 129 8 Theory of the Firm 1: The Single-Input Model 130 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 8.2 The Competitive Firm’s Problem, Focusing on Its Output . . . . . . . . . . 131 8.3 The Competitive Firm’s Problem, Focusing on Its Input . . . . . . . . . . . 139 8.4 Multiple Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 8.5 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 149 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 9.2 The Production Function in the Long Run . . . . . . . . . . . . . . . . . . 151 9.3 Cost Minimization in the Long Run . . . . . . . . . . . . . . . . . . . . . . 158 9.4 Profit Maximization in the Long Run . . . . . . . . . . . . . . . . . . . . . 163 9.5 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 170 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 10.2 The Production Function in the Short Run . . . . . . . . . . . . . . . . . . 170 10.3 Cost Minimization in the Short Run . . . . . . . . . . . . . . . . . . . . . . 172 10.4 Profit Maximization in the Short Run . . . . . . . . . . . . . . . . . . . . . 175 4 10.5 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 III Partial Equilibrium: Market Structure 182 11 Perfectly Competitive Markets 183 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 11.2 Perfect Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 11.3 Market/Industry Supply . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 11.4 Equilibrium in a Competitive Market . . . . . . . . . . . . . . . . . . . . . 190 11.5 Competitive Equilibrium and Social Surplus Maximization . . . . . . . . . 191 11.6 The Deadweight Loss of a Per Unit Tax . . . . . . . . . . . . . . . . . . . . 196 11.7 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 12 Monopoly and Monopolistic Competition 205 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 12.2 The Classical Solution to Monopoly . . . . . . . . . . . . . . . . . . . . . . 206 12.3 Deadweight Loss From Monopoly: Comparing Monopoly and Competition 210 12.4 Price Discrimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 12.5 Monopolistic Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 12.6 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 13 Duopoly 228 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 13.2 Cournot Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 13.3 More on Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 13.4 Collusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 13.5 Stackelberg Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 13.6 Bertrand Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 13.7 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 14 Game Theory 251 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 14.2 The Prisoners’ Dilemma, and the Idea of Dominant Strategy Equilibrium . 252 14.3 Prisoners’ Dilemma Complications: Experimental Evidence and Repeated Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 14.4 The Battle of the Sexes, and the Idea of Nash Equilibrium . . . . . . . . . 257 14.5 Battle of the Sexes Complications: Multiple or No Nash Equilibria, and Mixed Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 14.6 The Expanded Battle of the Sexes, When More Choices Make PlayersWorse Off . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 14.7 Sequential Move Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 14.8 Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 14.9 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 IV General Equilibrium 274 15 An Exchange Economy 275 15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 15.2 An Economy with Two Consumers and Two Goods . . . . . . . . . . . . . 275 15.3 Pareto Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 15.4 Competitive or Walrasian Equilibrium . . . . . . . . . . . . . . . . . . . . . 283 15.5 The Two Fundamental Theorems of Welfare Economics . . . . . . . . . . . 287 15.6 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 16 A Production Economy 298 16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 16.2 A Robinson Crusoe Production Economy . . . . . . . . . . . . . . . . . . . 299 16.3 Pareto Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 6 16.4 Walrasian or Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . 302 16.5 When There are Two Goods, Bread and Rum . . . . . . . . . . . . . . . . 306 16.6 The Two Welfare Theorems Revisited . . . . . . . . . . . . . . . . . . . . . 310 16.7 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 V Market Failure 318 17 Externalities 319 17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 17.2 Examples of Externalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 17.3 The Oil Refiner and the Fish Farm . . . . . . . . . . . . . . . . . . . . . . . 322 17.4 Classical Solutions to the Externality Problem: Pigou and Coase . . . . . . 326 17.5 Modern Solutions for the Externality Problem: Markets for Pollution Rights331 17.6 Modern Solutions for the Externality Problem: Cap and Trade . . . . . . . 332 17.7 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 18 Public Goods 343 18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 18.2 Examples of Public Goods . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 18.3 A Simple Model of an Economy with a Public Good . . . . . . . . . . . . . 346 18.4 The Samuelson Optimality Condition . . . . . . . . . . . . . . . . . . . . . 350 18.5 The “Free Rider” Problem and Voluntary Contribution Mechanisms . . . . 352 18.6 How To Get Efficiency in Economies With Public Goods . . . . . . . . . . 354 18.7 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 19 Uncertainty and Expected Utility 367 19.1 Introduction and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 19.2 Von Neumann-Morgenstern Expected Utility: Preliminaries . . . . . . . . . 369 19.3 Von Neumann-Morgenstern Expected Utility: Assumptions and Conclusion 371 7 19.4 Von Neumann-Morgenstern Expected Utility. Examples . . . . . . . . . . . 374 19.5 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 20 Uncertainty and Asymmetric Information 383 20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 20.2 When Sellers Know More Than Buyers: The Market for “Lemons” . . . . . 384 20.3 When Buyers Know More Than Sellers: A Market for Health Insurance . . 385 20.4 When Insurance Encourages Risk Taking: Moral Hazard . . . . . . . . . . 388 20.5 The Principal-Agent Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 391 20.6 What Should Be Done About Market Failures Caused By Asymmetric In- formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 20.7 A Solved Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402 8 Preface 9 Preface Welcome to this intermediate microeconomics course. At this point, you should already have taken an introductory economics class that exposed you to the method and main ideas of the two parts of economic theory, microeconomics and macroeconomics. In addition, you should also have taken a calculus course. The reason is simple: calculus is basic to microeconomics, much of which is about maximizing something (for instance, utility, or profit), or about minimizing something else (for instance, costs). Calculus is the area of mathematics most suited to maxi- mization and minimization problems, and using it makes microeconomic theory straightforward, transparent, and precise. Microeconomics begins with the study of how economic agents in the private sector (con- sumers and firms) make their decisions. We start this course with a brief introduction, in Chapter 1. Then we turn to the main events: Part I of our course (Chapters 2 through 7) is about the theory of the consumer, and Part II (Chapters 8 through 10) is about the theory of the producer, that is, the firm. Part I provides a foundation for the demand curves that you saw in your principles course, and Part II provides a foundation for the supply curves that you saw. Most economic decisions are made in the private sector, but governments also make many important economic decisions. We touch on these throughout the course, particularly when we discuss taxes, monopolies, externalities and public goods. Our main focus, though, is the private sector, since in market economies the private sector is, and should be, the main protagonist. Next, Part III (Chapters 11 through 13) combines theories of the consumer and the producer into the study of individual markets. Here, our focus is on different types of market structure, depending on the market power of the firms producing the goods. Market power is related to the number of firms in the market. We begin, in Chapter 11, with the case of perfect competition, where each firm is powerless to affect the price of the good it sells; this is usually a consequence of there being many firms selling the same good. In Chapter 12 we analyze the polar opposite case, called monopoly, where only one firm provides the good. We also consider intermediate cases between these extremes: in Chapter 13 we analyze duopoly, where two firms compete in the market. One important point that we emphasize is the strong connection between competition and the welfare of a society. This is the connection that was first discussed by Adam Smith, Preface 10 who wrote in 1776 that the invisible hand of market competition leads self-interested buyers and sellers to an outcome that is beneficial to society as a whole. Our analysis in Part III is called partial equilibrium analysis because it focuses on one market in isolation. In Part IV (Chapters 15 and 16), we develop models that look at all markets simultaneously; this is called general equilibrium analysis. The general equilibrium approach is useful to understand the implications of interactions among the different markets. These interactions are of course essential in the economy. A main theme in Part IV is the generalization of the invisible hand idea that market competition leads to the social good. We shall see that under certain conditions there are strong connections between competition in markets and the efficient allocation of resources. These connections, or fundamental theorems of welfare economics, as economists call them, are important both to people interested in economic ideas, and to people simply interested in what kind of economic world they want to inhabit. Finally, Part V (Chapters 17, 18 and 20) focus on the circumstances under which even competitive markets, left by themselves, fail to allocate resources efficiently. This is a very important area of study, because these market failures are common, and when they occur, governments, policy makers, and informed citizens must consider what policy interventionswould best improve the performance of the unregulated market. Our course includes two chapters that are not really part of the building-blocks flow from consumer theory through market failure. Chapter 14 is a basic introduction to game theory. The use of game theory is so prevalent in economics today that we think it is important to provide a treatment here, even if the theories of the consumer, of the firm, of competitive markets and of market failure could get along without it. A similar comment applies to Chapter 19, on uncertainty and expected utility. While most of this course describes decision problems and markets under complete information, the presence of uncertainty is crucial in much of economic life, and much modern microeconomic analysis centers around it. Some instructors may choose to ignore these chapters in their intermediate microeconomics courses, but others may want to cover them. In order to free up some time to do that, we offer some suggestions: We include two alternative treatments of the theory of the firm in this book. The first is contained in Chapter 8, the single-input model of the firm, which abstracts from the cost minimization problem. The second is contained in Chapters 9 and 10, the multiple-input model of the firm, which includes the cost minimization problem. Chapter 8 can be viewed as a quick Preface 11 route, a “highway” to the supply curve. An instructor looking for time to teach some of the newer topics covered in Chapter 14 or Chapter 19 might cover Chapter 8 and omit Chapters 9 and 10. Another short cut in the theory of firm section would be to omit Chapter 10, on the short-run, multiple-input model. Also, our chapters on market failure generally contain basic theory in their first sections and applications in later sections. Instructors might choose to include or omit some of the theory or some of the applications, depending on time and interests. This book has grown out of the lecture notes that Roberto Serrano developed to teach the intermediate microeconomics course at Brown University. The notes were shared with other instructors at Brown over the years. One of these instructors, Amy Serrano (Roberto’s wife), first had the idea of turning them into a book: “This looks like a good skeleton of something; perhaps flesh can be put around these bones.” Following this suggestion, Roberto and Allan began work on the book project. We are grateful to all our intermediate microeconomics students who helped us develop and present this material. Martin Besfamille, Dror Brenner, Pedro Dal Bo´, EeCheng Ong and Amy Serrano were kind enough to try out preliminary versions of the manuscript in their sections of the course at Brown. We thank them and their students for all the helpful comments that they provided. Amy also provided numerous comments that improved the exposition through- out, and her input was especially important in Chapter 7. EeCheng provided superb assistance completing the exercises and their solutions, as well as doing a comprehensive proofreading and editing. Elise Fishelson gave us detailed comments on each chapter at a preliminary stage; Omer Ozak helped with some graphs and TEX issues; and Rachel Bell helped with some graphs. Bar- bara Feldman (Allan’s wife) was patient and encouraging. We thank the anonymous reviewers selected by Cambridge University Press for their helpful feedback and Scott Parris and Chris Harrison, our editors at Cambridge, for their encouragement and support to the project. 1 Introduction 12 1 Introduction Economists study the economic problem. The nature of the economic problem, however, has changed over time. For the classical school of economists (including Adam Smith (1723-1790), David Ricardo (1772-1823), Karl Marx (1818-1883), and John Stuart Mill (1806-1873)), the economic problem was to discover the laws which governed the production of goods and the distribution of goods among the different social classes: land owners, capitalists, workers. These laws were thought to be like natural laws or physical laws, similar to Newton’s law of gravitational attraction. Forces of history, and phenomena like the industrial revolution, produce “universal constants” which govern the production of goods and the distribution of wealth. Towards the end of the 19th century, however, there was a major shift in the orientation of economics, brought about by the neoclassical school of economists. This group includes William Stanley Jevons (1835-1882), Leon Walras (1834-1910), Francis Ysidro Edgeworth (1845-1926), Vilfredo Pareto (1848-1923), and Alfred Marshall (1842-1924). The neoclassical revolution was a shift in the emphasis of the discipline, away from a search for natural laws of production and distribution, and toward the analysis of decision making by individuals and firms. In this book we will describe modern microeconomics, which mostly follows the neoclassical path. For us, and for the majority of contemporary microeconomists, the economic problem is the problem of the “economic agent.” He lives in a world of scarcity. Economists focus on the fact that resources are limited or constrained. These constraints apply to men, women, households, firms, governments, and even humanity. On the other hand, our wants and needs are unlimited. We want more and better material things, for ourselves, our families, our children, our friends. Even if we are not personally greedy, we want better education for our children, better culture, better health for people in our country, and longer lives for everyone. Economics is about how decision makers choose among all the things that they want, given that they cannot have everything. The economic world is the world of limited resources and unlimited needs, and the economic problem is how to best meet those needs given those limited resources. The key assumption in microeconomics, which could be taken as our slogan, our credo, is this: economic agents are rational. This means that they will choose the best alternatives, given what’s available, given the constraints. Of course we know that (to paraphrase Abraham Lincoln) some of the people behave irrationally all the time, and all of the people behave irrationally some 1 Introduction 13 of the time. But we will take rationality as our basic assumption, especially when important goods and services, and money, are at stake. Economics applies the scientific method to the investigation and understanding of the eco- nomic problem. As with the natural sciences like biology, chemistry or physics, economics has theory, and it has empirical analysis. Modern economic theory usually involves the construc- tion of abstract, often mathematical models, which are intended to help us understand some aspect of the economic world. A useful model makes simplifying assumptions about the world. (A completely realistic economic model would usually be too complicated to be useful.) The assumptions incorporated in a useful model should be plausible or reasonable, and not absurd on their face. For instance, it is reasonable to assume that firms want to maximize profits, even though some firms may not be concerned with profits in some circumstances. It is reasonable to assume that a typical consumer wants to eat some food, wear some clothing, and live in a house or an apartment. It would be unreasonable to assume that a typical consumer wants to spend all her income on housing, and eat no food. Once a model has assumptions, the economic analyst applies deductive reasoning and logic to it, in order to derive conclusions. This is where the use of mathematics is important. Correct logical and mathematical arguments clarify the structure of a model and help us avoid mistaken conclusions. The aim is to have a model which sheds some light on the economic world. For example, we might have a logical result like this: if we assume A, B, and C, then D holds, where D = “when the price of ice cream rises, the consumer will eat less of it.” If A, B, and C are very reasonable assumptions, then we feel confident that D will be true. On the other hand, if we do some empirical work and see that D is in fact false, then we are led to the conclusion that either A, B, or C must also be false. Either way, the logical proposition “A, B, and C together imply D” gives us insight into the way the economic world works. Economics is divided between microeconomics and macroeconomics. Macroeconomics studies the economy from above, as if seen from space. It studies aggregate magnitudes, the big things like booms and busts, gross domestic product, rates of employment and unemployment, money supply and inflation. In contrast, microeconomics takes the close-up approach to understand the workings of the economy. It begins by looking at how individuals, households, and firms make decisions, and how those decisions interact in markets. The individual decisions result in market variables, quantities demanded by buyers and supplied by sellers, and market prices. 1 Introduction 14 When people, households, firms, and other economic agents make economic decisions, they alter the allocation of resources. For example, if many people suddenly want to buy some goods in large quantities, they may drive up the prices of those goods, they may drive up employment and wages of the workers who make those goods, they may drive up the profits of the firms that sell them, and they may drive down the wages of people making other goods and the profits of firms that supply the competing goods. When a microeconomist analyzes a market in isolation, assuming that no effects are taking place in other markets, he is doing what is called partial equilibrium analysis. Partial equilibrium analysis focuses on the market for one good, and assumes prices and quantities of other goods are fixed. General equilibrium analysis assumes that what goes on in one market does affect prices and quantities in other markets. All markets in the economy interact, and all prices and quantities are determined more-or-less simultaneously. Obviously general equilibrium analysis is more difficult and complex than partial equilibrium analysis. Both types of analysis, however, are part of microeconomics, and we will do both in this book. Doing general equilibrium analysis allows the people who do microeconomics to connect to the aggregates of the economy, to see the “big picture.” This creates a link between microeconomics and macroeconomics. We will now move on to begin our study, and we do so by considering how individual households make consumption decisions. This is called the theory of the consumer. 15 Part I Theory of the Consumer 2 Preferences and Utility 16 2 Preferences and Utility 2.1 Introduction Life is like a shopping center. The consumer enters it and sees lots of goods, in various quantities, that she might buy. A consumption bundle, or a bundle for short, is a combination of quantities of the various goods (and services) that are available. For instance, a consumption bundle might be 2 apples, 1 banana, 0 cookies, and 5 diet sodas. We would write this as (2, 1, 0, 5). Of course the consumer prefers some consumption bundles to others; that is, she has tastes or preferences regarding those bundles. In this chapter we will discuss the economic theory of preferences in some detail. We will make various assumptions about a consumer’s feelings about alternative consumption bundles. We will assume that when given a choice between two alternative bundles, the consumer can make a comparison. (This assumption is called completeness.) We will assume that when looking at three alternatives, the consumer is rational in the sense that, if she says she likes the first better than the second and the second better than the third, she will also say that she likes the first better than the third. (This is part of what is called transitivity.) We will examine other basic assumptions that economists usually make about a consumer’s preferences: one says that the consumer prefers more of each good to less (called monotonicity), and another says that a consumer’s indifference curves (or sets of equally-desirable consumption bundles) have a certain plausible curvature (called convexity). We will describe and discuss the consumer’s rate of tradeoff of one good against another (called her marginal rate of substitution). After discussing the consumer’s preferences, we will turn to her utility function. A utility function is a numerical representation of how a consumer feels about alternative consumption bundles: if she likes the first bundle better than the second, then the utility function assigns a higher number to the first than to the second, and if she likes them equally well, then the utility function assigns the same number to both. We will analyze utility functions and describe marginal utility, which, loosely speaking, is the extra utility provided by one additional unit of a good. We will derive the relationship between the marginal utilities of two goods and the marginal rate of substitution of one of the goods for the other. We will provide various algebraic examples of utility functions, and, in the appendix, we will briefly review the calculus of derivatives and partial derivatives. 2 Preferences and Utility 17 In this chapter and others to follow, we will often assume there are only two goods available, with x1 and x2 representing quantities of goods 1 and 2, respectively. Why only two goods? For two reasons: first, for simplicity (two goods gives a much simpler model than three goods or five thousand, often with no loss of generality); and second, because we are often interested in one particular good, and we can easily focus on that good and call the second good “all other goods,” or “everything else,” or “other stuff.” When there are two goods any consumption bundle can easily be shown in a standard two-dimensional graph, with the quantity of the first good on the horizontal axis and the quantity of the second good on the vertical axis. All the figures in this chapter are drawn this way. In this chapter we will focus on the consumer’s preferences about bundles of goods, or how she feels about various things that she might consume. But in the shopping center of life some bundles are feasible or affordable for the consumer; these are the ones which her budget will allow. Other bundles are non-feasible or unaffordable; these are the ones her budget won’t allow. We will focus on the consumer’s budget in Chapter 3. 2.2 The Consumer’s Preference Relation The consumer has preferences over consumption bundles. We represent consumption bundles with symbols like X and Y . If there are two goods, X is a vector (x1, x2), where x1 is the quantity of good 1 and x2 is the quantity of good 2. The consumer can compare any pair of bundles and decide which one is better, or decide they are equally good. If she decides one is better than the other, we represent her feelings with what is called a preference relation; we use the symbol to represent the preference relation. That is, X Y means the consumer prefers bundle X over bundle Y . Presented with the choice between X and Y , she would choose X . We assume that if X Y , then Y X cannot be true; if the consumer likes X better than Y , then she had better not like Y better than X ! Obviously, a consumer’s preferences might change over time, and might change as she learns more about the consumption bundles. (The relation is sometimes called the strict preference relation rather than the preference relation, because X Y means the consumer definitely, unambiguously, prefers X to Y , or strictly prefers X to Y .) If the consumer likes X and Y equally well, we say she is indifferent between them. We 2 Preferences and Utility 18 write X ∼ Y in this case, and ∼ is called the indifference relation. Sometimes we will say that X and Y are indifferent bundles for the consumer. In this case, if presented with the choice between them, the consumer might choose X , might choose Y , might flip a coin, or might even ask us to choose for her. We assume that if X ∼ Y , then Y ∼ X must be true; if the consumer likes X exactly as well as Y , then she had better like Y exactly as well as X ! The reader might notice that the symbols for preference and for indifference are a little like the mathematical symbols > and =, for greater than and equal to, respectively. This is no accident. And, just as there is a mathematical relation that combines these two, ≥ for greater than or equal to, there is also a preference relation symbol , for preferred or indifferent to. That is, we write X Y to represent the consumer’s either preferring X to Y , or being indifferent between the two. (The relation is sometimes called the weak preference relation.) Assumptions on preferences: At this point we make some basic assumptions about the consumer’s preference and indifference relations. Our intention is to model the behavior of what we would consider a rational consumer. In this section we will assume the two goods are desirable to the consumer; we will touch on other possibilities (such as neutral goods or bads) in the Exercises. Assumption 1. Completeness. For all consumption bundles X and Y , either X Y , or Y X , or X ∼ Y . That is, the consumer must like one better than the other, or like them equally well. This may seem obvious, but sometimes it’s not. For example, what if the consumer must choose what’s behind the screen on the left, or the screen on the right, and she has no idea what might be hidden behind the screens? That is, what if she doesn’t know what X and Y are? We force her to make a choice, or at least to say she is indifferent. Having a complete ordering of bundles is very important for our analysis throughout this book. (In Chapters 19 and 20 we will analyze consumer behavior under uncertainty, or incomplete information.) Assumption 2. Transitivity. This assumption has four parts: • First, transitivity of preference: if X Y and Y Z, then X Z. • Second, transitivity of indifference: if X ∼ Y and Y ∼ Z, then X ∼ Z. • Third, if X Y and Y ∼ Z, then X Z. 2 Preferences and Utility 19 • Fourth and finally, if X ∼ Y and Y Z, then X Z. The transitivity of preference assumption is meant to rule out irrational preference cycles. You would probably think your friend needs psychiatric help if she says she prefers Econ. 1 (the basic economics course) to Soc. 1 (the basic sociology course), and she prefers Soc. 1 to Psych. 1 (the basic psychology course), and she prefers Psych. 1 to Econ. 1. Cycles in preferences seem irrational. However, do not be too dogmatic about this assumption; there are interesting exceptions in the real world. We will provide one later on in the exercises. The transitivity of indifference assumption (that is, if X ∼ Y and Y ∼ Z, then X ∼ Z) makes indifference curves possible. An indifference curve is a set of consumption bundles (or, when there are two goods, points in a two-dimensional graph) which the consumer thinks are all equally good; she is indifferent among them. We will use indifference curves frequently throughout this book, starting in Figure 2.1 below. The figure shows two consumption bundles, X and Y , and an indifference curve. The two bundles are on the same indifference curve, and therefore the consumer likes them equally well. INSERT FIGURE 2.1 HERE Caption for Fig. 2.1: At bundle X , the consumer is consuming x1 units of good 1 and x2 units of good 2. Similarly at bundle Y , she is consuming y1 units of good 1 and y2 units of good 2. Since X and Y are on one indifference curve, the consumer is indifferent between them. Assumption 3. Monotonicity. We normally assume that goods are desirable, which means the consumer prefers consuming more of a good to consuming less. That is, suppose X and Y are two bundles of goods such that (1) X has more of one good (or both) than Y does and (2) X has at least as much of both goods as Y has. Then X Y . Of course there are times when this assumption is inappropriate. For instance, suppose a bundle of goods is a quantity of cake and a quantity of ice cream, which you will eat this evening. After 3 slices of cake and 6 scoops of ice cream, more cake and more ice cream may not be welcome. But if the goods are more generally defined (e.g., education, housing), monotonicity is a very reasonable assumption. Some important consequences of monotonicity are the following: indifference curves repre- senting preferences over two desirable goods cannot be thick or upward sloping. Nor can they 2 Preferences and Utility 20 be vertical or horizontal. This should be apparent from Figure 2.2. below, which shows an upward sloping indifference curve, and a thick indifference curve. On any indifference curve, the consumer is indifferent between any pair of consumption bundles. A brief examination of the figure should convince the reader that the monotonicity assumption rules out both types of indifference curves shown, and similar arguments rule out vertical and horizontal indifference curves. INSERT FIGURE 2.2 HERE Caption for Fig. 2.2: Each indifference curve shown is a set of equally-desirable consumption bundles. For example, for any pair of bundles X and Y on the upward sloping curve, X ∼ Y . Can you see why the monotonicity assumption makes the upward sloping indifference curve impossible? How about the thick indifference curve? In Figure 2.3 below we show a downward sloping thin indifference curve, which is what the monotonicity assumption requires. The figure also shows the set of bundles which by the monotonicity assumption must be preferred to all the bundles on the indifference curve (the more preferred set), and the set of bundles which by the monotonicity assumption must be liked less than all the bundles on the indifference curve (the less preferred set). INSERT FIGURE 2.3 HERE Caption for Fig. 2.3: The only graph compatible with monotonic preferences is a downward sloping thin indifference curve. Another implication of the assumptions of transitivity (of indifference) and monotonicity is that two distinct indifference curves cannot cross. This is shown in the Figure 2.4. INSERT FIGURE 2.4 HERE Caption for Fig. 2.4: Two distinct indifference curves cannot cross. Here is why. Suppose the curves did cross at the point X . Since Y and X are on the same indifference curve, Y ∼ X . Since X and Z are on the same indifference curve, X ∼ Z. Then by transitivity of indifference, Y ∼ Z. But by monotonicity, Y Z. Therefore having the indifference curves cross leads to a contradiction. 2 Preferences and Utility 21 Assumption 4. Convexity for indifference curves. This assumption means that averages of consumption bundles are preferred to extremes. Consider two distinct points on one indifference curve. The (arithmetic) average of the two points would be found by connecting them with a straight line segment, and then taking the midpoint of that segment. This is the standard average, which gives equal weight to the two extreme points. A weighted average gives possibly unequal weights to the two points; geometrically a weighted average would be any point on the line segment connecting the two original points, not just the midpoint. The assumption of convexity for indifference curves means this: for any two distinct points on the same indifference curve, the line segment connecting them (excepting its end points) lies above the indifference curve. In other words, if we take a weighted average of two distinct points, between which the consumer is indifferent, she prefers the weighted average to the original points. We show this in Figure 2.5 below. We call preferences well behaved when indifference curves are downward sloping and convex. INSERT FIGURE 2.5 HERE Caption for Fig. 2.5: Convexity of preferences means that indifference curves are convex, as in the figure, rather than concave. This means that the consumer prefers averaged bundles over extreme bundles. For example, the bundle made up of 1/2 times X plus 1/2 times Y , that is X/2 + Y/2, is preferred to either X or Y . This is what we normally assume to be the case. In reality, of course, indifference curves are sometimes concave. There are many examples we can think of in which a consumer might like two goods, but not in combination. You may like sushi and chocolate ice cream, but not together in the same dish; you may like classical music and hip-hop, but not in the same evening; you may like pink clothing and orange clothing, but not in the same outfit. Again, if the goods are defined generally enough, like classical music consumption per year, hip-hop consumption per year, pink and orange clothing worn this year, the assumption of convexity of indifference becomes very reasonable. We show a concave indifference curve in Figure 2.6 below. INSERT FIGURE 2.6 HERE Caption for Fig. 2.6: A concave indifference curve. This consumer prefers the extreme points X and Y to the average X/2 + Y/2. 2 Preferences and Utility 22 2.3 The Marginal Rate of Substitution The marginal rate of substitution is an important and useful concept because it describes the consumer’s willingness to trade consumption of one good for consumption of the other. Consider this thought experiment. The consumer gives up a unit of good 1 in exchange for getting some amount of good 2. How much good 2 does she need to get in order to end up on the same indifference curve? This is the quantity of good 2 that she needs to replace one unit of good 1. Or, consider a slightly different thought experiment. The consumer gets a unit of good 1 in exchange for giving up some amount of good 2. How much good 2 can she give up and end up on the same indifference curve? This is the quantity of good 2 that she is willing to give up in exchange for a unit of good 1. The answer to either of these questions is a measure of her valuation of a unit of good 1, in terms of units of good 2. This is the intuitive idea of the marginal rate of substitution of good 2 for good 1. It is her rate of tradeoff between the two goods, the rate at which she can substitute good 2 for good 1 and remain as well off as she was before the substitution. Now let ∆x1 represent a change in her consumption of good 1, and ∆x2 represent a change in her consumption of good 2, and suppose the two changes move her from a point on an indifference curve to another point on the same indifference curve. Remember that for well behaved preferences, indifference curves are downward sloping, and therefore one of the ∆’s will be positive and the other negative. If ∆xi > 0, she’s getting some good i; if ∆xi < 0, she’s giving up some good i. In the first thought experiment above, we let ∆x1 = −1; in the second, we let ∆x1 = +1. In both, we were really interested in the magnitude of the resulting ∆x2. This is the amount of good 2 needed to replace a unit of good 1, or the amount of good 2 that she would be willing to give up to get another unit of good 1. At this point, rather than thinking about the consumer swapping a unit of good 1 in exchange for some amount of good 2, we consider the ratio ∆x2/∆x1. This ratio is the rate at which the consumer has to get good 2 in exchange for giving up good 1 (if ∆x1 < 0 and ∆x2 > 0), or the rate at which she has to give up good 2 in exchange for getting good 1 (if ∆x1 > 0 and ∆x2 < 0). Also, we assume that the ∆’s are very small, or infinitesimal. More formally, we take the limit as ∆x1 and ∆x2 approach 0. Because we are assuming that ∆x1 and ∆x2 are small moves from a point on an indifference 2 Preferences and Utility 23 curve that leave the consumer on the same indifference curve, the ratio ∆x2/∆x1 represents the slope of that indifference curve at that point. Since the indifference curves are downward sloping, ∆x2/∆x1 = Indifference Curve Slope < 0. The definition of the marginal rate of substitution of good 2 for good 1, which we will write MRSx1,x2 , or just MRS for short, is MRSx1,x2 = MRS = −∆x2/∆x1 = −Indifference Curve Slope. More formally, MRS = lim ∆x1,∆x2→0 −∆x2/∆x1 = −Indifference Curve Slope. In Figure 2.7 below, we show a downward sloping indifference curve, and a tangent line at a point X on the indifference curve. We show two increments from X , ∆x1 and ∆x2, that get the consumer back to the same indifference curve. Note that ∆x1 > 0 and ∆x2 < 0 in the figure. If the consumer gets ∆x1 units of good 1, she is willing to give up −∆x2 units of good 2. Her marginal rate of substitution is the limit of −∆x2/∆x1, as ∆x1 and ∆x2 approach zero. That is, her marginal rate of substitution is -1 times the slope of the indifference curve at X , or -1 times the slope of the tangent line at X . INSERT FIGURE 2.7 HERE Caption for Fig. 2.7: Intuitively, the marginal rate of substitution is an answer to one of these questions: “If I take away ∆x1 units of good 1, how much good 2 do I need to give you for you to remain indifferent?”, or “If I give you ∆x1 of units of good 1, how much good 2 can I take away from you and have you remain indifferent?” The second question is illustrated here. For well behaved preferences, the MRS decreases as you move down and to the right along an indifference curve. This makes good sense. It means that if a consumer consumes more and more of a good, while staying on the same indifference curve, she values an additional unit of that good less and less. To convince yourself that this is plausible, consider the following story. A well-off woman (Ms. Well-Off) is lost in the middle of a desert. She is so thirsty, almost dying of thirst. She has no water (good 1), but she does have $100 (good 2) in her pocket. A 2 Preferences and Utility 24 profit-seeking local trader (Mr. Rip-Off), carrying water, offers her a drink, and asks her: “How much are you willing to pay me for your first glass of water?” (That is, “What is your MRS of money for water when you have no water, but $100?”) Honest to a fault, she answers $25. Mr. Rip-Off immediately proposes this trade, and the first glass of water is sold for $25. At this point, Mr. Rip-Off asks again: “You are probably still thirsty, aren’t you? How much are you willing to pay for a second glass of water?” (That is, “What is your MRS of money for water when you already have had a glass of water, and you have $75 left?”) She now answers: “Yes, I am still thirsty. I would pay you $10 for a second glass.” They make this trade also. Her valuation of the second glass of water, her MRS of money for water, has dropped by more than half. This process continues for a while. By the time Ms. Well-Off has had nine or ten glasses of water, her MRS has dropped to zero, because at this point her need for water is much less pressing than her need for a bathroom. INSERT FIGURE 2.8 HERE Caption of Fig. 2.8: The MRS is decreasing because the consumer gets satiated with water as she consumes more of it. She is willing to pay less and less for the incremental drink. 2.4 The Consumer’s Utility Function Mathematically, it is much easier to work with functions than with relations, such as the pref- erence relation and the indifference relation. Our goal now is to construct a function that will represent the preferences of a consumer. Such a function is called a utility function. Imagine that we assign a number to each bundle. For example, we assign the number u(X) = u(x1, x2) = 5, to the bundle X = (x1, x2); we assign the number u(Y ) = u(y1, y2) = 4, to Y = (y1, y2); and so on. We say that such an assignment of numbers to bundles is a consumer’s utility function if: • First, u(X) > u(Y ) whenever X Y . • And second, u(X) = u(Y ) whenever X ∼ Y . Note how this assignment of numbers to bundles is a faithful translation of the consumer’s preferences. It gives a higher utility number to the preferred bundle, and it gives the same 2 Preferences and Utility 25 number to two bundles that the consumer likes equally well. This is the sense in which this function accurately represents the preferences of the consumer. Our consumer’s utility function is said to be an “ordinal” utility function rather than a “cardinal” utility function. An ordinal statement only gives information about relative magnitudes; for instance, “I like Tiffany more than Jennifer.” A cardinal statement provides information about magnitudes that can be added, subtracted, and so on. For instance, “Billy weighs 160 lbs. and Johnny weighs 120 lbs.” We can conclude from the latter statement that Billy weighs 40 lbs. more than Johnny, that the ratio of their weights is exactly 4/3, and that the sum of their weights is 280 lbs. Is utility an ordinal or a cardinal concept? The utilitarians, led by the English philosopher Jeremy Bentham (1748-1832), believed that utility is a cardinal magnitude, perhaps as measurable as length, weight, and so on. For them, statements like these would make sense: “I get three times as much utility from my consumption bundle as you get from your consumption bundle” or “I like a vacation cruise in the West Indies twice as much as you do.” Today, for the most part, we treat utility simply as an ordinal magnitude. All we care about is whether an individual’s utility number from one consumption bundle is larger than, equal to, or smaller than the same individual’s utility number from another bundle. For one individual, differences or ratios of utility numbers from different bundles generally do not matter, and comparisons of utilities across different individuals have no meaning. Under the ordinal interpretation of utility numbers, if we start with any utility function representing my preferences, and we transform it by adding a constant, it still represents my preferences perfectly well. Or, if we multiply it by a positive number, it still works perfectly well. Or, assuming all my utility numbers are positive, if we square all of them, or raise them all to a positive power, we are left with a modified utility function that still represents my preferences perfectly well. In short, if we start with a utility function representing my preferences, and modify it with what’s called an order-preserving transformation, then it still represents my preferences. All this is summed up in the following statement: If u(X) = u(x1, x2) is a utility function that represents the preferences of a consumer, and f is any order-preserving transformation of u, the transformed function f(u(X)) = f(u(x1, x2)) is another utility function that also represents those preferences. What is the connection between indifference curves and utility functions? The answer is that 2 Preferences and Utility 26 we use indifference curves to represent constant levels of utility. Remember that we are assuming the consumer’s utility level depends on her consumption of two goods, measured as variables x1 and x2. We need one axis to represent the amount of x1, and a second axis to represent the amount of x2. If we were to show utility in the same picture as quantities of the two goods, we would need a third axis to represent the utility level u that corresponds to the consumption bundle (x1, x2). A utility function in such a three-dimensional picture looks like a hillside. But three-dimensional pictures are hard to draw. It is much easier to draw two-dimensional graphs with level curves. A level curve for a function is a set of points in the function’s domain, over which the function takes a constant value. If you’ve hiked or climbed mountains with the help of a topographical map, you have used a picture with level curves; an elevation contour on the map is a level curve. Similarly, a weather map has level curves; the isobar lines represent sets of points with the same barometric pressure. (Isobar means: the same barometric pressure.) An indifference curve is a set of points in the consumption bundle picture, among which the consumer is indifferent. Since she is indifferent among these points, they all give her the same utility. Hence, the indifference curve is a level curve for her utility function. Therefore, in order to represent a consumer’s utility function, we will simply draw its level curves, its indifference curves, in the (x1, x2) quadrant. This is like transforming a three-dimensional picture of a mountain into a two-dimensional topographical map, with elevation contours. INSERT FIGURE 2.9 HERE Caption of Fig. 2.9: The indifference curves are the level curves of the utility function. 2.5 Utility Functions and the Marginal Rate of Substitution Next we explain the connection between the marginal rate of substitution, and the utility func- tion that represents the consumer’s preferences. Figure 2.10 below is similar to Figure 2.7. The marginal rate of substitution of good 2 for good 1, at the point X , is −∆x1/∆x2, roughly speak- ing. (And precisely speaking, in the limit.) How does this relate to a utility function for this consumer? INSERT FIGURE 2.10 HERE 2 Preferences and Utility 27 Caption for Fig. 2.10: Marginal utility and the marginal rate of substitution. The marginal utility of good 1 is the rate at which the consumer’s utility increases as good 1 increases, while we hold the quantity of good 2 constant. Loosely speaking, it is the extra utility from an extra unit of good 1. More formally, let ∆x1 represent an increment of good 1. The marginal utility of good 1, which we write MU1, is defined as: MU1 = lim∆x1→0 u(x1 + ∆x1, x2)− u(x1, x2) ∆x1 . If it weren’t for the presence of the variable x2, students would recognize this as the derivative of the function u(x1). And this is almost exactly what it is, except the function u(x1, x2) is really a function of two variables, the second of which, x2, is being held constant. The derivative of a function of two variables, with respect to x1 while x2 is being held constant, is called the partial derivative of the function u(x1, x2) with respect to x1. A derivative is commonly shown with a d symbol, as in df(x)/dx. A partial derivative is commonly shown with a ∂ symbol instead of a d, and so the marginal utility of good 1 can be written as MU1 = ∂u(x1, x2) ∂x1 = ∂u ∂x1 . The marginal utility of good 2, which we write MU2 is defined as: MU2 = lim∆x2→0 u(x1, x2 +∆x2)− u(x1, x2) ∆x2 = ∂u ∂x2 . Since marginal utility is derived from the utility function, which is ordinal, it shouldn’t be interpreted as a cardinal measure. That is, we don’t attach any meaning to a statement like “My marginal utility from an additional apple is 3.” We do attach meaning to a statement like “My marginal utility from an additional apple is 3, and my marginal utility from an additional banana is 2.” This simply means “I prefer an additional apple.” Our main use of the marginal utility concept at this point is to calculate the consumer’s MRS. Consider Figure 2.10 again. From the bundle X = (x1, x2), we increase good 1 by ∆x1, and simultaneously decrease good 2 by ∆x2, to get back to the original indifference curve. If we evaluate the change in utility along the way (keeping in mind that we are really thinking of very small moves), we have the following: utility increases because of the increase in good 1, by an amount equal to the marginal utility of good 1 times ∆x1. At the same time, utility 2 Preferences and Utility 28 decreases because of the decrease in good 2, by an amount equal to the marginal utility of good 2 times ∆x2. The sum of the increase and the decrease is zero, since the consumer ends up on the original indifference curve. This gives the following equation (note that ∆x1 is positive and ∆x2 is negative): MU1∆x1 + MU2∆x2 = 0. From this we easily get − ∆x2 ∆x1 = MU1 MU2 . But MRS = −∆x2/∆x1. We conclude that MRS = MU1MU2 . This gives us a convenient tool for calculating the consumer’s marginal rate of substitution, either as a function of (x1, x2), or as a numerical value at a given point. 2.6 A Solved Problem The Problem For each of the following utility functions, find the marginal rate of substitution function, or MRS. (a) u(x1, x2) = x1x2; (b) u(x1, x2) = 2x2; (c) u(x1, x2) = x1 + x2; (d) u(x1, x2) = min{x1, 2x2}; (e) u(x1, x2) = x2 − x21. The Solution We use the fact that the MRS equals the ratio of the marginal utilities, or MRS = MU1MU2 . In each case, we first calculate the marginal utilities, and then we find their ratio. 2 Preferences and Utility 29 (a) Assume u(x1, x2) = x1x2. MU1 = ∂(x1x2) ∂x1 = x2 and MU2 = ∂(x1x2) ∂x2 = x1. Therefore MRS = MU1MU2 = x2 x1 . (b) Assume u(x1, x2) = 2x2. MU1 = ∂(2x2) ∂x1 = 0 and MU2 = ∂(2x2) ∂x2 = 2. Therefore MRS = MU1MU2 = 0 2 = 0. (c) Assume u(x1, x2) = x1 + x2. MU1 = ∂(x1 + x2) ∂x1 = 1 and MU2 = ∂(x1 + x2) ∂x2 = 1. Therefore MRS = MU1MU2 = 1 1 = 1. (d) Assume u(x1, x2) = min{x1, 2x2}. The marginal utilities depend on whether x1 < 2x2, or x1 > 2x2. If x1 < 2x2, then MU1 = ∂(min{x1, 2x2}) ∂x1 = 1 and MU2 = ∂(min{x1, 2x2}) ∂x2 = 0. Therefore MRS = MU1MU2 = 1 0 = ∞. If x1 > 2x2, then MU1 = ∂(min{x1, 2x2}) ∂x1 = 0 and MU2 = ∂(min{x1, 2x2}) ∂x2 = 2. Therefore MRS = MU1 MU2 = 0 2 = 0. Finally, if x1 = 2x2, then MRS is undefined. 2 Preferences and Utility 30 (e) Assume u(x1, x2) = x2 − x21. MU1 = ∂(x2 − x21) ∂x1 = −2x1 and MU2 = ∂(x2 − x21) ∂x2 = 1. Therefore MRS = MU1MU2 = −2x1 1 = −2x1. 2 Preferences and Utility 31 Exercises 1. We assumed at the beginning of the chapter that a consumer’s preferences must be transitive, but we hinted that there might be interesting exceptions. Here are two: (a) A consumer likes sugar in her coffee, but she simply cannot taste the difference between a cup of coffee with n grams of sugar in it and a cup of coffee with n + 1 grams. Suppose a teaspoon of sugar is 10 grams, and suppose she takes her coffee with one teaspoon of sugar. Why does this violate transitivity? (b) Let’s call a committee of three people a “consumer.” (Groups of people often act together as “consumers.”) Our committee makes decisions using majority voting. When they compare two alternatives x and y they simply take a vote, and the winner is said to be “preferred” by the committee to the loser. Suppose that the preferences of the individuals are as follows: Person 1 likes x best, y second best, and z third best. We write this in the following way: Person 1 : x, y, z. Assume the preferences of the other two people are: Person 2 : y, z, x; and Person 3 : z, x, y. Show that in this example the committee preferences produced by majority voting violate transitivity. (This is the famous “voting paradox” first described by the French philosopher and mathematician Marquis de Condorcet (1743-1794).) 2. Consider the utility function u(x1, x2) = x1x2. (a) Graph the indifference curves for utility levels 1 and 2. (They are symmetric hyper- bolas asymptotic to both axes). (b) Graph the locus of points for which the MRS of good 2 for good 1 is equal to 1, and the locus of points for which the MRS is equal to 2. 3. Different students at World’s Greatest University (W.G.U.) have different preferences about economics. Draw the indifference curves associated with each of the following statements. Measure “economics books” along the horizontal axis and “books about other subjects” along the vertical. Draw arrows indicating the direction in which utility is increasing. 2 Preferences and Utility 32 (a) “I care only about the total amount of knowledge I acquire. It is the same whether that is economics knowledge or of any other kind. That is, all books on all subjects are perfect substitutes for me.” (b) “I hate the Serrano/Feldman textbook and all other economics books. On the other hand, I love everything else in the W.G.U. curriculum.” (c) “I really like books about economics because I want to understand the economic world. Books about other subjects make no difference to me.” (d) “I like all my courses and the liberal education that W.G.U. offers. That is, I prefer to read books on a variety of different subjects, rather than to read lots on one subject and little on the others.” 4. Sketch indifference curves for utility levels 1 and 2 for each of the following utility functions. Describe in a sentence or two the consumer’s preferences for the two goods. (a) u(x1, x2) = 2x2; (b) u(x1, x2) = x1 + x2; (c) u(x1, x2) = min{x1, 2x2}; (d) u(x1, x2) = x2 − x21. 5. Donald likes fishing (x1) and hanging out in his hammock (x2). His utility function for these two activities is u(x1, x2) = 3x21x42. (a) Calculate MU1, the marginal utility of fishing. (b) Calculate MU2, the marginal utility of hanging out in his hammock. (c) Calculate MRS, the rate at which he is willing to substitute hanging out in his hammock for fishing. (d) Last week, Donald fished 2 hours a day, and hung out in his hammock 4 hours a day. Using your formula for MRS from (c) above, find his MRS last week. (e) This week, Donald is fishing 8 hours a day, and hanging out in his hammock 2 hours a day. Calculate his MRS this week. Has his MRS increased or decreased? Explain why. 2 Preferences and Utility 33 (f) Is Donald happier or sadder this week compared to last week? Explain. 6. Suppose you are choosing between hours of work (a bad measured on the horizontal axis) and money (a good measured on the vertical axis). (a) Explain the meaning of MRS in words. (b) Should your MRS be positive or negative in this case? (c) Is yourMRS increasing, constant or decreasing as you increase the hours of work along an indifference curve? Explain and draw some indifference curves for this example. 2 Preferences and Utility 34 Appendix: Differentiation of Functions This short appendix is not meant to be a substitute for a calculus course. However, it may serve as a helpful review. Let’s begin with functions of one variable. Consider a function y = f(x). Its derivative is y′ = f ′(x) = dydx = lim∆x→0 f(x+∆x)− f(x) ∆x . The derivative of the function f is the rate at which f increases as we increase x, the infinitesimal increment in f divided by the infinitesimal increment in x. Some examples of differentiation of functions of one variable are: • (1) y = 4x, y′ = 4; • (2) y = 7x2, y′ = 14x; • (3) y = ln x; y′ = 1/x. What about functions of several variables? Consider a function u(x1, x2), like our utility function. We define two partial derivatives of u, with respect to x1 and with respect to x2: ∂u ∂x1 = lim ∆x1→0 u(x1 + ∆x1, x2)− u(x1, x2) ∆x1 and ∂u ∂x2 = lim ∆x2→0 u(x1, x2 +∆x2)− u(x1, x2) ∆x2 . The first is the rate at which u increases as we increase x1, while holding x2 constant. The second is the rate at which u increases as we increase x2, while holding x1 constant. How do we partially differentiate a function of several variables? Almost exactly the same way we differentiate a function of one variable, except that we must remember that if we are diffentiating with respect to variable xi, we treat any other variable xj as a constant. Some examples are: • (1) u(x1, x2) = x1x2; ∂u/∂x1 = x2, ∂u/∂x2 = x1; • (2) u(x1, x2) = x21x32, ∂u/∂x1 = 2x1x32, ∂u/∂x2 = 3x21x22; • (3) u(x1, x2) = ln x1 + 2 ln x2, ∂u/∂x1 = 1/x1, ∂u/∂x2 = 2/x2. 3 The Budget Constraint and the Consumer’s Optimal Choice 35 3 The Budget Constraint and the Consumer’s Optimal Choice 3.1 Introduction In Chapter 2 we described the consumer’s preferences and utility function. Now we turn to what constrains him, and what he should do to achieve the best outcome given his constraint. The consumer prefers some bundles to other bundles. He wants to get to the most-preferred bundle, or the highest possible utility level, but he cannot afford everything. He has a budget constraint. The consumer wants to make best choice possible, the optimal choice, or the utility-maximizing choice, subject to his budget constraint. In this lesson we will describe the the consumer’s standard budget constraint. We will give some examples of special budget constraints created by non-market rationing devices, like coupon rationing. We will also analyze budget constraints involving consumption over time. After describing various budget constraints, we will turn to the consumer’s basic economic problem: how to find the best consumption bundle, or how to maximize his utility, subject to the budget constraint. We will do this graphically using indifference curves, and we will do it analytically with utility functions. In the appendix to this lesson we will describe the Lagrange function method for maximizing a function subject to a constraint. 3.2 The Standard Budget Constraint, the Budget Set and the Budget Line A consumer cannot spend more money than he has. (We know about credit and will discuss it in a later section of this lesson.) We call what he has his income, written M , for “money”. He wants to spend it on goods 1 and 2. Each has a price, represented by p1 and p2, respectively. The consumer’s standard budget constraint, or budget constraint for short, says that the amount he spends (the sum of price times quantity for each of the two goods) must be less than or equal to the money he has! This gives: p1x1 + p2x2 ≤ M. The budget set is the set of all bundles that satisfy the budget constraint, i.e., all the bundles the consumer can afford. Of course there will generally be many bundles available in the budget set. The budget line is the set of bundles where the consumer is spending exactly what he has. 3 The Budget Constraint and the Consumer’s Optimal Choice 36 That is, it is the set of bundles (x1, x2) satisfying the equation p1x1 + p2x2 = M. The figure below represents the consumer’s budget line. INSERT FIGURE 3.1 HERE Caption for Fig. 3.1: The budget line is a downward sloping straight line. The intercepts are M/p1 and M/p2, and the slope is −p1/p2. The horizontal intercept of the budget line is the amount of good 1 the consumer would have if he spent all his money on that good; that is, if he consumed x2 = 0. This is x1 = M/p1 units of good 1. Similarly, he would have M/p2 units of good 2 if he spent all his money on that good. Since the price per unit of each good is a constant, the budget line is a straight line connecting these two intercepts. The slope of the budget line is obviously negative. The absolute value of the slope, p1/p2, is sometimes called the relative price of good 1. This is the amount ∆x2 of good 2 that the consumer must give up, if he wants to consume an additional amount ∆x1 of good 1. (Compare this with the MRS of good 2 for good 1—the amount ∆x2 of good 2 that the consumer is just willing to give up, in order to consume an additional amount ∆x1 of good 1.) The budget line defines a tradeoff for the consumer who wants to increase his consumption of good 1 and simultaneously decrease his consumption of good 2. Note that in Figure 3.1, ∆x1 is a positive number (good 1 is increasing) and ∆x2 is a negative number (good 2 is decreasing). If the amount spent on the two goods remains constant, the sum of the increase in money spent on good 1 and the decrease in money spent on good 2 must be zero, or p1∆x1 + p2∆x2 = 0. This gives − ∆x2 ∆x1 = p1 p2 . 3 The Budget Constraint and the Consumer’s Optimal Choice 37 3.3 Shifts of the Budget Line If the consumer’s income changes, or if the prices of the goods change, the budget line moves. Figure 3.2 below shows how the budget line shifts if income increases while prices stay constant. INSERT FIGURE 3.2 HERE Caption of Fig. 3.2: In this figure, income increases from M to M ′. The budget line shifts out, parallel to itself. The new intercepts are M ′/p1 and M ′/p2 on the horizontal and vertical axes, respectively. If both prices decrease by the same proportion, the same kind of shift occurs. Suppose the new prices are p′1 = kp1 and p′2 = kp2, where k < 1 is the same factor for both prices. Then the new budget line has slope −p′1/p′2 = −[kp1]/[kp2] = −p1/p2. The new intercept on the horizontal axis is M/p′1 = M/kp1 = (1/k)M/p1, which is farther out the axis because k < 1. If income decreases while both prices stay the same, or if both prices rise by the same proportion while income stays constant, the budget line shifts in. Now consider what happens when one price, say p1, rises, while the other price and income stay the same. Let p′1 be the new price and p1 the old, with p′1 > p1. If the consumer spends all his income on good 1, he will consume less, because the new intercept M/p′1 is smaller than the old M/p1. The intercept on the good 2 axis doesn’t move. The budget line gets steeper, because the absolute value of the new slope, p′1/p2, is greater than the absolute value of the old slope, p1/p2. Figure 3.3 below shows this important type of budget line shift. INSERT FIGURE 3.3 HERE Caption of Fig. 3.3: The price of good 1 rises, while the price of good 2, and income, stay the same. 3.4 Odd Budget Constraints The standard budget constraint described above assumes first, that prices are constant for any quantities of the goods the consumer might want to consume. Second, it assumes that prices don’t depend on income. Third, it assumes that nothing constrains the consumer except prices 3 The Budget Constraint and the Consumer’s Optimal Choice 38 and the money in his pocket. However the real world often doesn’t follow these assumptions. The real world is full of non-standard budget constraints; here are two examples: Example 1, A 2-for-1 Store Coupon. The consumer has one (and only one) coupon from a grocery store, allowing him to buy two units of good 1 for the price of one. The price of good 1 is 1 dollar per unit. Also, the consumer’s income is M = 5, and the price of good 2 is p2 = 1. It follows that p1 = 1/2 if x1 ≤ 2, and p1 = 1 for x1 > 2. Figure 3.4 below illustrates this case. The intercept on the good 2 axis is obviously M/p2 = 5, while the intercept on the horizontal axis is somewhat less obviously 6. INSERT FIGURE 3.4 HERE Caption of Fig. 3.4. The case of a 2-for-the-price-of-1 promotional coupon. Example 2, Ration Coupons. In times of war (and other emergencies, real or imagined) governments will sometimes ration scarce commodities (including food, fuel, and so on). This might mean that goods 1 and 2 sell for money at prices p1 and p2, but that the purchaser also needs a government coupon for each unit of the rationed good (say good 1) that he buys. Suppose the consumer has income of M = 100; let p1 = 1, and p2 = 2, and suppose that the consumer has ration coupons for 50 units of good 1. The vertical intercept is x2 = 50 and the slope of the budget line for x1 < 50 is −1/2. At the point (x1, x2) = (50, 25), the budget line becomes vertical; the consumer cannot buy more than 50 units of good 1 since he only has 50 coupons. INSERT FIGURE 3.5 HERE Caption of Fig. 3.5: The case of wartime coupon rationing. 3.5 Income and Consumption Over Time One very crucial type of budget constraint shows the consumer’s choices over time. This is called an intertemporal budget constraint. For this purpose, we start by assuming there are two time periods (“this year” and “next year”), and we let x1 represent consumption this year, and x2 represent consumption next year. For simplicity, we assume that a unit of the consumption 3 The Budget Constraint and the Consumer’s Optimal Choice 39 good, called “stuff,” has a price of 1, both this year and next year. (Assuming that a unit of a good has a price of 1 is sometimes called normalizing the price. A good with a price of 1 is sometimes called a numeraire good.) Since we are assuming the price of a unit of the good is the same this year and next year, we are assuming no price inflation. (We will add inflation to the mix in some exercises in this chapter, and again when we revisit this topic in Chapter 5.) We assume that the consumer has income M1 this year, and will have income M2 next year. He could obviously choose x1 = M1 and x2 = M2. In this case he’s spending everything that he gets this year on his consumption this year, and spending everything that he gets next year on his consumption next year. He’s neither borrowing nor saving. Alternatively, he could save some of this year’s income. In this case he spends some of M1 on consumption this year, and he sets some aside until next year, when he spends all that remains from this year, plus his income from next year. (We assume that he has monotonic preferences; he always prefers more stuff to less, and will therefore end up spending everything available by the end of next year.) Assume for now that what the consumer doesn’t spend this year he hides under his mattress for next year. In other words, he puts the money he doesn’t spend away in a safe place, but he doesn’t get any interest on his savings. His budget constraint now says that what he consumes next year (x2), must equal what he saved and put under his mattress this year (M1 − x1), plus his income next year (M2). This gives x2 = (M1 − x1) +M2 or x1 + x2 = M1 +M2. Note that we have written the budget constraint as an equation, rather than as an inequality, since the consumer ultimately spends all that he has. Next let’s assume that the consumer doesn’t hide his money under his mattress. Instead, whatever he doesn’t spend this year he puts into a bank account (or an investment) that pays a fixed and certain rate of return i (i is for “interest,” expressed as a decimal). Now what he saves and puts away in the first year, (M1−x1), he gets back with interest (multiply by (1+ i)), causing it to grow to (1+ i)(M1−x1) next year. The consumer’s budget constraint now becomes x2 = (1 + i)(M1 − x1) + M2 or (1 + i)x1 + x2 = (1 + i)M1 +M2. Finally, dividing both sides of the equation by 1 + i gives x1 + ( 1 1 + i ) x2 = M1 + ( 1 1 + i ) M2. 3 The Budget Constraint and the Consumer’s Optimal Choice 40 Economists call the term 1/(1 + i) the discount factor. In general, the term present value means that some future amount, or amounts, or some series of amounts over time, are being converted to the current time, or current year equivalent. The term x2/(1+ i) is called the (year 1) present value of year 2 consumption. The term M1/(1+ i) is called the (year 1) present value of year 2 income. The left hand side of the budget equation, or x1 + x2/(1 + i), is called the present value of the consumer’s consumption stream, and the right hand side of the equation, or M1 +M2/(1+ i), is called the present value of the consumer’s income stream. Therefore the budget equation we just derived says that the present value of the consumption stream equals the present value of the income stream. In the analysis above, we assumed the consumer saves some of his first-year income M1, in order to be able to consume more in the second year than his second-year income M2. Now let’s assume he does the reverse. That is, we now assume that he borrows against next year’s income, in order to increase this year’s consumption. For simplicity, we will assume that the interest rate i is the same for savers and borrowers. (This is of course quite unrealistic; in reality the interest rate paid to savers is normally much less than the interest rate paid by borrowers. To appreciate the difference, compare the interest rate applied to balances on your credit card to the interest rate paid to savers at your bank.) If the consumer intends to spend less than his income in the second year, then x2 < M2, or M2 − x2 > 0. Suppose the consumer goes to his banker in the first year and asks this question: Next year I can pay you back M2 − x2. How much can you lend me this year, based on this anticipated repayment? The banker reasons to himself: If I make a loan of L this year, I must get all my money back next year, plus interest, or a total of (1 + i)L. Therefore I require (1+ i)L = M2−x2. Solving for L then gives L = (M2−x2)/(1+ i). (Of course this process may be more complicated in the real world. In reality, bankers either require collateral or security for loans—as with real estate mortgages—or, for unsecured loans, they charge interest rates high enough to compensate for defaults.) We can now lay out the consumer’s budget constraint in the case where he is a borrower. His consumption this year (x1), is equal to his income this year (M1), plus the loan he gets from his banker (M2 − x2)/(1 + i). This gives x1 = M1 + (M2 − x2)/(1 + i), 3 The Budget Constraint and the Consumer’s Optimal Choice 41 or, rearranging terms, x1 + ( 1 1 + i ) x2 = M1 + ( 1 1 + i ) M2. But this is exactly the same budget equation as in the saver case! To summarize, we have looked at a consumer who has income this year and income next year, and who will consume some stuff this year and more stuff next year. We assumed the consumer can save or borrow, and that the interest rate is the same for savers and for borrowers. We have shown that the consumer has a simple budget constraint involving consumption quantities this year and next, income this year and next, and the interest rate. We have shown that there is a simple and intuitive interpretation of the budget constraint: The present value of the consumer’s consumption stream must equal the present value of his income stream. This is a crucial result for the theory of intertemporal choice. Moreover, the budget constraint we found, and more generally, the methodology of present values, are crucial in the theory and practice of finance. 3.6 The Consumer’s Optimal Choice: Graphical Analysis As we said when we began with the theory of the consumer, the consumer will choose the bundle that he most prefers among those that he can afford. This is his optimal choice. To put it another way, he will find the highest indifference curve that’s consistent with his budget. Figure 3.6 below illustrates. INSERT FIGURE 3.6 HERE Caption of Fig. 3.6: The consumer’s optimal choice is (x∗1, x∗2). What conditions must be satisfied by the consumer’s optimal choice? First, at the optimal point, the consumer’s indifference curve and budget line are just touch- ing, as we can plainly see in Figure 3.6. The figure actually shows more than that; it shows the standard case where the indifference curve and the budget line are in fact tangent at (x∗1, x∗2). That is, both slopes are well-defined and equal. (We will consider some examples where the slopes are either not defined, or not equal, below.) Since the slopes are equal in the figure, the absolute values of the slopes are also equal. The absolute value of the slope of an indifference curve at a point is equal to the MRS, 3 The Budget Constraint and the Consumer’s Optimal Choice 42 and the absolute value of the slope of the budget line is p1/p2. Therefore for the standard case illustrated in Figure 3.6, we have MRS = p1/p2. Recall that the marginal rate of substitution is interpreted as the amount of good 2 the consumer is just willing to give up, in exchange for getting an increment of good 1, and the price ratio p1/p2 is the amount of good 2 that the market demands the consumer give up, in exchange for an increment of good 1. At the optimal point (x∗1, x∗2) of the figure, what the consumer is just willing to do is exactly equal to what the market demands that he do. Now consider a point where MRS 6= p1/p2, for instance the bundle (x′1, x′2) in Figure 3.6. At that bundle, consider the possibility of the consumer giving up some of good 2, and getting some of good 1 in exchange. For a given increment of good 1, the consumer would be willing to give up much more of good 2 than the market requires that he give up (the indifference curve is relatively steep and the budget line is relatively flat). Therefore he would trade according to market prices, move down and to the right on the budget line, and make himself better off. The opposite adjustment would happen at the bundle (x′′1, x′′2). And no such adjustment can happen at the optimal bundle (x∗1, x∗2). And second, the optimal point must be on the budget line. That is, it must be the case that p1x1 + p2x2 = M. This is because we are assuming monotonic preferences; the consumer always prefers more to less, and will spend all of his income. A bundle like (xˆ1, xˆ2) in Figure 3.6 is not optimal and would not be chosen by the consumer, because he prefers, and can afford, bundles above and to the right of (xˆ1, xˆ2), that is, bundles with more of both goods. The principle behind the consumer’s optimal choice is always the same: he wants to buy the most-preferred bundle, or get to the highest indifference curve, that he can afford. However, our marginal rate of substitution condition assumes that the MRS of good 2 for good 1 is well defined, and that the optimal bundle has positive amounts of both goods. What happens to the consumer’s optimal choice without these assumptions? Consider the following examples: • Marginal rate of substitution not defined. Suppose u(x1, x2) = min{x1, x2}; p1 = 2 and p2 = 1. When the utility function has this form, we call x1 and x2 perfect complements. 3 The Budget Constraint and the Consumer’s Optimal Choice 43 This means a unit of good 1 is always consumed with exactly one unit of good 2. Think, for example, of a left shoe and a right shoe. Unless you are missing a limb, you always want to consume exactly one right shoe with each left shoe. You can check to see that the MRS is not defined when x1 = x2 because the utility function is not differentiable there. However, it’s easy to graph the consumer’s choice problem in this case. INSERT FIGURE 3.7 HERE Caption of Fig. 3.7: Perfect complements. Clearly, the budget line equation p1x1 + p2x2 = M still holds because of monotonicity of preferences. And the second equation we need is x1 = x2 in this case. It’s pointless for this consumer to choose a bundle where this condition is not met, given his preferences. • Corner solution. Assume u(x1, x2) = x1 + x2. When the utility function has this form, we call x1 and x2 perfect substitutes. This is like Coke and Pepsi for a consumer who (strangely) cannot taste the difference; a bottle of one soda can be freely substituted for a bottle of the other, with no effect on utility. Clearly, if p1 < p2, this consumer will spend all his income on good 1. The utility function is differentiable and the MRS is equal to 1 everywhere. If the price ratio p1/p2 6= 1, it is impossible to have a tangency of an indifference curve with the budget line. But this only means that the optimal choice must be at an end point of the budget line; that is, on the good 1 axis or the good 2 axis. (Such an optimal choice is called a corner solution.) This consumer would drink only Coke, or only Pepsi, whichever is cheaper. INSERT FIGURE 3.8 HERE Caption of Fig. 3.8: Perfect substitutes. Note that p1/p2, the absolute value of slope of the budget line, is now less than 1. In most of this book, we construct examples of optimal choices where indifference curves and budget lines are tangent. We do it this way to make the explanations simpler. 3 The Budget Constraint and the Consumer’s Optimal Choice 44 3.7 The Consumer’s Optimal Choice: Utility Maximization Subject to the Budget Constraint As we have said, the consumer will choose the bundle that he most prefers among those that he can afford. This is the consumer’s optimal choice. In the last section we thought of this as finding the highest indifference curve that’s consistent with the consumer’s budget line. Now we think of the same problem, but this time we think of it as maximizing the consumer’s utility function subject to his budget constraint. We will assume in this section that the consumer’s optimal choice is at a point where an indifference curve is tangent to the budget line. The consumer’s optimal choice (x∗1, x∗2) is the solution to this problem: Maximize u(x1, x2) subject to p1x1 + p2x2 = M. This is a special type of calculus problem; the objective function u(x1, x2) is being maximized subject to a constraint. If the constraint were not there, there would be no maximum, given our assumption of monotonicity. Therefore we cannot try to solve the problem by first maximizing u(x1, x2), and then worrying about the constraint. There are three ways we can solve the problem. 1. Brute force method. We could use the constraint to solve for one of the variables, plug the result back into the objective function, and then maximize the objective function, which has been reduced to a function of just one variable. (This function does have a maximum.) That is, we use the constraint to solve for x2: x2 = M − p1x1 p2 . We plug this into u(x1, x2) giving u ( x1, M − p1x1 p2 ) . Note that x2 has disappeared from the the utility function. We differentiate this function with respect to x1 and set the result equal to zero. Solving the resulting equation gives 3 The Budget Constraint and the Consumer’s Optimal Choice 45 x∗1. We then plug this back into the budget equation, p1x∗1 + p2x2 = M , and use this to solve for x∗2. The brute force method, as the name suggests, may be rather ugly and difficult, and we will try to avoid using it in what follows. 2. Use-the-graphs method. We could rely on what we learned from the graphs; at a consumer optimum where an indifference curve is tangent to the budget line, it must be the case that MRS = p1/p2. We then combine this equation with the budget constraint equation p1x1 + p2x2 = M to solve for the two unknowns (x∗1, x∗2). This is the method that we use most often in this book. 3. The Lagrange function method. The standard mathematical method for solving a constrained maximization problem is the following. First set up a special function, called the Lagrange function, that incorporates both the objective function and the constraint. In our case, the Lagrange function would be L = u(x1, x2) + λ(M − (p1x1 + p2x2)). In this function, λ is a special variable called the Lagrange multiplier. Next we proceed to find the first order conditions for the maximization of L with respect to x1, x2 and λ; these boil down to MRS = p1/p2 and p1x1 + p2x2 = M . Finally we use the first-order conditions to solve for the optimal quantities of the goods (x∗1, x∗2), and for the optimal λ∗. This method is more elegant than methods 1 and 2 above, and the Lagrange multiplier has a nice economic interpretation in terms of how much the consumer would value a one dollar increase in his income. In general, however, we will stick to the use-the-graphs method in this book, since it is simpler that Lagrange function method. We do describe the Lagrange method in more detail in the appendix to this chapter. 3.8 Two Solved Problems Problem 1 3 The Budget Constraint and the Consumer’s Optimal Choice 46 Part 1. Assume p1 = 1 and p2 = 2, and the consumer has income M = 10. Find the consumer’s budget constraint. Find utility-maximizing consumption bundles for the following utility functions: (a) u(x1, x2) = x1 + x2 (b) u(x1, x2) = x1x2 Part 2. Now assume the prices change to p1 = 2 and p2 = 1. What is his new budget constraint? What happens to the utility maximizing consumption bundles in the two cases? Solution to Problem 1 First note that with these utility functions the consumer will want to spend all his income. He will want to be on his budget line, not below it. The relevant budget constraint is an equation, not an inequality. Part 1. In general, the budget constraint is p1x1 + p2x2 = M . With prices (p1, p2) = (1, 2) this gives x1 + 2x2 = 10. His budget line has slope p1/p2 = 1/2 in absolute value, going from intercept M/p1 = 10 on the good 1 (horizontal) axis to intercept M/p2 = 5 on the good 2 (vertical) axis. (a) If u(x1, x2) = x1+x2, his indifference curves are straight lines, with slope equal to MRS = MU1/MU2 = 1/1 = 1 in absolute value. There is no indifference curve/budget line tangency possible, because the indifference curves have slope 1 and the budget line has slope 1/2 (both in absolute value). To find the corner solution we can use a sketch like Figure 3.8 above, or we can simply calculate utility levels at the ends of the budget line. If he puts all his income into buying 10 units of good 1, u(10, 0) = 10; if he puts all his income into buying 5 units of good 2, u(0, 5) = 5. His optimal consumption bundle is therefore (x∗1, x∗2) = (10, 0). (b) If u(x1, x2) = x1x2, his indifference curves are hyperbolas. The tangency condition is MRS = p1/p2 or MRS = MU1MU2 = x2 x1 = p1 p2 = 1 2 . This gives x1 = 2x2. His budget constraint is x1 +2x2 = 10, and substituting for x1 gives 4x2 = 10. It follows that the solution is x∗1 = 5 and x∗2 = 2.5. 3 The Budget Constraint and the Consumer’s Optimal Choice 47 Part 2. Now suppose the prices change to (p1, p2) = (2, 1). His budget constraint becomes 2x1 + x2 = 10. His budget line now has slope p1/p2 = 2/1 = 2 in absolute value, going from intercept M/p1 = 5 on the good 1 (horizontal) axis to intercept M/p2 = 10 on the good 2 (vertical) axis. (a) If u(x1, x2) = x1+x2, his indifference curves are straight lines, with slope equal to MRS = MU1/MU2 = 1/1 = 1 in absolute value, and again there is no indifference curve/budget line tangency possible. When we calculate utility levels at the ends of the budget line we find his optimal consumption bundle is now (x∗1, x∗2) = (0, 10). (b) If u(x1, x2) = x1x2, his indifference curves are hyperbolas. The tangency condition is again MRS = p1/p2, which now leads to MRS = x2 x1 = p1 p2 = 2 1 . This gives x1 = x2/2. His budget constraint is 2x1+x2 = 10, and substituting for x1 gives 2x2 = 10. It follows that the solution is x∗1 = 2.5 and x∗2 = 5. Finally, let’s contrast the shifts in the optimal consumption bundles, as p1/p2 changes from 1/2 to 2/1, in the cases of the two alternative utility functions. For u(x1, x2) = x1 + x2, the optimal consumption bundle jumps as far as possible, from (10, 0) to (0, 10). For u(x1, x2) = x1x2, the optimal consumption bundle changes from (5, 2.5) to (2.5, 5). A moral of this story is that when the slope of the budget line changes, the consumer will substitute one good for the other dramatically in the straight-line indifference curve case, but only moderately in the curved (hyperbolic) indifference curve case. Problem 2 The utility function u(x1, x2) = xα1xβ2 is called a Cobb-Douglas utility function. (It is named after mathematician Charles Cobb and economist—and Illinois Senator—Paul Douglas. The chapters on the theory of the firm will provide a bit more information.) The constants α and β are both positive. Suppose a consumer is maximizing this utility function subject to the budget constraint p1x1 + p2x2 = M . The consumer will want to spend all his income on his utility-maximizing bundle (x∗1, x∗2). Show the following: 3 The Budget Constraint and the Consumer’s Optimal Choice 48 (a) x∗1 = ( α α + β ) M p1 , (b) x∗2 = ( β α+ β ) M p2 . Solution to Problem 2 (a) The indifference curves are similar to hyperbolas; however, they are not symmetric around a 45 degree line from the origin when α 6= β. Note that MU1 = αxα−11 x β 2 and MU2 = βxα1xβ−12 . The tangency condition says the slope of the indifference curve must equal the slope of the budget line, which leads to MRS = MU1MU2 = αxα−11 x β 2 βxα1xβ−12 = αx2 βx1 = p1 p2 . This gives βp1x1 = αp2x2. It follows that p1x1 = αp2x2/β or p2x2 = βp1x1/α. The budget constraint is p1x1 + p2x2 = M . From p2x2 = βp1x1/α and the budget constraint, we get p1x1+βp1x1/α = M , or (αp1x1+ βp2x2)/α = M . This gives x∗1 = ( α α+ β ) M p1 . (b) From p1x1 = αp2x2/β and the budget constraint, we get αp2x2/β+p2x2 = M , or (αp2x2+ βp2x2)/β = M . This gives x∗2 = ( β α+ β ) M p2 . 3 The Budget Constraint and the Consumer’s Optimal Choice 49 Exercises 1. The consumer’s original budget equation is p1x1 + p2x2 = M , where p1 and p2 are the original prices, and M is the original income level. (a) If p1 doubles and p2 falls by half, what is the consumer’s new budget equation? How has the slope of the budget line changed? (b) If p1 doubles and M triples, what is the equation for the new budget line? How has the slope of the budget line changed? 2. The consumer’s utility function is u(x1, x2) = x1x22. (a) Graph her budget constraint for p1 = 3, p2 = 2 and M = 900, and write down the equation for her budget line. (b) Using the MRS = MU1/MU2 = p1/p2 tangency condition, find her optimal con- sumption bundle for these prices and income. 3. George enjoys apples (a) and bananas (b). If he spends his entire allowance, he can afford 10 apples and 30 bananas. Alternatively, he can afford 15 apples and 15 bananas. The price of an apple is $3. (a) Calculate George’s allowance, and the price of bananas. (b) Assume his utility function is u(a, b) = a+ b. How many apples and bananas will he consume? 4. There are two goods in the world, pumpkins (x1), and apple cider (x2). Pumpkins are $2 each. Cider is $7 per gallon for the first two gallons. After the second gallon, the price of cider drops to $4 per gallon. (a) Peter’s income is $54. Draw his budget line. Solve for the intercepts on the x1 and x2 axes, and the kink in the budget line. Show these in your graph. (b) Peter’s utility function is u(x1, x2) = x1 + 3x2. Sketch some indifference curves in your graph. Find Peter’s optimal consumption bundle (x∗1, x∗2). 3 The Budget Constraint and the Consumer’s Optimal Choice 50 (c) Paul’s income is $22. Draw his budget line in a new graph. Solve for the intercepts on the x1 and x2 axes, and the kink in the budget line. Show these in your graph for Paul. (d) Paul’s utility function is u(x1, x2) = min(3x1, 2x2). Sketch some indifference curves in your graph for Paul. Find Paul’s optimal consumption bundle (x∗1, x∗2). 5. Olivia gets an allowance of $50 this week, but it will have to last her for two weeks, since Mom pays her every other week. Let c1 be her consumption this week (measured in units of stuff), and let c2 be her consumption next week (measured in the same units). The price of one unit of stuff this week is $1. Next week the price will be higher, because of inflation. Assume the inflation rate pi is 1 percent per week, or 0.01 per week when expressed as a decimal. Therefore the price of one unit of stuff next week will be $1(1 + pi) = $1.01. Olivia can borrow or lend at the local bank, and, whether she’s borrowing or lending, the interest rate i is 1 percent per week, or 0.01 when expressed as a decimal. Olivia’s utility function is u(c1, c2) = ln(c1) + ln(c2). (a) Write down Olivia’s budget constraint, first in the abstract (with M , pi and i), and then with the given values incorporated. (b) Find her optimal consumption bundle (c∗1, c∗2). (c) Assume the inflation rate rises to 10 percent, and the interest rate drops to zero. Find her new optimal consumption bundle. 6. Sylvester’s preferences for consumption this period (c1) and consumption next period (c2) are given by the utility function u(c1, c2) = c21c2. Suppose the price of consumption this period is $1 per unit. Assume the interest rate i is 10 percent; and the inflation rate pi is 5 percent. Sylvester has income of $100 this period and $100 next period. (a) Write down the equation for his budget line, and show it in a graph. What are the intercepts of the budget line on the c1 and c2 axes? Label them in your graph. What is the slope of the budget line? Explain it briefly. Where is the zero savings point? Explain it briefly. 3 The Budget Constraint and the Consumer’s Optimal Choice 51 (b) Find Sylvester’s optimal consumption bundle (c∗1, c∗2). We’ll call this S for short. Is Sylvester a lender or a borrower? Show S on your graph, and include the indifference curve passing through it. (c) Suppose the interest rate i falls to 5 percent. Show the new budget line in your graph. Find Sylvester’s new optimal consumption bundle S ′. (d) Is Sylvester better off or worse off at the new point S ′? 3 The Budget Constraint and the Consumer’s Optimal Choice 52 Appendix: Maximization Subject to a Constraint: The Lagrange Function Method Joseph Louis Lagrange (1736-1813), an Italian-French mathematician and astronomer, developed a widely used method for maximizing (or minimizing) a function subject to a constraint. Con- sumers maximize utility subject to budget constraints, and, as we shall see in a later chapter, firms minimize costs subject to output constraints, or maximize profits subject to cost con- straints. Therefore the Lagrange method, also called the Lagrange multiplier method, is often used by economists. We will illustrate the method for the case of a well-behaved utility function u(x1, x2) which a consumer is maximizing subject to a budget constraint p1x1 + p2x2 = M . We assume that the consumer optimum is at a tangency point of an indifference curve and the budget line. The Lagrange function method works as follows. We start with the utility function we want to maximize, u(x1, x2). We rewrite the consumer’s budget constraint as M − (p1x1 + p2x2) = 0. We write a new objective function, called the Lagrange function, as follows: L(x1, x2, λ) = u(x1, x2) + λ(M − (p1x1 + p2x2)). Here λ is a new variable called the Lagrange multiplier. Intuitively, the Lagrange function is formed by taking the original objective function (i.e., u(x1, x2)), and adding to it λ times something that should be zero, at least at the optimal point (i.e., M − (p1x1+p2x2)). Note that L is a function of three variables, x1, x2, and λ. And then we maximize L. That is, we derive the first order conditions for the unconstrained maximization of L. It turns out that this process leads to the consumption bundle (x∗1, x∗2) that solves the original constrained maximization problem. This is because the first order conditions for the unconstrained maximization of L produce the first order conditions for the constrained maximization of u(x1, x2). Moreover, the process also leads to a λ∗ that has a nice intuitive interpretation. We’ll illustrate with two examples. 3 The Budget Constraint and the Consumer’s Optimal Choice 53 Example 1. Let u(x1, x2) = x1x2. Let p1 = 1, p2 = 2, and M = 10, as in the solved problem above, part 1(b). The Lagrange function is L = x1x2 + λ(10− (x1 + 2x2)). The first order conditions for the maximization of L are: ∂L ∂x1 = x2 − λ× 1 = 0, ∂L ∂x2 = x1 − λ× 2 = 0, and ∂L ∂λ = 10− (x1 + 2x2) = 0. Note that the first two first order conditions can be combined to get x2/x1 = 1/2. But this is exactly the same MRS = p1/p2 condition we are familiar with from the solved problem above, part 1(b). The last condition is of course the budget constraint. Solving for the optimal consumption bundle then gives (x∗1, x∗2) = (10/2, 10/4) = (5, 2.5). Finally, solving for the optimal Lagrange multiplier gives λ∗ = 2.5. What’s the interpretation of λ∗? First consider the general description of the Lagrange method. Note that if we were to differentiate the Lagrange function L with respect to money income M , we would get λ. This suggests that λ is the rate of increase in L, and therefore in the objective function u(x1, x2), as we relax the budget constraint, that is, as we increase M . Now consider our example. Assume we increase M somewhat, say by 1 dollar, to 11. Using the MRS = p1/p2 condition again gives x2/x1 = 1/2, and the budget constraint is now 11− (x1 + 2x2) = 0. Solving both simultaneously gives (x∗∗1 , x∗∗2 ) = (11/2, 11/4). The increase in M results in an increase in utility for the consumer. That change in u is ∆u = 11/2× 11/4− 10/2× 10/4 = 15.125− 12.5 = 2.625. We conclude that ∆u ∆M = 2.625/1 ≈ λ ∗ . That is, the Lagrange multiplier λ∗ is approximately equal to the consumer’s increase in utility when his income rises by 1 dollar. If we let ∆M approach zero, the “approximately 3 The Budget Constraint and the Consumer’s Optimal Choice 54 equal to” changes to “exactly equal to,” or ∂u ∂M = 2.5 = λ ∗ . That is, the Lagrange multiplier λ∗ equals the consumer’s rate of increase in utility as his income rises, or as his budget constraint is relaxed. Example 2. Let u(x1, x2) = x21x32. Let p1 = 2, p2 = 3, and M = 10. The Lagrange function is L = x21x32 + λ(10− (2x1 + 3x2)). The first order conditions for the maximization of L are: ∂L ∂x1 = 2x1x32 − λ× 2 = 0, ∂L ∂x2 = 3x21x22 − λ× 3 = 0, and ∂L ∂λ = 10− (2x1 + 3x2) = 0. In the first two first order conditions we bring the −λpi terms to the right hand sides of the equations, and then we divide each side of the first equation by the corresponding side of the second. This gives 2x2 3x1 = 2 3, and therefore x1 = x2. Now we use the last condition, that is, the consumer’s budget constraint, to calculate the optimal consumption point (x∗1, x∗2) = (2, 2). We finish by calculating λ∗ = x∗1(x∗2)3 = (x∗1)2(x∗2)2 = 24 = 16. 4 Demand Functions 55 4 Demand Functions 4.1 Introduction Chapter 3 ended with a discussion of the consumer’s optimal choice. That is, we learned how the consumer decides how much of good 1 and good 2 she wants to buy, given prices p1 and p2, and given her income M . The consumer maximizes her utility u(x1, x2), subject to her budget constraint p1x1 + p2x2 = M . We call the utility maximizing bundle she chooses (x∗1, x∗2). Because this optimal choice can be made for any values of p1, p2 and M , we are actually finding two functions: x∗1 = x1(p1, p2,M) and x∗2 = x2(p1, p2,M). These two functions show the amounts of good 1 or 2 the consumer wants to buy, that is, her demands for the two goods, given arbitrary values of prices and income. The functions are called her individual demand functions for goods 1 and 2, respectively. For notational ease we will drop the ∗’s for the rest of this chapter. We will start this chapter with a detailed study of individual demand functions. Each demand function depends on three independent variables. Because of this complication it would be impossible to graph them in the obvious way. (We would need four dimensions!) Therefore we will look at how demand changes as we vary one independent variable at a time. This exercise is called comparative statics, because we are comparing the consumer’s optimal consumption of goods 1 and 2, as one of the exogenous variables—one of the prices, or income—changes from one level to another. From now on we shall concentrate on the demand function for just one of the goods, say, good 1. In Section 2 we will focus on demand as a function of income, holding prices constant. We will derive an Engel curve, a graph which shows the desired consumption of a good as a function of income. We will distinguish between normal goods (higher income results in higher consumption) and inferior goods (higher income results in lower consumption). In Section 3 we will focus on demand as a function of price, and we will discuss the consumer’s standard demand curve (showing quantity as a function of price), as well as her inverse demand curve (showing price as function of quantity). In Section 4 we will analyze the demand for good 1 as a function of the price of good 2, and in Section 5 we will consider income and substitution effects. An income effect is a change in consumption attributable to a change in income, and a substitution effect is a change in consumption attributable to a change in relative prices. There 4 Demand Functions 56 are actually several ways to measure the substitution effect, and we discuss and graph three alternative methods. In Section 6 we will discuss compensated demand curves, and in Section 7 we will develop the idea of elasticity, the economist’s favored method for measuring the change in one variable (e.g., demand for good 1) in response to the change in another variable (e.g., change in p1, or change in M). In Section 8 we will show how the market demand curve is found by adding together the demand curves of individual consumers, and finally, in Section 9, we will provide a solved problem. 4.2 Demand as a Function of Income Consider the demand function for good 1: x1 = x1(p1, p2,M). In this section, p1 and p2 will remain fixed, so we focus on the function x1(M). That is, we are now interested in the consumer’s demand for a good as a function of the consumer’s income. INSERT FIGURE 4.1 HERE Note that in Figure 4.1, as income increases (and we move farther out from the origin) the amount of good 1 that the consumer wants to consume increases. A good that has this characteristic—as income rises, the consumer demands more—is called a normal good. As the word “normal” should suggest, many goods are normal. A richer consumer probably wants to consume a bigger house (more square feet, more bedrooms, more bathrooms, more closets, more counter space, and more cabinets), a bigger yard, and a bigger or better car; she probably wants to travel or vacation more; she may want to consume more clothes or pairs of shoes. (Have you heard of Imelda Marcos? She was the wife of the president of the Philippines, and among other character flaws, she owned thousands of pairs of shoes.) The richer consumer probably also wants to buy more entertainment, more education, and more psychotherapy. INSERT FIGURE 4.2 HERE In Figure 4.2, we will draw a different picture to summarize this information. We now measure income M on the vertical axis and the desired amount of the good x1 on the horizontal axis, and we plot the pairs (x1,M), (x′1,M ′) from the income expansion path. We connect our 4 Demand Functions 57 points to form a line, which we call an Engel curve for good 1. (Ernst Engel (1821-1896) was a German statistician and economist who first studied these curves.) Note that in the Engel curve, and in other curves we will discuss in this chapter, we put the independent variable on the vertical axis, and the dependent variable on the horizontal axis. This is of course absolutely contrary to standard mathematical custom, which puts the dependent variable on the vertical axis. Economists have been plotting curves the “wrong” way since a great English economist, Alfred Marshall (1842-1924), started doing it in the late 19th century. We apologize for Marshall. Of course there are some goods that the richer consumer wants to consume less of. These are the opposite of normal goods; they are called inferior goods. As income rises the consumer wants to consume less of them. Intercity bus service may be an example: as people become richer, they stop using the bus, and switch to the train, airplane or their own cars. French fries at McDonald’s may also be an inferior good (even though they’re yummy!), because a consumer may go less often to McDonald’s as her income rises. The figure below shows the income expansion path for an inferior good. We won’t draw the (downward-sloping) Engle curve. INSERT FIGURE 4.3 HERE To complicate matters, it is really not possible for a good to be an inferior good for all levels of a consumer’s income. If her income is zero, her consumption of good 1 must be zero (because we are assuming goods have positive prices). And if her income is just above zero, her desired consumption of any particular good must be greater than or equal to what it was when her income was zero. Therefore no good can be an inferior good from M = 0. Think of those McDonald’s French fries. When the consumer is poor, as she gets a little more money, she may well buy more fries at McDonald’s. But when she’s rich enough, she starts to eat fewer and fewer fries at McDonald’s. (Perhaps she nibbles pommes frites at Chez Panisse.) 4.3 Demand as a Function of Price As you recall, the consumer’s demand for good 1 depends on the price of good 1, the price of good 2, and on her income: x1 = x1(p1, p2,M). In this section, we assume that M and p2 4 Demand Functions 58 are fixed. We focus on the crucial relationship between the price of good 1 and the quantity of good 1 she wants to consume: x1(p1). Figure 4.4 below shows how the consumer’s desired consumption bundle (x1, x2) shifts as the price of good 1 falls. This exercise is repeated for various different good 1 prices to get an offer curve. INSERT FIGURE 4.4 HERE Remember that a “normal” good is one for which the consumer’s demand rises as her income rises. We expect a rising price to have the opposite effect. That is, as the price of a good rises, the consumer’s demand for that good usually falls. If this is true for good 1—as p1 rises, x1 falls—we call good 1 an ordinary good. At this point we will construct a new graph. In Figure 4.5 below we again put the independent variable, now the price p1, on the vertical axis, and the dependent variable, the amount of good 1 the consumer wants, x1, on the horizontal axis. We plot the points (p1, x1), (p′1, x′1), and so on, corresponding to the consumer’s optimal choices, and connect them with a line. The result is the consumer’s demand curve. Note that price and quantity demanded are inversely related in this graph. INSERT FIGURE 4.5 HERE Most goods are ordinary goods. That is, they obey the law of demand, which says that as the price goes up, the quantity demanded goes down, and as the price goes down, the quantity demanded goes up. In other words, demand curves are (ordinarily!) downward sloping. However, there are examples of goods that are not ordinary, that disobey the law of demand. These are called Giffen goods, after Robert Giffen (1837-1910), an English statistician and economist. (Although we continue to call them Giffen goods, the attribution may be incorrect. It may have originated in Alfred Marshall’s Principles of Economics, 3rd edition, 1895: “As Mr. Giffen has pointed out, a rise in the price of bread makes so large a drain on the resources of the poorer labouring families and raises so much the marginal utility of money to them, that they are forced to curtail their consumption of meat and the more expensive farinaceous foods: and, bread being still the cheapest food which they can get and will take, they consume more, and not less of it.” In 2009 Wikipedia raised doubts about Marshall’s attribution—“scholars have 4 Demand Functions 59 not been able to identify any passage in Giffen’s writings where he pointed this out”—but others have raised doubts about Wikipedia.) In any case, whether we should call them Giffen goods or Marshall/Giffen goods, these goods violate the law of demand. The very poor consumer eats a lot of bread, because she cannot afford to eat much dairy, fish, meat, and vegetables. The price of bread goes up, making her in effect poorer. She responds by cutting down even more on the expensive foods (and perhaps on clothing, housing, and so on.) As a result she ends up eating even more bread. In Figure 4.6 below we show a Giffen good. The price p1 rises, but the consumer’s demand for good 1 actually rises. INSERT FIGURE 4.6 HERE We need to make one last comment before leaving the notion of Giffen goods. Most of us can think of people we know who consume some goods because they are expensive in order to show off their wealth. Perhaps this is why some people wear Rolex or Breitling watches, or drive Rolls Royce or Lamborghini cars, or sail 30-meter yachts. This kind of demand may violate the law of demand; that is, if Rolex watches sold for $20 instead of $2,000, some consumers might want fewer of them. But these are not examples of Giffen goods. They are called Veblen goods, after Thorstein Veblen (1857-1929). These examples lie outside of the standard economic model of consumer behavior, in which the consumer’s preferences, and her utility function, depend only on the intrinsic attributes of the goods she consumes, and not directly on the prices of those goods. In particular, the standard economic model does not incorporate a consumer’s desire to signal her wealth by buying expensive toys. That would require a different model, and we can’t do everything in such an inexpensive textbook! Inverse demand. Consider again the demand curve for an ordinary good (Figure 4.5 above). We will sometimes want to read each point on the demand curve “vertically,” instead of “hori- zontally” (as we have been doing so far). That is, instead of saying that at each price there is a certain amount demanded, as shown by the demand curve, we may want to say that for each amount demanded, the consumer would be willing to pay a certain price. With ordinary goods, this willingness to pay is decreasing in the amount demanded: for the consumption of a larger amount, the consumer is willing to pay a lower (per-unit) price. A demand curve looked at this 4 Demand Functions 60 way—for each quantity, there is a corresponding price—is called an inverse demand curve; it gives price as a function of quantity, which we write p1(x1). We will come back to this notion when we introduce the notion of consumer’s surplus. 4.4 Demand as a Function of Price of the Other Good We are still analyzing the demand for good 1 as a function of three underlying variables, the price of good 1, the price of good 2, and the consumer’s income; that is x1 = x1(p1, p2,M). In this section we will fix p1 and M , and focus on the relationship between the demand for good 1 and the price of good 2. Formally, this is x1(p2). Goods 1 and 2 are called substitutes if an increase in the price of one causes the demand for the other to increase. As relative prices change, one of the goods is substituted for the other. There are very close substitutes, and not-so-close substitutes. Exxon-Mobil gasoline and Shell gasoline are very close substitutes; the oil companies may claim they are different in some ways, but they are actually almost identical. Competing gas stations selling different brands will find that if one raises its price and the other doesn’t, customers will abruptly shift to the service station with the lower price, whether it is Exxon-Mobil or Shell. Two drugs in the same class, such as Advil and Tylenol, may be substitutes. Chicken and pork (the “other white meat”) are probably substitutes; as the price of one rises, the demand for the other probably increases. Figure 4.7 below shows substitutes in the offer curve context. An analogous figure could be constructed in the price/quantity context, with the price of good 2 on the vertical axis and the desired quantity of good 1 on the horizontal. This would be a cross demand curve, but we will omit it here. INSERT FIGURE 4.7 HERE Goods 1 and 2 are called complements if an increase in the price of one causes the demand for the other to decrease. For instance, ink jet printers and ink cartridges are complements, cell phones and ring tones are complements, and (large) autos and gasoline are complements. In 2008, as the price of gasoline soared in the Unites States, the demand for large SUV’s and pickup trucks plummeted, leaving some auto makers in dire financial straits. Figure 4.8 shows complements in the offer curve context. 4 Demand Functions 61 INSERT FIGURE 4.8 HERE A final note about “substitutes” and “complements”: Saying that goods 1 and 2 are “sub- stitutes” may be trickier than one might think. It is possible, for example, that a consumer’s demand for good 1 rises as the price of good 2 rises (suggesting good 1 “substitutes” for good 2), while, at the same time, the consumer’s demand for good 2 falls as the price of good 1 rises (suggesting good 2 “complements” good 1). Here’s an example. Assume the consumer’s utility function is u(x1, x2) = (x1 − 1)(x2 + 1), for bundles with x1 ≥ 1. Think of good 1 as “food,” and assume x1 = 1 is the subsistence level. With 1 unit of food she is barely alive and has utility of zero, and any less food means she’s dead. Good 2 is “other stuff.” If you sketch a graph of her indifference curves, you will see that they are asymptotic to the line x1 = 1, but intersect the horizontal axis. Note that the marginal utility of good 1 is x2 + 1, and the marginal utility of good 2 is x1 − 1. The marginal rate of substitution is then (x2+1)/(x1−1). Setting this equal to the price ratio gives p1(x1−1) = p2(x2+1). Combining this with the standard budget constraint, p1x1 +p2x2 = M , leads to the consumer’s demand functions for goods 1 and 2. The two demand functions are: x1(·) = (M + p1 + p2)/2p1 and x2(·) = (M − p1 − p2)/2p2. From the first demand function for food, as the price p2 of other stuff rises, the consumer demands more food (suggesting “substitutes”). From the second demand function for other stuff, as the price p1 of food rises, she demands less other stuff (suggesting “complements”). The point of this example s that the terms “substitutes” and “complements” are useful for the intuitive understanding they provide. However, they are not mathematically precise terms. 4.5 Substitution and Income Effects In section 2 above we discussed how a change in income affects the demand for good 1. For a normal good, an increase in income will cause an increase in the amount demanded; for an 4 Demand Functions 62 inferior good, it will cause a decrease in the amount demanded. In section 3 we discussed how a change in the price of good 1 affects the demand for good 1. Generally a decrease in the price will cause an increase in the amount demanded. However, if it is one of those rare Giffen goods, a decrease in the price will cause a decrease in the amount demanded. Let us now think more carefully about what happens to the consumer’s demand for good 1 when p1 falls, while p2 and M stay the same. We realize that there are really two parts to the change in the consumer’s demand resulting from a decrease in p1. First, the relative price of good 1 (that is, the price of good 1 compared to the price of good 2, or p1/p2) has gone down. By itself, this change should cause the consumer to switch toward good 1 and away from good 2. Second, since one price has fallen while the other price and income have remained constant, the consumer is in effect richer. By itself, this change should cause the consumer to want to consume more of good 1 if it is a normal good, but less of good 1 if it is an inferior good. The first effect—switch toward good 1 because it has become relatively cheaper—is called the substitution effect. The second effect—because you are richer, switch toward good 1 if it’s normal, and away from it if it’s inferior—is called the income effect. In short, we are saying that if a consumer responds to a drop in p1, that response can be broken down or decomposed into two parts: the substitution effect (the effect of the change in relative prices), and the income effect (the effect of the consumer’s having become in a sense richer). There are (at least) two ways to approach this decomposition. The first is based on the analysis of Evgeny Slutsky (1880-1948), a Russian economist who wrote a seminal paper on consumer theory in 1915. The second is based on the analysis of the great 20th century English economist Sir John Hicks (1904-1989) whose influential book Value and Capital was published in 1939. We illustrate Slutsky’s breakdown into income and substitution effects in the next figure. INSERT FIGURE 4.9 HERE In Figure 4.9, when the price of good 1 falls from p1 to p′1, the consumer moves from the original point x on the original budget line, to the new point y on the new budget line. In Slutsky’s decomposition there is an intermediate (hypothetical) budget line, that has the slope of the new budget line, but passes through the original point x. If faced with that hypothetical budget line, the consumer would consume the (hypothetical) bundle z. Slutsky identifies the substitution effect as the move from x to z, and the income effect as the move from z to y. Note 4 Demand Functions 63 that the z to y move is a move between budget lines with equal slopes, and therefore involves no changes in relative prices. The problem with Slutsky’s decomposition, according to Hicks, is this: the consumer doesn’t like x as much as the intervening hypothetical point z. The move from the point x on the original budget line to the point y on the new budget line in effect makes the consumer richer, but this is also true of the move from the point x on the original budget line to the point z on the hypothetical budget line. As Hicks saw it, the decomposition needs a hypothetical budget line, and a different hypothetical point z, such that the consumer is just indifferent between x and z. The hypothetical budget line should have the same slope as the new budget line, as in Slutsky’s analysis. That is, it should be based on the new price ratio, rather than the old. But it should be tangent to the original indifference curve. All of this gives rise to the next figure, which show Hicks’ breakdown into income and substitution effects. In Figure 4.10 below, the consumer starts at x and ends up at y. The move from x to y is decomposed into a move from x to z and a move from z to y. The x to z move is along the same indifference curve; it reflects a change in relative prices but leaves the consumer exactly as well off as she was before. The z to y move reflects a change in income only; it leaves relative prices exactly as they were, but moves the consumer to a higher budget line. INSERT FIGURE 4.10 HERE Note that either figure can be used to decide whether good 1 is acting as an ordinary good or as a Giffen good. For this purpose, we only need to see whether y is to the right (ordinary good) or to the left (Giffen good) of x. To determine whether good 1 is normal or inferior, look at z compared to y; if y is to the right of z, it’s normal; if to the left, it’s inferior. Both Slutsky and Hicks decomposed the total change in the amount of good 1 demanded into two parts: the substitution effect, showing the effect of the change in relative prices, and the income effect, showing the effect of the consumer’s having become richer (or poorer, if p1 were to rise instead of fall.) Slutsky’s method is still used by economists who are constructing price indexes, for example, because it is based on observables, the consumption bundles. However for economic theorists who are interested in improvements (or reductions) in the welfare levels of consumers, the Hicks approach is better, since the points x and z in the Hicks decomposition figure are on the same indifference curve, and therefore the move from z to y more accurately 4 Demand Functions 64 reflects the consumer’s gain resulting from the price change. We will generally use the Hicks approach in this book. As a final exercise in this section, we will now draw a substitution effect / income effect figure, Figure 4.11, under the assumption that p1 rises instead of falls. This means that the price change makes our consumer worse off, as if her income were to fall. To complicate things further, we will show that there are really two ways to do a decomposition in the style of Hicks. In our discussion around Figures 4.9 and 4.10, we constructed a hypothetical budget line which was parallel to the new budget line and tangent to the old indifference curve; that is, it incorporated the new price ratio p′1/p2. With equally appealing logic we could have constructed the hypothetical budget line parallel to the old budget line and tangent to the new indifference curve; that is, incorporating the old price ratio p1/p2. In Figure 4.11, the budget line shifts in, because p1 rises. The Hicks substitution effect, based on the old indifference curve and the new price ratio, is the move from x to z. The slightly different substitution effect, developed by Nicholas Kaldor (1908-1986), is based on the new indifference curve, and a hypothetical budget line whose slope is given by the old price ratio. The Kaldor substitution effect is the shift from w to y. Which one is right? Why they both are, even though they give slightly different measurements! INSERT FIGURE 4.11 HERE It’s easy to see in any of the substitution effect / income effect figures that the substitution effect is always negative (or at least less than or equal to zero). That is, as price goes down, quantity demanded via the substitution effect goes up, and as price goes up, quantity demanded via the substitution effect goes down. The income effect, however, can have either sign. If the good is normal, the income effect is negative. That is, as price goes down, quantity demanded via the income effect goes up. On the other hand, if the good is inferior, the income effect is positive. That is, as price goes down, quantity demanded via the income effect goes down. In short, for a normal good, the income and substitution effects work in the same direction. As price goes down, both say: consume more. But for an inferior good, the substitution effect and the income effect work in opposite directions. As price goes down, the substitution effect says: consume more. The income effect says: consume less. The net effect is then ambiguous, and if the income effect (consume less) outweighs the substitution effect (consume more), we have a 4 Demand Functions 65 Giffen good. An exercise at the end of this chapter invites the reader to construct a Hicks-style substitution effect / income effect picture of a Giffen good. 4.6 The Compensated Demand Curve In section 3 we discussed demand curves. Recall that, in the abstract, demand for good 1 is a function x1(p1, p2,M) of three variables. We fix two of them, p2 and M , and we focus on how the desired quantity of good 1 depends on its price: x1(p1). With standard demand curve analysis, we can graph a demand curve and use it to see how the desired amount of good 1 changes in response to a change in its price. We now realize that the change in the desired quantity of good 1 can be viewed as the sum of a substitution effect change, and an income effect change. An alternative tool to measure demand, the compensated demand curve, is a modified demand curve which shows how the desired amount of good 1 changes because of the substitution effect alone. The construction of a compensated demand curve assumes that the consumer maintains the same utility level (or stays on the same indifference curve.) That is, we fix the utility level and the price p2, and we vary the price p1. In order to vary p1 while utility is held constant, we must simultaneously vary M . (To construct a standard demand curve, p1 is varied while p2 and M are fixed.) The compensated demand curve construction methodology is illustrated in Figure 4.12 below. In the figure we show three alternative budget lines; since the absolute value of the slope of a budget line is p1/p2, the flattest budget corresponds to the lowest p1, and the steepest corresponds to the highest p1. To construct the compensated demand curve itself (which we have not done in Figure 4.12), we would then plot the pairs (x1, p1), for the x1 in Figure 4.12 and the lowest p1; the x′1 in the figure and the middle p1, the x′′1 in the figure and the highest p1, and so on. It is clear from Figure 4.12 that as p1 rises the desired quantity of good 1 must fall. To put this another way, the moves from one desired consumption bundle to another in the figure are substitution effect moves (that is, they are moves resulting from changes in relative prices, as utility is held constant). And the compensated demand curve must be downward sloping because the substitution effect is negative. INSERT FIGURE 4.12 HERE 4 Demand Functions 66 Remember that a standard demand curve may violate the law of demand, if a good is a Giffen good, with a positive income effect outweighing the negative substitution effect. A compensated demand curve, in contrast, always obeys the law of demand (it’s always downward-sloping), because it only incorporates the substitution effect. 4.7 Elasticity In the sections above we have talked about how the amount of good 1 demanded changes, as p1, M , or p2 changes. In this section we introduce a standard economists’ approach to measuring changes in one variable in response to changes in another variable. Natural scientists usually measure how changes in one thing bring about changes in another thing by simply looking at ratios of those changes. For instance, if v is velocity, t is time, F is force, and m is mass, dvdt = F m . You probably remember this familiar equation from high school physics as F = ma, Issac Newton’s second law of motion. However economists have traditionally been reluctant to look at ratios like dx1dp1 because these ratios are sensitive to units: an equation would change if the measurement of x1 were changed from ounces to pounds, grams or kilograms, or if the measurement of dp1 were changed from U.S. dollars to Australian dollars, euros, or yen. However, if the ratios are of the form (percentage change in x1)/(percentage change in p1), they become pure numbers, free of units. The ratios can be used no matter what the commodity units or the currency units may be. For this reason, economists usually look at what we call elasticities. The elasticity of A with respect to B is the percentage change in A divided by the percentage change in B. Or, to put it more intuitively, it is the percentage change in A, for a one percent change in B. We will now consider the elasticity of demand for good 1 with respect to the price of good 1, or the price elasticity of demand, for short. Intuitively, the price elasticity of demand is the percentage change in the amount demanded, as the price changes by 1 percent. Since demand goes down as price goes up, if the price increases by 1 percent, the demand will decrease by some percentage. That is, the changes are in opposite directions. In order to avoid having to worry too much about pluses and minuses, positive and negative changes, we will focus on magnitudes, or absolute values. To be precise, then, the price elasticity of demand is the absolute value of the ratio of the percentage increase in the amount demanded to the percentage increase in price. 4 Demand Functions 67 We say that demand is elastic if the elasticity is greater than 1. We say that demand is inelastic if the elasticity is less than 1. We start at a point (p1, x1) on the demand curve. Let dp1 represent a change in the price p1, and let dx1 represent the resulting change in quantity demanded. Note that if dp1 > 0, then dx1 < 0, and conversely, if dp1 < 0, then dx1 > 0. The percentage change in price is dp1 p1 × 100. The percentage change in quantity is dx1 x1 × 100. One percentage change will be positive (an increase), and the other will be negative (a decrease). We use the symbol x1,p1 for the price elasticity of demand. We now have: x1,p1 = ∣∣∣∣∣ dx1 x1 × 100 dp1 p1 × 100 ∣∣∣∣∣ = − dx1 dp1 p1 x1 . Clearly the price elasticity of demand is related to the (absolute value of the inverse of the) slope of the demand curve, (−dx1/dp1), but it is not equal to it. This is because elasticity is a ratio of percentage changes, rather than a ratio of changes. We illustrate with the following example. Example. Consider a very simple linear demand curve: x1(p1) = 100− p1. Note that dx1/dp1 is constant for any linear demand curve, and equals -1 for this one. It follows that x1,p1 = p1 x1 will equal 1 at the midpoint (50,50). INSERT FIGURE 4.13 HERE Relationship between price elasticity of demand and the consumer’s expenditure on the good. The amount the consumer spends on good 1 is E1(p1) = p1x1(p1), where x1(p1) is the quantity demanded as a function of price. Let’s see how the consumer’s expenditure on good 1 changes as p1 increases. We take the derivative of E1 with respect to p1: dE1 dp1 = x1 + p1(dx1/dp1) = x1(1− x1,p1). This means that after a price increase, the consumer’s spending on a good increases if x1,p1 < 1, stays constant if x1,p1 = 1, and decreases if x1,p1 > 1. When the consumer’s demand for a good is inelastic (for instance, a low-wage working person’s demand for a very necessary commodity 4 Demand Functions 68 like bread or fuel), then as the price goes up, the consumer will spend more on that good. But if the consumer’s demand is elastic (for instance, a person’s demand for a good which can easily be done without, like exotic vacation trips), then as the price goes up, the consumer will spend less on that good. 4.8 The Market Demand Curve Everything we have done so far involves only one consumer. The market demand for good 1 is the sum of the demands for good 1 of all the consumers. Suppose for example there are 100 people, and they all have the simple linear demand curve described above, x1(p1) = 100 − p1. How would we find the market demand curve? It’s very simple; we only need to remember that demand is a function of price; so price is the independent variable and demand is the dependent variable. In other words, given p1, we need to sum the x1’s, and not vice versa. The market demand equation is therefore D1(p1) = 100∑ i=1 xi1(p1) = 100(100− p1) = 10, 000− 100p1. Since we are putting the dependent variable on the horizontal axis (following Marshall) in- stead of the vertical axis (the standard mathematical convention), if we are doing the summation graphically, we must be careful to sum the curves horizontally instead of vertically. We must also be careful to remember that a consumer cannot consume a negative quantity of good 1, and so, for example, if the demand function is x1 = 50 − p1 and if p1 = 75, the consumer will not want to consume −25 units of good 1; she will want zero units. The following example illustrates this hazard. Assume there are only two consumers in the market. The first one has demand for good 1 given by x11 = 100−p1; the second one has demand for good 1 given by x21(p1) = 50−p1. Adding together the x1’s is the right thing to do; adding the p1’s would be wrong. (Graphically, add horizontally, not vertically). This gives D1(p1) = x11 + x21 = 150− 2p1. This equation is fine if p1 ≤ 50. For instance, if p1 = 25, the market demand equation gives 150−50 = 100, which equals consumer 1’s demand (75) plus consumer 2’s demand (25). However, if you plug p1 = 75 into the market demand equation, you get zero, which is wrong. The difficulty is that when p1 = 75, the second consumer’s demand is zero, not −25. 4 Demand Functions 69 We can avoid all this confusion if we proceed as follows. When there are two (or more) consumers, some of whose demand curves hit the vertical axis, first, carefully sketch the individ- ual demand curves and include, as part of those demand curve graphs, the zero demand part. Second, add the curves horizontally. The figure below shows how the graph is drawn for the two person example discussed above. INSERT FIGURE 4.14 HERE 4.9 A Solved Problem The Problem The Cobb-Douglas utility function u(x, y) = xay1−a, where 0 ≤ a ≤ 1, represents a con- sumer’s preferences for tickets to baseball games (x) and football games (y). We’ll call our consumer Mr. CD (after mathematician Charles Cobb and economist—and Illinois Senator— Paul Douglas, who made this kind of function famous). Suppose that CD’s income is M , and that the ticket prices are px and py. (a) Derive the demand functions for baseball and football tickets. Indicate whether these goods are normal or inferior, ordinary or Giffen, and whether x and y are complements or substitutes. (b) Can you describe in words the preferences corresponding to a = 0 and a = 1? If a = 1/2? The Solution (a) First we’ll use the utility maximization condition MRS = MUx/MUy = px/py. This gives MRS = MUx MUy = axa−1y1−a (1− a)xay−a = ay (1− a)x = px py . From this we get pxx = a 1− a pyy. We then substitute this expression for pxx into the equation for the budget line, pxx+pyy = M . This gives a 1− a pyy + pyy = a 1− a pyy + 1− a 1− a pyy = 1 1− a pyy = M. 4 Demand Functions 70 It follows that Mr. CD’s demand functions for football tickets and baseball tickets are, respectively, y(·) = (1− a)M py and x(·) = aM px . This implies that the goods are normal; for each good, as income M goes up, the quantity demanded goes up. Also, the goods are neither complements nor substitutes, the fact that the price of one rises (or falls) has no effect on the demand for the other. (b) If a = 0, Mr. CD gets no utility from x, all he cares about is y; and whatever money he has, he spends on y. If a = 1, he only cares about x, and whatever money he has, he spends on x. If a = 1/2 he will always spend half of his money on x and half on y, no matter what the prices might be. 4 Demand Functions 71 Exercises 1. Consider the utility function u(x1, x2) = x1x2. (a) Show that the demand function for good 1 is x1(p1, p2,M) = M/2p1. (b) Is good 1 normal or inferior? Ordinary or Giffen? Are goods 1 and 2 substitutes or complements? 2. Consider the utility function u(x1, x2) = x1x2. Suppose the initial situation is given by p1 = 1, p2 = 1 and M = 10. (a) If the price of good 1 rises to 2.50, show that the total effect on the consumer’s demand for good 1 equals -3. (b) Using the Hicks method, show that the total effect can be decomposed into a substi- tution effect of −(5−√10), and an income effect of −(√10− 2). 3. Suppose the price of a Giffen good falls. Draw a Hicks-style graph showing the income and substitution effects. 4. Professor WL always drinks exactly one cup of coffee (good x) with exactly one spoonful of sugar (good y). He never varies the 1 to 1 proportions. His preferences for the two goods can be represented by the utility function u(x, y) = min(x, y). (WL stands for Wassily Leontief (1905-1999), an economist who first analyzed functions of this type.) Notice that this utility function is not differentiable when x = y. (That is, the partial derivatives of u do not exist if x = y). Suppose that WL’s income is 2, the price of a cup of coffee is 0.20 and the price of a spoonful of sugar is 0.05. (a) Find WL’s optimal consumption point. (b) Suppose that, due to new import taxes, the prices go up to 0.25 and 0.08, for coffee and sugar, respectively. (The taxes are 0.05 per cup of coffee and 0.03 per spoonful of sugar.) Find his new optimal consumption point. How much will he be paying in taxes? 4 Demand Functions 72 (c) Derive his demand functions for cups of coffee and spoonsful of sugar. Are the goods normal or inferior, ordinary or Giffen, substitutes or complements? 5. Sammy and Jimmy are twin brothers. Each gets a weekly allowance of $2. Sammy’s preferences for baseball cards (good x) and “famous economists” cards (good y) can be represented by the utility function u(x, y) = xy. Suppose that both goods are $1 per unit. (a) Solve for Sammy’s optimal consumption bundle. (b) Suppose px rises to $2. What is Sammy’s new optimal consumption point? (c) How much would his parents have to increase his allowance in order to leave him exactly as well off as he was originally? (d) Jimmy’s preferences are represented by v(x, y) = ln(x)+ln(y). Answer parts (a), (b), and (c) for Jimmy. Comment. 6. “Ernie and Bert, Inc.” sells cookies (good x) and bananas (good y). They are offering the following deal to their customers. The price of bananas is fixed at $1 each. The first three cookies that a consumer buys are free; after the third cookie, the price of cookies is also $1 each. Cookie Monster’s utility function is u(x, y) = x(y + 3) and his income is $5. (a) Draw Cookie Monster’s budget line. (b) Find his optimal consumption point. 5 Supply Functions for Labor and Savings 73 5 Supply Functions for Labor and Savings 5.1 Introduction to the Supply of Labor In addition to creating the demand for goods and services, individuals also supply key factors of production to firms. In particular, they supply their labor, and, by saving, they supply capital, i.e., money, to producers. In the first part of this chapter we will study the decisions involving consumption and leisure, which are behind the supply of labor. We will model the standard budget constraint for the consumption/leisure choice, which involves the wage rate, consumption, and the time spent working versus the time spent on leisure. We will also model some special budget constraints, for example, involving non-labor income. We will analyze the effects of income taxation on the consumer’s labor/leisure choice. This analysis has some very interesting implications about the relative desirability of flat and progressive income taxes. We will then turn to the consumer’s decisions regarding the supply of savings. We will discuss borrowing and saving, and revisit intertemporal budget constraints like the ones introduced in Chapter 3. Savings flow through the financial system and end up (hopefully) as part of the capital used by firms to produce more goods and services. We will model the consumer’s savings decision, and show how the amount the consumer saves depends on the interest rate and the inflation rate, as well as on the timing of the consumer’s income stream. 5.2 Choice between Consumption and Leisure We now explain how a consumer decides how much labor to supply. In our simple model, he allocates his time between working and not working. If he works, he earns a wage; with his wage, he can consume. We assume he works to earn money with which to consume. This is of course a simplistic model of why people work. Some scholars might say people work for the sake of working, that working is moral and not working is immoral, and that “idle hands do the devil’s work.” That is, people should and would work even without pay. (The great English philosopher and logician Bertrand Russell (1872-1970) takes the opposite view in In Praise of Idleness, where he writes “I hope that, after reading the following pages, the leaders of the YMCA will start a campaign to induce good young men to do nothing. If so, I shall not have lived in vain.”) For the most part, economists are on Russell’s side; work is not its own reward, 5 Supply Functions for Labor and Savings 74 “hours of work” is a bad rather than a good. In fact, it is apparent that people usually need to be paid in order to work at least most of the hours they are working. In our modeling of the decision about how much labor the worker supplies, there are two “goods.” One is consumption, and the other is leisure, that is, the time spent not working. The worker gets utility from consuming stuff, and from leisure. To consume more stuff he needs to spend more time working, which means less leisure. We let c represent consumption. A unit of consumption is a certain quantity of goods and services. We let p be the price per unit of consumption. We let L represent leisure. (This is leisure per unit time. If the standard time interval is a day, then L is leisure hours per day, e.g., 16 hours.) The consumer’s utility depends on (L, c). In the graphs that follow, we will put leisure L on the horizontal axis and consumption c on the vertical axis. We let T represent the number of hours in a standard time interval, e.g., T = 24 per day. We let l represent hours of work per standard time interval, e.g., 8 hours per day. The individual looks for his most preferred (L, c) bundle subject to a budget constraint, and this leads to his decision about l, that is, how much time to spend working. The budget constraint. Recall the standard budget constraint for a consumer with income M , who is choosing quantities of goods 1 and 2, with prices p1 and p2: p1x1 + p2x2 ≤ M. To find the budget constraint in our consumption/leisure model, we need one more crucial ingredient. Our consumer earns money by working. We let w represent the wage rate, in dollars per hour. If the consumer’s income is exclusively from working, it is given by wl. A budget constraint says “what you spend is less than or equal to what you have.” This gives pc ≤ wl. But hours of labor plus hours of leisure must equal the total number of hours per standard time interval, or l+ L = T. Substituting for l in the budget constraint and rearranging slightly then gives wL+ pc ≤ wT. 5 Supply Functions for Labor and Savings 75 This constraint looks very much like the standard budget constraint shown above, and has a nice interpretation. It is as if the consumer’s total income is what he would get by working all the time available (for example, 24 hours per day). With this total income, the consumer buys stuff c, at a price of p per unit, and also buys leisure, at a price of w per hour. Economists say that the wage rate is the opportunity cost of leisure. For each hour of leisure you choose to consume, you forego w dollars in income, which you would have gotten if you had worked that hour. INSERT FIGURE 5.1 HERE Figure 5.1. Draw the budget constraint with L on the horizontal axis and c on the vertical axis. Label the intercepts: the horizontal at T , the vertical at wT/p. The absolute value of the slope is w/p. Caption of Fig. 5.1: The budget constraint. By construction, the horizontal intercept of the budget line is at T . A consumer at this point is consuming all leisure and no goods; this is the choice of maximum idleness, the super slacker choice. The vertical intercept of the budget line is at wT/p; this is the choice of maximum work, the super go-getter choice. The absolute value of the slope of the budget line, or w/p, is the relative price of leisure in terms of consumption, or the real wage. The real wage is the amount of extra stuff you can consume if you work an extra hour. Preferences. What about preferences? We continue to make the basic assumptions on prefer- ences that were made in Chapter 2. In particular, both consumption and leisure are considered desirable goods. We assume the usual well behaved preferences, with downward sloping and convex indifference curves. Of course, there may be extreme kinds of preferences, such as those of the super go-getter (with horizontal or almost horizontal indifference curves), or those of the super slacker (whose indifference curves are vertical or almost vertical). Figure 5.2 below shows the utility-maximizing choice of a typical individual, who spends part of his day working and consumes a positive amount of stuff. The consumer’s optimal choice is at the point (L∗, c∗), which means he wants to work l∗ hours, to have L∗ hours of leisure, and to consume c∗. INSERT FIGURE 5.2 HERE 5 Supply Functions for Labor and Savings 76 Caption of Fig. 5.2: The consumer’s optimal choice of leisure and consumption is at (L∗, c∗). 5.3 Substitution and Income Effects in Labor Supply In Chapter 4 we discussed demand functions. We can do something similar in our consump- tion/leisure model. We can view w and p as variables, and we can find the corresponding demand functions for utility-maximizing consumption c∗ = c(w, p) and leisure L∗ = L(w, p). Once we have the leisure demand function, the labor supply function follows easily, because l∗(w, p) = T −L∗(w, p). That is, given the wage rate and price of consumption, we can find the amount of time the consumer wants to spend working. If p is fixed and only w is allowed to vary, then labor supply l∗(w) is simply a function of the wage rate w. This labor supply function can be graphed in the conventional (for economists) way, with w on the vertical axis and the consumer’s labor supply l∗(w) on the horizontal axis. Such a graph is called a labor supply curve. Should a labor supply curve have a positive slope (higher wage implies more labor) or a negative slope (higher wage means less labor)? We will answer this question below. In Chapter 4 we also discussed income and substitution effects, normal, inferior, ordinary, and Giffen goods. As the price p1 of good 1 rises, by the substitution effect, the consumer will want less of it, and by the income effect, he will want less of it, provided it is a normal good. The net effect is that as p1 goes up, x∗1 goes down. The demand curve is downward sloping. If x1 is an inferior good, then the net effect of a rise in p1 is ambiguous. We now turn to a similar analysis. How does the wage rate w affect the consumer’s desired consumption of leisure L∗, and his decision about how much time to spend working l∗ = T −L∗? You might think this discussion should be an easy application of what we already did in Chapter 4. As w goes up, the consumer wants less leisure and wants to supply more labor. However the analysis we now do is different and slightly complicated, for a mathematically simple reason. Compare the Chapter 4 budget constraint: p1x1 + p2x2 ≤ M 5 Supply Functions for Labor and Savings 77 with the budget constraint we are now analyzing: wL+ pc ≤ wT. The reason that what we do now is not an obvious and simple extension of the Chapter 4 analysis is that as w changes, both the left-hand side and the right-hand side of the budget constraint change. With that warning, we can proceed. Let’s assume we have an initial situation where the consumer is maximizing utility subject to a budget constraint, as in Figure 5.2. Now suppose the wage rate goes up, from w to w′. How will this affect his demand for leisure and his supply of labor? Figure 5.3 below shows the substitution effect. When the wage rate rises, the consumer will substitute consumption for leisure; he will want to consume more, and spend less time at leisure and more time at work. INSERT FIGURE 5.3 HERE Caption of Fig. 5.3: The substitution effect on the demand for leisure, or the supply of labor. The consumer wants less leisure (the substitution effect is L′ − L∗ < 0) and he wants to work more hours (the substitution effect is L∗ − L′ > 0). However, if leisure is a normal good, as the wage rate rises, the consumer will want to consume more leisure. That is, as w rises and the substitution effect says “consume less leisure,” the income effect says “consume more leisure.” This is because the right hand side of the budget constraint, wT , also rises. The net effect of an increase in w will therefore be ambiguous, and by the same token, the net effect on the consumer’s amount of work supplied l∗ will be ambiguous. If the substitution effect is larger than the income effect, then a rise in w will lead to the choice of less leisure and more time at work. If this happens, the consumer’s labor supply curve is upward sloping (higher w leads to higher l∗). Figure 5.4 below shows this case. We will leave it to the reader to draw a graph of the case where the substitution effect is smaller than the income effect, and an increase in w results in a decrease in l∗, so that the labor supply curve is downward sloping. INSERT FIGURE 5.4 HERE 5 Supply Functions for Labor and Savings 78 Caption of Fig. 5.4: The income effect is outweighed by the substitution effect. By the substitution effect the consumer wants less leisure (L′ < L∗) and by the income effect the consumer wants more leisure (L′ < L∗∗). But the substitution effect is bigger, and so the consumer ends up with less leisure (L∗∗ < L∗), i.e., with more hours of work. Some empirical studies find that for low levels of the wage rate, the substitution effect dominates. But as the wage rate (and hence income) increases, the income effect becomes more and more important, and finally overcomes the substitution effect. In this case the consumer wants to consume less leisure as w rises when w is low, but eventually, when w is high enough, he wants to consume more leisure as w rises. This results in what economists call a backwards bending labor supply curve. 5.4 Other Types of Budget Constraints This section is devoted to describing the shapes of more elaborate and more realistic kinds of budget constraints in the consumption/leisure model. A. Non-labor income. Suppose the consumer receives non-labor income of M dollars per time interval. This might be an allowance from his parents, an inheritance, a welfare check, or income from securities or bank accounts. Then, the budget constraint becomes pc ≤ w(T − L) + M if T − L > 0, and pc ≤ M if T − L = 0. Rearranging terms in the first equation produces, when L < T , wL+ pc ≤ wT +M. Figure 5.5 shows this budget constraint. INSERT FIGURE 5.5 HERE Caption of Fig. 5.5: Budget constraint with non-labor income. 5 Supply Functions for Labor and Savings 79 We can perform a comparative statics exercise to see what happens when M changes. For instance, suppose the consumer originally has zero non-labor income, and then receives an inheritance. What happens to his labor supply? Figure 5.6 shows the answer; if leisure is a normal good, when M > 0 is given to the consumer, he works less. INSERT FIGURE 5.6 HERE Caption 5.6: If leisure is normal, the consumer will work less (that is, take more leisure) after receiving an inheritance of M . B. Unemployment benefits. Governments often provide unemployment benefits to people who are not working. We start by assuming a fixed unemployment benefit U , provided to the consumer if and only if he is not working, that is, at L = T. We will assume no non-labor income M. The budget constraint is pc ≤ w(T − L) if L < T , and pc ≤ U if L = T. Note that the benefit is entirely lost if he does any work at all. In Figure 5.7 below, with an unemployment benefit equal to U , if the consumer doesn’t work, he will be at point A. If he does work, he will be at point B, at the tangency point of his budget constraint and his indifference curve. Note that A is below the indifference curve that goes through B. Therefore he prefers B to A, and so he will work when the benefit is U . But now suppose the unemployment benefit is U ′. Since point C is above the indifference curve that goes through B, the consumer prefers C to B. Therefore he will choose not to work when the benefit is U ′. In short, the presence of a large-enough unemployment benefit may motivate the consumer to stop working. INSERT FIGURE 5.7 HERE Caption of Fig. 5.7: Budget constraints with unemployment benefits. The benefit U has no effect on behavior because the point A is below the indifference curve through B. But U ′ is large enough to cause the consumer to choose not to work, since the point C is above that indifference curve. 5 Supply Functions for Labor and Savings 80 C. Overtime. Firms (and governments) often pay overtime to employees who work more than a standard number of hours per week (e.g., more than 40 hours per week). Under these contracts, the wage rate increases after the standard number of hours. For example it might go from w to 1.5w. (This would be a 50 percent overtime premium, or “time-and-a-half” for overtime.) Figure 5.8 below shows how an overtime premium affects the consumer’s budget line. In the figure, the consumer is paid w per hour for up to T − L′ hours of work, and w′ > w per hour for hours of work beyond T − L′. Note that the absolute value of the slope of the budget line is w/p in the part of the budget line without overtime, and w′/p in the overtime part. In the figure the overtime wage rate w′ is just high enough so that this worker is indifferent between working overtime (at the “Overtime point”) or not working overtime (and ending up at the point (L∗, c∗). INSERT FIGURE 5.8 HERE Caption of Fig. 5.8: A budget constraint with overtime pay for work hours beyond T − L′. Given this overtime rate w′, this consumer is indifferent between working overtime (and getting to the “Overtime point”), and working no overtime (and getting to the point (L∗, c∗)). Since the consumer with the indifference curve shown in Figure 5.8 is indifferent between the overtime point and (L∗, c∗), the w′ used in the figure must be the minimum wage rate that the consumer has to be offered to get him to work overtime. At the overtime wage w′, the consumer is indifferent between working overtime and not working overtime. At higher overtime wage rates, the consumer will work overtime. We leave it to the reader to draw an example. 5.5 Taxing the Consumer’s Wages The consumption/leisure model reveals some interesting problems with taxation. One of the principal sources of government revenue throughout the world is the taxation of workers’ earn- ings. In the U.S., federal income taxes are largely derived from the earnings of employees, and Social Security taxes are largely taxes on wages (and therefore called payroll taxes). An important characteristic of taxation of earnings in the U.S. and in most other countries is progressivity. A taxpayer’s average tax is the fraction of his income that he must pay as tax. A tax on income is said to be progressive if, as the taxpayer’s income rises, his average tax rises. 5 Supply Functions for Labor and Savings 81 In the U.S., for instance, the federal income tax involves brackets with increasing rates. For a single individual in 2011, the first $8,500 in taxable income is taxed at a rate of 10 percent (i.e., each additional dollar adds $0.10 to one’s tax bill), the next $26,000 is taxed at a rate of 15 percent (i.e., each additional dollar adds $0.15 to one’s tax bill), and so on. There are six tax brackets, with the bracket rates rising from a low of 10 percent at the lowest bracket, to a high of 35 percent at the highest bracket (which starts when taxable income is $379,150). The increasing bracket rates produce a progressive tax, once the taxpayer goes beyond the first bracket. Social Security taxes in the U.S. are not progressive. For Social Security, all wages up to a certain ceiling ($106,800 in 2010) are subject to the same flat rate; beyond the ceiling, the tax stops. Once the taxpayer’s income has risen beyond the ceiling, the Social Security tax is regressive, which means that as income rises, his average tax falls. (This does not mean that the Social Security system as a whole—taxes plus benefits financed by those taxes—is regressive. In fact the system as a whole, taxes plus benefits, is quite progressive.) Medicare taxes in the U.S. provide a good example of a flat tax; all the worker’s earnings are taxed at the same rate for Medicare, from the first dollar to the billionth and beyond. (For a typical employed person, both the Social Security tax and the Medicare tax are split equally between the employee and the employer, with the total rate, of Social Security tax plus Medicare tax, for someone whose wages do not exceed the Social Security ceiling, equal to 15.3 percent in 2010.) Obviously whether taxes are progressive, flat, or regressive is vitally important to workers, to the government, to the economy, and to society. This is an important political issue, and always will be. We can illustrate some parts of the controversy with our model. Let us now assume a very simple progressive tax system with only two brackets: 0 percent for income below a certain threshold, and 50 percent for all income above the threshold. Let’s set the threshold so that our consumer must work exactly T/3 hours per standard time interval to reach it. (For instance, if the standard time interval is a day, so T = 24, and w is the hourly wage rate, the threshold is 8w dollars per day. If w = 50, and a worker works 5 days per week and 52 weeks per year, then we are assuming the threshold income at 8×50×5×52 = $104, 000 per year. With these numbers, the consumer would pay no income tax on the first $104,000 in (annual) income, and $0.50 in income tax on each dollar above $104,000.) In Figure 5.9 below, we show the consumer’s budget constraint, and a choice (L∗, c∗) which 5 Supply Functions for Labor and Savings 82 involves his earning enough such that he is subject to the income tax. Now consider what would happen to this particular consumer if the two-bracket tax is replaced with a flat tax, at a rate t. With such a tax, the consumer’s budget constraint is replaced by a straight line, with intercept T on the horizontal axis and w(1 − t)T/p on the vertical axis. The slope of this budget line is w(1− t)/p in absolute value. Now assume that the rate t is so cleverly chosen that the new straight-line budget constraint goes exactly through the point (L∗, c∗). This makes the transition from the progressive tax to the flat tax neutral for this consumer, in the sense that if he wanted to, he could continue to consume exactly the same leisure/consumption combination that he used to consume. Looking at Figure 5.9, three remarkable conclusions become apparent: (1) This consumer is better off when the two-bracket progressive tax is replaced by the flat tax. (2) This consumer works more hours after the change. (3) The government collects more income tax from this consumer after the change. Conclusion (1) is because the flat tax budget line must cross the consumer’s indifference curve at (L∗, c∗). Therefore, the consumer can find a new optimum on a higher indifference curve. Conclusion (2) is because the chosen point under the flat tax, call it (L∗∗, c∗∗), must lie to the left of (L∗, c∗). Therefore L∗∗ < L∗, and it follows that l∗∗ > l∗. For conclusion (3), note that for any given L and l = T − L, going straight up to the no tax budget line gives wl/p, the consumer’s gross earnings expressed in units of consumption. And going straight up to the flat tax budget line gives the consumer’s net-of-flat-tax earnings w(1− t)l/p, expressed in units of consumption. The vertical difference between the no tax budget and the flat tax budget, for a given L and l = T − L, is therefore wtl/p, or how much the government is collecting from this consumer, measured in units of consumption. This vertical difference increases as we move to the left. INSERT FIGURE 5.9 HERE Caption of Fig. 5.9: Budget constraint with progressive tax (two brackets), showing the flat tax is better for this consumer, better for the government, and results in more hours of work by the consumer. Given the remarkable triplet of conclusions, that the consumer is better off, works more, and 5 Supply Functions for Labor and Savings 83 pays more taxes under flat taxes, why haven’t all progressive taxes in the world been abandoned? The answer is that while Figure 5.9 is a convincing argument for one person, it is an argument for just one person. Imagine a second consumer, like the one whose budget appears in Figure 5.9, but one who earns half the wage. Under the progressive two-bracket tax, person 2 would most likely pay no income taxes at all (unless he worked more than 16 hours per day, which would be unusual). The introduction of the flat tax would therefore make him worse off. The conclusion is that, while replacing progressive taxes with flat (or flatter) taxes may be desirable for high earners or even average earners (possibly causing high earners to work more hours, and possibly causing government revenues to increase), it will likely make the low earners worse off. 5.6 Saving and Borrowing: the Intertemporal Choice of Consumption In the previous sections of this chapter, we analyzed a consumption/leisure model of consumer behavior in order to draw conclusions about a consumer’s decision to supply his time and labor. He sells his labor for wages in the labor market. In the sections below, we consider how the consumer interacts with the capital market. This is the market (or markets) where the consumer puts his savings, or where he goes to borrow. We will show how the consumer decides how much money to save and loan to others (his supply of savings), or how much money to borrow from others. To do this, we revisit the discussion of budget constraints over time laid out in part 5 of Chapter 3. A budget constraint over time is called, more elegantly, an intertemporal budget constraint. It shows how much a consumer can consume today, and how much he can consume in the future. That is, it shows the tradeoff between consumption this year and consumption next year. The consumer’s decision about this tradeoff will immediately lead to his decision about how much to save, or how much to borrow. We start by assuming there are two time periods, which can be called “today” and “tomor- row,” or “this year” and “next year,” or “year one” and “year two.” In Chapter 3, we had labeled consumption in the two years x1 and x2, respectively, but in this chapter we will use c1 and c2 in order to make the notation similar to that used in our discussion of the consumption/leisure model. So c1 is the number of units of goods and services, or “stuff,” that our person consumes this year. And c2 is the number of units he consumes next year. A unit of stuff this year is the 5 Supply Functions for Labor and Savings 84 same amount of stuff as a unit next year; the only difference is the timing. We will again assume, as we did in Chapter 3, that a unit of stuff this year costs $1, or p1 = 1. Now, however, we will allow for inflation. The inflation rate is pi, expressed as a decimal. Therefore the price of a unit of stuff next year is p2 = 1+pi. In Chapter 3, we assumed no inflation, and the price of stuff this year and next year were equal, at $1 per unit. In this chapter that’s no longer true. We will let M1 represent the consumer’s income this year, in dollars, and M2 represent his income next year, in dollars. In Chapter 3, where we assumed no inflation, a dollar this year was equal to a dollar next year, in terms of its buying power. In this chapter, where we allow inflation, that is no longer true. Therefore we must be careful to remember that M1 dollars in income this year will buy M1 units of stuff this year, but, because of inflation, M2 dollars in income next year will only buy M2/(1 + pi) units of stuff next year. The budget constraint. The consumer with income M1 this year and M2 next year could obviously consume c1 = M1 units of stuff this year, and c2 = M2/(1 + pi) units of stuff next year. So (M1,M2/(1 + pi)) satisfies his budget constraint. If he did this, he would not be going to the capital market; he would neither save part of his first-year income and lend it to others, nor would he borrow from others against his anticipated second-year income. We will call the point (M1,M2/(1+ pi)), which is always available to the consumer, the zero savings point. This is the point on the budget constraint recommended by Lord Polonius, a character in William Shakespeare’s tragedy Hamlet (1603): “Neither a borrower nor a lender be; for loan oft loses both itself and friend, and borrowing dulls the edge of husbandry.” (Like most of the characters in the play, Polonius came to a bad end.) On the other hand, our consumer might want to consume less than M1 this year, in which case he can set money aside, and then consume more than M2/(1 + pi) next year. The amount he saves is M1 − c1. We assume, as in Chapter 3, that if the consumer saves some money, he earns interest on his savings at a rate of i per period. We will also assume that i ≥ 0. (This is almost always true for nominal interest rates, that is, interest rates from which inflation rates have not been subtracted. However, during the panic of 2008-09, interest rates on U.S. Treasury Bills actually became slightly negative!) The budget constraint for the consumer who is saving money this year says that his spending on stuff next year has to be less than or equal to his 5 Supply Functions for Labor and Savings 85 income next year plus what he saved this year, with interest. This gives (1 + pi)c2 ≤ M2 + (1 + i)(M1 − c1). Rearranging terms leads to (1 + i)c1 + (1 + pi)c2 ≤ (1 + i)M1 + M2, and dividing both sides by 1 + i gives c1 + ( 1 + pi 1 + i ) c2 ≤ M1 + ( 1 1 + i ) M2. This is almost the same constraint as we found in Chapter 3. The differences are: (1) here we have an inequality (for the budget constraint) rather than an equality (for the budget line), (2) we have replaced x’s with c’s, and (3) we have allowed for inflation with pi. We will assume here, as we did in Chapter 3, that the interest rate paid by a borrower is the same as the interest rate paid to a saver, that is, i. This is obviously an unrealistic assumption, but it is easy to modify later, and we will do so in an exercise. The consumer who borrows this year and pays off his loan next year consumes more than his income in the first year; he borrows c1−M1. Next year he must pay back this amount with interest. His budget constraint says that his spending on stuff next year has to be less than or equal to his income next year minus what he borrowed this year, plus interest. This gives (1 + pi)c2 ≤ M2 − (1 + i)(c1 −M1). This leads to exactly the same formula as the one for the saver. The equation for the intertem- poral budget line, for the lender, the borrower, or the consumer who neither lends nor borrows, is c1 + ( 1 + pi 1 + i ) c2 = M1 + ( 1 1 + i ) M2. In Figure 5.10 below, we show an intertemporal budget line. It goes through the zero savings point (M1,M2/(1+pi)), which represents the consumer’s consumption bundle if he neither saves nor borrows. Its intercept on the horizontal axis is M1 + (1/(1 + i))M2. This is the present value of the consumer’s income stream. If the super impatient consumer decided to consume everything this year and nothing next year, this is how much stuff he could consume. The intercept on the vertical axis is ((1 + i)M1 + M2)/(1 + pi). This is how much stuff the super 5 Supply Functions for Labor and Savings 86 patient consumer could consume if he consumed nothing this year, and everything next year. The slope of the intertemporal budget line, in absolute value, is (1 + i)/(1+ pi). This ratio is the relative price of current consumption, or the amount of future stuff that could be exchanged for one unit of current stuff. If the ratio is high, current consumption is relatively expensive. On the other hand, if the ratio is low, current consumption is relatively cheap. The ratio also has another important interpretation. Economists define the real interest rate, or δ, with the equation 1 1 + δ = 1 + pi 1 + i . This gives δ = (1+ i)/(1+pi)−1; this makes δ approximately equal to i−pi. (So the real interest rate is roughly the nominal interest rate minus the inflation rate.) It follows immediately from the definition of δ that the slope of the intertemporal budget line, in absolute value, equals 1+δ. Given the budget constraint, which tells the consumer what’s possible, he chooses the combi- nation of current consumption and future consumption that gets him to the highest indifference curve, or maximizes his utility. Figure 5.10 also shows an indifference curve for the consumer; given this indifference curve and the “original” budget line shown, he chooses to consume (c∗1, c∗2), and to save an amount M1 − c∗1. The figure also shows an alternative “new” budget line based on an increase in i. INSERT FIGURE 5.10 HERE Caption of Fig. 5.10: The intertemporal budget constraint for a saver, and a new budget constraint showing what he can consume when the interest rate increases. 5.7 The Supply of Savings In response to given M1, M2, pi, and i, the consumer will decide on utility-maximizing c∗1 and c∗2. We will call his chosen savings level s∗, defined by s∗ = M1 − c∗1. If s∗ > 0, then he really is saving; if s∗ < 0, he is actually borrowing, and if s∗ = 0, he is at the zero savings point, and is neither saving nor borrowing. The consumer’s supply of savings is a function of the underlying variables M1, M2, pi, and i, and also depends, of course, on the consumer’s preferences or utility function. We now consider how s∗ depends on the interest rate i. 5 Supply Functions for Labor and Savings 87 We start by assuming that consumption this year and consumption next year are both normal goods. That is, if the consumer’s budget line shifts out, but the slope of the budget line, (1 + i)/(1 + pi) in absolute value, does not change, then the consumer wants to consume more of both goods—more stuff this year and more stuff next year. This kind of shift would result, for example, from an increase in both M1 and M2. Next, consider a saver, like the one whose budget line is shown in Figure 5.10. Suppose all variables are constant except for i; assume that i rises a bit. Now there is a new budget line. It still goes through the point (M1,M2/(1 + pi)), but it is now steeper, and the point (c∗1, c∗2) now lies below it. The increase in i has an income effect and a substitution effect. Since (c∗1, c∗2) lies below the new budget line, our consumer, who is a saver, welcomes the increase in the interest rate; he can now afford more stuff this year and more stuff next year—he is richer. By the income effect, his desired consumption of stuff this year rises. Therefore, his desired savings (income minus current consumption) falls by the income effect. Note that Figure 5.11 below, which is based on Figure 5.10, shows the income and substitution effects with dashed arrows; the income effect is the rightward sloping arrow. But the new budget constraint is also steeper. Its slope (in absolute value), (1 + i)/(1+ pi), is the relative price of current consumption. Since consumption this year has become rela- tively more expensive, by the substitution effect the consumer’s desired consumption of stuff this year falls. Therefore, by the substitution effect his desired savings (income minus current consumption) rises. In Figure 5.11 below the substitution effect is the leftward sloping dashed arrow. In short, for the saver, when i increases the income effect says “consume more and save less,” but the substitution effect says “consume less and save more.” Therefore the net effect of an increase in i on desired consumption this year, and on desired savings, is ambiguous. When i rises, the saver might save more (if the substitution effect prevails), or he might save less (if the income effect prevails). In Figure 5.11 below, we show an enlarged and enhanced view of the crucial region of Figure 5.10. We add a hypothetical budget line with the same slope as the new budget line, as well as a new optimal point (c∗∗1 , c∗∗2 ) on a new indifference curve. The substitution effect is the dashed arrow from the original consumption point (c∗1, c∗2) to a tangency point between the original indifference curve and the hypothetical budget line. The income effect is the dashed arrow to 5 Supply Functions for Labor and Savings 88 the new consumption point (c∗∗1 , c∗∗2 ). The new consumption point (c∗∗1 , c∗∗2 ) might lie to the right or to the left of the original consumption point (c∗1, c∗2). Therefore the net effect of an increase in i, on current consumption and on savings, is ambiguous. INSERT FIGURE 5.11 HERE Caption of Fig 5.11: An enlarged view of the crucial area of Figure 5.10, which shows income and substitution effects. By the substitution effect, when the interest rate rises the saver wants to save more; by the income effect he wants to save less. The net effect is ambiguous. In the analysis above, we considered the effects of an increase in i on a saver. We will now turn to the same kind of analysis, but for a borrower. Figure 5.12 shows the case for a borrower. Note that Figure 5.12 is based on exactly the same indifference curve and exactly the same zero savings point (M1,M2/(1 + pi)) as Figure 5.10. The difference is that in Figure 5.10 both the original and the new budget lines are rather steep, and in Figure 5.12 both the original and the new budget lines are rather flat. Both figures represent the same consumer, who is a saver in Figure 5.10 because interest rates are high, and who is a borrower in Figure 5.12 because interest rates are low. INSERT FIGURE 5.12 HERE Caption of Fig. 5.12: The intertemporal budget constraint for a borrower, and a new budget constraint showing what he can consume when the interest rate increases. We consider again a small increase in i, which modifies Figure 5.12. There is a new budget line. It still goes through the point (M1,M2/(1 + pi)), but it is now steeper, and the point (c∗1, c∗2) now lies above it. Since (c∗1, c∗2) lies above the new budget line, our consumer, who is now a borrower rather than a saver, is hurt by the increase in the interest rate; he can no longer afford the combination of stuff this year and stuff next year that he used to consume—he is poorer. By the income effect, his desired consumption of stuff this year falls. Therefore, his desired savings s∗ = M1−c∗1 rises by the income effect. (Remember that s∗ is a negative number for the borrower; when we say he saves more, we mean he borrows less.) The new budget line is steeper than the original. By the same argument made in the case of the saver, the borrower’s savings rise as a consequence of the substitution effect. (A steeper 5 Supply Functions for Labor and Savings 89 budget line indicates that current consumption is relatively more expensive, which implies less current consumption, which implies more savings.) In the case of the borrower, therefore, the net effect of an increase in i on desired consumption this year, c∗1, and on desired savings, s∗ = M1 − c∗1, is unambiguous. As i rises, the borrower will want to consume less this year, and he will want to save more, that is, borrow less. The upshot of what we have done so far is that an increase in i has an ambiguous effect for the saver but an unambiguous effect for the borrower. For the borrower, higher i means save more, that is, borrow less. On the other hand, for the saver, higher i might result in more savings or less savings. We’ll finish this discussion by emphasizing that the same consumer, with the same income stream and the same preferences, should sometimes choose to be a saver and sometimes choose to be a borrower. This was the case for our consumer shown in Figures 5.10 and 5.12. Whether a person chooses to be a saver or chooses to be a borrower depends on how steep the intertemporal budget line is, that is, it depends on (1+ i)/(1+ pi). All that differed between Figures 5.10 and Figure 5.12 was i. For a given pi, the consumer will borrow the most when i = 0. (Remember that we are assuming the interest rate cannot be negative. Therefore the flattest possible intertemporal budget line has slope (1 + 0)/(1 + pi) in absolute value.) As i rises from 0, borrowing will decline monotonically (that is, negative savings will increase monotonically) until a “crossover” i is reached, at which point the consumer neither saves nor borrows. As i rises above the crossover, the consumer will have positive savings. However, with further increases he may reach a point of maximum savings, after which his savings may actually decline. That is, for i above the crossover point, when the consumer is saving, there is no longer a monotonic relationship between i and s∗. The interested reader is invited to sketch a savings supply curve, showing i on the vertical axis and desired savings s∗ on the horizontal axis. Remember to allow for positive s∗ for the saver case and negative s∗ for the borrower case, and show the “crossover” i. 5.8 A Solved Problem The Problem Penny is deciding how much money to allocate to consumption today and to consumption 5 Supply Functions for Labor and Savings 90 tomorrow, c1 and c2. Penny’s utility function is u(c1, c2) = c1c2. She will receive income of M1 = $100 today and M2 = $100 tomorrow. Suppose the interest rate is 10 percent (i = 0.10). Assume today’s price of stuff is $1 per unit, and the inflation rate is 5 percent (pi = 0.05). Assume Penny can either save or borrow at a bank, at the given interest rate. Of course she can only borrow up to a maximum of 100/(1 + i), the present value of tomorrow’s income. (a) Write down the equation for Penny’s budget line. (b) Calculate Penny’s optimal (c∗1, c∗2). How much does she save or borrow today? (c) How does your answer to (b) change if the inflation rate goes up to 10 percent? (d) Answer parts (a) and (b) for Milly, Penny’s friend, who has the same income stream and whose utility function is u(c1, c2) = √c1 + c2. The Solution (a) An intertemporal budget constraint says that the present value of the consumption stream must be less than or equal to the present value of the income stream. On the budget line the present values must be equal. The present value of consumption is c1+((1+pi)/(1+ i))c2. The present value of income is M1 + (1/(1+ i))M2. Therefore her budget line equation is c1 + 1 + pi 1 + i c2 = M1 + 1 1 + i M2 ⇔ c1 + 1.05 1.10 c2 = 100 + 100 1.10 . Note that the slope of the budget line, in absolute value, is (1 + i)(1 + pi) = 1.10/1.05. (b) Next we use the tangency condition, which says the MRS should equal the absolute value of the slope of the budget line. This gives MRS = MU1/MU2 = (1+ i)/(1+pi) = 1.10/1.05. Since her utility function is u(c1, c2) = c1c2, MU1 = c2 and MU2 = c1. Therefore the tangency condition is c2 c1 = 1 + i 1 + pi = 1.10 1.05 , which gives c1 = (1.05/1.10)c2. When we plug this into the budget line equation we get 1.05 1.10c2 + 1.05 1.10c2 = 100 + 100 1.10 . It follows that the optimal quantities are c∗1 = (1.05/1.10)100 = 95.455 and c∗2 = 100. 5 Supply Functions for Labor and Savings 91 (c) If pi rises from 5 percent to 10 percent, the budget line equation changes to c1 + 1.10 1.10c2 = 100 + 100 1.10. The tangency condition changes to c2 c1 = 1 + i 1 + pi = 1.10 1.10 = 1, which gives c1 = c2. Plugging this into the budget line equation leads directly to c∗1 = 95.455 and c∗2 = 95.455. Note that the increase in inflation has made Penny worse off, and she has reacted by cutting back on her consumption tomorrow, which has become relatively more expensive. (d) Milly’s utility function is u(c1, c2) = √c1 + c2. It’s clear that she’s going to favor con- sumption tomorrow over consumption today, since c is usually a lot bigger than √ c. But she won’t ignore consumption today entirely. Her marginal utility of consumption today is 12c −1/2 1 . Her marginal utility of consumption tomorrow is 1. Therefore her tangency condition is MU1 MU2 = 1 2c −1/2 1 1 = 1 + i 1 + pi = 1.10 1.05. This gives c−1/21 = 2.20/1.05. Therefore c∗1 = (1.05/2.20)2 ≈ 0.2278. Her budget line equation is the same as Penny’s: c1 + 1.05 1.10c2 = 100 + 100 1.10. We now plug in c∗1 ≈ 0.2278 and get c∗2 ≈ 199.7614. 5 Supply Functions for Labor and Savings 92 Exercises 1. In the consumption/leisure model, let the consumer’s utility function be u(L, c) = Lc. Suppose the price of stuff is p = 1. Can you show that the daily labor supply curve is l∗(w) = 12? (That is, this consumer’s preferences are such that, regardless of the wage rate, she wants to work exactly half her day.) 2. Humpty Dumpty earns some non-labor income and decides not to work. Draw a graph with leisure on the horizontal axis and consumption on the vertical axis, showing his preferences, budget constraint, and the optimal consumption bundle. Label the intercepts and the optimal bundle. 3. Suppose the interest rate for savers is i1 and the interest rate for borrowers is i2, with i1 < i2. (a) Sketch the consumer’s intertemporal budget line. (b) Now assume i1 > i2. Show with a graph that the consumer might be indifferent between borrowing a large amount of money, or saving a large amount of money. 4. Mr. A’s preferences for present consumption (c1) and future consumption (c2) are given by the utility function u(c1, c2) = c1/31 c2/32 , while Mr. B’s are given by v(c1, c2) = c2/31 c1/32 . Suppose that the price of current consumption is 1, and that the interest rate and the inflation rate both equal 5 percent. Finally, both Mr. A’s income and Mr. B’s income is $100 per period, today and tomorrow. (a) Write down the equation of the budget line and show it graphically, labeling the two intercepts with the axes, the slope, and the zero savings point. Comment on each of these values. (b) Solve for Mr. A’s and Mr. B’s optimal consumption bundles. Can you say whether they are lenders or borrowers? (c) For those values of income, price, and the inflation rate, find Mr. A’s and Mr. B’s savings supply curves. Next, suppose they are the only consumers in the economy. Find the aggregate supply of savings of this economy. Represent the three curves graphically. 5 Supply Functions for Labor and Savings 93 (d) Assume that the interest rate goes up to 10 percent. Find the new consumption bun- dles of the two consumers. Are they better or worse off than in the initial situation? Discuss. 5. Sketch a consumer’s savings supply curve, with i on the vertical axis and s∗ on the horizontal axis. Be sure to show s positive for one range of i and negative for another, and show the critical i where the consumer crosses over from being a borrower to being a saver. 6. Analyze the effect of a decrease in pi on the saver’s choice of current consumption c∗1 and savings s∗. Is the effect the same as an increase in i? In your analysis, be careful to consider the saver’s budget constraint, and how decreasing pi or increasing i impacts that constraint. 6 Welfare Economics 1: The One-Person Case 94 6 Welfare Economics 1: The One-Person Case 6.1 Introduction This chapter is part of welfare economics. Welfare economics is about the well-being of society: which institutions, which policies, which market structures, which distributions of goods or wealth, make society better or worse off. This type of question has interested economists since the time of Adam Smith (1723-1790), whose The Wealth of Nations was published in the same year (1776) as the American Declaration of Independence. Welfare economics is easy if society can be modeled as just one person, plus a government deciding on something like a tax policy, and if the only question is whether that one person would be better off with policy A or policy B. This is the kind of analysis we will do in Sections 2 and 3 of this chapter. The appendix to this chapter covers the theory of revealed preference, and relates it to some of the Section 2 material. Welfare economics is somewhat more difficult if we want to know how much that one indi- vidual prefers policy A to policy B. We look at this issue in Section 4. Welfare economics is complicated if there are two or more people in society, and if we are trying to determine whether policy A is better for the many-person society than policy B. In Section 5 of this chapter we provide an example which touches on the problem, but the many-person model is mainly discussed in the following chapter. A different branch of welfare economics explores the connections between competitive markets and what economists call Pareto efficiency or Pareto optimality. We will do this later, in our chapters on exchange and production economies. 6.2 Welfare Comparison of a Per Unit Tax and an Equivalent Lump Sum Tax We will now focus on one consumer, who we will call the typical consumer or Ms. Typical. We assume she has income M , and that there are two goods, with prices p1 and p2. We assume that Ms. Typical has well behaved preferences, and a standard utility function that depends on the quantities (x1, x2). Her utility maximizing bundle is x∗ = (x∗1, x∗2). This point is the solution to: maxu(x1, x2) subject to p1x1 + p2x2 ≤ M. 6 Welfare Economics 1: The One-Person Case 95 Note that (x∗1, x∗2) is on the budget line, and is a point of tangency between the budget line and an indifference curve, so that MRS = p1/p2. Now suppose a per unit tax on good 1 is introduced. A per unit tax on good 1, also sometimes called a specific tax, is a fixed amount of money that the consumer must pay to the government for each unit of good 1 that she consumes. For instance, a state gasoline tax in the U.S. is a per unit tax; it is figured as cents per gallon. A per unit tax is different from a percentage or ad valorem tax, which is figured as a percentage of the price, rather than as a fixed amount of money per unit. For example, state (and city and county) sales taxes in the U.S. are typically percentage taxes. In 2010, U.S. sales taxes (state plus city and county) ranged between a low of 0 percent in Delaware, New Hampshire, and Oregon, and a high of over 9 percent in California, Illinois, and Tennessee. (Value added taxes in the European Union are also percentage taxes, but unlike U.S. sales taxes, these percentage taxes are paid several times at steps along the production process, when inputs or unfinished goods are being sold from firm to firm, as well as when the final product is sold by the last firm to the consumer. At each step, the tax is calculated by multiplying the given percentage by the value added by the producer at the current step, that is, price minus cost of materials and other inputs.) We now model a per unit tax. We will let t represent the per unit tax on good 1. With the introduction of the tax, the cost of a unit of good 1 to the consumer rises from p1 to p1 + t. Her budget constraint becomes (p1 + t)x1 + p2x2 ≤ M . She solves the maximization problem max u(x1, x2) subject to (p1 + t)x1 + p2x2 ≤ M. Let’s call the solution x∗∗ = (x∗∗1 , x∗∗2 ). Notice that the government tax revenues must be equal to tx∗∗1 . In contrast to per unit taxes or ad valorem (percentage) taxes, we might imagine a tax on Ms. Typical’s income M. In this chapter we have not discussed how she came by that income (for instance, by working a number of hours at a wage rate, by inheritance, and so on). Nor do we want to worry about this significant issue. We will simply assume that somehow the government takes a sum of money out of her pocket, without creating incentives or disincentives for her to earn the money in the first place. This kind of tax acts like an unavoidable, and perhaps unanticipated, wallet lightening. A tax which a consumer must pay, but which is independent of any decision she might make, is called a lump sum tax. Lump sum taxes are actually rather 6 Welfare Economics 1: The One-Person Case 96 unusual in the real world. So-called poll taxes are examples; a poll tax is a fixed tax imposed on every person or every adult in a given community, that is, a per head tax. Poll taxes were once used in southern U.S. states to limit voting by blacks and poor whites, and they are now generally viewed as unconstitutional in the U.S. We now let T represent the lump sum tax. With this tax in place, and no per unit tax, Ms. Typical’s budget constraint becomes p1x1 + p2x2 ≤ M − T . Let us also assume that the lump sum tax T is chosen so that the tax revenue generated is the same as with the per unit tax. That is, T = tx∗∗1 . Which of these two taxes would our typical consumer prefer? With the lump sum tax, she would solve this problem: max u(x1, x2) subject to p1x1 + p2x2 ≤ M − T = M − tx∗∗1 . Note that the bundle x∗∗ = (x∗∗1 , x∗∗2 ) is still feasible for the consumer, since it was feasible under the per unit tax. That is, x∗∗ is on the lump sum tax budget line. But the lump sum tax budget line has slope p1/p2, and at the point x∗∗, Ms. Typical’s MRS = (p1+ t)/p2. Therefore, her indifference curve must cross the lump sum tax budget line at that point, and so she will choose something else she prefers. (See Figure 6.1 below.) Let us call the something else she prefers x∗∗∗ = (x∗∗∗1 , x∗∗∗2 ). The moral of this story is that with the lump sum tax, which raises the same revenue as the per unit tax, the consumer ends up at a point x∗∗∗ which she likes better than the point she ends up with under the per unit tax. In short, an equivalent lump sum tax is better for the consumer than a per unit tax, and raises the same amount of money for the government. This result is shown in Figure 6.1. INSERT FIGURE 6.1 HERE Caption of Fig. 6.1: Comparison of a per unit tax and a lump sum tax on income. Why is the lump sum tax better? In a sense, the typical consumer’s decision is affected less by the lump sum tax than by the per unit tax. With the lump sum tax, she has T = tx∗∗1 less to spend on all goods, and so she is poorer; but with her reduced income, she faces undistorted market prices. On the other hand, with the per unit tax, she is again poorer in the sense that she cannot afford her no-tax bundle of goods, and, in addition, the per unit tax causes her to inappropriately substitute good 2 for good 1, because the relative prices are distorted by the tax. 6 Welfare Economics 1: The One-Person Case 97 Finally, let’s think about this question: The lump sum tax would make the consumer better off than the per unit tax. But by how much? We will come back to this in Section 4 below. 6.3 Rebating a Per Unit Tax Occasionally a government may impose a per unit tax on a commodity, say caviar, perhaps because it thinks consuming caviar is an immoral extravagance. However, it may want to compensate the typical consumer for the caviar tax, by rebating the revenue in some other way. For instance, the government may decide to send a check to Ms. Typical at the end of the year, equal to the amount of caviar taxes paid by her. Would this policy make her better off, the same, or worse off? For the purposes of this analysis, we will assume that Ms. Typical does not connect the total per unit tax paid over the course of the year for her caviar purchases with the rebate check received at the end of the year. That is, the rebate is a lump sum rebate. Before the government sets up the per unit tax/lump sum rebate scheme, Ms. Typical is maximizing utility subject to prices (p1, p2) and income M . That is, she is solving the problem: max u(x1, x2) subject to p1x1 + p2x2 ≤ M. We let the bundle x∗ = (x∗1, x∗2) represent her choice absent the scheme. This point is shown in Figure 6.2 below. Now the government imposes the per unit tax on good 1, caviar, and rebates the proceeds in a lump sum. The cost of a unit of good 1 to Ms. Typical becomes p1 + t, where t is the per unit tax. The government is collecting t times the number of units of caviar she consumes, and simultaneously rebating her what they collect. We’ll let R be the lump sum rebate. Also, we’ll let x∗∗ = (x∗∗1 , x∗∗2 ) represent her new chosen point. This is the solution to the following utility maximization problem: max u(x1, x2) subject to (p1 + t)x1 + p2x2 ≤ M +R, Note that R = tx∗∗1 , but the consumer does not act on this information. That is, she does not plug the formula for R into the budget constraint and thereby “solve out” the tx1 term. However, we know that R = tx∗∗1 , and therefore at (x∗∗1 , x∗∗2 ), the following must be true: (p1 + t)x∗∗1 + p2x∗∗2 = M + tx∗∗1 . 6 Welfare Economics 1: The One-Person Case 98 Therefore p1x∗∗1 + p2x∗∗2 = M. In short, the point x∗∗ satisfies her original budget constraint. That is, it was affordable when the consumer chose x∗. Therefore the consumer must be worse off at x∗∗ than at x∗. All this is shown in Figure 6.2 below. INSERT FIGURE 6.2 HERE Caption of Fig. 6.2: Rebating a per unit tax. Schemes like this are quite common in the real world, although the real-world schemes usually have a serious and important reason for the government’s action. (In our example, the government’s rationale, discouraging the immoral extravagance of eating caviar, obviously wasn’t serious.) As an example of a real-world scheme of this type, but one with a serious reason behind it, consider the following. In early 2009, people in the U.S. Congress proposed a carbon emissions tax (which would ultimately be paid by users of electricity, gasoline, and so on). To mitigate the negative impact on Ms. Typical, payroll taxes would simultaneously be reduced. For the typical consumer, the burden of the new taxes on electricity and fuel would be just offset by the reduction in payroll taxes. Our analysis above suggests that this would be a bad thing for Ms. Typical, but when the benefits of reduced carbon emissions are factored in, a bad thing might become a good thing. (The proposal failed in Congress.) We will discuss pollution-based externalities in Chapter 17. 6.4 Measuring a Change in Welfare for One Person Our discussion of per unit taxes and lump sum taxes in the sections above made no attempt to measure how much Ms. Typical might like one policy more than another. We will now turn to that question. We continue to assume there is just one person. Recall our comparison of a per unit tax and an equivalent lump sum tax, illustrated in Figure 6.1. We learned from that comparison that the consumer prefers the lump sum tax, which gets her to x∗∗∗, to the per unit tax, which gets her to x∗∗. This is because x∗∗∗ is on a higher indifference curve than x∗∗. If we were asked “By how much does she prefer x∗∗∗?”, we 6 Welfare Economics 1: The One-Person Case 99 might give an answer along these lines: “Her utility levels at the two points are u(x∗∗) = 7 and u(x∗∗∗) = 9, and therefore she likes x∗∗∗ exactly 2 utility units more than x∗∗.” (Since we did not specify a utility function in this example, we just made up the 7 and the 9.) The problem with this answer is that, although it might be true, it only means that she prefers x∗∗∗ to x∗∗. That is, since utility functions are ordinal, a 2 utility unit preference for one point over another means no more and no less than a 0.2 utility unit preference, or a 200 utility unit preference. In order to get a meaningful idea of how much Ms. Typical prefers one point to another, we need a measurement in more tangible units, such as units of goods, or units of money. A measurement in units of utility will not do. In Chapter 4, in our discussion of the Hicks substitution effect, we developed the means for transforming utility unit gains and losses into corresponding dollar gains and losses. The reader might look back at Figure 4.10 to recall how Hicks decomposes a move from an initial consumption bundle to a new consumption bundle into an income effect and a substitution effect. The substitution effect shows how the consumer’s consumption bundle shifts because of a shift in prices, holding utility constant, while the income effect shows how the consumer’s consumption bundle shifts because his income has in effect changed; he has in effect become richer or poorer. The income effect abstracts away the change in relative prices, and makes the shift from a point on an old indifference curve to a point on a new indifference curve a consequence of a change in income alone (measured in dollars). This gives us an easy and objective way to measure a consumer’s gain or loss from a shift in consumption: Figure the dollar amount of the income effect. Also recall that in Chapter 4, Figure 4.11, we indicated there are really two ways to do an income/substitution effect decomposition in the Hicks style; one proposed by Hicks himself and an alternative proposed by Kaldor. The Hicks version uses relative prices at the new point to produce a hypothetical budget line tangent to the old indifference curve, while the Kaldor version uses relative prices at the old point to produce a hypothetical budget line tangent to the new indifference curve. Before proceeding, let’s lay out some terminology. Suppose a consumer shifts from one consumption point to another. We know that her move can be decomposed into a substitution effect and an income effect, using either the Hicks decomposition or the Kaldor decomposition. 6 Welfare Economics 1: The One-Person Case 100 Whichever decomposition we use, the income effect is a move from one bundle to another bundle; and at each of these bundles there is a budget line (one real and one hypothetical) tangent to an indifference curve. The two budget lines are parallel, they are based on the same prices. The dollar amount or dollar value of the income effect is the dollar difference between the two parallel budget lines; that is, the dollars it would take to get from one budget line to the other. The compensating variation measure of the consumer’s gain (or loss if negative) is the dollar amount of the Hicks income effect, the income effect based on the new prices. The equivalent variation measure of the consumer’s gain (or loss if negative) is the dollar amount of the Kaldor income effect, the income effect based on the old prices. We now turn to an algebraic example, which will make everything crystal clear. Example 1. Measuring one consumer’s welfare change in dollars, when p1 rises, for a simple product utility function. Let’s assume a consumer has utility function u(x) = u(x1, x2) = x1x2. Assume her income is M = 18, and the prices at the start are p1 = 1 and p2 = 1. In order to maximize her utility subject to her budget constraint, she sets her MRS equal to the price ratio p1/p2. This gives MRS = MU1MU2 = x2 x1 = p1 p2 . Therefore p1x1 = p2x2. Since her budget constraint is p1x1 + p2x2 = M, we get 2p1x1 = M. Therefore her demand function for good 1 is x1 = M 2p1 . At the initial prices of (p1, p2) = (1, 1), and since M = 18, her utility maximizing bundle is x∗ = (x∗1, x∗2) = (9, 9). Her initial utility level is u(x∗) = 9× 9 = 81. Now let’s assume the price of good 1 rises to p1 = 2.25, while M and p2 remain the same. The consumer is worse off since she can no longer afford the bundle she had been consuming. We want to measure her loss. We first use the demand function to find her new desired quantity of good 1: x∗∗1 = M/2p1 = 18/(2×2.25) = 18/4.5 = 4. Then we use the budget constraint to find x∗∗2 = 9. Therefore her new utility maximizing bundle is x∗∗ = (x∗∗1 , x∗∗2 ) = (4, 9). Her new utility level is u(x∗∗) = 4×9 = 36. We will use Table 6.1 below to record the quantity and cost variables we are calculating. In the table, we have columns identifying the original (“old”) prices (1, 1), as well as the revised 6 Welfare Economics 1: The One-Person Case 101 (“new”) prices (2.25, 1). The original utility maximizing bundle x∗ is shown on line 1 of the table, and the new utility maximizing bundle x∗∗ is shown on line 2. Entries under the “old” and “new” price columns are the dollar costs of the given bundles at those prices. For example, the cost of x∗ at the old prices is $18. The cost of x∗∗ at the new prices is also $18. The other lines of the table will be explained below. Old Prices New Prices (p1, p2) = (1, 1) (p1, p2) = (2.25, 1) 1. x∗ = (9, 9) $18 2. x∗∗ = (4, 9) $18 3. y = (6, 13.5) $27 4. z = (6, 6) $12 5. y to x∗∗ (C.V.) minus $9 6. x∗ to z (E.V.) minus $6 Table 6.1 In the move from the old optimal point x∗ to the new optimal point x∗∗, the consumer’s utility drops by 81 − 36 = 45 utility units. But this information is not particularly helpful, because utility functions are ordinal. Also, in the move from x∗ to x∗∗, the consumer’s income stays constant, at $18. This suggests a loss of $18− $18 = $0. But only a fool would say she has lost nothing. How then do we measure her loss in dollars? We decompose the move from x∗ to x∗∗ into income and substitution effects, and then use the dollar value of the income effect. There are two very similar ways to do the decomposition: the compensating variation (Hicks) method is based on a hypothetical budget line that is tangent to the original indifference curve but with a slope based on the new prices; the equivalent variation (Kaldor) method is based on a hypothetical budget line that is tangent to the new indifference curve but with a slope based on the old prices. We illustrate the two methods in Figure 6.3 below. The original consumer optimum is at x∗ = (9, 9) and the new consumer optimum is at x∗∗ = (4, 9). To get the compensating variation measure, we draw a hypothetical budget line, tangent to the old indifference curve, but with a slope of 2.25/1 = 2.25 in absolute value, based on the new prices. This is the hypothetical 6 Welfare Economics 1: The One-Person Case 102 budget B1 in the figure. The substitution effect is the x∗ to y shift, and the income effect is the y to x∗∗ shift. To find y, we use the fact that y = (y1, y2) is on the old indifference curve, so y1y2 = 81; and we also use the fact that the slope of the indifference curve at y, in absolute value, must equal 2.25. This gives MRS = y2/y1 = 2.25, or y2 = 2.25y1. Putting these two equations together gives 2.25y21 = 81, or y21 = 81/2.25 = 36. Therefore y1 = 6 and y2 = 13.5. (This is Table 6.1, line 3.) At the new prices (p1, p2) = (2.25, 1), y would cost 2.25× 6 + 1× 13.5 = 27; at the new prices x∗∗ costs 18. Therefore the dollar value of the income effect is 18− 27 = −9. (Table 6.1, line 5.) That is, using the compensating variation measure, the move from the old point x∗ to the new point x∗∗ is equivalent to the consumer losing $9. To get the equivalent variation measure, we draw a hypothetical budget line, tangent to the new indifference curve, but with a slope of 1/1 = 1 in absolute value, based on the old prices. This is the hypothetical budget B2 in the figure. The substitution effect is now the z to x∗∗ shift, and the income effect is the x∗ to z shift. To find z, we use the fact that z = (z1, z2) is on the new indifference curve, so z1z2 = 4 × 9 = 36; and we also use the fact that the slope of the indifference curve at z, in absolute value, must equal 1.0. This gives MRS = z2/z1 = 1, or z2 = z1. Putting these two equations together gives z21 = 36. Therefore z1 = 6 and z2 = 6. (This is Table 6.1, line 4.) At the old prices (p1, p2) = (1, 1), z would cost 1× 6 + 1 × 6 = 12; at the old prices x∗ costs 18. Therefore the dollar value of the income effect is 12− 18 = −6. (Table 6.1, line 6.) That is, using the equivalent variation measure, the move from the old point x∗ to the new point x∗∗ is equivalent to the consumer losing $6. INSERT FIGURE 6.3 HERE Caption of Fig. 6.3: There are two ways to measure the consumer’s loss in dollars. Com- pensating variation is the dollar value of the y to x∗∗ income effect (based on the new prices); equivalent variation is the dollar value of the x∗ to z income effect (based on the old prices). At this point we can say the following. Our consumer’s move from x∗ to x∗∗ has made her worse off. She has lost 45 utility units, which means she is worse off, but the number “45” doesn’t mean anything in particular. But the move has an effect that can be measured in dollars; and measured in the compensating variation (Hicks income effect) fashion, the result is −$9; measured in the slightly different equivalent variation (Kaldor income effect) fashion, the result 6 Welfare Economics 1: The One-Person Case 103 is −$6. Before leaving this section, we must point out a disturbing peculiarity of the compensating variation and equivalent variation measures. Take another look at Figure 6.3. Now, instead of assuming the consumer starts at x∗ and finishes at x∗∗, suppose she starts at x∗∗ and finishes at x∗. In other words, we reverse the direction of the move. Then what had been new is now old, and what had been old is now new. Therefore, for this shift, by the compensating variation measure, the consumer has gained $6 (the dollar value of the z to x∗ income effect). For this shift, by the equivalent variation measure, the consumer has gained $9 (the dollar value of the x∗∗ to y income effect). This is all very confusing, but here’s the worst part: Now consider a round trip, from x∗ to x∗∗, and then back, from x∗∗ to x∗. According to the compensating variation measure, the consumer in effect loses $9 on the x∗ to x∗∗ leg. According to the compensating variation measure, she in effect gains $6 on the return leg. Therefore, by the compensating variation measure, starting at x∗ and ending back up at x∗, the consumer’s has a net loss of $3. But this is absurd, a paradox, because she ends up at the same point where she started! We have shown that compensating variation and equivalent variation measures of one con- sumer’s gains (or losses) from a move from one consumption bundle to another are useful gauges of how much the consumer gains or loses. This is because they are dollar measures of the gains (or losses), and therefore much more tangible than utility measures of those gains or losses. On the other hand, at least for some utility functions, these measures may produce strange results. The reason is the presence of income effects, which makes things quite tricky. 6.5 Measuring Welfare for Many People, A Preliminary Example We will now turn to another example that illustrates how choices between policies or programs might be made by a government. In this example we assume there are many people in society, instead of just one. In other words, we are no longer thinking of alternative government policies and how they might affect Ms. Typical. We are now thinking of alternative government policies and how they affect a population of consumers with different preferences and different utility functions. In the last section, we carefully discussed compensating and equivalent variation measures 6 Welfare Economics 1: The One-Person Case 104 of one consumer’s gain or loss from a change in a budget constraint variable, the price of one good. Both of these measures in effect converted utility changes into dollar equivalents. In the following example, we will use a similar although less carefully defined measure of welfare: willingness-to-pay. We will assume that various consumers are being offered a choice between two different policies (or “programs”) by their government, and each consumer (or “family”) would be willing to pay, at most, a certain dollar amount for each policy. This example is simpler than the last example in the sense that there is no change in relative prices. The example will show that if the government makes choices by adding together the willingness-to-pay numbers of the various families, it risks putting too much importance on the preferences of the rich. Example 2. Measuring society’s preference for one program over another, in dollars. Should schools teach music or art (if they cannot afford both)? Suppose there are two regions of the country, the East and the West. People in the East love music and are less interested in the visual arts. People in the West love the visual arts and are less interested in music. Assume there are equal numbers of families in the two regions. There is a public school system which educates children in both regions. The school system is short of funds and can only afford to provide a music education program, or an art education program, but not both. Nor can it afford to provide music in one region and art in the other. Policy M is to teach music in all schools in the country; policy A is to teach art in all schools in the country. Assume that each family in each region would be willing to pay up to 2 percent of family income each year for the type of education (music or visual arts) they like more, and up to 1 percent of family income each year for the type of education they like less. Now suppose social welfare for each education program is measured simply by summing each family’s willingness-to-pay, over all families in the country. Assume that the government opts for the education program that maximizes aggregate willingness-to-pay. We might call this the naive willingness-to-pay approach. It is “naive” in the Webster’s dictionary sense, of being too simple, too deficient in worldly wisdom. Here is the problem with measuring society’s preference this way: The decision will simply depend on aggregate family income in the two regions of the country. If the East is richer, the decision will be M; if the West is richer, it will be A. The government will provide the program 6 Welfare Economics 1: The One-Person Case 105 preferred by the richer region. Even worse, if every family in the country is middle-income except for one super-rich family, the decision will depend only on whether the super-rich family lives in the East or in the West. Most economists would reject this approach. Should we then abandon the naive willingness-to-pay approach and just use compensating variation, or equivalent variation, or some similar measure? But this might produce paradoxes and inconsistencies like those in Example 1 – which was about one person – compounded in a world with many people. In the following chapter we will continue to discuss these problems. We will introduce an assumption, called “quasilinearity,” which which will prevent the paradox of Example 1, and also prevent the unpalatable outcome of Example 2. 6.6 A Solved Problem The Problem The King of Phoenicia wants to build a new school, which will cost roughly $100,000. To pay for it he will tax the people of Phoenicia. There are a million people in the country, and they are all alike. They consume only two goods, fruits (f) and nuts (n). They all have the same utility function, u(f, n) = f(n+2), and they all have the same income, M = $1, 000. The prices of fruits and nuts are pf = $4 and pn = $4. The King’s Council of Economic Advisors presents him with three alternative proposals: (1) Impose a per unit tax of $1 on fruits. (2) Impose a per unit tax of $1 on nuts. (3) Impose a lump sum tax of $100 on each person. How do these proposals effect the citizens of Phoenicia? How much revenue will each proposal raise? The Solution With no tax, each citizen’s budget constraint is 4f + 4n = 1, 000. The tangency condition for utility maximization is MRS = MUfMUn = n+ 2 f = pf pn = 4 4 = 1. 6 Welfare Economics 1: The One-Person Case 106 This gives n = f − 2. Plugging this into the budget constraint gives the optimal consumption bundle (f∗, n∗) = (126, 124). Utility with no tax is u(f∗, n∗) = 126(124+ 2) = 15, 876. (1) With a per unit tax of $1 on fruits, the consumer must pay $4 + $1 = $5 for each unit of fruit he consumes. The budget constraint is now 5f+4n = 1, 000. The tangency condition is now MRS = n + 2f = 5 4 . This gives n = (5/4)f − 2. Plugging this into the budget constraint gives the opti- mal consumption bundle (f∗, n∗) = (100.8, 124). Utility with the per unit fruit tax is u(f∗, n∗) = 100.8(124+2) = 12, 701. Tax revenue from one consumer is $1×100.8 = $100.8, and total tax revenue is a million times this, or $100.8 million. (2) With a per unit tax of $1 on nuts, the consumer must pay $4+$1 = $5 for each unit of nuts he consumes. The budget constraint is now 4f + 5n = 1, 000. The tangency condition is now MRS = n + 2f = 4 5 . This gives n = (4/5)f − 2. Plugging this into the budget constraint gives the optimal consumption bundle (f∗, n∗) = (126.25, 99). Utility with the per unit nut tax is u(f∗, n∗) = 126.25(99+ 2) = 12, 751. Tax revenue from one consumer is $1 × 99 = $99, and total tax revenue is a million times this, or $99 million. (3) With a lump sum tax of $100, the prices stay the same, (pf , pn) = (4, 4). However, the consumer’s income drops by $100. The budget constraint is now 4f + 4n = 900. The tangency condition is MRS = n+ 2f = 4 4 = 1. This gives n = f − 2. Plugging this into the budget constraint gives the optimal con- sumption bundle (f∗, n∗) = (113.5, 111.5). Utility with the lump sum tax is u(f∗, n∗) = 113.5(111.5+2) = 12, 882. Tax revenue from one consumer is $100, and total tax revenue is a million times this, or $100 million. We conclude that all three proposals raise roughly $100 million. But the citizens prefer proposal 3 over proposal 2 over proposal 1. The King has heard about the Arab Spring uprisings of 2011, and so he goes with the lump sum tax. 6 Welfare Economics 1: The One-Person Case 107 Exercises 1. Leah spends $200 a month on berries (b) and cream (c). Her utility function is u(b, c) = bc. Berries cost $4 a pint and cream costs $2 a pint. (a) Find Leah’s optimal consumption bundle, and calculate her utility at that bundle. (b) Suppose a 25 percent tax on cream is imposed. What is Leah’s optimal consumption bundle? (c) The government is contemplating a subsidy on berries. What would the net price of berries have to be, so that with the 25 percent tax on cream and the subsidized berry price, Leah ends up with the same utility as in (a)? 2. Rachel gets a weekly allowance of $45, which is spent on milk (m) and cookies (c). Her utility function is u(m, c) = mc2 + 100. A glass of milk is $1 and a cookie is $3. (a) Find Rachel’s optimal consumption bundle, and calculate her utility at that bundle. (b) Suppose the government taxes Rachel $1 for each cookie consumed. At the end of the week, the government sends Rachel a rebate check equal to the amount of cookie taxes she paid. Rachel, however, does not connect the rebate check with the cookie taxes she paid. Find Rachel’s new consumption bundle, and show that she is worse off than in (a). 3. There are two goods in the world, x and y. William’s utility function is u(x, y) = min(x, y) and Mary’s utility function is v(x, y) = x + y. If a tax is imposed on good x, how does William’s utility change? How about Mary’s utility? 4. Louis’s utility function for champagne (c) and soda (s) can be written as u(c, s) = 10c4s. A bottle of champagne is $32 and a bottle of soda is $1. His monthly budget for champagne and soda is $80. (a) Find Louis’s optimal consumption bundle, (c∗, s∗), and his utility level at this bundle. 6 Welfare Economics 1: The One-Person Case 108 (b) Suppose a new study shows that champagne has tremendous health benefits, and a bill subsidizing the consumption of champagne is passed. The net price of champagne with the subsidy is $16. Find Louis’s new consumption bundle, (c∗∗, s∗∗), and his utility level at this bundle. (c) Using the Hicks notion of income and substitution effects, calculate the dollar value of the income effect. 5. A couple’s utility function for condominiums (good x measured in square feet) and other stuff (good y) is given by u(x, y) = x2y. Suppose that the couple’s income is $30,000, the initial price of x is $100, and the price of y is $1. The local government offers the following alternative housing programs. Which program does the couple prefer? How much would each program cost the government? Comment briefly. (a) A lump-sum subsidy of 3,000 dollars, independent of the size of the condominium purchased. (b) A subsidy on the price of condominiums such that the net price per square foot is $80. 6. President Clinton has appointed you as her Secretary of the Economy. Assume that goods may be classified into just two types (x and y), and that the preferences of all consumers are u(x, y) = min (x, y). Let x∗ and y∗ be the initial amounts demanded. Suppose px = 10 and py = 10. A third of the consumers, Group A, earn $500 each, a third of the consumers, Group B, earn $400 each, and a third of the consumers, Group C, earn $300 each. You present the following plan to Congress: (i) Lump sum income taxes will be levied on everybody. Consumers in Group A pay a tax of $68 each, consumers in Group B pay a tax of $40 each, and consumers in Group C pay a tax of $12 each. (ii) The two goods will have per unit subsidies of sx = 1 and sy = 1. The subsidies are chosen so that the new (net-of-subsidy) prices px−sx and py−sy satisfy sxx∗∗+syy∗∗ = T and (px − sx)/(py − sy) = px/py, where px and py are the initial prices, x∗∗ and 6 Welfare Economics 1: The One-Person Case 109 y∗∗ are the amounts demanded after the policy intervention, and T is the total tax collected. Members of Congress argue that the plan should be rejected on the grounds that it worsens the welfare of the median consumer. Can you show that their argument is wrong? Who would be better off and who would be worse off if the policy were implemented? Calculate each group’s optimal consumption bundle and utility pre- policy and post-policy. 6 Welfare Economics 1: The One-Person Case 110 Appendix: Revealed Preference In our approach to the theory of the consumer, starting in Chapter 2, we began by assuming that the consumer has preferences. We made certain assumptions about preferences and the utility functions that represent preferences. From the preferences, the utility functions, and the budget constraints, we derived demand curves, and we discussed the properties those demand curves should have. This line of reasoning started with the properties of preferences or utility, which generally are not directly observable, and moved to the properties of demand, which are directly observable. In the 1930’s and 1940’s, the great 20th century American economist Paul Samuelson (1950- 2009) developed a different sort of theory, which started with the properties of demand, rather than the properties of preferences or utility. Since it started with demand, which is observable, it was in a sense more empirical than classical consumer theory. Samuelson (and others) worked out the assumptions about consumer choice that would produce the same kind of logical structure as the standard preference-based consumer theory. The essential idea of Samuelson’s theory is this. Suppose a consumer chooses a bundle of goods (x1, x2), when she could have chosen a different bundle (y1, y2), given the prices of the goods and her income. Then she has directly demonstrated that she prefers (x1, x2) to (y1, y2), or directly revealed a preference for (x1, x2) over (y1, y2). Samuelson then proposed a basic assumption about the bundles that the consumer chooses: if the consumer directly reveals a preference for (x1, x2) over (y1, y2), then she should not directly reveal a preference for (y1, y2) over (x1, x2). Note that this assumption, although it incorporates the word “preference,” is purely a statement about what bundles of goods the consumer does or does not choose. That is, it is a statement about bundles the consumer chooses, which are observable, and not about preference relations or a utility functions, which are not observable. Samuelson’s basic assumption about choice is now called the weak axiom of revealed prefer- ence or WARP for short. Figure 6.4 below shows two pairs of budget lines in two graphs. In both graphs, the consumer chooses the point (x1, x2) when her budget line is Bx. In each graph, there is an alternative budget line By , flatter than Bx, under which the consumer chooses a different bundle (y1, y2). In both graphs, (y1, y2) lies on Bx, which means the consumer could have chosen (y1, y2) when she in fact was choosing (x1, x2). The upper graph is consistent with 6 Welfare Economics 1: The One-Person Case 111 the weak axiom of revealed preference, but the lower graph is not. That is, in the upper graph, the consumer is directly revealing her preference for (x1, x2), but she is not (illogically) also directly revealing her preference for (y1, y2) over (x1, x2). On the other hand, in the lower graph, the consumer is directly revealing her preference for (x1, x2) over (y1, y2), and, simultaneously, illogically, she is directly revealing her preference for (y1, y2) over (x1, x2). INSERT FIGURE 6.4 HERE Caption of Fig. 6.4: When WARP is satisfied, and when it is violated. In some applications, the weak axiom of revealed preference is strengthened to what is now called the strong axiom of revealed preference or SARP for short. The idea of SARP is the following: we say a consumer indirectly reveals a preference for (x1, x2) over (y1, y2) if there is some string of alternative bundles (call them AB1, AB2, and so on), such that she directly reveals a preference for (x1, x2) over AB1, and she directly reveals a preference for AB1 over AB2, and so on, until she directly reveals a preference for the last in the string of alternative bundles, say, ABk, over (y1, y2). The assumption of SARP says that if the consumer indirectly reveals a preference for (x1, x2) over (y1, y2), then she should not indirectly reveal a preference for (y1, y2) over (x1, x2). In Figure 6.5 below, we show three bundles and the associated budget lines which give rise to their choice. The consumption bundles are (x1, x2), (y1, y2), and (z1, z2), and the corresponding budget lines are labeled Bx, By, and Bz . The figure illustrates the idea of SARP, because (x1, x2) is indirectly revealed preferred to (z1, z2), but (z1, z2) is not indirectly revealed preferred to x1, x2). INSERT FIGURE 6.5 HERE Caption of Fig. 6.5: Illustrating SARP. The bundle (x1, x2) is indirectly revealed preferred to (z1, z2), but not vice versa. We end this section with some hints about how revealed preference theory might be merged into the welfare economics analysis we have been doing in this chapter, as well as the consumer theory we did in the previous chapters. 6 Welfare Economics 1: The One-Person Case 112 First, consider the welfare comparison of a per unit tax and the equivalent lump sum income tax, which we discussed in the second section of this chapter, particularly in Figure 6.1 above. To understand how revealed preference relates, take another look at Figure 6.1, focusing on the three points (x∗1, x∗2), (x∗∗1 , x∗∗2 ), and (x∗∗∗1 , x∗∗∗2 ), and the three budget lines in the figure, labeled “no tax,” “per unit tax,” and “lump sum tax.” Try to ignore the indifference curves in the figure. Obviously the consumer is revealing her preference for the no tax consumption bundle (x∗1, x∗2) over the other two bundles. Moreover, as Figure 6.1 is drawn, the consumer is directly revealing her preference for (x∗∗∗1 , x∗∗∗2 ) over (x∗∗1 , x∗∗2 ). Therefore, the lump sum tax policy is better than the per unit tax policy. Finally, the point (x∗∗∗1 , x∗∗∗2 ) must lie where it does, to the right and below (x∗∗1 , x∗∗2 ), because if it were to the left and above, WARP would be violated. The reader should figure out why. Second, consider the discussion of the Slutsky substitution effect, from Section 5 of Chapter 4, particularly in Figure 4.9. A quick look back at that figure should convince the reader that the figure is virtually identical to Figure 6.1. Consequently, revealed preference analysis can be applied there as well. In particular, we can easily establish, without relying on indifference curves, that WARP implies that the Slutsky substitution effect is negative. 7 Welfare Economics 2: The Many-Person Case 113 7 Welfare Economics 2: The Many-Person Case 7.1 Introduction Most of the last chapter was about the well-being of a one-person society. It is appropriate to model society as having just one person (our Ms. Typical) if, for example, we want to decide between alternative tax policies that impact a homogeneous population (made up of many Ms. Typicals) in a uniform fashion. But if people are very different (with different preferences and income levels, for instance), and are differently affected by any particular policy choice, it is wrong to model society this way. In this chapter, we will assume that there are two or more people. How do we determine whether policy A is better than policy B if there are various people affected by those policies, in various different ways? This is the crucial problem we now face. We touched on this problem in Section 6.5 of the last chapter, but now we explore it further. We know from Chapter 2 that utility is an ordinal measure, and so it probably makes no sense to add together the utility levels of two or more people to get a social utility measure. But if this is so, is it possible to judge alternative government policies, institutions, or market structures by adding together numbers that in some way represent individual assessments of those alternatives? Economists use the idea of consumers’ surplus to do this, and we explore consumers’ surplus in this chapter. We start off by quickly revisiting problematic Examples 1 and 2 from the last chapter, and then we introduce an assumption, called quasilinear preferences, or quasilinearity, which rules them out. We then carefully define consumer’s surplus for a single individual. We show that under the quasilinearity assumption, a consumer’s change in welfare as measured by compensating variation, equivalent variation, or the change in consumer’s surplus, all agree (which rules out Example 1). We show that under the quasilinearity assumption, the consumer’s demand for good 1 is independent of income (which rules out Example 2). Then we define consumers’ surplus, for two or more individuals, and present an example. We conclude with some comments on the restrictiveness of the quasilinearity assumption: while it is sometimes a good approximation to reality – for instance, when the good under study is subject to small income effects – at other times it is an unrealistic and inappropriate assumption. 7 Welfare Economics 2: The Many-Person Case 114 7.2 Quasilinear Preferences Recall that Chapter 6 ended with two examples which were slightly unsettling. To recapitulate, they were: Example 1. Measuring one consumer’s welfare change in dollars, when p1 rises, for a simple product utility function. The consumer has utility function u(x) = x1x2. Her income is M = 18, and the prices at the start are p1 = 1 and p2 = 1. Based on these prices, she chooses a utility maximizing bundle x∗. Suppose the first price rises to p1 = 2.25. Now she chooses a utility maximizing bundle x∗∗. She is worse off; her utility change is −45. By the compensating variation measure, she is worse off by $9. By the equivalent variation measure, she is worse off by $6. So the dollar measures of her welfare loss are somewhat inconsistent, and this can lead to paradoxical conclusions. These inconsistencies suggest potential measurement errors, and as will be explained below, can be traced back to the presence of income effects. Example 2. Measuring society’s preference for one program over another, in dollars. Should schools teach music or art (if they cannot afford both)? The gov- ernment must choose between policy M (music education) and policy A (art education). There are two regions of the country, the East and the West. Preferences are different for M and A in the two regions. Social welfare from the alternative policies is measured by aggregating naive willingness-to-pay. We call this “naive” because it is too simple, too deficient in informed judgment. Naive willingness-to-pay depends only on income levels; each family is willing to pay 2 percent of its income for the program it likes more, but only 1 percent of its income for the program it likes less. The problem is that this social welfare measure is too dependent on the income distribution. Society’s choice between M and A will depend on whether wealthy people live in the East or the West. We now turn to an important assumption that prevents problems like these. There are situations in which this assumption will be a good approximation of reality, but there will be others when it won’t be; Section 6 will elaborate on this. The key is to assume that the goods available to consumers in society, and the utility functions of those consumers, have a special property. There is one good which enters everyone’s utility function in the same way, and the way it enters it is as a simple additive term. For example, if 7 Welfare Economics 2: The Many-Person Case 115 the special good is apples, then each person’s utility can be written as ui(everything) = vi(everything except apples) + apples. Here the vi function represents person i’s utility from everything except apples. If this is the case, we can measure a change in i’s utility as an apple equivalent, and there is no problem in measurement, because an apple is an apple. (Speaking loosely, we do know about McIntosh versus Red Delicious, fresh versus rotten!) Moreover, if we measure apples in 1 dollar units, then a change in i’s utility becomes a dollar equivalent, and there is again no problem in measurement, because a dollar is a dollar. Finally, if we are looking at social welfare rather than the welfare of one individual, instead of summing over utilities for the various people, which is not legitimate, we can sum over quantities of apples or quantities of dollars for the various people, which is perfectly legitimate! More formally, we proceed this way. We will assume there are two goods. The first good enters the utility functions of different people in different ways; opinions are divided about it. But the second good enters everybody’s utility function as a simple additive term. We will assume that the second good is measured in 1 dollar units, so p2 = 1. (When we measure a good in units chosen so that 1 unit costs 1 dollar, we call it a numeraire good.) Although good 2 should be viewed as a real “good,” that is, as something the consumer consumes, it may also be a composite of various other things whose relative prices don’t change. We will call it the money good, although it should not be mistaken for the consumer’s income M . Now let’s focus on one consumer. When we are considering just one person, we do not need a subscript i to identify her. Her utility u is a function of the quantities of the two goods which she consumes, (x1, x2). We say that she has quasilinear preferences if her utility function can be written as u(x1, x2) = v(x1) + x2. Here v is an increasing and concave function of x1. Note that x2 enters our consumer’s utility function as a simple additive term. We say that the utility functions of the various individuals, as a group, satisfy quasilinear preferences, if they can all be written in this form, with the quantity of the second good entering in the same way in every utility function, but with the v(x1) terms generally differing between consumers. 7 Welfare Economics 2: The Many-Person Case 116 At this point we can note that the quasilinearity assumption would rule out the simple product utility function used in Example 1. 7.3 Consumer’s Surplus Note the location of the apostrophe in the heading of this section; we are again focusing on one person. Our consumer wants to maximize her utility subject to her budget constraint: max u(x1, x2) = v(x1) + x2 subject to p1x1 + x2 = M. The two equations that describe her choice are the budget line equation and the tangency condition MRS = v′(x1)/1 = v′(x1) = p1/p2 = p1. We assumed above that the function v is concave. This assumption guarantees that the consumer’s preferences are convex; that is, the consumer’s indifference curves have the standard curvature. If we move to the right on any indifference curve, it must get flatter and flatter. But quasilinearity implies much more. It implies that the consumer’s indifference curves have some very special properties. If we fix x1, her marginal rate of substitution is constant, equal to v′(x1), the marginal utility of the first good at x1. Therefore, if we graph some indifference curves in an x1, x2 picture, and if we go straight up from a fixed x1, (1) the slopes of the indifference curves above the given x1 are constant. It is also the case that (2) if we take any two different indifference curves, the vertical gap between them is constant, no matter what x1 might be. That is, under quasilinearity, any two indifference curves are vertical translations of each other. We say, somewhat loosely speaking, that the indifference curves are “parallel.” (The quotation marks are there because being parallel is usually defined as a geometric property of straight lines.) Showing point (2) will be left as an exercise at the end of this chapter. Figure 7.1 below shows a consumer’s indifference curves under the assumption of quasilinear preferences. INSERT FIGURE 7.1 HERE Caption of Fig. 7.1: Indifference map for quasilinear preferences. Recall that in Example 1, as the consumer moves from x∗ to x∗∗, her compensating variation loss (of $9) and her equivalent variation loss (of $6) are different. This leads to a worrisome paradox. We have already noted that the quasilinearity assumption rules out the product utility 7 Welfare Economics 2: The Many-Person Case 117 function used in Example 1. But quasilinearity does more than ruling out the Example 1 utility function; quasilinearity rules out the possibility of a difference between compensating variation and equivalent variation. Therefore it rules out the type of paradox encountered in Example 1. We will show this in Figure 7.2 below. Figure 7.2 is based on the income and price assumptions of Example 1. However, we now assume a quasilinear utility function u(x1, x2) = v(x1) + x2. In Figure 7.2, as in Figure 6.3, compensating variation is the y to x∗∗ income effect move, based on the new prices, while equivalent variation is the x∗ to z income effect move, based on the old prices. Because of quasilinearity, y is directly above x∗∗, and x∗ is directly above z. Also because of quasilinearity, the vertical gap between the two indifference curves in the figure is constant. Since p2 = 1 and since the vertical gaps between y and x∗∗ and between x∗ and z are equal, compensating variation must equal equivalent variation, and the Example 1 paradox is impossible. INSERT FIGURE 7.2 HERE Caption of Fig. 7.2: The move from x∗ to x∗∗ under quasilinearity. No absurd result is possible. At this point we will derive a consumer’s demand function under the quasilinearity assump- tion. The consumer wants to maximize v(x1)+x2, subject to p1x1 +x2 = M. (We are assuming p2 = 1.) From the tangency condition, we get v′(x1) = p1. We let v′−1 represent the inverse of the v′ function. Therefore x1 = v′−1(p1) shows the consumer’s desired consumption of good 1, contingent on the price p1. That is, v′−1 is the consumer’s demand function for good 1. Her demand for good 2 is given by x2 = M − p1x1. We’ll assume for simplicity that the consumer’s income M is large enough such that she spends positive amounts on both goods. Now note that under quasilinearity, the consumer’s demand for good 1 is independent of her income M ; her demand for good 1 depends only on p1, the price of good 1 (or more generally on the relative price of good 1, p1/p2). This rules out the objectionable outcome of Example 2. We can easily draw the consumer’s demand curve for good 1. We simply graph the equation for the demand function, x1 = v′−1(p1), or equivalently, we graph the equation for the inverse demand function, p1 = v′(x1). (There is just one graph for demand and for inverse demand, the only difference is that demand is read from vertical (price) to horizontal (quantity), and 7 Welfare Economics 2: The Many-Person Case 118 inverse demand is read from horizontal (quantity) to vertical (price)). See Figure 7.3 below, in which we also include a horizontal line at a particular price p∗1. When we read the graph from vertical to horizontal, it shows the amount of good 1 the consumer wants to consume at any particular price. When we read the same graph from horizontal to vertical, it shows, for each x1, the corresponding p1 = v′(x1). But v′(x1) = MRS. So, for a given x1, the height of the (inverse) demand curve is the number of dollars or units of good 2 (remember we are assuming p2 = 1), that the consumer would be willing to give up in exchange for one more unit of good 1. (The “one more unit” is what economists often call the marginal unit; it’s the additional or incremental unit.) Naturally we call the height of the (inverse) demand curve the consumer’s willingness-to-pay for a marginal unit of good 1. It is important to note that the equation v′(x1) = p1 does not include the terms M or x2. Therefore neither the consumer’s demand for x1 (contingent on p1) nor her willingness-to-pay for an additional unit of good 1 (contingent on x1) depends on how much income she has, or on how much good 2 she is consuming. Consequently, willingness-to-pay as we use it here and in the rest of this chapter avoids the bad implications of naive willingness-to-pay as described in Example 2. Now consider the horizontal line in Figure 7.3. It crosses the demand curve at the point (x∗1, p∗1). When the consumer is paying a price of p∗1 and consuming x∗1 units of good 1, her willingness-to-pay is p∗1 for the additional or marginal unit, that is, for the x∗1th unit. But for any x1 < x∗1, the willingness-to-pay is higher. For the first unit, the second unit, and so on, her willingness-to-pay is the height of the demand curve at each of those points. In a sense, when she is buying x∗1 at price p∗1, she is getting a real bargain, because she’s getting the first unit, the second unit, and so on, at lower prices than what she is willing to pay. And this is true for every bit of good 1 that she is consuming up to the marginal unit, the x∗1th. We can now suggest a way to measure our consumer’s benefit from the situation at hand, of being able to buy and consume x∗1 units of good 1, (and simultaneously, buy and consume M−p∗1x∗1 units of good 2). Consumer’s surplus is the aggregate amount, over all units consumed, of the consumer’s willingness-to-pay for the additional units, minus the amount actually paid. This is the area in Figure 7.3 under the demand curve and above the horizontal line at p∗1. Note that under the assumption of quasilinearity, the consumer’s surplus is not sensitive to income levels or levels of consumption of the other good(s). And, finally, note that the consumer’s 7 Welfare Economics 2: The Many-Person Case 119 surplus is measured in dollars (or in units of good 2), rather than in utility units for our particular consumer. This will allow us to add together the consumer’s surpluses of two different consumers. INSERT FIGURE 7.3 HERE Caption of Fig. 7.3: The consumer’s willingness-to-pay for good 1, and the consumer’s surplus. 7.4 A Consumer’s Surplus Example With Quasilinear Preferences We will now work out an example similar to Example 1, but with quasilinear preferences. Remember that Example 1 produced a disturbing inconsistency between compensating variation and equivalent variation, and a strange paradox. In the example to which we now turn, we will see that the compensating variation and equivalent variation measures are exactly equal (making inconsistencies and paradoxes impossible), and that they are precisely equal to the change in the consumer’s surplus. Example 3. A quasilinear utility function, with a rise in p1. Compensating variation, equivalent variation, and the change in consumer’s surplus. As in Example 1, we assume the consumer’s income is M = 18, and the prices at the start are p1 = 1 and p2 = 1. As before, the change will be an increase in the price of good 1, to p1 = 2.25. In contrast to Example 1, we now assume the utility function is u(x1, x2) = ln x1 + x2. Note that u(x1, x2) is quasilinear. Setting MRS equal to the price ratio gives MRS = MU1 MU2 = 1 x1 = p1 p2 = p1. Therefore the consumer’s demand function for good 1 is x1 = 1/p1. Note that her demand for good 1 is independent of her income M . At the start, when p1 = 1 and p2 = 1, she will choose x∗1 = 1, and since her budget constraint is x1 + x2 = M = 18, x∗2 = 17. That is, her initial utility maximizing bundle is x∗ = (x∗1, x∗2) = (1, 17). Her initial utility level is u(x∗) = ln 1 + 17 = 17. 7 Welfare Economics 2: The Many-Person Case 120 Now we assume the price of good 1 rises to p1 = 2.25, while M and p2 remain the same. The consumer is worse off, since she can no longer afford the bundle she had been consuming before. We first use the demand function to find her new desired quantity of good 1: x∗∗1 = 1/p1 = 1/2.25 = 4/9. Then we use the budget constraint, 2.25(4/9)+ x2 = 18, which gives x∗∗2 = 17. Therefore her new utility maximizing bundle is x∗∗ = (x∗∗1 , x∗∗2 ) = (4/9, 17). Her new utility level is u(x∗∗) = ln(4/9) + 17 = 17 − ln(9/4) = 16.19. In the move from the old point x∗ to the new point x∗∗, her utility drops by 17 − 16.19 = 0.81 utility units. As was the case with Example 1, this information is not particularly helpful, because we don’t know how to interpret a utility unit. We now want to measure her welfare change in terms of the compensating variation measure (the income effect based on the new prices) and the equivalent variation measure (the income effect based on old prices). It will help to refer to Figure 7.2, which shows all the relevant points and lines, although the horizontal and vertical coordinates are off because that figure was not drawn for this particular quasilinear utility function. The original consumer optimum is at x∗ = (1, 17) and the new consumer optimum is at x∗∗ = (4/9, 17). To find the compensating variation measure of the consumer’s loss, we need to find the point corresponding to y in Figure 7.2. We know that y is on the old indifference curve, so ln y1 + y2 = u(x∗) = 17. We also know that y is directly above x∗∗, and so y1 = x∗∗1 = 4/9. Therefore the vertical coordinate of y is y2 = 17− ln(4/9) = 17+ ln(9/4). Therefore the vertical gap between y and x∗∗ equals 17+ln(9/4)−17 = ln(9/4) = 0.81. Since p2 = 1, the compensating variation measure of the consumer’s loss is $0.81. The equivalent variation measure is based on the vertical gap between x∗ and z in Figure 7.2. We will not go over the detailed calculations, which are obviously very similar to what we just did. Remember, though, that under the quasilinearity assumption, the vertical distance or gap between two indifference curves is constant as x1 varies. In short, the equivalent variation measure of the consumer’s loss is also ln(9/4), or $0.81. Next, we will figure out the consumer’s loss of consumer’s surplus as p1 changes from 1.0 to 2.25. Figure 7.3 shows consumer’s surplus as the area under a demand curve (or, more precisely, an inverse demand curve) but above p∗1, for a consumer consuming x∗1 units of good 1 when the price is p∗1. Figure 7.4 below applies this method to our Example 3 consumer, whose inverse demand function is p1 = 1/x1. 7 Welfare Economics 2: The Many-Person Case 121 In Figure 7.4, when the price is p1 = 1, consumer’s surplus is the area under the inverse demand curve but above p1 = 1. When the price is p2 = 2.25, consumer’s surplus is the area under the inverse demand curve but above p1 = 2.25. The change in consumer’s surplus is therefore the roughly trapezoidal difference between these two areas, shown with cross-hatching in the figure. We need to find the area of the cross-hatched region. Let’s call it C.S. Note first that C.S. is the sum of the area of a (cross-hatched) rectangle, and the area of a (cross-hatched) roughly triangular region ABC. The area of the cross-hatched rectangle is (4/9)× (9/4− 1) = (4/9)× (5/4) = 5/9. To find the area of the ABC rough triangle in Figure 7.4, we first take the integral of the inverse demand function, going from x1 = 4/9 to x1 = 1. This gives us the area below the inverse demand curve, down to the horizontal axis, between x1 = 4/9 and x1 = 1. Then we subtract the area of the rectangle immediately below the ABC rough triangle, which is (5/9)× 1 = 5/9. In short, C.S. will equal the integral of the inverse demand curve from x1 = 4/9 to x1 = 1, minus 5/9 (for the area of the non-cross-hatched rectangle below ABC), plus 5/9 (for the area of the cross-hatched rectangle to the left of ABC). That is, C.S. equals the integral of the inverse demand curve from x1 = 4/9 to x1 = 1. We now have C.S. = ∫ x1=1 x1=4/9 1 x1 dx1 = ln 1− ln ( 4 9 ) = 0+ ln ( 9 4 ) = 0.81. We conclude that when p1 rises from 1 to 2.25, our consumer with quasilinear preferences suffers a loss. Whether we calculate her loss using the compensating variation measure, the equivalent variation measure, or the consumer’s surplus measure, we get the same answer, $0.81. And 0.81 is also her loss in utility units. INSERT FIGURE 7.4 HERE Caption of Fig. 7.4: The cross-hatched area is the loss of consumer’s surplus when p1 rises from 1 to 2.25. For this quasilinear preferences example, compensating variation, equivalent variation, and consumer’s surplus all agree. 7.5 Consumers’ Surplus Note the changed location of the apostrophe. We are now discussing a social measure, and allowing for many people. 7 Welfare Economics 2: The Many-Person Case 122 Suppose we have a group of consumers with quasilinear preferences. For each consumer, separately, we do the exercise described above. This will produce a demand curve for each consumer, showing consumer i’s demand for good 1, contingent on p1, the price of good 1. We write this as xi1 = v′i −1(p1). We then add all the demand functions together. That is, for each p1, we add together the amounts demanded by the various consumers. (Since economists put good 1 on the horizontal axis and p1 on the vertical axis, this is sometimes called “adding the demand curves horizontally.” See Chapter 4.) Let X1(p1) be the resulting market demand curve. Based on the assumptions we have been making in this section, the market demand curve is independent of the incomes of the various consumers and of their consumption levels for the other good. As was the case with individual demand curves, we can read points on the market de- mand curve X1(p1) in two ways. We can read them from the vertical coordinate (price) to the horizontal coordinate (quantity). This reading says that at each price p1, there is a market demand X1(p1) that is the sum of the individual demands of all the consumers: X1(p1) = x11(p1) + x21(p1) + . . .. Or, we can read them from the horizontal coordinate (total quantity demanded) to the vertical coordinate (price), with the understanding that the total demand is based on particular amounts demanded by consumer 1, consumer 2, and so on. With this reading, for a given X1, there is an underlying list of desired quantities of good 1 for the various consumers, which sum to X1. In this list, consumer 1’s willingness-to-pay for an addi- tional unit is equal to p1, and consumer 2’s willingness-to-pay for an additional unit is also equal to p1, and so on, for all the consumers. Now fix the price p1 at p∗1. There is a corresponding market demand X∗1 . For each of the consumers, there is an individual demand curve, and an amount of consumer’s surplus associated with the consumer being able to buy her desired amount of good 1 at the price p∗1 per unit. That amount of consumer’s surplus for consumer i is in money units or units of good 2. Adding together money amounts of consumer’s surplus over all the consumers makes perfectly good sense. Doing so produces the consumers’ surplus, which is simply aggregated consumer’s surpluses. Finally, the economist can find consumers’ surplus in a simple graphical way without looking at the separate demand curves of the various consumers. For the fixed p∗1, simply find the area under the market demand curve, above the horizontal line at p∗1. That is, take the market 7 Welfare Economics 2: The Many-Person Case 123 demand function, reading it from the horizontal axis (quantity) to the vertical axis (price), and integrate it from X1 = 0 to X1 = X∗1 . Then subtract p∗1X∗1 . The result is consumers’ surplus. The formal mathematical definition is as follows. Let X1(p1) represent the market demand function (reading from price to market quantity), and let V1(X1) represent the inverse market demand function (reading from market quantity to price). Let p∗1 be a given market price, and suppose X∗1 is the corresponding market quantity. We abbreviate consumers’ surplus Cs.S. (note the plural “consumers”). Consumers’ surplus, given (p∗1, X∗1), is now defined as: Cs.S. = ∫ X1=X∗1 X1=0 V1(X1)dX1 − p∗1X∗1 . In Figure 7.5 below, we show two consumers’ individual demand curves for good 1, assumed here for the sake of simplicity to be linear, and we show a market demand curve, which is drawn as the (horizontal) sum of the two individual demand curves. The market price is at p∗1. Consumers’ surplus is shown in the graph of the market demand curve. We leave it as an exercise for the student to show that in Figure 7.5, consumers’ surplus equals consumer 1’s surplus plus consumer 2’s surplus. INSERT FIGURE 7.5 HERE Caption of Fig. 7.5: Consumer’s surplus for two consumers, and consumers’ surplus. Example 4. Build a bridge? Here is a rather typical application of consumers’ surplus: Let’s suppose the government is considering building a bridge. There is just one size, and it will either be built or not be built. It will cost $1,000,000 (per unit time) if built, and nothing if not built. Assume quasilinear preferences. Local econometricians have estimated a demand curve for the bridge, given by X(p) = 1, 000, 000− 200, 000p, where p is the price charged by the government for use of the bridge, per unit time, or the user fee. The government wants to build the bridge if it is socially worthwhile to do so. Building the bridge is socially worthwhile if and only if the net social benefit is non-negative, that is, if and only if consumers’ surplus given the price, plus the net profit to the government (revenue minus cost, or pX(p)− $1, 000, 000) is greater than or equal to zero. Should the bridge be built? If so, what price should be charged? To determine whether or not the bridge should be built, we can calculate consumers’ surplus at the user fee which maximizes the consumers’ surplus, namely p = 0. The consumers’ surplus 7 Welfare Economics 2: The Many-Person Case 124 triangle at p = 0 has a height of $5 and a base of 1,000,000, and therefore an area (one half base times height) of $2,500,000. This far exceeds the cost of building the bridge, and so the bridge certainly should be built. However, the government might be reluctant to build the bridge and charge a user fee of $0. What if it requires a p just high enough to cover its cost of building the bridge? In that case, it requires that the fee revenue cover costs, or pX(p) = 1, 000, 000. This leads to 5p − p2 = 5, which gives p = (5−√5)/2 = 1.382. But is charging p = $1.382 the best policy? We leave it as an exercise for the student to compare the net social benefit (consumers’ surplus plus government revenue minus the cost of the bridge) under three alternative policies: build the bridge and charge a user fee of p = $0, or build the bridge and charge a user fee of p = $1.382, or build the bridge and charge the user fee that maximizes government revenue. 7.6 A Last Word on the Quasilinearity Assumption Our measures of consumer’s surplus (one person) and consumers’ surplus (many people) make good sense under the assumption of quasilinearity, but may be objectionable without that as- sumption. We shouldn’t be complacent about this, and shouldn’t take the quasilinearity as- sumption lightly, because lots of plausible utility functions are not quasilinear. For example, the utility function we used in Example 1, u(x1, x2) = x1x2, is not quasilinear, and neither are commonly used variations on the same theme, such as u(x1, x2) = xα1xβ2 . (This is called a Cobb- Douglas utility function, after similarly defined production functions that will be discussed in Chapter 9.) These utility functions, and many others, simply cannot be rewritten as v(x1)+x2, and indifference curves for these utility functions do not have the indifference curves properties for quasilinear utility functions that we used in this chapter, namely that the MRS is constant for fixed x1, and that for any two indifference curves, the vertical gap between them is constant as x1 varies. For utility functions like these, income effects on good 1 are not zero. Then, as Ex- ample 1 showed, compensating variation and equivalent variation measures produce paradoxical results, and demand functions for good 1 depend on income M . In short, quasilinearity is a big assumption, which is not likely to be true for a utility function blindly picked out of a hat, and we should not take it lightly. 7 Welfare Economics 2: The Many-Person Case 125 Economists sometimes try to avoid the quasilinearity assumption by measuring consumer’s surplus and consumers’ surplus with compensated demand curves. With such demand curves, the income effects have already been teased out, and so the objectionable results of Examples 1 and 2 may be avoided. In a similar fashion, economists may try to avoid problems in the logic of the consumer’s and consumers’ surplus by focusing on goods for which income effects are minor (goods like zucchini, apples, toothpaste, or the services of a bridge used in the commute to work). But we should be aware of the limitations of the concept when applying it to markets with significant income effects (such as housing, healthcare, etc.) And to return to the issue raised in Example 2, the quasilinearity assumption makes our conclusions independent of the income distribution. Quasilinearity makes demand independent of income, and in effect makes the marginal utility of income equal to 1 for every consumer. But are we really willing to say that an additional dollar of income is valued the same by Bill Gates and by the poorest person in your hometown? 7.7 A Solved Problem The Problem The small state of Rhode Island in the U.S. is planning to build a state of the art highway from Providence to Newport. The aggregate demand for highway services between the two cities is given by x = 100, 000−20, 000P , where x measures the number of cars and P the toll charged to the user. Assume quasilinear preferences. The total cost of the highway is estimated as $300,000. (All price and costs quoted are per unit time.) The state has decided not to charge a toll on this road. The project will go ahead if consumers’ surplus covers at least the total cost of the project. (a) Should the highway be built? (b) Newport plans an advertising campaign (“Come to Newport – The City By the Sea!”) that will cost $10,000, and will cause the highway demand to double to x = 200, 000−40, 000P . Should the ad campaign and the highway project be carried out? The Solution 7 Welfare Economics 2: The Many-Person Case 126 (a) The demand curve is linear. Its vertical intercept is P = 100, 000/20, 000 = 5, and its horizontal intercept is x = 100, 000. Consumers’ surplus is the area under the demand curve and above the horizontal axis (where P = 0). Therefore consumers’ surplus is the area of a triangle with height 5 and base 100, 000, or 12 · 5 · 100, 000 = $250, 000. Therefore the highway shouldn’t be built. (b) If the advertising campaign happens, the demand curve will shift out. The new vertical intercept is P = 200, 000/40, 000 = 5, which is the same as the old intercept, and the new horizontal intercept is x = 200, 000. The new consumers’ surplus is the area of a triangle with height 5 and base 200, 000, or 12 · 5 · 200, 000 = $500, 000. Since $500,000 is greater than the sum of the cost of the highway ($300,000) and the cost of the advertising campaign ($10,000), both the ad campaign and the highway project should be carried out. 7 Welfare Economics 2: The Many-Person Case 127 Exercises 1. Assume a consumer has quasilinear preferences. Consider two of her indifference curves, corresponding to u(x1, x2) = 5 and u(x1, x2) = 10. Show that the vertical distance between the two indifference curves remains constant, no matter what x1 might be. 2. In Example 3, compensating variation, equivalent variation, and the change in consumer’s surplus were all equal to -$0.81. The change in utility was -0.81 utility units. Can you explain why the change measured in utility units was identical to the change measured in dollars? 3. In Figure 7.5, prove (using simple geometry) that consumers’ surplus equals consumer 1’s surplus plus consumer 2’s surplus. 4. Consider again the story of the bridge in Example 4. (a) Calculate the net social benefit when p = 0 and when p = (5−√5)/2 = 1.382. (b) Find the price that would maximize government revenue from the bridge, pX(p). If the government chooses the price that maximizes revenue, what is the net social benefit? (c) Explain this claim: since the cost of the bridge is fixed at $1,000,000, the net social benefit must be maximized when p = 0. Can you find a formula for the net social benefit? 5. Consider the utility function of Example 1, u(x) = u(x1, x2) = x1x2. Assume again that the consumer has M = 18, that the prices are p1 = 1 and p2 = 1) to start, and that the price of good 1 rises to p1 = 2.25. Recall that the consumer’s demand function for good 1 is x1 = M/2p1. Find the loss of consumer’s surplus resulting from the rise in p1. (Hint: Look at the methodology of Example 3, sketch a picture like Figure 7.4, and integrate.) 6. Carter’s utility function is u(x, y) = 10x+ 13x3 + y. 7 Welfare Economics 2: The Many-Person Case 128 (a) Find his demand function for x, x(px). How many units of x does he demand when px = 1 and py = 1? (b) Find his inverse demand function for x, px(x). What is his consumer surplus? (c) Suppose the price of x rises to px = 6 while py is unchanged. How many units of x does he demand now? What is his new consumer surplus? 129 Part II Theory of the Producer 8 Theory of the Firm 1: The Single-Input Model 130 8 Theory of the Firm 1: The Single-Input Model 8.1 Introduction Production is the transformation of inputs into outputs. The production process typically takes place within firms. They buy or hire various inputs, and combine them using available technology to produce various outputs, of goods and services. Then they sell the outputs they produce. For example, a firm that makes video games hires different kinds of labor (game experts, programmers, sales people, accountants, lawyers, and so on) and buys or rents various capital goods (office space, computer equipment, internet access, furniture, and so on) to make and market games. A farm, whose land and machinery are more or less fixed in the short term, employs labor to produce corn. In the farm example, it’s plausible to think of the production process as one that uses one input to produce one output. In this chapter, we will develop a simple production model with just one input and one output; we call it the single-input/single-output model. At the end of the chapter we will briefly describe a model with a single input and multiple outputs—most firms in reality many outputs— and we will provide techniques for solving its profit maximization problem. Later, in the next chapter, we will move on to the case of the production of a single output with multiple inputs, the multiple-input/single-output model. Focusing on the simple single-input/single-output model is definitely not the usual textbook approach. Most books on microeconomics start with a two-input/one-output model, the kind that we will cover in our next chapter. We think that either approach—single-input/single- output or multiple-input/single-output—can be used to introduce a reader to the most important implications of the theory of the firm. In this book we give the reader a choice. When we developed the theory of the consumer in Chapters 2 and 3, we modeled his goals (finding a most-preferred bundle, or maximizing utility) and his constraints (the budget con- straint). Similarly, we will now model the goals and the constraints for a typical firm. As for goals, we will assume that the firm wants to maximize its profit, that is, revenue minus the cost of production. There may be some debate about this assumption. For instance, some analysts assume that firms want to maximize market shares, rather than profits, or that the managers of some firms may be more interested in maximizing their own compensation levels, rather than their firms’ profits. Moreover, there are many important institutions in society which are 8 Theory of the Firm 1: The Single-Input Model 131 explicitly non-profit, including most government institutions, and many schools, charities, uni- versities and hospitals. Our model may not fit them well at all. But economists feel that the profit motive—money, money, money—usually motivates private firms that are in the business of producing and selling goods and services. What about the firm’s constraints? There are two kinds. First, there are technological constraints. This means that the firm must work with an existing body of scientific and technical knowledge, and the restrictions imposed by nature. For example, it is impossible for Federal Express, no matter how effectively managed, to deliver a package on the day before it was sent. That’s due to a law of nature. And it’s impossible for Pfizer (currently the world’s largest pharmaceutical firm) to make a pill that cures all forms of cancer. That’s because it cannot be done given the reality of science and technology. Such restrictions are embodied in the idea of a production function. A production function is a mathematical description of how the firm can transform inputs into outputs, given the technological constraints. Second, there are market constraints. These are the constraints on the prices and quantities of the inputs the firm uses, and on the prices and quantities of the outputs the firm sells. We will start this chapter by describing the single-input/single-output model. In Section 2, we will assume the firm’s output is its choice variable, and in Section 3, we will assume the firm’s input is its choice variable. In Section 2, we will derive the firm’s supply function for its output, or its supply curve, and in Section 3, we will derive the firm’s demand function for its input, or its demand curve. In Section 4, we will consider the case of many outputs, that is, the single-input/multiple-output model. 8.2 The Competitive Firm’s Problem, Focusing on Its Output The Production Function. We assume now that the firm produces one output using one input. The output quantity is y; the input quantity is x. (For instance, the farm produces y bushels of corn using x units of labor.) The technological constraints on the firm are represented by its production function y = f(x). The function shows the maximum output y the firm can produce if it uses x units of the input. The first basic assumption we make about the production function is monotonicity. This means that as x increases, y also increases. Formally, the first derivative of the production 8 Theory of the Firm 1: The Single-Input Model 132 function, f ′(x), is positive. This is a very plausible assumption in most situations. Think of the farm that grows corn—more labor on the farm means more corn. Of course, if the farm is 10 acres, and the farm already employs so many workers that they cannot physically fit on 10 acres, then monotonicity is implausible. But we will assume that firms try to make money, and a firm interested in making money would never use so many workers that an additional unit of labor produces negative additional output. The second basic assumption that we make about the production function has to do with its curvature. We have two alternative versions of this assumption: a simple version and a more realistic version. The simple assumption is concavity. Concavity says that while increases in x lead to increases in y, the increases get smaller and smaller as x gets bigger and bigger. Mathematically, while the first derivative of the production function, f ′(x), is positive, the second derivative, f ′′(x), is negative. This is sometimes described as the assumption of “diminishing returns.” Think of that 10 acre farm. Suppose there are n workers, and one worker is added. This will result in some increment of corn, say 100 bushels. But going from n + 1 to n + 2 workers will result in a smaller increment of corn, say 90 bushels, and so on and so forth. David Ricardo (1772-1823), a descendant of Sephardic Jews from Portugal and an eloquent member of the British Parliament, was one of the great classical economists (along with Adam Smith (1723-1790) and Thomas Malthus (1766-1834)). He once gave a speech against the corn laws, which put tariffs on grain imports to Britain. In the speech he justified the assumption of diminishing returns. He argued that if it were not for diminishing returns, one could feed all of England, and all the world for that matter, simply by putting more and more labor into raising grain planted in one flower pot. His reductio ad absurdum argument convinced his opponents that diminishing returns are real, but the corn laws lived on until 1846. The more realistic curvature assumption is that the production function is convex at first, and then turns concave. More formally, f ′(x) is always positive, but f ′′(x) is at first positive, passes through zero, and then becomes (and stays) negative. That is, when x is small, f ′(x) is increasing as you increase x, but when x is big, f ′(x) is decreasing. This is sometimes described as “increasing returns” when the firm is small, followed by “diminishing returns” when the firm is large. The firm becomes more and more efficient as it grows from size zero, in the sense that the incremental output of an additional worker gets greater and greater, until it reaches maximal efficiency, in the sense that the incremental output of an additional worker is 8 Theory of the Firm 1: The Single-Input Model 133 maximized, after which it becomes less and less efficient as it grows beyond that size. There is evidence that real production functions do have this property, and so we will call it the real-world convexity/concavity assumption, or the real-world assumption for short. For reasons which will soon become clear, this is also called the U-shaped average cost curve assumption. In Figure 8.1 below we draw two production functions; the first is always concave. The second shows the real-world case, with a production function that is convex at first and then becomes concave. We will also assume f(0) = 0. This means that our firm has the option of choosing to use no input x and produce no output y. That is, our firm can shut down, hire no input, and sell nothing. If it does this, it has zero profit. INSERT FIGURE 8.1 HERE Caption of Fig. 8.1: Two production functions: the always concave case and the real-world case. Price or Market Constraints. So far we have talked about the technological constraints. We now turn to the market constraints. At this point we will assume that our firm operates in perfectly competitive markets. When we say that a firm (or, for that matter, a consumer) operates in a competitive market, we mean that the firm (or the consumer) takes the price as given. We will assume that our firm is competitive in the market where it buys or hires its input x, and that it is competitive in the market where it sells its output y. Let w represent the input price (w suggesting “wage”) and let p represent the output price. Our competitive firm takes both w and p as given and fixed, that is, beyond its control. The firm acts as a price-taker when it is deciding how to maximize profits. The assumption that the firm is competitive in the markets for its input and for its output is especially reasonable for small firms (small in the sense that they only use a small fraction of the input good sold to various firms, and only provide a small fraction of the output good sold by the various firms that sell the same good). For instance, a corn farmer in Midwestern U.S., even one with a 1000 acre farm, will produce only a minute fraction of the corn produced each year for the U.S. market. It will be competitive in the market for its output. It will also probably only use a small part of the labor input available in the local labor market. So the small farmer is competitive in its input market and in its output 8 Theory of the Firm 1: The Single-Input Model 134 market. On the other hand, there are some very large firms that buy corn, and some of them may be so large that their decisions about how much to buy will affect the market price for corn. We would not call them competitive in the corn market. Note that this example illustrates that in any market there are buyers and there are sellers, and there may be competitive behavior or non-competitive behavior on either side of the market. Profit. Let’s now think more carefully about profit. Profit is the difference between revenue and cost. We will write pi for profit. Profit is usually measured as money per unit time, or what is called a flow, such as “$500 per month” or “$1 billion per year.” However we drop the time reference when no confusion results. Revenue is the money that comes into the firm from the sale of its output. In our case, revenue is py(x). Cost is the money that goes out of the firm because of its purchase of its input. In our case, cost is wx. Like profit, revenue and cost are both money amounts per unit time, but we will usually drop the time reference. Our firm wants to select the input quantity x, and/or the output quantity y, that will maximize profit: pi = py − wx. Note that we can substitute y(x) for y in the above equation, in which case we are left with the simple problem of finding the x that maximizes profit, as a function of x: pi(x) = py(x)−wx. Alternatively, we can look at the inverse of the production function x = f−1(y). We can use this function to substitute for x in the expression for profit, and then we are left with the simple problem of finding the y that maximizes profit, as a function of y: pi(y) = py − wf−1(y). In this section, we take the second approach. We “solve out” the input variable x, treat profit as a function of output y, and solve the profit maximization problem by finding the y that maximizes pi(y). Consider again the inverse of the production function x = f−1(y). For a given output level y, it shows the amount of the input x that the firm must use to produce y. Note that when there is only one input, the firm doesn’t have much to think about. If it wants to produce and 8 Theory of the Firm 1: The Single-Input Model 135 sell y, it must use x = f−1(y). However, when we analyze the behavior of the firm that uses two or more inputs, then this stage of the decision making will become much more interesting and complicated, because the firm will have to decide, for a given level of output y, what combination of inputs will produce that output at least cost. We put off this question to the next chapter. For now, if the firm is to produce and sell y, it must buy or hire x. Looked at this way, the function f−1(y) shows what’s called the firm’s conditional input demand or conditional factor demand. Figure 8.2 shows this factor demand function for the real-world case of a production function that is convex at first, and then turns concave. Note that the function shown in Figure 8.2, which is the inverse of the second function shown in Figure 8.1, can be found by flipping the axes in Figure 8.1—just put x on the vertical axis and y on the horizontal axis. Also note that this function starts out concave, and then turns convex. INSERT FIGURE 8.2 HERE Caption of Fig. 8.2: The conditional factor demand in the real-world case. Total Cost, Average Cost, and Marginal Cost. Back to the problem at hand, which is to maximize the firm’s profit (as a function of y), that is, to maximize pi(y) = py − wf−1(y). We will call wf−1(y) the firm’s total cost function and write it as C(y). This is the cost in dollars—or currency—that the firm must pay if it wants to produce y units of the output. We show a graph of the total cost function, or the total cost curve, in Figure 8.3 below. Notice that the total cost curve is simply a translation of the conditional input demand curve in this simple single-input model. In Figure 8.3, we also show a point P on the total cost curve, and, at P , a ray from the origin through P , labeled l1, and a line l2 that is tangent to the total cost curve at P . The horizontal component of P is identified as y. The point Q in the graph, corresponding to output y∗, is where a ray from the origin is tangent to the total cost curve. INSERT FIGURE 8.3 HERE Caption of Fig. 8.3: The total cost curve in the real-world case. Now consider the straight lines in Figure 8.3. First, look at l1. The slope of l1 is important; it equals the height of P , which is the total cost of producing the quantity y, divided by the horizontal coordinate of P , which is y. That is, the slope of l1 is the total cost of producing the 8 Theory of the Firm 1: The Single-Input Model 136 given amount, divided by that amount. In short, it is the average cost of producing the given amount. Next, consider the slope of l2 at P . This is the slope of a tangent line to the total cost curve, or the slope of the total cost curve itself. It is the rate of change of total cost as the quantity changes, or intuitively, the extra cost of producing an additional unit, given that the firm is at the point P . This is the marginal cost at the given quantity. More formally, for any quantity y, average cost, written AC(y), is defined by AC(y) = C(y) y . For any quantity y, marginal cost, written MC(y), is defined as the derivative of total cost at y, or MC(y) = C′(y) = dC(y)dy . Looking at Figure 8.3, and focusing on the slopes of the lines l1 and l2, should convince the reader of several important things. First, average cost starts as a large positive number (the slope of the total cost curve at the origin), then declines monotonically until it reaches a minimum at the point Q in the figure (corresponding to output y∗). After that it rises monotonically. Second, marginal cost starts as a large positive number (again the slope of the total cost curve at the origin), declines monotonically until it reaches a minimum (in Figure 8.3, at the point P ), and then rises monotonically. Third, at the point where average cost reaches its minimum (that is, at the point Q), average cost and marginal cost must be equal. Fourth, to the left of that point, average cost exceeds marginal cost, and to the right of that point average cost is less than marginal cost. Based on these observations about the important real-world case, we can draw the average cost and marginal cost curves of Figure 8.4 below. INSERT FIGURE 8.4 HERE Caption of Fig. 8.4: Marginal and average cost curves in the real world case. The relationship between average cost and marginal cost in the real-world case can be easily derived mathematically. Recall that AC(y) = C(y)/y. Differentiating gives dAC(y) dy = yC′(y)− C(y) y2 . 8 Theory of the Firm 1: The Single-Input Model 137 At the bottom of the average cost curve, this derivative has to be zero, which gives yC′(y)−C(y) y2 = 0, or MC(y) = C′(y) = C(y)/y = AC(y). Therefore when average cost is at its minimum, average cost is equal to marginal cost. To the left of the average cost minimum point, average cost is declining as y increases, or dAC(y) dy = yC′(y)−C(y) y2 < 0. This leads immediately to C′(y) < C(y)/y, or marginal cost is less than average cost. And to the right of the average cost minimum point, we easily see from an almost identical argument that marginal cost is greater than average cost. Profit Maximization With Output as the Choice Variable. We are now ready to solve the profit maximization problem for the competitive firm: max pi(y) = py −wf−1(y) = py − C(y). Note that y is constrained to be greater than or equal to zero, and remember that we are assuming that the firm has the option of choosing x = 0, in which case y = 0, and therefore pi = 0. Once we have found the solution to this problem for every output price p, we will have the supply curve of the firm; that is, for each p, we will have the output y that the firm will supply to the market. We start by observing that since the firm can choose pi = 0, the y it chooses must produce non-negative profits. Therefore pi(y) = py −C(y) ≥ 0. Dividing both sides of the inequality gives p ≥ AC(y), which in turn implies p ≥ minAC(y). We conclude that if the market price p for the output doesn’t even cover the firm’s minimum average cost of producing its output, the firm won’t produce anything. In Figure 8.5 below, 8 Theory of the Firm 1: The Single-Input Model 138 we show a “produce-nothing” situation. In the figure, we consider the possible choice of an arbitrarily chosen y′, but at that point (as at all the points), the price is less than average cost. INSERT FIGURE 8.5 HERE Caption of Fig. 8.5: p < minAC(y) leads to zero supply. Now we assume that the market price p is high enough to cover minimum average cost. How much output should the firm produce? The first order condition for maximizing profit says that the derivative of pi(y) should be zero. Of course setting the derivative of a function equal to zero will find the function’s minima as well as its maxima. To find the maximum of a function, we set the first derivative equal to zero (the first order condition), and we also require the second derivative to be less than or equal to zero (the second order condition). The first order condition for profit maximization is dpi(y) dy = p− dC(y) dy = p−MC(y) = 0. This gives the crucial basic rule for profit maximization for a competitive firm: price equals marginal cost, or p = MC(y). The second order condition for profit maximization is d2pi(y) dy2 = d(p−MC(y)) dy ≤ 0, and this leads directly to dMC(y)) dy ≥ 0. In short, at a profit-maximizing point, price equals marginal cost, and marginal cost is rising (or at least not falling). The marginal cost curve cannot be downward sloping at the point of maximum profit. In sum, profit maximization for a competitive firm implies the following: (1) If p is less than the minimum of average cost, the firm produces nothing. (2) If the firm is producing something, it will choose an output level y where p = MC(y), and where marginal cost is rising (or at least not falling). 8 Theory of the Firm 1: The Single-Input Model 139 In Figure 8.6, we show average cost and marginal cost curves for the real-world case, and a horizontal line at price p. The line passes through the marginal cost curve at two points; one intersection represents the profit minimum, and the other represents the profit maximum. The reader should note that at the point labeled “minimum profit,” the price p is less than average cost, which implies that profits are negative. Also, at the “minimum profit” point, marginal cost is declining, indicating that this point fails the second order condition for a maximum. (It actually satisfies the second order condition for a profit minimum). INSERT FIGURE 8.6 HERE Caption of Fig. 8.6: Supply the quantity where p = MC(y) and the slope of MC(y) is non-negative. Now let’s describe the supply curve for our competitive firm in the real-world case. As long as p < minAC(y), the firm supplies zero. That is, in Figure 8.6, if the market price is less than minAC(y), the supply curve coincides with the vertical axis. If p ≥ minAC(y), the firm will find the y where p = MC(y) and where MC(y) is rising (or at least not falling). That is, in Figure 8.6, if p ≥ minAC(y), the supply curve will coincide with that part of the MC(y) that lies above the AC(y) curve. Figure 8.7 below shows the competitive firm’s supply curve. INSERT FIGURE 8.7 HERE Caption of Fig. 8.7: The competitive firm’s supply curve. 8.3 The Competitive Firm’s Problem, Focusing on Its Input Remember that the firm chooses its output y and/or its input x, so as to maximize its profit pi = py − wx, given its production function y = f(x). In Section 2 above, we substituted for x using the inverse of the production function and maximized pi(y) = py − wf−1(y). That is, we viewed profit as a function of output quantity, we found important conditions for profit maximization, and we derived the firm’s supply function for the output. Now we will do 8 Theory of the Firm 1: The Single-Input Model 140 it the other way; that is, we will treat profit as a function of input quantity, we will find new and equally important conditions for profit maximization, and we will derive the firm’s demand function for the input. In other words, in this section we will consider how to maximize profit, given by pi(x) = pf(x)− wx. Marginal Product and Average Product. Before proceeding we need to introduce a few concepts related to the production function f(x). First, consider increasing x by a small amount, say, one unit. Then output y goes up by some amount. Somewhat loosely speaking, the extra output resulting from another unit of input is called the “marginal product” of the input. For example, one extra worker on the farm might produce another 100 bushels of corn. More precisely, the marginal product of the input is defined as the derivative of the production function, or MP (x) = df(x)dx = f ′(x). Of course we are already familiar with this function, having discussed it when we were describing the curvature of the production function. Marginal product is, intuitively, the extra output from one extra unit of input. This is generally quite different from the average output from all the units of input. The average product is defined, quite simply, as the average output of all those input units: AP (x) = f(x) x . Given the curvature assumptions we have made for the production function in the real-world case, both MP (x) and AP (x) are at first increasing as x increases, and then switch to decreasing as x increases. To see this, refer back to Figure 8.1 showing the production function. In the “real-world” panel, where marginal product would be the slope of a tangent to the f(x) function, and where average product would be the slope of a ray from the origin to the f(x) function, it is clear that MP (x) (the slope of a tangent line) rises and then falls, and that AP (x) (the slope of a ray from the origin) also rises and then falls. It is also clear that marginal product and average product start out equal at x = 0 (where they both equal the slope of the production function 8 Theory of the Firm 1: The Single-Input Model 141 at the origin). Marginal product must reach its maximum first (at the point of inflection of the f(x) function in Figure 8.1), and average product reaches its maximum second (at the input level where a ray from the origin is tangent to f(x)). Finally, it is clear from Figure 8.1 that at the point where average product is maximized, AP (x) is equal to MP (x). All this leads to Figure 8.8 below, which shows the marginal and average product curves for the real-world firm. INSERT FIGURE 8.8 HERE Caption of Fig. 8.8: Marginal and average product. Marginal product and average product are measured in units of output. To convert the measures into dollars, we simply multiply by the output price p. This produces what we call the value of marginal product, or VMP, and the value of average product, or VAP, respectively. The simple definitions are VMP (x) = pMP (x) and V AP (x) = pAP (x). To graph these functions, simply take the graphs shown in Figure 8.8, and shift all points upward by multiplying by p. This will be done in Figures 8.9 and 8.10 below. Profit Maximization With Input as the Choice Variable. We can now return to the profit maximization problem. The firm wants to maximize pi(x) = pf(x) − wx, and we know that if it chooses x = y = 0, profit will be zero, so it will not accept negative profit. Given that it won’t accept negative profit, it will only choose an x > 0 if pi(x) = pf(x)−wx ≥ 0. Therefore pf(x)/x ≥ w, or V AP (x) ≥ w. This leads to a condition that must hold if the firm is to use any input: maxV AP (x) ≥ w. To put it another way, the input price w must be less than or equal to the maximum of the V AP curve. If this condition is met, we can use the first and second order conditions to see how much x the firm wants to hire in order to maximize profit. The first order condition says that the derivative of pi(x) should be zero, and the second order condition says that the derivative should be falling (or at least not rising) at the profit-maximizing x. The first order condition is: dpi(x) dx = p df(x) dx − w = VMP (x)−w = 0. 8 Theory of the Firm 1: The Single-Input Model 142 This yields a simple expression: VMP (x) = w. The second order condition says that the derivative of dpi(x)dx should be less than or equal to zero: d2pi(x) dx2 = dV MP (x) dx ≤ 0. That is, the VMP (x) curve should be downward sloping (or at least not upward sloping.) Figure 8.9 below shows VMP (x) and V AP (x) curves, and is based on Figure 8.8, with the curves scaled up by a multiplicative factor p. In Figure 8.9, we also show three possible input prices, w1, w2, and w3. At both w2 and w3, the input price is too high, above the maximum of the V AP (x) curve. So the firm would opt for x = y = 0 and pi = 0 under either of these prices. At the price w1, however, there are many x′s for which V AP (x) ≥ w1. The firm would choose the profit-maximizing x by using the first order condition, VMP (x) = w1. That condition leads to two points, which are circled. However, only the point on the right satisfies the second order condition, which requires the VMP (x) curve to be downward sloping. INSERT FIGURE 8.9 HERE Caption of Fig. 8.9: Choosing x to maximize profits. Repeating what was done in Figure 8.9 for all possible levels of the input price w gives us the firm’s demand curve for its input. This is shown in Figure 8.10 below. This input demand curve should not be confused with the firm’s conditional input demand, which shows how much x the firm needs in order to produce a given level of output y. The demand curve we are now considering shows the quantity x of the input the firm wants to employ, for a given input price w, so as to maximize profits. As Figure 8.10 shows, for w above the maximum of the V AP curve, the firm will use none of the input good, because if it did use any, it would end up with negative profit. Once the input good price falls below the maximum of the V AP curve, however, the firm can make positive profits. The quantity of the input good it buys or hires is found on the VMP curve (first order condition), but only on the downward sloping part of that curve (second order condition). INSERT FIGURE 8.10 HERE Caption of Fig. 8.10: The profit maximizing firm’s input demand curve. 8 Theory of the Firm 1: The Single-Input Model 143 8.4 Multiple Outputs Most firms in the real world produce multiple outputs. In this section, we model a firm with one input and multiple outputs, to show how the conditions we have derived in earlier sections of this chapter might be generalized. As before, the analysis can be done by focusing on the outputs, or by focusing on the input. Focusing on the input involves techniques that are similar to the multiple-input/single-output case that will be developed in the next chapter, and will be left as an exercise at the end of that chapter. At this point we will focus on the outputs. Suppose then that the firm produces two different outputs, y1 and y2, using one input x. In Section 2 above, where there was one output y, we started with the production function y = f(x), and then we used its inverse x = f−1(y) to derive profit maximization conditions involving y, such as p = MC(y). We will do something similar here. But there is one small complication. When there are two (or more) outputs, it would be wrong to write a production function (y1, y2) = f(x). Why would this be wrong? Simply because for any fixed level of the input x, there are many combinations of y1 and y2 that the firm might be able to produce. On the other hand, the inverse production function f−1 does make perfect sense. Writing x = f−1(y1, y2) simply means that if the firm is going to produce y1 units of output 1 and y2 units of output 2, it needs to employ x units of the input. Therefore in this section we will use the inverse production function f−1 to represent the firms’s technological constraint, and we will use this notation even though the production function f(x) itself is not defined. In Section 2 above, where there was one output, we assumed monotonicity. This meant that as x increases, y also increases, or dy/dx = df(x)/dx > 0. Here we will make a similar assumption, but on the inverse production f−1, and for one output at a time. We will assume that ∂x/∂yi = ∂f−1(y1, y2)/∂yi > 0 for i = 1 and 2. That is, if the firm wants to increase its production of one of the outputs, while keeping the other constant, it will have to hire more units of the input. Also recall that in Section 2 above, the second basic assumption we made about the pro- duction function involved its curvature. The simple version of that assumption was that the production function is concave; the real-world assumption was that it is first convex and then concave. Concavity of a production function translates into convexity of its inverse. (A quick look at the graph of a concave production function in Figure 8.1 should convince you of this.) 8 Theory of the Firm 1: The Single-Input Model 144 Therefore in this section, for the simple version, we shall assume that f−1 is a (strictly) con- vex function. Somewhat loosely speaking, strict convexity of f−1 means the following. Sup- pose x0 = f−1(y01, y02) and x1 = f−1(y11, y12). Consider the average of the two output vectors: (0.5y01 + 0.5y11, 0.5y02 + 0.5y12). To produce this combination of outputs, it would take less than the average of the input quantities, 0.5x0+0.5x1. In a sense, the firm gains by averaging output vectors, perhaps because it can profitably rearrange the units of the inputs going to each output. Finally, we assume, similar to what we assumed before, that if the firm hires no input, it produces zero of both outputs. With respect to the firm’s market constraints, we continue to assume that the firm is com- petitive in all markets where it operates. That is, it acts as a price-taker in both of the output markets, and in the input market. Let p1 and p2 be the output prices, and let w be the input price. The firm wants to maximize profit, given by pi(y1, y2) = p1y1 + p2y2 − wx = p1y1 + p2y2 −wf−1(y1, y2) subject to the constraint that pi(y1, y2) ≥ 0. (Remember that the firm can hire no x and produce no outputs, giving pi = 0.) Assuming that profit is non-negative and that the firm is operating, how much of each output should it produce? The first order conditions for profit maximization are now: ∂pi ∂y1 = p1 −w ∂f−1(y1, y2) ∂y1 = 0, ∂pi ∂y2 = p2 −w ∂f−1(y1, y2) ∂y2 = 0. But what is w∂f−1(y1, y2)/∂yi? It is the price of x times the extra amount of input x required to produce another unit of output i, while holding output j constant. That is, it is the marginal cost of output i or MCi. Therefore the two first order conditions for profit maximization say that p1 = MC1 and p2 = MC2. These conditions are exactly parallel to the p = MC condition of the single-output case. In order to maximize profit, the firm will produce an amount of each output that equates price and marginal cost. Finally, provided the non-negative profit condition is met, the solution to this set of equations gives the firm’s supply functions for outputs 1 and 2. In general, each of these will depend on both prices; that is, they would be written y∗1(p1, p2) and y∗2(p1, p2). 8 Theory of the Firm 1: The Single-Input Model 145 Given our assumption of convexity for f−1, it is not necessary to check the second order profit maximization conditions. However, in more general cases such as the real-world case described in Section 2, we would have to check them. The second order conditions would be generalizations of the non-decreasing marginal cost conditions, also described earlier in Section 2. 8.5 A Solved Problem The Problem A competitive firm’s production function is y = 10 + (x− 1, 000)1/3. The price of the input x is w = 1. (a) Show that the firm’s total cost curve is C(y) = 1, 000+ (y − 10)3. (b) Show that the minimum of the marginal cost curve is at y = 10, and the minimum of the average cost curve is at y = 15. (c) Finally, show that the firm supplies zero when p < 75, and the firm supplies y(p) = 10+√p/3 when p ≥ 75. The Solution (a) First, we solve for x as a function of y. We rewrite the production function as y − 10 = (x− 1, 000)1/3. Then we cube both sides: (y − 10)3 = (x− 1, 000), and this gives the inverse of the production function: x = 1, 000 + (y − 10)3. Since cost is wx and since w = 1, the cost function is C(y) = 1, 000+ (y − 10)3. (b) Marginal cost is the derivative of the cost function: MC(y) = dC(y)dy = 3(y − 10) 2 . 8 Theory of the Firm 1: The Single-Input Model 146 To find the minimum of the marginal cost curve, we differentiate MC(y) and set the result equal to zero: dMC(y) dy = 6(y − 10) = 0. This gives y = 10. Average cost is AC(y) = C(y) y = 1, 000+ (y − 10)3 y . Differentiating AC(y) and setting the result equal to zero gives dAC(y) dy = y(3(y − 10)2)− (1, 000+ (y − 10)3)1 y2 = 0. This leads to 3y(y − 10)2 = 1, 000+ (y − 10)3, which simplifies to (2y + 10)(y − 10)2 = 1, 000, which yields y = 15. Consequently, average cost is minimized at y = 15. (c) At y = 15, the point where average cost is minimized, average cost is AC(15) = 1, 000+ (y − 10) 3 y = 1, 000+ 125 15 = 75. Therefore, for any price p < 75 and any y > 0, price is less than average cost. If p < 75, the firm will produce y = 0. For p ≥ 75, the firm will maximize profit by setting price equal to marginal cost, which gives p = MC(y) = 3(y − 10)2. Solving this equation for y as a function of p leads to y(p) = 10 +√p/3. 8 Theory of the Firm 1: The Single-Input Model 147 Exercises 1. Let the production function be y = x1/2. (a) Show that the production function y(x) is concave. (b) Suppose the price of x is w = 1. Find the firm’s total cost curve C(y), average cost curve AC(y), and marginal cost curve MC(y). (c) Find the firm’s supply curve y∗(p). (d) Suppose the price of y is p = 10. Calculate the firm’s profit. 2. Assume the production function is y = 5x1/3 − 30, and the price of x is w = 1. (a) Derive the firm’s total cost curve C(y), average cost curve AC(y), and marginal cost curve MC(y). (b) What is the firm’s supply curve y∗(p)? 3. Consider the production function from question 1 above, y = x1/2. Assume x ≥ 1. (a) Show that the inverse production function x(y) is convex. (b) The price of y is p = 10. Find the firm’s marginal product MP (x) and average product AP (x). (c) Find the firm’s value of marginal product VMP (x) and value of average product V AP (x). (d) Find the firm’s input demand curve x∗(w). (e) Suppose the price of x is w = 1. Calculate the firm’s profit. 4. Suppose the production function is y = x2/3 + 13x, and the price of y is p = 6. Assume x ≥ 1. (a) Find the firm’s marginal product MP (x) and average product AP (x). (b) Derive the firm’s value of marginal product VMP (x) and value of average product V AP (x). 8 Theory of the Firm 1: The Single-Input Model 148 (c) What is the firm’s input demand curve x∗(w)? 5. Consider the single-input/multiple-output model. Recall that x = f−1(y1, y2), the inverse production function, represents the firm’s technological constraint. Can you solve the profit maximization problem for this firm by focusing on the input variable? Hint: Do it with the following four steps. (Note: Because we have not specified the f−1 function, this is a graphical exercise, without specific functional or numerical solutions.) (a) An isofactor curve is a locus of output combinations that use the same level of input. In a graph of the (y1, y2)-quadrant, sketch some isofactor curves, assuming f−1 is convex. (b) An isorevenue line is a locus of output combinations that yield the same total revenue. Plot several isorevenue lines on the same graph as the isofactor curves. (c) Solve the revenue maximization problem for a fixed level of input. This will yield the conditional output supply curves y1(p1, p2, x) and y2(p1, p2, x). (d) Finally, write down the profit maximization problem, making profit a function of the single variable x. 6. The inverse production function with one input and two outputs is x = y21 + y22 + y1y2. Assume the price of x is w = 1. (a) Find the firm’s total cost curve C(y1, y2) and marginal cost curves MC1(y1) and MC2(y2). (b) Find the firm’s supply curves y∗1(p1, p2) and y∗2(p1, p2), subject to the non-negative profit condition. (c) Suppose p1 = 1 and p2 = 1. Calculate y∗1 and y∗2. What is the firm’s profit? (d) Suppose p2 rises to 2. Recalculate y∗1 and y∗2 . How has the firm’s profit changed? 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 149 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 9.1 Introduction In most of the last chapter we modeled a firm with one input and one output. However, assuming one input is unrealistic; most goods and services are produced by firms with a variety of different inputs. The production of something as simple as corn really requires land, labor, trucks, tractors, combines, fertilizer, pesticides, possibly irrigation, and so on. Moreover, the single-input model fails to capture a basic economic problem. Normally there are many ways to combine inputs to produce a desired level of the output; some of the ways are expensive and some are cheap. How does the firm combine various inputs so as to produce a given level of output at the least cost? In this chapter we will assume there are two or more inputs which the firm combines in some way to produce its output. We will analyze how the firm decides how much output to produce, and how much of each input to use, so as to minimize its costs and maximize its profits. That is, we will now develop the multiple-input/single-output model. As we indicated in the introduction to the last chapter, it is possible to learn about the most important results in the theory of the firm by studying either the single-input/single-output model, or by studying the multiple-input/single-output model. (The only important topic that the single-input/single-output model cannot handle is cost minimization.) This book differs somewhat from the typical textbook on microeconomics because it gives the reader the choice between these two models. We now turn to the second model. In the last chapter, x was the quantity of the (single) input and y was the quantity of the output. The production function was y = f(x). Now we assume there are two or more inputs. We let x1 represent the quantity of input 1 used by the firm, x2 the quantity of input 2, x3 the quantity of input 3, and so on. The production function now becomes y = f(x1, x2, x3, . . .). Most of the important implications of profit maximization with two or more inputs can be seen with just two inputs, and so we will focus on that case. In short, in this chapter the production function is assumed to be y = f(x1, x2). The function f(x1, x2) represents the technological constraints facing the firm, as did the function f(x) in the last chapter. The firm must work within the constraints imposed by nature, science, and technology. The firm also faces market constraints, involving the price of its output 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 150 p, and the prices of its inputs w1 and w2. There is another kind of constraint that we will consider in this chapter and the next, having to do with the variability of the firm’s inputs. Some input quantities can be changed quickly and easily; others can’t. For example, if the firm’s inputs are electricity or phone service, the quantities used can easily be varied hour by hour, even minute by minute. But if the inputs are, for example, acres of farmland planted in corn, or pharmaceutical research to develop new drugs, the input quantities can only be varied over periods of months, years, or even decades. So time horizons and the degrees of variability of input levels within those time horizons create a new type of constraint on the firm. In this chapter we will assume the inputs are both (or all) freely variable. In the next chapter we will assume one or more of the inputs is fixed over the underlying time horizon, while one or more of the inputs is variable. Economists call a period of time that is so long that all of the firm’s inputs are freely variable the long run, and they call a period of time that is so short that one or more inputs is fixed the short run. Therefore this chapter is about the theory of the firm in the long run. The next chapter is about the theory of the firm in the short run. We are doing the long run theory first because it is simpler and more elegant than the short run theory. We apologize for the inescapable vagueness about how long a time is long run, and how short a time is short run, but the study of economics is different from the study of, say, chemistry or physics. Moreover, as the great economist John Maynard Keynes (1883-1946) once quipped, “The long run is a misleading guide to current affairs. In the long run we are all dead.” (Keynes was writing about macroeconomic policy, rather than microeconomic theory, when he created this gem.) A complication of the long run/short run dichotomy is the possibility of bankruptcy. For example, in 2009, American automobile companies General Motors and Chrysler went through what is called chapter 11 bankruptcy in order to escape the burdens of debt and union contracts, both of which create massive costs that are impossible to escape in the short run. (Massive government aid also helped!) Chapter 11 bankruptcy allows a firm to shield itself from its creditors in a court; the court has the power to rewrite or erase the firm’s contractual obligations and thus modify or end such costs. In standard microeconomic theory, the long run is a period of time long enough to modify all the firm’s costs. In the real world, bankruptcy is another way to modify costs, and it may be faster than the “long run.” We’ll end this very brief discussion of bankruptcy by paraphrasing Keynes: in the long run we are all 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 151 dead, and if not dead, perhaps bankrupt. 9.2 The Production Function in the Long Run We are now assuming that y = f(x1, x2), and that both inputs are free to vary. Suppose a firm can produce 1 unit of its output by using 2 units of input 1, which we’ll call “workers,” and 3 units of input 2, “raw materials.” Now if 2 workers and 3 units of raw materials can produce 1 unit of the output, it’s easy to imagine that 3 workers and 3 units of raw materials can also produce 1 unit of output. Just ask the third worker to sit at a table in the shop or office and text messages to her children, while the original 2 workers make the 1 unit of output! Naturally we want to refine our notion of the production function to rule out this possibility, and so we will impose the assumption of technological efficiency. When there are various combinations of inputs that produce the same level of output, we call such combinations production techniques. For instance, in the example above, (x1, x2) = (2, 3) and (x1, x2) = (3, 3) are alternative production techniques, both resulting in output y = 1. We will say that a production technique is technologically inefficient if there is another production technique (or combination of techniques) that results in the same level of output, but uses less of one of the inputs and no more of the other input or inputs. Otherwise it is technologically efficient. In our example, (3, 3) is technologically inefficient. Since firms generally have to pay for the inputs they use, they don’t want to use technologically inefficient production techniques. Therefore we will confine the application of the production function f to efficient production techniques. That is, when we write y = f(x1, x2) in what follows, it is understood that (x1, x2) is an efficient production technique. We now define an isoquant. The prefix “iso” is Greek for “the same” or “equal,” and “quant” is short for “quantity.” An isoquant is a set of efficient production techniques that result in the same quantity of output. To graph an isoquant, we start with a picture that has quantities of the inputs x1 and x2 on the horizontal and vertical axes, respectively, and we identify a locus of points (x1, x2) which all produce a fixed quantity of output y. When we do this for different y’s, we get different isoquants. Isoquants for y = 1, 2, and 3 are shown in Figure 9.1 below. The isoquants in Figure 9.1 also happen to be “evenly spaced”; we will explain the meaning of this below. 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 152 INSERT FIGURE 9.1 HERE Caption of Fig. 9.1: A map of isoquants. The reader may think that isoquants look vaguely familiar. And so they should. Isoquants in the theory of the firm play a role very similar to that of indifference curves in the theory of the consumer, introduced back in Chapter 2. Recall that a consumer wants to get to higher and higher indifference curves, all else equal. Similarly, a firm wants to get to higher and higher isoquants, all else equal. (Of course the exact meanings of these statements depends on what we mean by “all else equal.”) There is, however, one important difference between isoquants and indifference curves, which we should point out immediately. In consumer theory, utility is an ordinal measure; only relative utilities matter, and in the statement “I am on the indifference curve for u = 3,” the number 3 has no intrinsic meaning. On the other hand, in the theory of the firm, output is a cardinal measure. In the statement “the firm is at y = 3,” the number 3 does have significance. It means that the firm is producing 3 cars, or 3 bushels of wheat, or 3 units of whatever the firm produces. Moreover, the u = 6 indifference curve is better for the consumer than the u = 3 indifference curve, but not twice as good. In contrast, the y = 6 isoquant really does produce twice the output of the y = 3 isoquant. Marginal products of the inputs, and assumptions about production functions and isoquants. In Chapter 8, on the single-input/single-output model, we defined the marginal product of the input. Intuitively, the marginal product is the extra output resulting from another unit of the input. Formally, it is the derivative of the production function f(x) with respect to x. When there are two (or more) inputs, the definition is quite similar. Intuitively, the marginal product of input 1, for instance, is the extra output resulting from an additional unit of input 1 and zero additional units of input 2. Formally, the marginal product of input 1 is the derivative of the production function with respect to x1, holding x2 constant, or the partial derivative of f(x1, x2) with respect to x1. That is, MP1 = ∂f(x1, x2) ∂x1 . This may also be written as ∂f/∂x1, or ∂f(x1, x2)/∂x1, or f1(x1, x2). If we need to emphasize 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 153 where the partial derivative is being computed, we will write MP1(x1, x2) if is being evaluated at (x1, x2). The marginal product for input 2 is defined similarly. Note the strong resemblance between the marginal product of input 1 in the theory of the firm, and the marginal utility of good 1 in the theory of the consumer, as defined in Chapter 2. The reader may remember that, in the theory of the consumer, the marginal rate of substitution was equal to the marginal utility of good 1 divided by the marginal utility of good 2. As we shall see, there is a close parallel in the theory of the firm. We will now turn to the assumptions we make about the production function and its asso- ciated isoquants. Assumption 1. Monotonicity. If a firm increases one input without decreasing the other, output increases. That is, the marginal products MP1 = ∂f(x1, x2)/∂x1 and MP2 = ∂f(x1, x2)/∂x2, or the partial derivatives of f with respect to x1 and x2, are both positive. This rules out instances of technological inefficiency like that described in the example above. It also rules out the possibility of “fat” isoquants; that is, isoquants that are other than thin lines. And it rules out isoquants that are horizontal, vertical, or upward sloping. That is, it forces isoquants to be downward sloping. Assumption 2. Convexity. The isoquants are convex. There are several reasons why this is a plausible assumption. First, in many cases, it is reasonable to assume that production techniques can operate at different levels, with output levels scaled proportionately, and that a firm can use two or more production techniques simultaneously, without the techniques interfering with each other. For instance, suppose (x1, x2) = (2, 3) is an efficient production technique that gives y = 1, and suppose (x1, x2) = (3, 2) is another efficient technique that gives y = 1. Consider running the first technique at half level, and the second also at half level. It is reasonable to assume that this would produce at least 1/2 + 1/2 = 1 unit of output. And running the two techniques at these levels would require (1/2)(2, 3)+(1/2)(3, 2) = (2.5, 2.5) units of inputs 1 and 2. Now think of the y = 1 isoquant. It has to pass through (2, 3) and (3, 2). It would fail to be convex if it passed above the point midway between these two points, that is, (2.5, 2.5). But it cannot pass above (2.5, 2.5), because at that point the firm can produce at least y = 1 just by running those two production techniques at half level. And there are likely to be other techniques available 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 154 that would transform inputs of (2.5, 2.5) into output of y > 1. This implies that the isoquant running through (2, 3) and (3, 2) should pass below the point (2.5, 2.5). This would mean that the isoquant is (strictly) convex. Second, if isoquants are not convex, a firm will use “extreme” input bundles (extreme in the sense that one of the input levels is zero). That is, the inputs would not be used in combination. Since we observe firms using combinations of inputs, it is plausible to assume convexity. Technical rate of substitution. The reader will recall that in the theory of the consumer, a crucial concept is the marginal rate of substitution of good 2 for good 1, or MRSx1,x2 , or MRS for short. The intuition is this: if the consumer gives up a unit of good 1, how much good 2 does he need to replace it, and remain on the same indifference curve? The marginal rate of substitution of good 2 for good 1 is negative 1 times the slope of an indifference curve. Also recall from Chapter 2 the relationship between the marginal rate of substitution and the ratio of the marginal utilities: MRS = MU1/MU2. In the theory of the firm, we have a concept exactly analogous to the marginal rate of substitution concept in the theory of the consumer, and we have a relationship just like the MRS = MU1/MU2 relationship in the theory of the consumer. The technical rate of substitution of input 2 for input 1, formally written TRSx1,x2 , or TRS for short, is defined as TRSx1,x2 = − ∆x2 ∆x1 , where ∆x1 and ∆x2 are small (more precisely, infinitesimal) increments in inputs x1 and x2, one negative and the other positive, which leave the firm on the same isoquant. In other words, TRS is negative 1 times the slope of the isoquant. The intuition is this: if the firm uses a unit less of input 1, how much more of input 2 does it need to use in order to keep output constant, that is, to remain on the same isoquant? To put it another way, TRS is the value of a unit of input 1 in the production process, measured in terms units of input 2 needed to replace it. We can establish the relationship among TRS, MP1, and MP2 with an argument very similar to the one we made in Chapter 2 on the theory of the consumer. Imagine we start at an input bundle (x1, x2) and we simultaneously reduce x1 and increase x2 in a way that leaves y unchanged. The increments are ∆x1 < 0 and ∆x2 > 0. Then output changes by MP1∆x1 < 0 because of the reduction in input 1, and it changes by MP2∆x2 > 0 because of the increase 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 155 in input 2. Since output is unchanged, we end up on the same isoquant, and therefore the net effect is zero. This implies that MP1∆x1 + MP2∆x2 = 0. Therefore TRS = −∆x2∆x1 = MP1 MP2 = ∂f(x1, x2) ∂x1 /∂f(x1, x2)∂x2 . Now let’s reconsider the idea of convexity for an isoquant. Consider moving toward the right and down along a single isoquant. Convexity means that negative 1 times the slope of the isoquant, or the absolute value of the slope, is declining as we move to the right and down, or TRS declines as we move to the right and down. That is, as x1, the quantity of input 1, gets greater and greater, the value of an extra unit of input 1 in the production process gets smaller and smaller. This seems a very plausible assumption for TRS, and another reason to view convexity for isoquants as a reasonable basic assumption. Returns to scale. The reader will recall that in Chapter 8, on the single-input/single- output model of production, we assumed that the production function f(x) was either (1) concave, or more realistically, (2) at first convex and then concave. The intuition was that a concave production function, with a negative second derivative f ′′(x), represents “diminishing returns,” and that a production function that starts convex and then becomes concave represents the more realistic real-world case of “increasing returns” when the firm is small, eventually becoming “diminishing returns” when the firm is large. We will now consider similar notions applied to the firm with two (or more) inputs. For this purpose, it is conventional practice to consider what happens when all inputs are scaled up proportionately, rather than to consider what happens as each input is modified incrementally. So we proceed as follows. Assume both inputs are scaled up by a constant t > 1. (Remember we are talking about production in the long run, so there are no input quantities that are fixed and cannot be scaled up.) Then, if output changes by the same scale factor t, the production function is said to have the property of constant returns to scale. For instance, if the firm doubles all its inputs, or increases them by 100 percent, output should double, that is, rise by 100 percent. More formally, f(x1, x2) is a constant returns to scale production function, if, for any t > 1 and any (x1, x2), f(tx1, tx2) = tf(x1, x2). 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 156 If scaling up the inputs results in output increasing, but by less than the scale factor, the production function is said to have the property of decreasing returns to scale. For instance, the firm might double all its inputs, and see its output rise by 50 percent as a result. More formally, f(x1, x2) is a decreasing returns to scale production function, if, for any t > 1 and any (x1, x2), f(tx1, tx2) < tf(x1, x2). If scaling up the inputs results in output increasing, and by more than the scale factor, the production function is said to have the property of increasing returns to scale. For instance, the firm might double all its inputs, and see its output rise by 150 percent as a result. More formally, f(x1, x2) is an increasing returns to scale production function, if, for any t > 1 and any (x1, x2), f(tx1, tx2) > tf(x1, x2). When the isoquants for a constant returns to scale production function are graphed, they are “evenly spaced,” in the sense that the y = 2 isoquant is twice as far from the origin as the y = 1 isoquant, the y = 3 isoquant is three times as far from the origin as the y = 1 isoquant, and so on. Figure 9.1 was drawn under the constant returns to scale assumption. Under increasing returns to scale, as you move away from the origin, isoquants get closer and closer to each other, and under decreasing returns to scale, as you move away from the origin, isoquants get farther and farther apart. Figure 9.2 below illustrates decreasing returns to scale. Decreasing returns to scale is the scale assumption that corresponds to our Chapter 8 assumption of concavity for the production function f(x). INSERT FIGURE 9.2 HERE Caption of Fig. 9.2: Decreasing returns to scale. In the Chapter 8 real-world case, the production function f(x) starts convex and then be- comes concave. The isoquant-spacing assumption that corresponds to this case is the following: when output is low, successive isoquants get closer and closer to each other, but when output is high, successive isoquants get farther and farther from each other. Loosely speaking, this is increasing returns to scale at the start, but becoming decreasing returns to scale at the end. Figure 9.3 shows the real-world case. INSERT FIGURE 9.3 HERE Caption of Fig. 9.3: Returns to scale in the real world. 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 157 Marginal products and TRS in the constant returns to scale case. A constant returns to scale production function f(x1, x2) scales output proportionately when both the inputs are scaled up by a factor t > 1, so f(tx1, tx2) = tf(x1, x2). Taking the partial derivative of the left side of this equation (that is, f(tx1, tx2)) with respect to x1 (that is, differentiating with respect to x1 while holding x2 constant) gives ∂f(tx1, tx2) ∂tx1 d(tx1) dx1 = t ∂f(tx1, tx2) ∂tx1 = tMP1(tx1, tx2). Taking the partial derivative of the right side of the equation (that is, tf(x1, x2)) with respect to x1 gives t ∂f(x1, x2) ∂x1 = tMP1(x1, x2). Now setting the partial derivative of the left hand side equal to the partial derivative of the right hand side, and canceling out the t’s on the left and on the right, we get MP1(tx1, tx2) = MP1(x1, x2). We conclude that for a constant returns to scale production function, if both inputs are scaled up proportionately, the marginal products of the inputs don’t change. Scaling (x1, x2) up or down is graphically equivalent to moving up or down a ray from the origin, in a graph with inputs 1 and 2 on the horizontal and vertical axes. Therefore, for a constant returns to scale production function, MP1 and MP2 are constant as the firm moves along rays from the origin, or, to put it another way, MP1 and MP2 remain constant as long as the input ratio x2/x1 remains constant. Now consider what happens to the technical rate of substitution as we vary x1 and x2, but keep the ratio x2/x1 fixed. Since TRS = MP1MP2 , and since the marginal products don’t change, TRS remains constant. In other words, for a constant returns to scale production function, if you scale up both inputs proportionately, the technical rate of substitution stays constant. In a graph with inputs 1 and 2 on the horizontal and vertical axes, if you move out a ray from the origin, the slopes of the isoquants crossing that ray are all the same. 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 158 9.3 Cost Minimization in the Long Run We now turn to the topic of cost minimization. We continue to assume that the time horizon is long run, and that both inputs are freely variable. Cost minimization wasn’t an issue in Chapter 8, where we discussed profit maximization by a firm using only one input. This is because if a firm produces y with one input x, according to the production function y = f(x), there is only one way to produce a given level of output y0, and only one possible cost: use x0 = f−1(y0) and pay wx0 for it. In the single-input case it is a waste of time to search for a cheaper way to produce y0. But now we are assuming the firm is producing its output y, and it is using two inputs in quantities x1 and x2 to do so. The prices for the inputs are w1 and w2, respectively. The firm is of course constrained by its production function y = f(x1, x2). But for any given level of output, say y0, there may be infinitely many ways to produce that output; all the input combinations on the y = y0 isoquant will do it. The firm wants to maximize its profits, but in order to maximize profits, it must minimize costs. At the risk of belaboring the obvious, let’s emphasize this point. The firm’s profit equals its revenue less its cost, or, in our notation, pi = py −C(y) = py − (w1x1 + w2x2). If, for a given y, the cost C(y) is not at the minimum, then profit can obviously be increased by switching to a lower-cost method of producing the given y. In other words, cost minimization is a necessary condition for profit maximization. Isocost lines and the condition for cost minimization. To facilitate the analysis of cost minimization, we use isocost lines. (Remember “iso” means “the same” or “equal.”) An isocost line is a set of input combinations (x1, x2), all with the same cost. For a given level of cost, say C0, the isocost line is the graph of w1x1 +w2x2 = C0. This equation may seem vaguely familiar to the reader, because it looks like the equation for the budget line in the theory of the consumer: p1x1 + p2x2 = M. 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 159 Of course the consumer tries to get to the highest indifference curve, for a given budget line. The firm, in contrast, will try to get to the lowest isocost line, for a given isoquant. Figure 9.4 below shows several isoquants corresponding to cost levels C0, C1, and C2. A lower isocost line in the figure corresponds to a lower level of cost since w1 and w2 are both assumed to be positive. The slope of an isocost line is very much analogous to the slope of the consumer’s budget line; it equals w1/w2 in absolute value. Note that Figure 9.4 includes one isoquant, which happens to be tangent to the C1 isocost line. Figure 9.4 makes it clear that in order to minimize the cost of producing y0, the firm will try to find a point where the y0 isoquant is tangent to an isocost line. INSERT FIGURE 9.4 HERE Caption of Fig. 9.4: Three isocost lines and one isoquant. Cost is minimized at the tangency point. We noted above that cost minimization is a necessary condition for profit maximization. We now describe the tangency condition that must hold for cost minimization. Suppose the firm is producing y0. Suppose the production function is differentiable, the isoquants are smooth and convex, and there is an isoquant/isocost line tangency. To produce y0 at least cost, the firm must choose the input combination (x1, x2) where the isoquant is tangent to an isocost line. Since the slope of the isoquant is −TRS and the slope of the isocost line is −w1/w2, we have the following basic cost-minimization condition: TRS = w1 w2 . This condition should look familiar; it is exactly like the corresponding tangency condition in the theory of the consumer. The consumer tangency condition says that in order to get to the highest indifference curve subject to the budget constraint, the consumer must satisfy MRS = p1 p2 . Now let’s consider a variable output level y instead of a particular level y0. The firm is facing input prices (w1, w2). To produce an arbitrary y at the least cost, the firm will solve for a pair of input levels, which we’ll call x∗1 and x∗2, that satisfy (1) the cost minimization tangency 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 160 condition TRS = w1 w2 and (2) the production function equation y = f(x1, x2). The desired input levels (x∗1, x∗2) now depend on the input prices (w1, w2), and on the output level y, and so we write them as x∗1(w1, w2, y) and x∗2(w1, w2, y). These two functions show how much of the two inputs the firm wants to hire, given the input prices, and given the level of output. These are called the conditional factor demands or conditional input demands for the firm. In a typical application of this kind of analysis, w1 and w2 are fixed, and the word “conditional” is used here because the amounts of the inputs that the firm demands depend on the level of output the firm will choose. With the conditional input demands, we can define the firm’s long run cost function or long run cost curve. This shows, for any level of output y, the least cost of producing y (assuming fixed input prices, and assuming efficient production techniques). The long run cost function is C(y) = w1x∗1(w1, w2, y) +w2x∗2(w1, w2, y). The relation between long run cost curves and returns to scale. Loosely speaking, the returns of a technology and the costs of production are mirror images. That is, when returns are high, costs are low, and vice versa. Somewhat more precisely, when returns are increasing, costs are falling, and vice versa. As we will see below, this is what we find for constant, increasing, and decreasing returns to scale. Suppose a firm’s production function satisfies constant returns to scale. Let (x∗1, x∗2) be the cost-minimizing input combination that results in one unit of output. Then C(1) = w1x∗1+w2x∗2. Now, if the firm wants to produce an arbitrary y units of output, by constant returns to scale, it can do so by multiplying by y both the input quantities that gave one unit of output. Moreover, as we saw above, scaling up both the inputs in this fashion will leave MP1, MP2, and TRS unchanged. Therefore since (x∗1, x∗2) was a point of tangency between an isoquant and an isocost line, (yx∗1, yx∗2) will also be a tangency point. Therefore C(y) = yC(1) is the least-cost way to 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 161 produce y units of output. That is, for constant returns to scale, the long run cost function is a very simple linear function that makes C(y) directly proportional to output y. We show this kind of linear cost function in Figure 9.5 below. INSERT FIGURE 9.5 HERE Caption of Fig. 9.5: Total cost with constant returns. Remember that average cost is total cost divided by quantity, or AC(y) = C(y)/y. For our constant returns to scale case, average cost is constant. This is because AC(y) = C(y)/y = yC(1)/y = C(1). Also recall that marginal cost is the derivative of the cost function, or, intuitively, the extra cost per additional unit of output. That is, MC(y) = dC(y)/dy. In the constant returns to scale case, MC(y) = d(C(1)y)/dy = C(1) = AC(y). In short, with constant returns to scale, total cost is a linear function of y, and both average and marginal cost are constant, equal to each other, and equal to the cost of producing just 1 unit of output. Next we assume the firm’s production function satisfies increasing returns to scale. The cost of producing one unit of output is C(1). If the firm wants to produce y units of output, it can scale up the input quantities by less than y, since if it scaled up by y, by the increasing returns to scale assumption, output would rise to more than y. Therefore the cost of producing y will be less than yC(1). In short, C(y) < yC(1). This implies that the total cost curve is concave, as shown in Figure 9.6 below. INSERT FIGURE 9.6 HERE Caption of Fig. 9.6: Total cost with increasing returns. Average cost is the slope of l1, and marginal cost is the slope of l2. In Figure 9.6, we have included a line l2 that is tangent to the total cost curve at a point P , as well as a line l1 going from the origin through the point P . The slope of the tangent line is dC(y)/dy, or marginal cost, and the slope of the line from the origin is C(y)/y, or average cost. This is similar to what we did in Chapter 8, Figure 8.3. It is clear from Figure 9.6 that at any point on the total cost curve, the slope of the tangent line, or marginal cost, is less than the slope of the line from the origin, or average cost. In short, for the increasing returns to scale case, MC(y) < AC(y). Also, as y increases, both marginal cost and average cost are decreasing. 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 162 Now we assume the firm’s production function satisfies decreasing returns to scale. We continue to assume that the cost of producing one unit of output is C(1). Now, if the firm wants to produce y units of output, it must scale up the input quantities by more than y. Therefore the cost of producing y will be more than yC(1). In short, C(y) > yC(1). This implies that the total cost curve is convex, as shown in Figure 9.7 below. INSERT FIGURE 9.7 HERE Caption of Fig. 9.7: Total cost with decreasing returns. Average cost is the slope of l1, and marginal cost is the slope of l2. In Figure 9.7, we have again included a line l2 that is tangent to the total cost curve at a point P , as well as a line l1 going from the origin through the point P . The slope of the tangent line is dC(y)/dy, or marginal cost, and the slope of the line from the origin is C(y)/y, or average cost. It is clear from Figure 9.7 that at any point on the total cost curve, the slope of the tangent line, or marginal cost, is greater than the slope of the line from the origin, or average cost. In short, for the decreasing returns to scale case, MC(y) > AC(y). Also, as y increases, both marginal cost and average cost are increasing. Finally, let us suppose, loosely speaking, that the production function satisfies increasing returns to scale at the beginning, changes to constant returns to scale, and then changes to decreasing returns to scale at the end. We say “loosely speaking” here because, as we have defined constant, increasing, and decreasing returns, the properties are universal, and not attached to particular scales of operation for the firm. In terms of isoquants, we are now assuming that the isoquants first get closer and closer together, and ultimately get farther and farther apart. Now the total cost curve is first concave, and then turns to convex, as in Figure 9.8 below. Note that this figure is just like Figure 8.3 in Chapter 8, on the single-input model. INSERT FIGURE 9.8 HERE Caption of Fig. 9.8: Total cost under increasing returns to scale followed by decreasing returns to scale. Average cost is the slope of l1, and marginal cost is the slope of l2. The reader should refer back to Chapter 8, Figure 8.4, to recall the appearance of the U- shaped average and marginal cost curves in the real-world case for the single-input model. A 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 163 similar figure applies here in the multi-input model. Both average and marginal cost first decline and then rise, and the marginal cost curve passes through the minimum of the average cost curve. When average cost is declining, marginal cost is below average cost, and when average cost is rising, marginal cost is above average cost. 9.4 Profit Maximization in the Long Run Throughout this chapter we have been assuming that the firm is competitive in the market for its inputs. That is, it takes the input prices w1 and w2 as given and fixed. It is too small to affect the input prices. We now also assume that our firm is competitive in the market for its output. That is, its choice of y doesn’t affect the output price p. (Note that we will return to a careful analysis of the competitive market assumption in a subsequent chapter.) We’ll now derive the conditions for profit maximization in the long run, and the firm’s long run supply curve. This section should look very familiar, because what we do here is very similar to what we already did in Chapter 8, the single-input model. In the long run, both (or all) inputs are free to vary. If the firm produces nothing, y, x1, and x2 are all zero, revenue is zero, total cost is zero, and profit is zero. Therefore whenever the firm chooses a y to maximize its profit, it will only consider output quantities for which profit is non-negative. Therefore pi(y) = py −C(y) ≥ 0. Dividing both sides of the inequality by y leads to p ≥ min AC(y). That is, the firm will only operate if the market price equals or exceeds minimum average cost. In the long run, it won’t be in business if being in business means loosing money. If the price p is high enough for the firm to operate without a loss, then it must decide how much to produce. Its profit is pi(y) = py − C(y). The first order condition for maximizing this function is dpi(y) dy = p− dC(y) dy = p−MC(y) = 0. 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 164 This gives p = MC(y), or price equals marginal cost. The second order condition for maximizing profit is d2pi(y) dy2 = d(p−MC(y)) dy ≤ 0, which gives dMC(y) dy ≥ 0. That is, at the profit-maximizing point, price equals marginal cost, and marginal cost is rising (or at least not falling). We refer the reader back to Figure 8.6 of Chapter 8 to view a graph illustrating the profit- maximizing choice in the real-world case of a firm with U-shaped average and marginal cost curves. That graph includes a horizontal line at a price p which is greater than the minimum of the average cost curve. The horizontal line intersects the marginal cost curve at two points; both points satisfy the first order condition, but only one satisfies the second order condition. We also refer the reader back to Figure 8.7 of Chapter 8 to view the graph of the profit- maximizing firm’s supply curve in the real-world case. In summary, in the typical competitive firm case, with U-shaped average and marginal cost curves, in the long run, the firm supplies nothing if p < min AC(y). But if p ≥ min AC(y) the amount the firm supplies is given by the upward-sloping part of the marginal cost curve. Returns to scale and long run supply. We will now analyze the firm’s profit- maximization decision, and its supply curve, in the cases of constant, decreasing, and increasing returns to scale. If the firm’s production function is constant returns to scale, its total cost, average cost, and marginal cost curves are as illustrated in Figure 9.5 above. In particular, marginal cost and average cost are constant (that is, horizontal lines) at C(1). For p > C(1), the firm would want to supply an unlimited amount, and would have unlimited profit. For p = C(1), the firm would supply any quantity; all would result in zero profit. For p < C(1), the firm would supply nothing, and would have profit of zero. 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 165 Next we assume an increasing returns to scale production function, and the corresponding total cost curve, as in Figure 9.6. We can see from that figure that average cost (the slope of the ray l1, from the origin through a point on C(y)), is always greater than marginal cost (the slope of the tangent l2 to C(y)), and both average cost and marginal cost are declining as y increases. Figure 9.9 shows the average and marginal cost curves corresponding to this case. Note that at the price/quantity combination (p, y∗) illustrated, the first order condition p = MC(y) is satisfied, but the second order condition is not. In fact, that point is a profit minimum, rather than a profit maximum. Given the price p, if the firm increases its output to y′, it will break even, and moving to the right of y′ allows increasing profit without limit. In other words, given that average cost and marginal cost continue to decline, the firm wants to produce an infinite amount of its output, and earn infinite profits. This of course would eventually result in the firm becoming so large that its choice of y would affect the price p. In other words, the assumption of increasing returns to scale, over all levels of output, is ultimately inconsistent with the assumption of competitive behavior. INSERT FIGURE 9.9 HERE Caption of Fig. 9.9: Unbounded supply under increasing returns. Finally, we assume a decreasing returns to scale production function, and the corresponding total cost curve, as in Figure 9.7. In that figure, we can see that average cost is less than marginal cost, and both are increasing as y increases. Figure 9.10 shows the average and marginal cost curves corresponding to this case. Both curves are upward sloping, and the marginal cost curve lies above the average cost curve. Given a market price p, as shown, the firm will maximize profit by producing the y∗ shown. The firm’s profit will then be pi(y∗) = py∗−C(y∗) = py∗−AC(y∗)y∗, or the area of the cross-hatched rectangle. The marginal cost curve is also the firm’s supply curve, at least for prices greater than or equal to the intercept of MC(y) with the vertical axis. INSERT FIGURE 9.10 HERE Caption of Fig. 9.10: The long run supply with decreasing returns. 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 166 9.5 A Solved Problem The Problem Consider the production function y = f(x1, x2) = xα1xβ2 , where α and β are positive constants. Assume the input prices are w1 and w2. This production function is called a Cobb-Douglas production function, after the people who first studied it, Charles Cobb (1875-1949) and Paul Douglas (1892-1976). Cobb was a mathematician and economist; Douglas was an economist at the University of Chicago who became an important and influential Democratic senator from Illinois. (a) Show that if α + β < 1, the production function has the property of decreasing returns to scale. (We will say “the production function is decreasing returns to scale” for short.) Show that if α + β = 1, the production function is constant returns to scale; and show that if α+ β > 1, the production function is increasing returns to scale. (b) Find the marginal products MP1 and MP2, and the technical rate of substitution TRS. (c) Now assume w1 = 1 and w2 = 1, and also assume α = 1 and β = 1. Find the long run cost function C(y), the average cost function AC(y), and the marginal cost function MC(y). The Solution (a) Note that f(tx1, tx2) = (tx1)α(tx2)β = (tα+β)xα1xβ2 , and tf(x1, x2) = txα1xβ2 . A production function is decreasing returns to scale if for any t > 1, f(tx1, tx2) < tf(x1, x2); it is constant returns to scale if for any t > 1, f(tx1, tx2) = tf(x1, x2); and it is increasing returns to scale if for any t > 1, f(tx1, tx2) > tf(x1, x2). Therefore, the Cobb-Douglas production function is decreasing returns to scale if (tα+β)xα1xβ2 < txα1xβ2 ⇔ tα+β < t ⇔ α+ β < 1. 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 167 Similarly, the Cobb-Douglas production function is constant returns to scale if (tα+β)xα1xβ2 = txα1xβ2 ⇔ tα+β = t ⇔ α+ β = 1. And finally, the Cobb-Douglas production function is increasing returns to scale if (tα+β)xα1xβ2 > txα1xβ2 ⇔ tα+β > t ⇔ α+ β > 1. (b) To find MP1, we take the partial derivative of f(x1, x2) with respect to x1. This gives MP1 = ∂xα1x β 2 ∂x1 = αxα−11 x β 2 . MP2 is found similarly: MP2 = ∂xα1x β 2 ∂x2 = βxα1xβ−12 . The technical rate of substitution is MP1 divided by MP2, or TRS = MP1MP2 = αxα−11 x β 2 βxα1xβ−12 = αx2 βx1 . (c) Now we are assuming w1 = w2 = 1, and we are also assuming α = β = 1. These two assumptions will make things a lot easier. For cost minimization, in the general case, the firm finds input combinations where TRS = w1/w2. This gives TRS = αx2βx1 = x2 x1 = w1 w2 = 1. Therefore x2 = x1. Substituting back in the production function, we now get y = xα1x β 2 = x1x2 = x 2 1. Therefore the cost-minimizing inputs are x∗1 = y1/2 and x∗2 = y1/2. The cost function is therefore C(y) = w1x∗1 + w2x∗2 = y1/2 + y1/2 = 2y1/2. The average cost and marginal cost functions are AC(y) = C(y)/y = 2y 1/2 y = 2y−1/2, MC(y) = dC(y)dy = d(2y1/2) dy = y −1/2 . 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 168 Exercises 1. Explain why the concepts of constant, increasing, and decreasing returns to scale make sense when applied to isoquants, but would not make sense in the theory of the consumer, if applied to indifference curves. That is, why does the spacing between successive isoquants make sense, while the spacing of successive indifference curves does not? 2. If the price of the output of a profit-maximizing firm rises, how will the firm’s output change? 3. Suppose a firm’s production function is y = x1/41 x 1/4 2 . The prices of the inputs are w1 = 1 and w2 = 2. (a) Show that the long run conditional factor demands are x∗1(y) = √ 2y2 and x∗2(y) = y2/√2. (b) Show that the long run cost function is C(y) = 2√2y2. (c) Show that the long run supply curve for the firm is given by y∗(p) = p/(4√2). 4. A firm produces computers with two factors of production: labor L and capital K. Its production function is y = LK/10. Suppose the factor prices are wL = 10 and wK = 100. (a) Graph the isoquants for y equal to 1, 2, and 3. Does this technology show increasing, constant, or decreasing returns to scale? Why? (b) Derive the conditional factor demands. (c) Derive the long run cost function C(y). (d) If the firm wants to produce one computer, how many units of labor and how many units of capital should it use? How much will it cost? What if the firm wants to produce two computers? (e) Derive the firm’s long run average cost function AC(y) and long run marginal cost function MC(y). Graph AC(y) and MC(y). What is the firm’s long run supply curve? 9 Theory of the Firm 2: The Long Run, Multiple-Input Model 169 5. Let the firm’s production function be given by y = x1 + x2. Suppose w1 = 2 and w2 = 1. (a) Derive the conditional factor demands and use them to find the long run cost function for this firm. (b) For these factor prices, derive and graph the firm’s long run supply curve. (c) Suppose the price of the second input, w2, rises to $2 per unit. What is the long run cost curve? Derive and graph the new supply curve. Hint: Since these isoquants are straight lines, cost minimization cannot require tangencies of isoquants and isocost lines. 6. Consider a production function which uses three inputs: y = x1/51 x 1/5 2 x 1/5 3 . Suppose the factor prices are w1 = w2 = w3 = 1. (a) What are the conditional factor demands x∗1(y), x∗2(y), and x∗3(y)? (b) Find the long run cost function C(y). (c) Find the long run supply curve y∗(p). 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 170 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 10.1 Introduction In Chapter 8 we modeled a firm with one input and one output. In Chapter 9 we devel- oped a more general model, with multiple inputs and one output. Both the Chapter 8 single- input/single-output and the Chapter 9 multiple-input/single-output models were long run mod- els, which means that the inputs were freely variable. If we assume a production function like y = f(x1, x2), long run analysis means that both x1 and x2 can be varied by the firm. In the short run, however, some inputs cannot be varied, because the time horizon is too short. How short is “short run” and how long is “long run” in reality depends on the facts of the firm, and so our economic analysis is necessarily a little vague about the time units. But we can be exact about what we mean by a short run model. A short run theory of the firm model is one in which some of the input quantities are fixed. In this chapter we will develop our short run model. If there are n inputs, x1, x2, ... xn, with input prices w1, w2, ... wn, short run means that some of the inputs are fixed at non-zero levels, while others are variable. If the production function is y = f(x1, x2), with two inputs, short run means x2 is fixed at a non-zero level, while x1 is variable. One main implication should be immediately clear: in a short run model, the cost function has a non-zero fixed part. When there are just two inputs, this is w2 times the fixed quantity of input 2. When there are n inputs, this is the sum of the prices of the fixed inputs times the respective quantities of those inputs. Moreover, when there are just two inputs, one fixed and one variable, the short run model will be much like the Chapter 8 model, but with a fixed cost element attached; and if there are three or more inputs, with one or more fixed and two or more variable, the short run model will be much like the Chapter 9 model, but with the fixed cost element attached. 10.2 The Production Function in the Short Run In the short run the firm doesn’t have time to vary the level of one or more of its inputs. In what follows, we will generally assume there are only two inputs, with x1 variable but x2 fixed, and a production function y = f(x1, x2). We let x02 > 0 represent the fixed level of the second input. The production function f(x1, x2) is now constrained at f(x1, x02). This constrained function is 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 171 called a short run production function. Since f(x1, x02) depends on only one variable (that is, x1), the analysis of the short run production function is almost exactly the same as the analysis of the single-input production function f(x) of Chapter 8. The only differences arise when the variable input is zero: in the single-input model, if x = 0, then y = 0, and cost and profit are also zero; but in the two-input short run model, if x1 = 0, y may not be zero, cost is not zero (since x02 > 0), and profit will probably not be zero. The short run production function can be graphed, with x1 on the horizontal axis and y on the vertical; the result will be similar to the graph of f(x) in Chapter 8 (that is, Figure 8.1), although it will not necessarily pass through the point (0, 0). As we did in Chapters 8 and 9, we can define average and marginal products for the variable input x1 for the short run production function. The average product of input 1 is y/x1, and the marginal product is, roughly speaking, the extra output per extra unit of input 1. More formally, average product is AP (x1) = f(x1, x 0 2) x1 . Marginal product is the derivative of the short run production function f(x1, x02) with respect to the variable x1, or the slope of the short run production function with the given x02. More formally, marginal product is MP (x1) = ∂f(x1, x 0 2) ∂x1 . If we need to show the underlying fixed level of input 2 in the expressions for average product and/or marginal product, we can add the term |x02, which simply means “given x2 = x02.” For average product, for example, we can write AP (x1|x02). As with the single-input model of Chapter 8, economists believe that the short run MP (x1) and AP (x1) curves for a “real-world” firm should be roughly parabolic, first increasing, and then decreasing, similar to the curves in Figure 8.8. That is, given a fixed x02, marginal product of x1 first rises, as the firm in a sense gets more productive, reaches a peak, and then declines, as the firm in a sense gets less productive. Average product does likewise, although it peaks after marginal product. And where average product reaches its peak, marginal product passes through it. 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 172 10.3 Cost Minimization in the Short Run Remember that the short run is a period of time so short that input 2 cannot be varied. Because it must pay for x02, the firm has an inescapable cost of w2x02. This is called the firm’s fixed cost. Fixed cost is written FC = w2x02. In the short run the firm can vary x1. Therefore w1x1 is the variable part of the firm’s total cost. This is called the firm’s variable cost, and it is written V C(y). Note that we have written variable cost as an explicit function of y, which we did not do for FC. We did this for an obvious reason: variable cost varies with y, whereas fixed cost doesn’t. In order to calculate the V C(y) function, we first note that V C(y) = w1x1. Next we need to substitute a function of y in place of x1. To do this, we use the firm’s short run demand function for input 1. This is the amount of input 1 needed to produce a given quantity of y, subject to the constraint that input 2 is fixed at x02. In the two-input model, with one input fixed in the short run, the short run demand function is simply the inverse of the production function, with the constraint, of course, that input 2 is fixed at x01. We write the inverse production function as x1(y) = f−1(y). If we need to show the underlying fixed level of input 2 in this expression, we can write x1(y) = f−1(y|x02). And now we can show variable cost as an explicit function of y. It is simply V C(y) = w1x1(y) = w1f−1(y). The firm’s total cost is the sum of its fixed cost and its variable cost, or CS(y) = FC + V C(y). We have put a superscript S on the total cost function to emphasize that it is a short run cost function, and that it is contingent on what is fixed in the short run, namely x02. If we need to be explicit about the level of the fixed input, we will write this as CS(y|x02). 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 173 We will reserve the C(y) notation, with no superscript, to represent long run total cost. Note that FC and V C(y) are only defined in the short run, and therefore we don’t bother to put S superscripts on them. In Figure 10.1 below, we show an isoquant for a given level of output y0. We show two isocost lines: one is the isocost line the firm could get to in the long run, when it could vary input 2 and escape its fixed cost FC = w2x02. The other is the isocost line it must settle for in the short run, when it must use x02 and it cannot escape the fixed cost. The long run cost-minimizing input combination is (x∗1, x∗2), but in the short run the firm will simply use x02 units of input 2, and whatever level of input 1 is sufficient to produce y0 units of output when it’s using x02 units of input 2. That is, it will operate at the point (x1, x02) shown in the figure. INSERT FIGURE 10.1 HERE Caption of Fig. 10.1: Short run cost minimization is at (x1, x02); long run is at (x∗1, x∗2). As we can plainly see from Figure 10.1, (x1, x02) is a more costly way to produce y0 than (x∗1, x∗2). In other words, at (x1, x02) the firm is not “fully” minimizing costs. That is, for any given level of output, short run total cost is greater than or equal to long run total cost. At (x1, x02), the firm is choosing a technologically efficient way to produce y0, subject to the constraint that x2 = x02. Other than that, it is not doing anything clever when it decides on the x1 that it will use. That is, short run cost minimization when there are only two inputs is a mindless process. Of course, if there are three or more inputs, and the third input is fixed in the short run, then the firm does have to make an intelligent choice of x1 and x2, along the lines we laid out in Chapter 9. We can now write down the firm’s short run cost function, based on the formulas discussed above: CS(y) = FC + V C(y) = w2x02 + w1x1(y) = w2x02 +w1f−1(y). If we need to be explicit about the underlying fixed level of input 2, we can write CS(y|x02) instead of CS(y). In Figure 10.2 below, we show a short run total cost curve, under the assumption that the marginal product of input 1 is first increasing, and then decreasing. Note that this figure is very similar to Figure 8.3 of Chapter 8, which illustrated the total cost curve in the “real-world” case 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 174 for the single-input model. The notable difference is this: the short run total cost curve starts out (when y = 0) at a positive intercept, namely the fixed cost. INSERT FIGURE 10.2 HERE Caption of Fig. 10.2: The short run total cost curve. In the next figure, we show marginal and average costs corresponding to Figure 10.2 above. This figure is of course similar to Figure 8.4 of Chapter 8. The idea of short run marginal cost is similar to the idea of long run marginal cost: just take the total cost function and differentiate it. However it is important to note that short run and long run marginal costs are generally not equal, since the long run cost function allows x2 to freely vary. Therefore we must distinguish between them. Just as we used a superscript S to identify short run total cost (CS(y)), we will use a superscript S to identify short run marginal cost. That is, short run marginal cost will be written MCS(y). In the long run analysis, there is only one concept of average cost: just take total cost and divide by y. But in the short run analysis, there are two relevant alternative notions of average cost, one of which includes the firm’s fixed cost, and the other of which excludes it. The first is called average total cost and the second is called average variable cost. They are formally defined as ATC(y) = CS(y)/y = (FC + V C(y))/y, and AV C(y) = V C(y)/y. Since average total cost includes FC/y, whereas average variable cost doesn’t, it’s clear that ATC(y) > AV C(y). Because we only use the ATC(y) and AV C(y) terminology in the context of short run analysis, we will not bother to remind the reader that they are short run concepts with the superscript S. Figure 10.3 below shows marginal cost, average variable cost, and average total cost curves in the case where the marginal product of input 1 is first increasing, and then decreasing. This is the “real-world” case for the short run cost function. INSERT FIGURE 10.3 HERE 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 175 Caption of Fig. 10.3: Marginal, average variable, and average total cost curves in the short run. We’ll end this section by making some observations about long run and short run cost curves in “real-world” cases. We know their shapes may be similar—first concave, and then convex. We also know that every short run cost curve must lie on or above the long run cost curve. This is so because no matter what the fixed x02 might be, and no matter what y may be, the cost of producing y has to be lower, or at least not higher, when x2 is free to vary (that is, in the long run) than when it is fixed. The short run total cost curve will just touch the long run total cost curve if, for the given y, x02 happens to precisely equal the long run cost-minimizing level for input 2. Otherwise the short run total cost curve will be higher. Figure 10.4 below shows a long run cost curve and two short run cost curves in the “real- world” concave-then-convex case. One short run cost curve, labeled CS(y|x02), is based on x02; the other, labeled CS(y|x12), is based on a higher input 2 level, x12. You can see that the long run curve seems to lie just below the short run curves, supporting them from below in a sense, or “enveloping” them. For this reason, the long run cost curve is said to be the envelope of the short run curves. (Note that the long run cost curve is the envelope of all the short run curves, not just the two illustrated!) For each short run curve, there is one y for which long run and short run costs coincide, otherwise the short run cost lies above the long run curve. INSERT FIGURE 10.4 HERE Caption of Fig. 10.4: Long run and short run total cost curves. 10.4 Profit Maximization in the Short Run We are continuing to assume that x2 is fixed at x02. For any output level y, the firm chooses the technologically efficient x1 which, when combined with x02, produces y. In a previous section of this chapter, we constructed the short run total cost curve CS(y), and in Figure 10.3, we graphed short run marginal cost, average total cost, and average variable cost curves in the “real-world” firm case. Keep in mind that the MCS(y), ATC(y), and AV C(y) curves all depend on the underlying x02. 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 176 We define short run profit as piS(y) = py − CS(y). Since CS(y) = FC + V C(y), this implies piS(y) = py − FC − V C(y). But variable cost at y = 0 is zero, and therefore profit at y = 0 is piS(0) = −FC. In other words, in the short run, if the firm elects to produce nothing, it loses money (in the amount FC). Therefore, in the short run, the firm will produce output even if it is losing money, provided the amount it is losing is less than FC. This implies the firm is willing to produce as long as piS(y) = py − CS(y) ≥ −FC. Rearranging gives py ≥ CS(y) − FC = V C(y) and dividing both sides of the inequality by y leads to p ≥ min AV C(y). That is, in the short run, the firm is willing to lose money, but it is not willing to lose more than its fixed cost, and this implies it will only operate if the market price of its output is greater than or equal to the minimum of average variable cost. Once the firm decides it is worthwhile to stay in business, it must choose the best y. Here is what it does. Its profit is piS(y) = py − CS(y). The first order condition for maximizing this function is dpiS(y) dy = p− dCS(y) dy = p−MC S(y) = 0. This gives p = MCS(y), or price equals short run marginal cost. This condition looks just like the one derived in Chapter 8, the single-input model, and in Chapter 9, on the long run multiple-input model. Of course in the present context, the marginal cost function referred to is short run marginal cost, contingent on x02. 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 177 The second order condition for maximizing profit is d2piS(y) dy2 = d(p−MCS(y)) dy ≤ 0, which gives dMCS(y) dy ≥ 0. In sum, at the profit-maximizing point, price equals short run marginal cost, and short run marginal cost is rising (or at least not falling). Again, this looks like the results we have already seen, except it’s all short run, with input 2 fixed at x02. Figure 10.5 below shows the short run supply curve, and other relevant information, for the profit maximizing firm. It is based on Figure 10.3. For p < p1 in the figure, price is below the minimum of the average variable cost curve, and the firm shuts down (and loses FC). For a market price between p1 and p2, the firm stays open and covers some part of its fixed cost. However, its profit is negative or zero. (The firm’s loss is FC if p = p1, less than FC if p2 > p > p1, and 0 if p = p2). For p > p2, the firm is making money; it is covering all of its costs including fixed cost, and more. It has money to spare. The firm’s profit can be found graphically in a way similar to the way used in Figure 9.10 in Chapter 9. INSERT FIGURE 10.5 HERE Caption of Fig. 10.5: The short run supply curve of a competitive firm. We can describe the profit-maximizing firm’s supply in the short run this way. The firm supplies nothing if the market price is below the minimum of average variable cost. However when price is above that level, the firm’s supply curve is the upward-sloping part of the MCS(y) curve. Given a market price of p, the firm maximizes profit by solving for the y∗ where price equals short run marginal cost, where short run marginal cost is rising, and where price is greater than or equal to average variable cost. 10.5 A Solved Problem The Problem Consider the Cobb-Douglas production function y = f(x1, x2) = x1/21 x1/22 . 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 178 Assume the input prices are w1 = 1 and w2 = 2. Assume the input 2 quantity is fixed in the short run at x02 = 9. Assume the output price is p = 2. (a) Find the short run production function f(x1, x02), the short run average product function, and the short run marginal product function. (b) Find fixed cost FC, the variable cost function V C(y), the average variable cost function AV C(y), and the short run total cost function CS(y). (c) Find short run marginal cost MCS . Find the the profit-maximizing output. Find the short run profit level for the firm. Is the firm’s profit positive or negative? Explain. The Solution (a) To get the short run production function, we just substitute the constant x02 = 9 for the variable x2. This gives f(x1, x02) = 3x1/21 . To get short run average product, we divide f(x1, x02) by x1, or AP (x1) = 3x 1/2 1 x1 = 3x−1/21 . Short run marginal product is found by differentiating the short run production function, which gives MP (x1) = d(3x 1/2 1 ) dx1 = 3 2 x −1/2 1 . (Note that we have written the derivative with d symbols instead of ∂ symbols to avoid confusion.) (b) Fixed cost is FC = w2x02 = 2 × 9 = 18. To find the variable cost function, we will need the inverse production function. Therefore we invert the short run production function y = f(x1, x02) = 3x1/21 . This gives x1(y) = y2/9. The variable cost function is now V C(y) = w1x1(y) = 1× y 2 9 = y2 9 . Average variable cost is V C(y)/y, which gives AV C(y) = (y2/9)/y = y9 . 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 179 Short run total cost is CS(y) = FC + V C(y) = 18 + y 2 9 . (c) To get short run marginal cost, we differentiate the short run cost function CS(y). This gives MCS = d(18 + y 2/9) dy = 2y 9 . Note that short run marginal cost is an always-increasing function of y. To find the short run profit-maximizing output y∗, we first set MCS(y) = 2y/9 = p = 2 and solve. This gives y∗ = 9. Next we check to see that price is greater than or equal to average variable cost. This gives p = 2 ≥ AV C(y∗) = y∗/9 = 9/9 = 1, and it checks out. The profit for the firm is piS(y∗) = py∗ −CS(y∗) = 2× 9− (18 + 92/9) = 18− (18 + 9) = −9. The firm’s profit is negative, but it doesn’t fold in the short run because if it produced nothing, its profit would be piS(0) = −FC = −18. 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 180 Exercises 1. Let f(x1, x2) = (243 + 13 (x1 − 9)3)x2. (a) Calculate AP (x1|1) and MP (x1|1). (This assumes that input 1 is the variable input and input 2 is fixed, as we assumed in most of this chapter, with x02 = 1.) (b) Calculate AP (x2|x01) and MP (x2|x01). (This assumes that input 2 is the variable input and input 1 is fixed.) (c) Can you see from the comparison how these two inputs play very different roles in this technology? 2. Consider a profit-maximizing firm with a decreasing returns to scale production function y(x1, x2) and input 2 fixed at x02. Explain what happens to the conditional factor demand for input 1, x∗1, and profit, pi, in each of the following cases. (a) The price of input 1, w1, rises. (b) The price of input 2, w2, falls. (c) The price of the output, p, rises. 3. Recall the production function from Chapter 9, Exercise 3, y = x1/41 x 1/4 2 . The prices of the inputs are w1 = 1 and w2 = 2. Assume that the amount of input 2 is fixed at one unit. The firm must of course pay for that one unit, but it cannot increase or decrease x2. (a) Find the short run cost function CS(y). (b) Show that the supply curve for the firm, given this constraint, is y∗(p) = ( p4)1/3. 4. Consider the production function y = LK/10, where L is labor and K is capital. (This is from Chapter 9, Exercise 4.) The factor prices are wL = 10 and wK = 100. Suppose the amount of capital, K, is fixed at 1 unit. (a) Derive the short run cost function CS(y). 10 Theory of the Firm 3: The Short Run, Multiple-Input Model 181 (b) Derive and graph the average total cost function ATC(y), the average variable cost function AV C(y), and the short run marginal cost function MCS(y). What is the firm’s short run supply curve? (For simplicity, assume y > 0.) 5. A cake baker bakes cakes. His short run cost function is CS (y) = 100 + 10y − 2y2 + y3, where y is the number of cakes. (a) Derive and graph his average total cost, average variable cost, and marginal cost curves. (b) What is his short run supply curve? 6. Explain why a profit-maximizing firm might choose to produce output even though it is making negative profit by doing so. 182 Part III Partial Equilibrium: Market Structure 11 Perfectly Competitive Markets 183 11 Perfectly Competitive Markets 11.1 Introduction In this chapter, we put together consumers interested in buying a good and firms interested in selling the good. We will start out by describing what we mean by perfect competition; this requires price-taking behavior by all parties, homogeneous goods, perfect information, and free entry and exit in the long run. We will derive industry supply curves in the short run and in the long run. With consumers’ actions aggregated into an industry demand curve, and firms’ actions aggregated into an industry supply curve, we will discuss excess demand and excess supply. Then we will describe the competitive market equilibrium. Next we will turn to the welfare properties of the market equilibrium. We will define pro- ducer’s surplus for a single firm and producers’ surplus for all the firms in the market. We will show how the competitive market equilibrium maximizes social surplus, that is, the sum of consumers’ surplus and producers’ surplus. Finally we will analyze the deadweight loss, or loss in social surplus, created by a per-unit tax on the good being sold in the market. We will use the idea of the market demand curve, developed in Chapter 4, and the idea of consumers’ surplus, developed in Chapter 7. We will also extend the welfare economics analysis of Chapter 7, but this time with an eye on both the consumers and the producers of the good. 11.2 Perfect Competition As in Chapters 8 through 10, we are focusing on competitive firms. A firm is competitive if it takes prices as given, that is, beyond its control. A market is competitive if all the agents in that market (that is, all the buyers and all the sellers) take prices as given, that is, beyond their control. The idea of competitive markets is fundamental in economic thinking, but it is of course an ideal, and the reality is often different. It is useful to study an ideal even if the reality differs. For example, scientists analyze the motion of objects falling in a vacuum, even though true vacuums are rare on the surface of the earth. Where would Newton’s laws be if he had to carefully incorporate in his equations the effects of atmospheric and other frictions? Let’s try to justify the assumption of a competitive market by briefly describing the properties that result 11 Perfectly Competitive Markets 184 in competition. Perfectly competitive markets typically have these features: 1. Price-taking behavior. Each firm is so small, compared to the total market, that whatever quantity it sells, it does not affect the market price. For instance, suppliers of commodities like wheat, corn, heating oil, crude oil, gold, and silver are mostly so small that they do not influence world market prices. (However, decisions of Exxon-Mobil may affect oil prices, and the Hunt brothers of Texas once tried to corner the world market for silver.) The polar opposite of price-taking behavior is the behavior of a monopolist, a firm which is the one-and-only supplier of a good. For instance, a pharmaceutical firm producing a patented drug, for which there are few or no close substitutes, knows very well that it can sell many units if it charges a low price, and fewer units if it charges a higher price; it knows the downward-sloping demand curve it faces, and it finds the most profitable price/quantity combination, rather than taking the price as constant and solving for the most profitable quantity. 2. Homogeneous goods. The good that each firm produces is identical to the good that the other firms produce. For instance, the farmer producing number 2 corn, or winter wheat, is selling a product that is indistinguishable from that sold by thousands of other farmers. If what you produce is identical to what hundreds or thousands of other firms produce, you are unlikely to be able to sway the price; if you charge a penny more per bushel for your corn, you won’t be able to sell any of it. On the other hand, the producer of Coca-Cola is selling a product that is somewhat different than Pepsi-Cola or RC-Cola, or generic grocery-store Cola. And so the Coca-Cola company can vary its price and still have plenty of sales. It will therefore search for the most profitable price/quantity combination, rather than taking its price as constant. 3. Perfect information. The standard economic model assumes that every buyer and every seller in the market has perfect information about all the relevant facts. All the buyers and all the sellers know the market price and the characteristics of the good being bought and sold. This assumption might be violated, for example, in labor markets, where workers might know things about their productivity that the firms that employ them do not know. 4. Free entry and exit in the long run. Recall that in Chapters 9 and 10 we made much of 11 Perfectly Competitive Markets 185 the distinction between the short run and the long run. For purposes of the theory of the firm, the short run is a period of time so short that a firm is unable to vary one of its inputs. The long run is a period of time long enough that all input levels can be freely varied by that firm. This implies, among other things, that short run cost functions have a fixed term and a variable term, whereas long run cost functions have no fixed term. It also implies that the long run cost of producing zero units of output is zero, whereas the short run cost of producing zero is positive. In this chapter, we introduce a different type of distinction between the short run and the long run, that is quite independent of the issue of whether or not a firm has time to vary the level of input 2. This distinction looks at the composition of the set of firms producing a good. We know that in reality, firms enter and exit industries, and that entering or exiting a business takes time. A feature of a perfectly competitive market is that, with enough time, entering and leaving a market are possible at a negligible cost. That is, in a competitive industry where firms make positive profits, there are no barriers to entry to protect those profits. Similarly, in a competitive industry where firms make losses year after year, firms shut down. For purposes of this chapter, when we are referring to the set of firms producing a good, the short run is a period of time so short that firms are unable to enter or leave the industry, and the long run is a period of time long enough that firms can freely enter or leave the industry. For example, consider children setting up stands to sell cups of lemonade on a hot summer day in a nice residential neighborhood. With mom’s help, they can put up a table on the sidewalk in 20 minutes, and they can bring out the lemonade and cups in 10 minutes. The short run is around 30 minutes; a couple of hours is long run. Alternatively, consider power companies building nuclear generating facilities in the United States. Short run might be 10 years, and long run might be 50 years. Of course, as with the nuclear generating facilities, the time scale for entering or exiting the industry may be very much a consequence of various government policies. In any case, in the long run, when there is sufficient time for competing firms to enter or exit an industry, it is more likely that firms in the industry will be competitive, and that they will have to take the price as given. Pharmaceuticals again provide an example: take 11 Perfectly Competitive Markets 186 the short run as a period of time during which a drug formula is protected by a patent, and the long run as a period of time longer than the patent duration. In the long run, producers of a particular drug formula will have generic competition, and firms producing the drug will be much more likely to have to accept the market price. 11.3 Market/Industry Supply We begin by deriving the short run industry supply curve. By “short run” in this chapter, we only mean that the number of firms is fixed; that is, there is insufficient time for firms to enter or leave the industry. We do not mean “short run” in the sense that a particular firm does not have time to vary one of its inputs—that is a separate issue. In fact, for the sake of simplicity, we will illustrate with average cost (AC(y)) and marginal cost (MC(y)) curves, rather than with (short run) average variable cost (AV C(y)) and short run marginal cost (MCS(y)) curves from the last chapter. So how is industry supply determined? The answer is very easy in the short run, when the set of firms doesn’t change. Simply add the separate supply curves of the various firms in the industry; for each price p, add up the desired amounts the various firms want to supply. In other words, add the supply curves horizontally. We will illustrate with some examples Example 1. Two firms with different cost curves. Figure 11.1 below has three panels; the first is for firm 1, the second is for firm 2, and the third is for the market. We let y1 and y2 represent the output quantities of firms 1 and 2, respectively. In the firm 1 panel, marginal cost (MC1(y1)) and average cost (AC1(y1)) curves are shown. A horizontal line at p1 is drawn at the point where marginal cost crosses the minimum of average cost. For prices below p1, firm 1 wants to supply zero. For prices at or above p1, it wants to supply the amounts indicated by the MC1(y1) curve. Firm 1’s supply curve is shown in bold. The panel for firm 2 is similar, with a horizontal line at p2 where firm 2’s marginal cost curve crosses the minimum of average cost. In the market panel, the supply curves of the two firms have been added together horizontally, and the market supply is y = y1 + y2. INSERT FIGURE 11.1 HERE Caption of Fig. 11.1: Short run industry supply with an industry comprising two firms. 11 Perfectly Competitive Markets 187 We see that there is a discontinuity of industry supply at the price p1, and another one at the price p2; these are the minima of the average cost curves of the two firms. That is, there are jumps in market supply y at each of these prices. If there are many similar firms, of course, the jump in market supply caused when one firm starts producing may be negligible. Example 2. A large number of identical firms. Suppose the minimum of each firm’s average cost curve is at p∗—p∗ is the same for all firms. All the firms will supply nothing if the market price p < p∗. If p ≥ p∗, each firm will go to its marginal cost curve to determine how much to supply. Let’s assume there are 100 firms. The left hand panel in Figure 11.2 below shows the marginal cost (MCi(yi)) and average cost (ACi(yi)) curves for firm i. Firm i’s supply curve is shown in bold. Since all the firms are the same, we will write yi(p) = y∗(p) for the i-th firm’s supply curve. The right hand panel in the figure is the market supply curve. It is of course zero for p < p∗. For p ≥ p∗, it is y = ∑100i=1 yi(p) = 100y∗(p). INSERT FIGURE 11.2 HERE Caption of Fig. 11.2: The short run industry supply (identical firms). (As we noted in section 2 above, the short run/long run distinction in the theory of the firm is different from the short run/long run distinction in the theory of competitive markets. In the theory of the firm, short run means that one or more input levels is fixed; in the theory of competitive markets, short run means that the number of firms in the market is fixed. If, however, the time frames were the same, so that “short run” meant that for each firm, some input levels were fixed and the number of firms in the market were fixed, then in Figures 11.1 and 11.2 above, we would simply replace ACi with AV Ci, and MCi with MCSi .) Deriving the long run industry supply curve is slightly more complicated, because in the theory of competitive markets, “long run” means that the number of firms and the identity of firms in the industry are not fixed. In the competitive market in the long run, there is free entry and exit. When existing firms are making profits, other firms will enter the market; conversely, when existing firms are losing money, they will leave the market. So the number of firms in the market will vary—rising if incumbent firms are making positive profits, and falling if incumbent firms are making losses. 11 Perfectly Competitive Markets 188 We now suppose that different firms, either in the industry or potentially in it, have different technologies and therefore different average and marginal cost curves. Every firm’s average cost curve has a minimum, and those minima will generally vary among the firms. Let p∗ represent the smallest of the average cost minima. In the long run, there is free entry and exit by firms. Therefore, a firm will not be in the industry if the market price is below the minimum of its average cost curve; that is, if its profits are negative. Conversely, a firm will be in the industry if the market price is at or above the minimum of its average cost curve; that is, if its profits are greater than or equal to zero. Now suppose, hypothetically, that the market price is below p∗. Then every firm that might potentially produce this good opts out of the business. Market supply is zero. Next, suppose the market price is exactly equal to p∗. Then the firm (or firms) whose minimum average cost equals p∗ is in business, and producing the quantity that gives that average cost. This firm (or firms) has profits of zero. Finally, suppose the market price is p > p∗. Then the firm (or firms) whose minimum average cost equals p∗ is in business, and making positive profits. Moreover, every firm with minimum average cost above p∗ and up to p is also in business, and, making positive (or at worst zero) profits. Each one of the firms in the market will produce that quantity y for which its marginal cost equals the price p. All of this will give rise to a generally upward-sloping market supply curve. At the lowest price for which there is greater-than-zero supply, namely p∗, only the firm (or firms) with minimum average cost equal to p∗ is operating. As the price moves higher, the firm already in the market produces more of the output good. Then the price rises high enough for the next firm to enter the market, that is, the firm with the next-to-lowest minimum average cost. This produces a step in the supply function. Then another firm moves in, producing another step. If the firms are relatively small compared to the size of the market, these steps may be small; if the firms are relatively large, the steps are large. We show a market supply curve produced by this kind of reasoning in Figure 11.3 below. INSERT FIGURE 11.3 HERE Caption of Fig. 11.3: Long run industry supply with three potential firms with different cost curves. The generally upward-sloping long-run market supply curve shown above looks the way it 11 Perfectly Competitive Markets 189 does because we assumed that the firms which are actually in the industry, and those which are potentially in the industry, have different average and marginal cost curves. Sometimes it is appropriate to assume otherwise. In particular, let us now consider the special case where all firms in the industry, or potentially in the industry, have identical average and marginal cost curves. This means that they all have equal access to the same technologies, the same information, and so on. In the real world, there are many factors that make firms different, including ownership of patents, copyrights and trademarks, different stores of knowledge and technical expertise, different locations, different managers, and different histories. We are now assuming away those differences. In this case, all the firms in the industry, and all potential firms in the industry, have the same cost curves. The minimum average cost is the same for all, at, say, p∗. At any market price p < p∗, every firm will choose, in the long run, to leave the industry. Market supply will be zero. At any market price p > p∗, every firm will have positive profits, and more and more firms will enter the industry, without limit. (Consider that if you owned one of the firms, you’d be making Xdollars in profit, and you would think to yourself: I shall open a second identical firm, and make 2X , and then a third identical firm, and make 3X , and so on.) Therefore at any market price p > p∗, in the long run, market supply will be unlimited, or plus infinity. And finally, at the price p∗, market supply can be anything, depending on the number of firms choosing to operate (and earn zero profits.) In short, in the case of a market with many identical firms, with minimum average cost for each firm equal to p∗, the market supply curve is a horizontal line at p∗. Finally, note that in the long run, when the market price is p∗, each active firm in the industry will be producing where its marginal cost equals its average cost equals p∗. It will be making zero profits. (Zero profits does not mean zero returns to capital, and so the zero-profit firm may be paying dividends to shareholders and interest to creditors.) The price for each unit of output will exactly equal the marginal cost of the unit (implying no profit at the margin, a consequence of profit maximization), and will also exactly equal the average cost of producing all the units the firm is producing (implying no profits overall). That is, the absence of barriers to entry in this long run perfectly competitive market wipes out all positive economic profits. In Figure 11.4 below, we show the typical firm’s average and marginal cost curves, as well as its supply curve in the left panel. In the right panel, we show the market supply curve. 11 Perfectly Competitive Markets 190 INSERT FIGURE 11.4 HERE Caption of Fig. 11.4: Long run industry supply when all the firms are identical. 11.4 Equilibrium in a Competitive Market A competitive market has buyers and sellers, who are looking to buy or sell a good at the market price. The buyers are utility-maximizing consumers, whose various demands for the good have been aggregated into a market demand curve, which we’ll call D(p). The sellers are profit- maximizing firms, whose desired sales have been aggregated into a market supply curve, which we’ll call S(p). To put it another way, for a given price p, D(p) represents the total amount that the various consumers want to buy, and S(p) represents the total amount that the various firms want to sell. If the aggregate demand is greater than the aggregate supply, or D(p) > S(p), then some consumers will not be able to buy the quantities they planned to buy at the market price. They will be disappointed, frustrated, unhappy, and unable to purchase the good. This is called excess demand. In a market with excess demand, frustrated buyers will attempt to get what they want by bidding up the price, and sellers, seeing that there are unsatisfied buyers who couldn’t get what they wanted, will likely raise their prices. If the aggregate demand is less than the aggregate supply, or D(p) < S(p), then some sellers will not be able to sell the quantities they planned to sell at the market price. They will be disappointed, frustrated, unhappy, and unable to get rid of the good. This is called excess supply. In a market with excess supply, frustrated sellers will attempt to sell what they have on hand by lowering the price, and buyers, seeing that there are unsatisfied sellers who couldn’t sell what they wanted, will be likely to try to buy at a discounted price. If the aggregate demand is equal to the aggregate supply, or D(p) = S(p), then the number of units of the good that the various sellers want to sell is exactly equal to the number of units that the various buyers want to buy. Every party in the market can buy or sell the good, exactly as planned. We say that the market clears. There are no frustrated buyers or sellers, and no one has an obvious incentive to raise or lower the price. This is called a market equilibrium. We can have a market equilibrium in the short run (with insufficient time for firms to enter or exit), or in the long run. Figure 11.5 below shows a short run equilibrium, with an upward- 11 Perfectly Competitive Markets 191 sloping market supply curve S(p). Note that this figure does not include the steps that occur as different firms opt to start supplying positive quantities of the good. In the figure, there is equilibrium at price equal to p∗ and market quantity (supplied and demanded) equal to y∗. At a lower price p1, there is excess demand, and at a higher price p2, there is excess supply. INSERT FIGURE 11.5 HERE Caption of Fig. 11.5: Competitive equilibrium in the short run. Now let’s consider a competitive equilibrium in the long run, in that special case where all firms in the industry, or potentially in the industry, have exactly the same average and marginal cost curves, with minimum average cost at p∗. As in Figure 11.4, the long run industry supply curve is a horizontal line at p∗. The market equilibrium is shown in Figure 11.6 below. Note how if the market price were other than p∗, the market could not be in equilibrium. There would be excess demand for p < p∗, or excess supply for p > p∗. In these cases, either consumers want to buy more than firms are willing to supply, or firms end up with unsold units. INSERT FIGURE 11.6 HERE Caption of Fig. 11.6: Long run competitive equilibrium, when all the firms are identical. In a competitive industry in the long run, when all the firms are identical, the price of the good is entirely determined by technological considerations. Each firm is operating at the level where average cost is minimized and equal to p∗, and each firm is making zero profits. The demand side determines only the number of units sold, and therefore the number of firms in the industry. (There is a small complication because, in reality, the number of firms should be an integer. But this is of minimal importance when the market equilibrium quantity is so large that there are many firms, and we will leave it to the interested reader to work out the details when there are only a few firms in equilibrium.) 11.5 Competitive Equilibrium and Social Surplus Maximization Recall that in Chapter 7, we discussed the ideas of consumer’s surplus and consumers’ surplus. Note the locations of the apostrophes! One consumer’s benefit from being able to buy a good 11 Perfectly Competitive Markets 192 at price p can, under certain conditions, be properly measured as the area under his demand curve and above the horizontal line at p. If the consumer is buying a single unit of the good, the surplus is very intuitive—it is equal to his (maximum) willingness to pay for the item, less what he actually pays. If he is buying many units, these increments can be added up, unit by unit, to the quantity he actually buys. This adding up of surplus amounts, unit by unit, implies that his consumer’s surplus is the area under his demand curve and above p. We also found in Chapter 7 that under the assumption of quasilinear preferences, the consumer’s surpluses of various consumers can be accurately calculated, and can be accurately aggregated into consumers’ surplus. Consumers’ surplus can then be used to measure the aggregate net benefit, in dollars, to all the consumers in a market who are buying a good at price p. It is, roughly speaking, aggregate willingness to pay for all the units consumed, less the total amount actually paid. For the group of consumers in a market, consumers’ surplus is measured by starting with the market demand curve, and calculating the area under the demand curve but above the horizontal line at the price p, from quantity zero to quantity D(p), the market demand at price p. There is a similar methodology for measuring the aggregate benefit to all the sellers of a good in a market, who are getting a market price p. The firms that produce the good measure their own benefits as profits, in dollars. Therefore, there is no conceptual difficulty in adding together the benefits of firm i and firm j, as there was when we wanted to add together the benefits of consumers i and j, who have (non-comparable) utility functions. We will now define a measure of the benefit to a firm, or a group of firms, of being able to produce and sell a good at a market price p, and that measure will be based on profit—almost. We say “almost” because we want a measure that can be seen on a graph with market supply and market demand curves. We know that the market supply curve is the horizontal sum of the supply curves of various firms, and we know that, for a single firm, the supply curve is just its marginal cost curve, above the minimum of its average cost curve. We want to use an area under a marginal cost curve when we figure a firm’s costs and therefore its profit, so that we can relate profit to the market supply curve. The simple way to do this is to first note that for firm i producing output yi, profit is pii = Revenue −Cost = pyi − Ci(yi). 11 Perfectly Competitive Markets 193 Next, note that the area under firm i’s marginal cost curve is∫ yi 0 MCi(y)dy = Ci(yi)− Ci(0), where Ci(0) is firm i’s fixed cost. Therefore, firm i’s profit is given by pii = pyi − ∫ yi 0 MCi(y)dy − Ci(0). We will define producer’s surplus for firm i as PSi = pyi − ∫ yi 0 MCi(y)dy. It follows that pii = PSi −Ci(0) or PSi = pii + Ci(0) . What does producer’s surplus mean graphically? Producer’s surplus is revenue pyi, minus the area under the firm’s marginal cost curve from zero to yi. Revenue is the area under the horizontal line at height p, from zero to yi. Producer’s surplus is then the area below p but above marginal cost, from zero to yi. This is the graphical measure we will use for finding firm i’s benefit. Note that it equals firm i’s profit plus a constant, the constant being firm i’s fixed cost, that is, its cost at output level zero. Note also that if Ci(0) = 0, then producer’s surplus is exactly the same as profit; this would be the case in what we have described as the long run situation for a firm, the time horizon over which the firm can vary all its inputs, and set them all equal to zero when it is producing zero output. Figure 11.7 below shows producer’s surplus for firm i in the special case where the marginal cost curve MCi(yi) is a straight line from the origin. Note that the area of the cross-hatched rectangle py∗i is the firm’s revenue, and the area under the marginal cost curve equals cost at y∗i minus cost at zero. Producer’s surplus is the area below the horizontal line at height p but above the marginal cost curve. It is equal to profit plus the constant Ci(0), firm i’s fixed cost. INSERT FIGURE 11.7 HERE Caption of Fig. 11.7: Revenue, producer’s surplus, and the area under the firm’s supply curve. 11 Perfectly Competitive Markets 194 In the real-world case of a U-shaped marginal cost curve, the supply curve for the firm is not exactly the marginal cost curve; it is zero up to the point where average cost equals marginal cost, and then it coincides with marginal cost. (Look back at Figures 11.1 and 11.2.) This complicates the relationship between producer’s surplus and profit slightly, but does not change it in a fundamental way. We will explore this complication in an exercise. To go from producer’s surplus to producers’ surplus, we consider a market with various firms supplying the good, which sells at the market price p. The market supply curve S(p) is constructed by horizontally adding the supply curves, that is, the marginal cost curves, of the various firms. For the given price p, we could figure each firm’s producer’s surplus separately, and then add them all together. This sum would then represent aggregate profit, plus the various Ci(0) constants for the various firms. Or, equivalently, we can take the market supply curve, draw the horizontal line at height p going through it, and then take producers’ surplus as the area below the horizontal line at height p but above the market supply curve. In Figure 11.8 below, we provide two panels. Both are somewhat similar to Figure 11.5, in the sense that they show a downward-sloping market demand D(p) curve and an upward-sloping market supply curve S(p). In the left panel, the market price is at p∗, the equilibrium price at which supply equals demand, and there are no frustrated buyers or sellers in the market. In this panel, the consumers’ surplus area and the producers’ surplus area are cross-hatched, and are identified as C.S. and P.S., respectively. In this market, the aggregate net benefit to the various consumers, who are able to buy the good they consume at price p∗ but who are willing to pay more, is C.S. The aggregate net benefit to the various firms, equal to aggregate producers’ surplus, or, roughly speaking, aggregate profit, is P.S. The sum of the two areas, therefore, represents total net benefit to society (except for the constants, the fixed costs) that results from the existence of this market for this good. This is called the social surplus. The right panel of Figure 11.8 shows what would happen in this market if all transactions had to be made at a price p∗∗, higher than the equilibrium price. (Imagine, for example, that the government passed a law that made it illegal to buy or sell the good at any price lower than p∗∗.) The number of units sold would then be y∗∗ < y∗, the lesser of the amounts supplied and demanded at the non-equilibrium price. Some units might be produced but not sold. (This assumes that the law did not also force unwilling consumers to buy more units than they want at the high price p∗∗.) The right panel also shows producers’ surplus and consumers’ surplus, 11 Perfectly Competitive Markets 195 given the new (non-equilibrium) situation. In fact, the area identified as producers’ surplus in the right panel may be an overestimate of the real producers’ surplus: in this situation, it is possible that some of the units actually sold may have marginal costs higher than the height of S(p) at y∗∗. Note that the panels are based on identical S(p) and D(p) curves. A quick examination of the two panels should convince the reader that the total net benefit to society, or the social surplus, measured as the sum of consumers’ surplus and producers’ surplus, is greater in the competitive equilibrium, left panel. To put this another way, the fact that the market is forced to operate at a non-equilibrium price p∗∗ results in a loss to society. That loss is identified in the right panel. Note that it is a triangular area below the demand curve and above the supply curve, from y∗∗ to y∗. This triangular area is called the deadweight loss triangle or, for short, simply the loss triangle. This may underestimate the real loss, since, as mentioned above, the area identified as producers’ surplus in the figure may be an overestimate of the real producers’ surplus. INSERT FIGURE 11.8 HERE Caption of Fig. 11.8: The competitive equilibrium, and producers’ surplus, consumers’ surplus, and deadweight loss when the price is too high. Figure 11.8 suggests an extremely important and remarkable result. The competitive market where the price is allowed to find its equilibrium, and where buyers and sellers engage in trade in order to individually maximize their utilities or profits, will result in a maximum net benefit to society as a whole. Markets where the price is prevented from finding its natural equilibrium will show a deadweight loss. This reflects the position of free-market economists since the time of Adam Smith (1723-1790). The position is this: let the market operate, with each person seeking to maximize his own benefit, ignoring the welfare of others, and the outcome will actually be best for society. This beautiful and amazing result constitutes the basis for many recommendations made by market-oriented economists. Of course it depends on a number of assumptions, some of which have been explicit and some of which have been implicit. (Among our explicit assumptions are the assumptions of competitive behavior, and quasilinearity for consumers’ utility functions. Among our implicit assumptions are the assumptions that all the 11 Perfectly Competitive Markets 196 buyers and sellers are knowledgeable and rational, and that there are no market failures of the types described in Chapters 17, 18, and 20 below — on externalities, public goods, and asymmetric information , respectively.) We conclude this section with another intuition. If a consumer values a unit of a good at more than the market price p, he will benefit if he can consume another unit. If a producer can produce another unit at less than the market price p, he will profit if he can sell another unit. In a free competitive market, the equilibrium price adjusts so that all trades with possible gains—all the instances where a producer can produce another unit of the good for an amount of money less than what some consumer is willing to pay—are carried out. At the final “marginal” trade, the willingness to pay is just p∗, and the marginal cost of the unit is p∗; the net benefit to society is zero for that last unit traded. All trades with positive net benefit have been made. In a market with a dictated non-equilibrium price, however, there are potential trades which would result in net gains to society, which can never take place. 11.6 The Deadweight Loss of a Per Unit Tax In our chapters on the theory of the consumer, we explored some of the effects of taxes on the consumer’s behavior. Now we will explore how taxes affect a market equilibrium. We will use the consumers’ surplus and producers’ surplus measures, and we will see how per unit taxes affect welfare. Effects of a per unit tax in the short run. Suppose there is a market for a good, and a short run equilibrium in that market. By short run we only mean that the set of firms in the industry is fixed; there is insufficient time for entry and exit. Assume the government steps in and imposes a per unit tax of t on that good. Sometimes when a government imposes a tax, the buyer is legally responsible for paying it; at other times the seller is legally responsible. (Sometimes they have to split the tax and each pay half; this is the structure for Social Security and Medicare taxes in the U.S.) If the buyer is legally responsible for the tax, the seller will sometimes act as the collection agent, collecting the tax and sending it on to the tax authorities. State sales taxes in the U.S., on items sold in local stores, are legally structured this way. Sometimes the buyer is legally responsible for the tax, but the seller doesn’t have to act as a collection agent. State sales taxes on items sold 11 Perfectly Competitive Markets 197 over the internet are often set up this way. (In many states residents who buy online from a vendor without a physical presence in the state, such as Amazon.com, are supposed to report their purchases and pay the taxes themselves directly to the state. Of course, almost nobody does!) We will see below that the party who is legally responsible for the tax may or may not be the party that actually ends up paying it, even if the parties adhere strictly to the law. Who bears the burden of the tax will depend on the slopes of the demand and supply curves, or the elasticities of demand and supply. Most importantly, we will see that the introduction of a per unit tax creates a deadweight loss. We start with Figure 11.9 below, which shows an upward-sloping supply curve and a downward-sloping demand curve, as well as a no-tax market equilibrium. For simplicity of computation, we use linear supply and demand curves; supply is given by S(p) = −α0 + α1p and demand is given by D(p) = β0 − β1p. The alphas and betas are all positive constants. The market equilibrium is where the supply and demand curves intersect; the market equilibrium price is p∗ and the market equilibrium quantity is y∗. The figure also shows consumers’ surplus C.S. and producers’ surplus P.S. Note that social surplus is maximized, and there is no dead- weight loss triangle. Also note the vertical intercepts of the supply and demand curves. To find either one, set S(p) or D(p) equal to zero and solve for p. This gives α0/α1 as the intercept of the supply curve, and β0/β1 as the intercept of the demand curve. INSERT FIGURE 11.9 HERE Caption of Fig. 11.9: The market equilibrium before the introduction of a per unit tax. Now assume the government imposes a per unit tax t. Assume the law specifies that the producers are legally liable for the tax. Now if the producers sell a unit of the good for a price p, they will actually end up with p− t, since they have to send t to the government. The result is that the market supply curve gets shifted; if p is the market price, the amount supplied will be S(p) = −α0 + α1(p− t) = −α0 + α1p− α1t. The new (post-tax) supply curve has the same slope as the old one since dS(p)/dp is still equal to α1. That is, the new supply curve is parallel to the old (pre-tax) supply curve. However it is higher by the tax t, because in order to get firms to supply any particular quantity, you must pay them t more per unit than when there is 11 Perfectly Competitive Markets 198 no tax. (More formally, for the no-tax supply curve, the intercept on the vertical axis is α0/α1. For the with-tax supply curve, the intercept on the vertical axis, found by solving for p when S(p) = 0, is α0/α1 + t.) In Figure 11.10, we show what happens to the market when the per unit tax is introduced. The new supply curve is shown, parallel to the original supply curve, but displaced upward by t. There is now a new consumers’ surplus triangle, which is smaller than the original no-tax consumers’ surplus triangle. There is a new producers’ surplus triangle, also smaller than the original no-tax producers’ surplus triangle. So the tax makes both consumers and producers worse off, even though legally this is only a tax on producers. But there is a new benefit-to- society factor here, and that is government revenue from the tax. In dollar terms, it equals the area of the rectangle DFGH. All things considered, the net benefit to society is now given by the total of consumers’ surplus, producers’ surplus, and government revenue. But this total is less than the social surplus when there is no tax, and is less by the deadweight loss triangle DEF. INSERT FIGURE 11.10 HERE Caption of Fig. 11.10: The market equilibrium after the introduction of a per unit tax. Introducing the per unit tax creates a deadweight loss, because it creates a gap between what the producers receive per unit (p− t) and what the consumers pay per unit (p). There are consumers who value extra units of the good at more than p− t but less than p, and there are producers who could produce extra units of the good profitably at a cost more than p − t but less than p. Unfortunately there is no way for those parties to (legally) get together and make those mutually beneficial transactions. The result is much like the result shown in the right panel of Figure 11.8, where the government mandated a price above the market equilibrium price. Production and consumption stop at a point where there would still be gains to society from producing and consuming more. As we said above, even though the tax is the legal responsibility of the producers, it is actually a burden on both producers and consumers. How much of the tax burden actually falls on the producers and how much on the consumers will depend on the slopes (or elasticities) of the supply and demand curves. The general rule is that if the elasticity of supply is greater than the elasticity of demand, the demanders, i.e., consumers, will bear more of the burden, and 11 Perfectly Competitive Markets 199 vice versa. The less elastic side of the market gets stuck with more of the tax. We’ll illustrate this result below, by looking at the long run case of identical firms, in which the supply curve is horizontal, that is, infinitely elastic. Long run effects of a per unit tax, with many identical firms. Let’s go back to the long run case with many identical firms. We assume firms can freely enter and exit the industry and that all the firms in the industry, or potentially in the industry, have identical average and marginal cost curves. This means that they all have equal access to the same patents, the same technologies, the same information, the same management, and so on. In the long run, the industry supply curve is a horizontal line at a price p∗, equal to the minimum of the average cost curve that all the firms and potential firms share. The elasticity of supply is generally defined as a percentage change in quantity supplied divided by a percentage change in price. With a horizontal supply curve, the supply elasticity is plus infinity, as elastic as it can be. The intuition is that a tiny percentage increase in the price (starting at say below p∗) will result in an infinitely large percentage increase in the amount supplied. In Figure 11.11 below, we show what happens when a per unit tax of t, a legal obligation of the suppliers, is introduced in a market with many identical firms in the long run. This figure is much like Figure 11.10, except that the no-tax supply curve is a horizontal line with a vertical intercept of p∗. (That is, for p < p∗, zero units of the good are supplied, for p = p∗, an arbitrary number of units are supplied depending on demand, and for p > p∗, an infinite number of units are supplied.) The supply curve with the per unit tax, payable by firms, is another horizontal line, t dollars above the first. INSERT FIGURE 11.11 HERE Caption of Fig. 11.11: Effects of a per unit tax, in the long run, with many identical firms. In Figure 11.11, we see that when the supply curve is horizontal, that is, infinitely elastic, the introduction of a per unit tax creates a deadweight loss triangle, as it did in the general case. What’s new here is that even though the tax is theoretically paid by the firms that supply the good, in fact all the burden of the tax is borne by consumers. This is a consequence of the infinitely elastic supply curve. 11 Perfectly Competitive Markets 200 11.7 A Solved Problem The Problem The market demand for good x is xD = a − bp and the market supply is xS = cp, where a > 0, b > 0, and c > 0. (a) Calculate the competitive equilibrium of this market (that is, the equilibrium price and quantity). Calculate the consumers’ and the producers’ surpluses. (b) The government imposes a per unit tax t on x, which must be paid by the sellers. Calculate the new competitive equilibrium and the government revenue. (c) Calculate the new consumers’ and producers’ surpluses. (d) Is society better off or worse off after the tax? By how much? The Solution (a) To find the competitive equilibrium, we set supply equal to demand, or xS = xD, or cp = a− bp. This gives the competitive equilibrium price p∗ = a b+ c. We substitute back into either the supply function or the demand function, to get the competitive equilibrium quantity x∗ = cp∗ = ac b+ c. To find consumers’ surplus and producers’ surplus, it is helpful to sketch a graph of the supply and demand curves; this is easy because they are straight lines. Our graph is Figure 11.12 below; we will use it for parts (a), (b), (c), and (d). Note that it is similar to Figures 11.9 and 11.10. INSERT FIGURE 11.12 HERE Caption of Fig. 11.12: The original market equilibrium, the new market equilibrium after the per-unit tax t is imposed, the original C.S., the original P.S., and the welfare loss triangle DEF . 11 Perfectly Competitive Markets 201 In Figure 11.12, consumers’ surplus is the area of the upper triangle, with positively-sloped cross-hatching; producers’ surplus is the area of the lower triangle, with negatively-sloped cross-hatching. The area of a triangle is one half the base times the height, and we will now use the horizontal x∗ = ac/(b+ c) as the triangle base. The height of the consumers’ surplus triangle is a/b− p∗ = a/b− a/(b+ c). All this gives C.S. = 1 2 ( ac b + c )( a b − a b+ c ) = 1 2 ( ac b + c )( ac b(b+ c) ) = (ac)2 2b(b+ c)2 for consumers’ surplus. Similarly, producers’ surplus is P.S. = 1 2 ( ac b + c )( a b + c ) = a2c 2(b+ c)2 . (b) With a per unit tax of t payable by the sellers, we have to refigure the supply curve. Before the tax was imposed, the supply function was xS = cp; since firms now have to pay t per unit off the top, they are netting p− t for each unit they sell, rather than p. Therefore the new supply function is xS = c(p− t). The new supply curve is the dashed line in Figure 11.12. Setting supply equal to demand now gives c(p− t) = a− bp. This gives the new competitive equilibrium price p∗∗ = a+ ct b + c . We substitute back into either the supply function or the demand function, to get the new competitive equilibrium quantity x∗∗ = c(p∗∗− t) = c(a+ ct)(b+ c) − ct = ac− bct b + c . Figure 11.12 includes the new equilibrium price p∗∗, the new equilibrium price net of the tax p∗∗ − t, and the new equilibrium quantity x∗∗. Note that the new price is greater than the old pre-tax price by an amount ∆p∗∗ = (a+ ct)/(b+ c)− a/(b+ c) = ct/(b+ c). Consumers have to pay this new higher price, and are worse off. Producers are also worse off; they now receive p∗∗ − t = (a+ ct)/(b+ c)− t = (a− bt)/(b+ c) per unit. Comparing this to the old p∗ that they used to receive, producers are now getting bt/(b + c) less per 11 Perfectly Competitive Markets 202 unit. Note that there is now a gap of t between the price p∗∗ paid by consumers and the (net) amount p∗∗ − t received by producers. The revenue received by the government is Revenue = tx∗∗ = t(ac− bct)b+ c . (c) To figure the new consumers’ surplus, we again find the area of a triangle, one half the base times the height. The base of the new C.S. triangle is x∗∗ = (ac− bct)/(b+ c). The height of the new C.S. triangle is a/b − p∗∗ = a/b − (a + ct)/(b + c). Therefore the new consumers’ surplus is New C.S. = 1 2 ( ac− bct b+ c )( a b − a+ ct b + c ) = 1 2 ( ac− bct b + c )( ac− bct b(b+ c) ) = (ac− bct)2 2b(b+ c)2 . The new producers’ surplus is the area of a triangle with base x∗∗ and height p∗∗− t. This gives New P.S. = 1 2 ( ac− bct b + c )( a+ ct b + c − t ) = 1 2 ( ac− bct b + c )( a− bt b+ c ) = (a− bt)(ac− bct) 2(b+ c)2 . (d) We could take the original social welfare total of C.S. plus P.S., and compare that to the new social welfare total, of the new C.S. plus the new P.S. plus government revenue. But that would require a lot of ugly algebra! Instead, let’s look at Figure 11.12. Society is worse off by the area of the welfare loss triangle DEF . To calculate the area of DEF , let’s use the DE side as the base, and the difference x∗−x∗∗ as the height. The social loss produced by the tax is Welfare Loss = 1 2 t(x∗ − x∗∗) = 1 2 t ( ac b+ c − ac− bct b + c ) = bct2 2(b+ c) . 11 Perfectly Competitive Markets 203 Exercises 1. The short run market for coffee can be described by an upward-sloping supply curve and a downward-sloping demand curve. Suppose this market is perfectly competitive. How are the equilibrium price and quantity exchanged affected by the following perturbations? (a) An increase in consumers’ income (assume that coffee is a normal good). (b) An increase in the price of the factors of production. (c) A technological improvement in the coffee industry. 2. Suppose the best technology to produce a good is given by the production function y = √ x1x2. Let the input prices be w1 = 4 and w2 = 1. Assume the number of firms in the industry can vary (this is long run analysis), and that any firm can use the production function above. (a) Show that the industry supply is infinitely elastic at p = 4. (b) If the market demand is given by D(p) = 1, 000, 000− 1, 000p, how many units of good y are exchanged in equilibrium? 3. Good h is produced in Asia. There are 10,000 firms producing good h according to the technology described by h = K1/3L2/3, where K is land and L is labor. The unit prices of land and labor are $256 and $1, respectively. (a) Derive the long run marginal and average cost curves for each of these producers. What is the long run market supply of good h? (b) This good is mainly consumed in the U.S. Suppose the market demand is hD = 36, 000/p. Calculate the competitive equilibrium price, the amount of good h ex- changed in the market, and each producer’s output and profit. (c) Following a generalized campaign in the press and a few successful actions carried out by the DEA, the demand for good h shrinks to hD = 24, 000/p. Compute the new competitive market equilibrium for good h. Could we have a different number of producers in the market? 11 Perfectly Competitive Markets 204 4. Suppose the wine industry is made up of many small identical firms. A representative firm’s long run cost function is C(yi) = 992 − 12y2i + y3i if yi > 0, C(yi) = 0 otherwise. (a) Derive the representative firm’s market supply curve for wine. (b) If the market demand for wine is yD = 1, 140−10p, calculate the long run competitive equilibrium of this industry (i.e., indicate the equilibrium price, the amount of good y exchanged in the market, the number of firms in the market, and each firm’s output and profit). 5. Dakota is a firm that produces rocking horses. The market for rocking horses is perfectly competitive. Dakota’s cost function is C(y) = 12y2 + 40y + 2, 450. The market price of rocking horses is p∗ = 140. (a) Write down Dakota’s profit function pi(y). (b) What is the firm’s optimal level of output y∗? (c) Calculate the firm’s profits pi∗ at the optimal level of output. (d) What will happen to the number of firms in the rocking horse industry in the long run? 6. Now consider the market for rocking horses in the long run. Dakota’s cost function is C(y) = 12y2 + 40y + 2, 450. (a) What is Dakota’s level of output in the long run y∗∗? (b) What is the market price of rocking horses in the long run p∗∗? (c) Show that the firm earns zero profits in the long run. (d) What will happen to the firm’s profits if the government decides to impose a tax, t = 5, per rocking horse? 12 Monopoly and Monopolistic Competition 205 12 Monopoly and Monopolistic Competition 12.1 Introduction In the last chapter, we studied the behavior of competitive firms, that is, firms that take market prices as given and outside their control. Generally, such firms are small enough relative to their markets that their decisions have no effect on the market prices. Now we will study the polar opposite, the market in which only one firm supplies a particular good. This is called a monopoly market and the firm is a monopoly firm or monopolist. The word “monopoly” is from Greek, and means “one seller.” In the first part of this chapter, we will analyze the classical solution to the monopoly problem. Then we will consider various price discrimination techniques that monopolies can employ to increase their profits. At the end of the chapter, we will look at a special market structure, called monopolistic competition, in which there are many firms producing goods that are very similar, but not identical, such as different brands of laundry detergent. There are various reasons why some markets are monopolies or near-monopolies. Sometimes there are technological reasons. For example, there may be very large start-up costs. The classic example is the provision of a utility in an urban market via pipelines. If a firm is to sell water or natural gas in a city, it may need a network of underground pipes leading from source points to tens of thousands of residential and commercial customers. Having two or more firms installing such a network would be unnecessarily costly, and the first firm to get its pipes in the ground would have a tremendous advantage over later-arriving firms. This is the case of what is called a natural monopoly. The natural monopoly idea used to be applied to the provision of many utilities, including water, natural gas, electricity, and phone service. However, changes in law and in technology in the last 40 years have taken telephone service off the list of natural monopolies. The natural monopolist was AT&T—“Ma Bell” in the United States. AT&T was broken up in legal actions between 1974 and 1984, and rapidly changing technology, culminating in the development of cellular phone systems, eventually undid the technological basis for that monopoly. Innovations in the provision of electricity have changed our view of the local electric company as a natural monopoly. Now the firm that owns the wire network may be viewed as a natural monopolist, but there may be other different firms, non-monopolists, that actually generate the power. The same may be true of natural gas. The firm that owns the pipeline may 12 Monopoly and Monopolistic Competition 206 be a natural monopolist, but it may act simply as a delivery service between competitive gas producers and competitive gas customers. Often monopolies exist because the government has granted them, and a firm has a monopoly in the provision of some good or service only because a state makes it very difficult or illegal for another firm to come in and compete. For example, the British East India Company originated with a charter granted by Queen Elizabeth I in 1600, giving that company a legal monopoly in the trade with India and China in various goods (including cotton, silk, tea, and opium). Patents, copyrights, and trademarks are legal monopolies granted by a state to an inventor (or writer, composer, performer, or artist), usually with a limited term. For instance, in the U.S., a patent is granted by the U.S. Patent Office to a person or a firm, and gives its owner exclusive rights over an invention for a period of 20 years; other countries have similar patent laws. Copyrights last much longer and trademarks may last indefinitely. In the formal analysis in the next section, we will assume, as we did in our analysis of competitive markets, that there is one homogeneous good, and that all buyers and the seller have perfect information. But we will depart from our analysis of competitive markets by assuming that the seller does not take the price as given. We also assume that even if profits are positive, there are barriers to entry which serve to preserve the monopoly. A final note before proceeding with the monopoly model. A monopoly is a single seller of some good or service. A single buyer of a good or service is called a monopsony. As an example, suppose an isolated town has only one major employer—a large diamond mine located nearby. If the mine is the only (significant) buyer of labor services in the town, it is called a monopsony. In this chapter we will not analyze the theory of monopsonies, because that theory is formally quite similar to the theory of monopolies. 12.2 The Classical Solution to Monopoly Let us now assume there is a monopoly firm producing a good. We let y denote the quantity of the good. We assume in this section that the good is sold at a price p. (Later on we will analyze what happens when the monopolist sells the good to different people at different prices, or sells it to the same customer at different per-unit prices for different quantities. But for now there is one price in the market, which depends on the quantity y the monopolist decides to 12 Monopoly and Monopolistic Competition 207 produce.) There is a downward-sloping demand curve for the good, written y(p). The inverse demand curve is p(y). The monopoly firm wants to maximize its profit, just as a competitive firm wants to maximize its profit. The firm’s revenue is R(y) = p(y)y. We assume that the monopolist has a long run cost curve C(y). The firm’s profit is given by pi(y) = R(y)−C(y) = p(y)y −C(y). The monopolist’s problem is to choose y so as to maximize this function. Before proceeding, let’s consider how the monopolist’s problem differs from the profit- maximization problem of the competitive firm. A competitive firm operates in a market. There is a market demand curve. But the competitive firm is small enough that its decisions have little or no impact on the market price. Therefore its problem is to maximize pi(y) = R(y)− C(y) = py − C(y), for a given p. That is, the competitive firm takes the price p as given and fixed, whereas the monopolist takes p as variable, and as a function of its own output y. In terms of the formal analysis, this is the crucial difference between the monopoly firm and the competitive firm. From this point on, we will assume that when the monopolist is choosing its profit-maximizing output, p ≥ AC(y), or price is greater than or equal to average cost. If this were not the case, the firm would be losing money, and would leave the market in the long run. Now let’s turn to the profit maximization conditions. The first order condition for profit maximization says that the first derivative of profit with respect to y should be zero. The second order condition says that the second derivative of profit with respect to y should be less than or equal to zero. The first order condition gives the following: dpi(y) dy = dR(y) dy − dC(y) dy = 0 or dR(y) dy = dC(y) dy or MR(y) = MC(y). 12 Monopoly and Monopolistic Competition 208 In short, marginal revenue equals marginal cost. Now let’s focus on marginal revenue. Note that MR(y) = dR(y)dy = d dy (p(y)y) = p(y) + dp(y) dy y. Since the demand curve is downward-sloping, dp(y)dy is negative. Therefore MR(y) < p(y). That is, for any output level y, price is greater than marginal revenue. Since the profit-maximizing monopolist will set marginal revenue equal to marginal cost, the monopolist must end up charging a price that is greater than marginal cost. Recall that in Chapter 4 we discussed the concept of price elasticity of demand. This is, roughly speaking, the percentage change in amount demanded divided by the percentage change in price. In that chapter, we used the symbol x1,p1 to represent the price elasticity of demand for good 1, with price p1. We will use price elasticity of demand in this chapter also, but we will start with simplified notation, using to represent the price elasticity of demand for the good being sold by the monopolist. Therefore, let the price elasticity of demand for the monopolist’s product be: = − dy/y dp(y)/p(y). Inverting both sides and rearranging slightly gives dp(y) dy y = − p(y) . Now substituting this elasticity formula into the expression for marginal revenue gives the fol- lowing: MR(y) = p(y) + dp(y)dy y = p(y)− p(y) = p(y)(1− 1/ ). The result of all of this is that the monopolist’s first order condition for profit maximization can be rewritten as follows, in terms of the elasticity of demand: MR(y) = p(y)(1− 1/ ) = MC(y). Note that this implies p(y) MC(y) = 1 1− 1/ . The competitive firm charges a price equal to marginal cost. But as we observed above the monopolist charges a price that is greater than marginal cost. And we now know that the gap 12 Monopoly and Monopolistic Competition 209 between the price and the marginal cost is p(y)/ . Because of this gap the monopoly market must be inefficient: There must be customers who would love to consume the gizmo that the monopolist is selling at the monopolist’s marginal cost or above, but who won’t buy it at the monopoly market price. The monopolist is setting the price too high (from the social point of view) and producing less than the (socially) optimal output, where price equals marginal cost. We can make a few more observations about the monopolist’s first-order condition for profit maximization. In the equations above both p(y) and MC(y) are positive numbers. Therefore (1− 1/ ) must be positive. It follows that > 1. That is, the monopolist will always operate on the elastic part of the demand curve. Let’s now define a variable called markup. The intuition is this: a monopoly charges a price that exceeds the marginal cost of the good. The markup is the fractional (or percentage) amount by which price exceeds cost. For instance, if the markup is 0.5 (or 50 percent), the price exceeds marginal cost by 0.5 (or 50 percent). Formally, the markup is p(y) MC(y) − 1 = 1 1− 1/ − 1 = 1 − 1 . Therefore, the the markup increases as demand gets less elastic. (Remember that > 1.) If approaches 1, the markup approaches infinity. On the other hand, if approaches plus infinity (the competitive case), the markup approaches zero. To this point, we have only discussed the first-order condition for profit maximization. We now turn briefly to the second-order condition. This condition says that at the profit-maximizing level of output, the second derivative of profit with respect to y must be less than or equal to zero. It follows that d2pi(y) dy2 = d2R(y) dy2 − d2C(y) dy2 ≤ 0 or dMR(y) dy ≤ dMC(y) dy . In short, the first order condition for profit maximization says that marginal revenue must equal marginal cost. The second order condition says that the slope of the marginal revenue curve must be less than or equal to the slope of the marginal cost curve. Example. Let’s assume the inverse demand curve is given by p(y) = 100 − y. Then the monopolist’s total revenue is R(y) = p(y)y = (100 − y)y = 100y − y2. Differentiate to get 12 Monopoly and Monopolistic Competition 210 marginal revenue: MR(y) = 100−2y. Assume the total cost function is C(y) = y2. Differentiate to get marginal cost: MC(y) = 2y. Note that average cost AC(y) = y2/y = y. The first order condition for profit maximization requires setting marginal revenue equal to marginal cost. This gives 100 − 2y = 2y, or y = 25. So the monopolist knows it should sell 25 units. It then uses the inverse demand function again to determine the price to charge: p(y) = 100 − y = 100 − 25 = 75. Figure 12.1 below shows the demand curve (graphically identical to the inverse demand curve), the marginal revenue curve, and the marginal cost curve for this example, as well as the profit-maximizing solution. Note that the monopolist using this graph first finds the point A, where MR(y) and MC(y) cross. This gives the profit-maximizing quantity y = 25. Then the monopolist reads up to the demand curve, point B, to get the profit-maximizing price p(25) = 75. INSERT FIGURE 12.1 HERE Caption of Fig. 12.1: The classical solution to the monopoly problem in this simple example. 12.3 Deadweight Loss From Monopoly: Comparing Monopoly and Compe- tition We indicated above that the monopoly solution cannot be efficient because the monopoly firm is selling its product or service at a price greater than marginal cost. Therefore there must be customers or potential customers who would like to buy additional units of the good at more than the additional cost of producing those units, but these potential transactions, which would be beneficial to some people and harmful to none, do not take place. We will now formalize this argument by examining consumers’ surplus and producer’s surplus in a monopoly market. (Recall the limitations on the possible use of the consumers’ surplus concept described in Chapter 7.) We will show that the net benefit to society of the monopoly market, as represented by consumers’ surplus plus producer’s surplus, is not as great as it would be if the monopolist were acting like a competitive firm, that is, acting as a price taker. We will show this with the graph we used in the Example above. Consider Figure 12.2 below. This is based on the particular demand curve and marginal cost curve assumed in the Example, but 12 Monopoly and Monopolistic Competition 211 the argument obviously generalizes. In Figure 12.2, the monopolist finds the point A where MR(y) = MC(y). This determines the monopoly firm’s profit-maximizing quantity, shown as “monopoly quantity” on the horizontal axis. To sell that quantity, the monopolist goes up to the demand curve, at point B, to find the optimal price, shown as “monopoly price” on the vertical axis. Aggregate dollar benefit to consumers can now be measured as the area under the demand curve and above the monopoly price. This is consumers’ surplus, shown in the figure as the area with the upward-sloping cross-hatching. The benefit to the monopolist is producer’s surplus, which equals the producer’s profit, if there are no fixed costs (as we have been assuming in this chapter), or profit displaced by C(0) if there are fixed costs. In Figure 12.2, producer’s surplus (i.e., profit) is the area under the horizontal line from the monopoly price and above the marginal cost curve. This is the area with the downward-sloping cross-hatching. Point C in Figure 12.2 is where the demand curve and the marginal cost curve intersect. The horizontal coordinate of point C is labeled “competitive quantity” for reasons which will become clear. Note that in Figure 12.2 there are units of the good, to the right of the monopoly quantity, for which the height of the demand curve exceeds the height of the marginal cost curve. That is, for y greater than the monopoly quantity but less than the competitive quantity, p(y) > MC(y). Now imagine, hypothetically, that the monopolist could get together with each of the potential customers who might like to buy one or more of those units, and negotiate some price for each such unit, with the price being greater than MC(y) and less than p(y). All such transactions would make the buyers better off, and the monopolist better off, and would leave the buyers who were already buying at the monopoly price unaffected. The aggregate gain to society, if such hypothetical transactions were made, would be the area of the triangle ABC. Of course monopoly firms in the real world do not make these hypothetical transactions. That is, the area ABC represents potential benefit to society which is unrealized in the presence of monopoly. It is called deadweight loss due to the monopoly. What if the monopolist were somehow made to act as if it were a competitive firm? Suppose the firm is told that it must always charge a price p equal to marginal cost MC(y). (This is of course the way a competitive firm acts, since profit maximization by a competitive firm implies p = MC(y).) Suppose the firm continues to operate on the demand curve (that is, its price p equals p(y)). The result is that the firm ends up at the intersection of the demand curve and 12 Monopoly and Monopolistic Competition 212 the MC(y) curve, or at point C in Figure 12.2. There is a new consumers’ surplus triangle (not shown in the figure), the floor of which goes through the point C. There is a new producer’s surplus area (also not shown in the figure), whose ceiling goes through the point C. Total net benefit to society expands, and deadweight loss disappears. This is the theoretical reason why economists are generally opposed to monopoly, and are generally inclined toward policies that promote competition. It is the basis for price-equals- marginal-cost-regulation of natural monopolies. State-granted monopolies like patents are more complicated; there is an economic rationale for granting such monopolies, namely to encourage and promote innovation and invention. On the other hand, such monopolies create deadweight loss. The first argument suggests that patent protection should last more than 5 days, but the second suggests it should last less than 5 decades. Whether the lives of patents (and copyrights and trademarks) under the laws of the U.S. and other countries are too short or too long is an extremely interesting practical question, which we leave it to others to answer. INSERT FIGURE 12.2 HERE Caption of Fig. 12.2: The deadweight loss of a monopoly. 12.4 Price Discrimination Sometimes a monopolist can increase its profits by charging different prices to different people, or different prices for different units sold to one person. This is known as price discrimination. A familiar version is the price discrimination practiced by airlines, which commonly charge business travelers much more than vacationing tourists for seats in the same section of the same plane. Economists usually distinguish among three types of price discrimination. Third degree price discrimination, which we will also call common price discrimination, means charging different customers different prices, but not different prices for different units sold to one customer. A customer is charged a price that depends on the pigeonhole that the monopolist has placed him in, but that is independent of the number of units he buys. Second degree price discrimination means that the monopolist charges all its customers according to the same price schedule, but for any customer the price per unit depends on the number of units that customer is buying. First 12 Monopoly and Monopolistic Competition 213 degree price discrimination, which we will also call perfect price discrimination, means that the monopolist charges different prices to different people (that is, puts us in different pigeonholes), and possibly charges individual people different per unit prices, depending on the number of units the customer buys. In this chapter, we will limit our analysis to the first and third degree versions, that is, to common price discrimination and perfect price discrimination. Note that both of these types of discrimination are only possible when the buyers of the monopolist’s product or service cannot easily resell it among themselves. That is, suppose the monopoly firm is selling gizmos to you at $10 each, and to me at $20 each. If we all know about it, and if a gizmo is easy to transport and resell, I might go to you to buy some for a price between $10 and $20, and thereby undo the monopolist’s discriminatory pricing scheme. The reader should think about the goods which are provided by discriminating monopolists in this light. Providers of electricity and natural gas may be able to price discriminate because it is difficult for customers to trade these things among themselves, but providers of heating oil, which can be easily trucked from place to place, may be less able to discriminate. Providers of airline trips are able to price discriminate because a ticket is issued to a person and I cannot sell you my ticket to fly from New York to L.A. Sellers of prescription pharmaceuticals are able to price discriminate because a prescription, like an airline ticket, is attached to a person, making it difficult for you to sell me Vicodin. Sellers of generic drugs like ibuprofen, which can easily be bought and sold in large quantities by third parties, are much less able to price discriminate. (A five minute investigation in a large drug store will reveal that the ibuprofen seller may practice second degree price discrimination, in the form of bulk discounts, but not first or third degree discrimination.) Common or Third Degree Price Discrimination. In this mild kind of price discrimi- nation, the monopolist is able to partition the market for its product into a (typically small) number of distinct groups of buyers. For example, think of an airline selling tickets to business travelers and vacationers, or a publisher selling its academic journal to university libraries, pro- fessors, and (poor) students. We will assume that the buying groups are separate in the sense that there is no possibility of buyers reselling the product between groups. (For instance, the vacationer cannot sell his airline ticket to a business traveler, and the student cannot sell his cheap journal subscription to the university library.) Since the customer groups are separate 12 Monopoly and Monopolistic Competition 214 and the monopolist is charging different prices to customers in the different groups, there are distinct demand curves for the different customer groups. We now turn to our formal model of simple price discrimination. We will assume that there are two distinct groups of customers, and the monopolist discriminates between the two groups. (Analyzing three or more distinct buying groups is an easy and obvious extension of our model.) Customers in markets 1 and 2 may know that the monopolist is charging a different price in the other market, but they cannot do anything about it; they are stuck in the group they are in, and the product cannot be resold by customers from one market to the other. As an aside, if you are interested in important policy debates related to this model, you can look into the issue of patent drug pricing in the U.S. versus pricing of the same drugs in Europe and Canada. Generally pharmaceuticals are priced much higher in the U.S. than in the rest of the developed world, even when they are produced by U.S. based companies, . Some drug buyers in the U.S. wish they could buy their prescriptions elsewhere, but U.S. laws and regulations make doing so difficult. We let y1 and y2 represent the quantities sold by the monopolist in markets 1 and 2, at prices p1 and p2, respectively. The total amount produced by the monopolist and sold in the two markets is y = y1 +y2. We let p1(y1) and p2(y2) represent the inverse demand curves in the two markets. The revenue functions in the two markets are R1(y1) and R2(y2). The product or service sold in market 1 is the same as that sold in market 2, and therefore the cost function for the monopolist just depends on its total output y1 + y2. That is, C(y) = C(y1 + y2). The monopolist wants to choose the quantities y1 and y2 to maximize its profits. How should it do it? Its profit function can be written as follows: pi(y1, y2) = R1(y1) +R2(y2)−C(y1 + y2) = p1(y1)y1 + p2(y2)y2 −C(y1 + y2). The first order conditions for maximizing this function of two variables are that (1) the partial derivative of pi(y1, y2) with respect to y1 must be zero, and (2) the partial derivative of pi(y1, y2) with respect to y2 must be zero. Differentiating with respect to y1 gives: ∂pi ∂y1 = dR1(y1) dy1 − dC(y) dy dy dy1 = 0, or dR1(y1) dy1 = dC(y) dy . 12 Monopoly and Monopolistic Competition 215 Therefore MR1(y1) = MC(y). That is, marginal revenue in market 1 must equal marginal cost. We will now analyze the “marginal revenue equals marginal cost” result just as we did above, in Section 2 of this chapter, except that now we have to remember that there are two separate markets, and we are focusing on market 1 at the moment. Note that MR1(y1) = dR1(y1)dy1 = d dy1 (p1(y1)y1) = p1(y1) + dp1(y1)dy1 y1. Let 1 represent the price elasticity of demand in market 1. By the same reasoning as in Section 2, we find that MR1(y1) = p1(y1) + dp1(y1)dy1 y1 = p1(y1)(1− 1/ 1). Setting marginal revenue equal to marginal cost now gives: MR1(y1) = p1(y1)(1− 1/ 1) = MC(y). We have analyzed the first order condition for profit maximization for market 1. Now let’s turn to the first order condition for market 2: the partial derivative of pi(y1, y2) with respect to y2 must be zero. By the same reasoning used above, this leads directly to MR2(y2) = p2(y2)(1− 1/ 2) = MC(y), where 2 is the price elasticity of demand in market 2. Putting our first order conditions together gives MR1(y1) = MR2(y2) = MC(y), and p1(y1)(1− 1/ 1) = p2(y2)(1− 1/ 2). We are now ready to conclude our discussion of simple (third degree) price discrimination. The monopolist who can price discriminate between two markets will end up selling quantities y1 and y2 such that MR1(y1) = MR2(y2) = MC(y1+y2). That is, marginal revenue in market 1 equals marginal revenue in market 2 equals marginal cost. By the same reasoning as in Section 2 above, the monopolist will operate in the elastic sections of the respective demand curves, that is, where 1, 2 > 1. By the price and elasticity formula above, the price will be lower in the market 12 Monopoly and Monopolistic Competition 216 with demand that’s more elastic. For instance, if 1 = 3/2, and 2 = 3, then by the formula, p1 will be twice p2. The moral of all this is that if you are buying from a price-discriminating monopolist, it’s better to be in the market with the higher price elasticity of demand. Be a flying vacationer rather than a business flyer! Perfect or First Degree Price Discrimination. Suppose a monopolist can charge different customers different prices, and charge different prices for different units going to the same customer. Moreover, suppose the monopolist knows exactly how much each customer is willing to pay for his first unit, his second, and so on. That is, the monopolist knows every customer’s demand curve (or inverse demand curve). Suppose the customers cannot transfer units of the monopolist’s product among themselves. Finally, assume (of course) that the monopolist wants to maximize profit. This is the most extreme price discrimination case, where each incremental unit of the monopolist’s product may be sold at a different price, a price which is the maximum that anyone would pay for the extra unit. This is perfect, or first degree, price discrimination. Fortunately for consumers, perfect price discrimination is a theoretical construct. But here is how the theoretical construct works. The monopolist knows all the customers, all their demand curves (or inverse demand curves), and every possible willingness to pay for additional units at every possible point. It therefore works its way down the market demand curve (more precisely, the inverse demand curve), and for each additional unit, it sells that unit to the person with the highest willingness to pay, given the number of units already sold to various buyers. Most importantly, the monopolist charges the customer the highest possible price for each additional unit. And this price is exactly equal to the height of the inverse demand curve, given the number of units already sold. The profit maximizing monopolist will continue to sell additional units (at different prices), as long as the price it is selling those units for exceeds MC(y). But once the monopolist reaches a price equal to MC(y), it stops searching for additional sales. Given that the perfect price discriminating monopolist works its way down the inverse market demand curve, selling each incremental unit at a price given by the height of the curve, its revenue equals the area under the market (inverse) demand curve. That is, the revenue from selling y units is R(y) = ∫ y 0 p(t)dt. 12 Monopoly and Monopolistic Competition 217 (Note that we are using a dummy variable t in the price function within the integral.) The perfect price discriminating monopolist’s profit is now given by pi(y) = ∫ y 0 p(t)dt− C(y). The first order condition for profit maximization requires that the derivative of profit with respect to y equal zero. This gives: dpi(y) dy = p(y)− dC(y) dy = 0 or p(y) = MC(y). That is, price equals marginal cost. Remember, of course, that each unit produced by the perfect price discriminating monopolist is being sold at a different price; the condition we have just derived is for the last or marginal unit. The result of all this is shown is Figure 12.3 below. The figure shows an inverse demand curve p(y). This is a straight-line demand curve, similar to the ones in Figures 12.1 and 12.2. The monopolist’s marginal cost curve MC(y) is also shown in Figure 12.3; this one is roughly parabolic, unlike the ones in Figures 12.1 and 12.2. (A marginal cost curve which first falls and then rises is more realistic than a straight-line marginal cost curve.) The monopolist is producing a quantity y∗ where p(y∗) = MC(y∗), and is charging the last customer this price for this unit. But all the previous units were sold at higher prices to the various customers, with each incremental unit being sold at the highest possible price to its buyer. The monopolist’s revenue equals the area under the inverse demand curve up to y∗, and its total cost equals the area under the marginal cost curve, up to that point (except possibly for any fixed cost C(0)). Profit is revenue minus cost, and is shown as the cross-hatched area (except possibly for any fixed cost). This area is producer’s surplus. We conclude that: (1) There is no deadweight loss due to the operation of the perfect price discriminating monopoly firm. The sum of consumers’ surplus and producer’s surplus is the same in this figure as it would be if the monopoly were acting like a competitive firm (that is, finding the output level where price equals marginal cost, and selling to everybody at that price). In other words, a perfect price discriminating monopoly market is efficient. This is the good news. The bad news is (2) There is no consumers’ surplus. The buyers get no net 12 Monopoly and Monopolistic Competition 218 benefit in this monopoly; all the benefit flows to the monopoly firm. Therefore, although the outcome is efficient, it’s grossly inequitable, with all the benefit flowing to the firm and none to the consumers. INSERT FIGURE 12.3 HERE Caption of Fig. 12.3: Perfect price discrimination. We will end this section with a final comment about the real world. It’s hard to think of a real example of a perfect price discriminating monopolist, because a typical monopolist firm simply doesn’t know the amounts all its buyers would be willing to pay for additional units of its good or service. In fact buyers are often careful to conceal how much they are willing to pay, since they don’t want to be forced to pay extra for a product. Often buyers will actively conceal both their enthusiasm for the monopolist’s product, and their income or wealth—the depth of their pockets—and the typical firm has no good way to discover either of these ingredients in willingness-to-pay. We are familiar, however, with one interesting case where the monopolist firm goes to con- siderable lengths to discover the buyer’s interest in the product, and to discover the buyer’s income or wealth. This is the case of the Famous University (e.g., Harvard, Yale, Princeton, Brown) offering financial aid (that is, discounted tuition and fees) to prospective students. The university asks questions on application forms whose answers reveal how much the applicant wants to go to that university, and the university asks detailed questions on financial aid forms to discover how deep the applicant’s (and her parents’) pockets are. Of course, universities are not profit-maximizing firms. Their price discrimination, although extensive, only takes place on the bottom part of the inverse demand curve; they do not actively try to extract the highest possible fees from students who are not on aid. But this is a case where the seller digs very deeply to find out how much the customer wants the product, and how much money the customer has available to pay for the product. 12.5 Monopolistic Competition When economists use the term monopolistic competition they are referring to a market where there are many competing firms, producing products which are similar but not identical. Each 12 Monopoly and Monopolistic Competition 219 particular product is produced by just one of the firms. The products produced by the firms are different, but just slightly; that is, they are close substitutes. Think of the market for laundry detergents, for example. Among the popular brands currently available in the U.S. are Tide, Gain, OxyClean, Method, All, and Cheer. Each is unique, possibly protected by patents, and certainly protected by trademarks. So each firm producing one of these brands can be called a monopolist; it is the only firm producing that particular brand of detergent. But the different brands of detergent are close substitutes, from the buyer’s perspective. If a buyer usually uses Gain, but the price of Gain goes up slightly while the price of Tide goes down slightly, she will most likely switch. Therefore the firm that sells Gain may be a monopoly, strictly speaking, but demand for its product is very sensitive to the price it charges, and to the prices charged by competing brands. In Chapter 11 on competitive markets, we assumed that firms are price takers—they take prices for the goods they produce as given and outside their control. In the analysis of monop- olistic competition, we will drop that assumption. Rather, we will assume, as we did in earlier parts of this chapter, that firms are aware of how their pricing decisions affect demand for their products. But now we also assume that, although the maker of one particular product or brand has a monopoly on that particular product or brand, other firms compete with very similar products or brands. We also assume that firms are free to enter or leave the market, and if the firms in the market are making profits, new firms will enter and try to sell similar new products. The main formal difference between the model in this section and the models in prior sections of this chapter is that firm i’s inverse demand curve doesn’t only depend on yi. It also depends on the total number of firms in the market n. We write pi(yi, n) for firm i’s inverse demand curve. As before, when yi rises, pi(yi, n) falls, and when yi falls, pi(yi, n) rises. That is, the inverse demand curve is downward sloping, for a given number of firms. We now also assume that as n rises, pi(yi, n) falls, and as n falls, pi(yi, n) rises. This means that to sell a given level of output, firm i must charge a lower price if the number of competing firms increases. Firm i’s problem is to choose its output yi to maximize its profit pii(yi, n). Of course it only controls yi; it has no control over n (short of the drastic step of leaving the business itself.) Its profit is: pii(yi, n) = Ri(yi, n)− Ci(yi) = pi(yi, n)yi − Ci(yi). 12 Monopoly and Monopolistic Competition 220 The first order condition for profit maximization says that the first derivative of profit with respect to yi should be zero. This gives MRi(yi, n) = MCi(yi), or marginal revenue for firm i equals marginal cost for firm i. So far, this is exactly like the classical analysis for monopoly. But now we turn to what is new. We can solve for the equilibrium number of firms in the market. We assumed free entry into the monopolistic competition market. This implies that if firms in the market are making profits, new firms will enter, driving up the number of firms n, and driving prices and output levels per firm down. An equilibrium will occur when profits disappear. Firm i’s profit is zero when the price it is getting for its product equals average cost. This means that pi(yi, n) = ACi(yi). We now have two equations describing the equilibrium under monopolistic competition. The first says that firm i’s marginal revenue equals firm i’s marginal cost. The second says that the number of firms in the market adjusts until pi equals ACi. Figures 12.4 and 12.5 below illustrate how monopolistic competition works. Figure 12.4 revisits standard monopoly analysis. Recall that in Figures 12.1 and 12.2 we analyzed classical monopoly, but we used a special linear marginal cost function, MC(y) = 2y. Figure 12.4 is similar to 12.1 and 12.2, except that it shows a standard U-shaped average cost curve, and a standard marginal cost curve that falls at first and then rises, passing through the bottom of the average cost curve. ACi(yi) and MCi(yi) in Figure 12.4 represent average cost and marginal cost curves for a particular firm, firm i, in the group of monopolistic competi- tors. Figure 12.4 represents a short run situation in which firm i has few competitors, lots of demand for its product, and substantial profit. The firm finds the point where marginal revenue equals marginal cost (point A), it chooses the corresponding output y∗i , and its profits are then substantial. Deadweight loss due to this monopoly is also substantial, equal to the area of the triangle ABC. INSERT FIGURE 12.4 HERE Caption of Fig. 12.4: Firm i in the short run, before competition has driven profit down to zero in the monopolistic competition market. 12 Monopoly and Monopolistic Competition 221 Figure 12.4 represents firm i’s nicely profitable situation in the short run, before its com- petitors have entered the market and driven profits down to zero. As competitors enter, the inverse demand curve for i’s product drops and flattens. Competitors continue entering (that is, n continues increasing) until i’s profits are driven to zero, or pi(yi, n) = ACi(yi). In the long run, firm i is making no profit, just like a firm in a competitive market in the long run. Figure 12.5 shows the long run equilibrium. Note that Figure 12.5 is based on the same average cost and marginal cost curves as Figure 12.4. What has changed is that the demand and marginal revenue curves have shifted down and flattened. There is a new equilibrium y∗i for firm i, which now lies to the left of the minimum of ACi. And there is a new equilibrium price p∗i , which equals average cost: p∗i = ACi(y∗i ). INSERT FIGURE 12.5 HERE Caption of Fig. 12.5: A long run equilibrium under monopolistic competition, with zero profit for firm i. Notice how the long run equilibrium in a monopolistic competition market has both compet- itive and monopolistic features. It’s like an equilibrium in a competitive market in that the firms are making zero profits. But it differs from a competitive equilibrium in that the firms are not operating at the minima of their average cost curves. It’s like an equilibrium for a monopoly in that there is a deadweight loss triangle. But firm i has a much smaller deadweight loss triangle in the monopolistic competition equilibrium, Figure 12.5, than it does when it’s a real monopolist, Figure 12.4. Finally, the regulatory implications of monopolistic competition are unclear. When firm i is a profitable monopoly, as in Figure 12.4, forcing it to act competitive (and produce and sell at a point where price equals marginal cost) makes sense. But when firm i is a monopolistic competition firm, as in Figure 12.5, there is not much latitude for the government to force it to do anything, since it is already making zero profit. What then should the policy maker do about monopolistic competition? The answer is probably not much. However, consumers should be alert to the possibility that some of the firms in the market might want to force other firms out, to reduce competition, or that some of the firms in the market might want to create barriers to prevent other potential competitors from coming in. One can easily imagine a trade group, call it the U.S. Detergent Council, 12 Monopoly and Monopolistic Competition 222 made up of the firms that make Tide, Gain, OxyClean, Method, All, and Cheer, getting together with their congressional allies to form a new government agency, say the U.S. Administration of Cleanliness. The U.S.A.C. could then prevent any other firms from selling detergent to U.S. consumers, without extremely time-consuming and expensive prior testing. This would enhance the profits of the firms already entrenched in the market. But it would be a dirty deal for the consumers. 12.6 A Solved Problem The Problem Esmeralda is a fortune teller, and can predict her clients’ futures. The demand for future- telling sessions is given by yD = 20− p, where y is the number of client sessions, and p is the price per session. Her future-telling costs increase more than proportionally with each session (she gets horrible headaches after a while!); her cost function is C(y) = y2. She works in a theme park called Promised Land. (a) Suppose Promised Land requires that she charge the same price per session, for all customers and all sessions. Assume she is the only fortune teller who is allowed to operate in Promised Land. How many sessions does she sell? What price does she charge? What is her profit? What are the consumers’ and producers’ surpluses? (b) Now assume that Promised Land drops the uniform price requirement. Assume further that Esmeralda can not only tell the future; she can also read customers’ minds, and see exactly how much each customer is willing to pay, at most, for each session. That is, she knows all their willingnesses-to-pay. What will she do now to maximize her profit? What prices will she charge? How many sessions will she sell? What profit will she make? What are the consumers’ and producers’ surpluses? (c) Finally, assume that Promised Land decides to allow any and all fortune tellers to come and operate in the park. The fortune telling market becomes competitive. Assume the supply curve in this competitive market is yS = p/2, where p is the market price. Calculate the competitive equilibrium price and quantity. What are the consumers’ and producers’ surpluses? Compare these with corresponding values in parts (a) and (b). 12 Monopoly and Monopolistic Competition 223 Note: Treat y as a continuous variable. The Solution (a) We first invert the demand function to get price as a function of quantity: p(y) = 20− y. Esmeralda’s revenue function is R(y) = p(y)y = (20 − y)y = 20y − y2. We differentiate this to get marginal revenue: MR(y) = 20 − 2y. Since her cost function is C(y) = y2, her marginal cost is MC(y) = 2y. The first order condition for profit maximization for a non-price-discriminating monopolist is MR(y) = MC(y). This gives MR(y) = 20− 2y = 2y = MC(y). The profit-maximizing number of sessions is therefore y = 5. If we substitute this into the inverse demand function, we get p = 20− y = 15. Her profit is pi(y) = R(y)− C(y) = 20y − y2 − y2 = 100− 25− 25 = 50. See Figure 12.6 below for the demand function yD, the MR(y) function, and the MC(y) function, all straight lines. (Note that Figure 12.6 is very similar to Figure 12.2.) Figure 12.6 also includes the profit-maximizing output y = 5 and price p(y) = 15, as well as capital letters identifying some crucial points. (Point A, for example, is where marginal cost crosses marginal revenue, point B is where a vertical line from y = 5 hits the demand curve, and point C is where marginal cost crosses the demand curve.) Note that consumers’ surplus, in the case of the monopolist charging one price, is the area under the demand curve but above a horizontal line at the monopoly price; this is the area of the triangle HBG in Figure 12.6, which is 1/2× 5× 5 = 12.5. Producers’ surplus is the area below the horizontal line at the monopoly price, but above the marginal cost curve; this is the area of the quadrilateral GBAE in the figure. Alternatively, given that there are no fixed costs in this example, producers’ surplus simply equals Esmeralda’s profit. Therefore producers’ surplus is 50, and the sum of consumers’ surplus and producers’ surplus is 62.5. (See Figure 12.2 for another view of the same sort of thing, with cross- hatching.) Note that deadweight loss is the area of triangle ABC. INSERT FIGURE 12.6 HERE 12 Monopoly and Monopolistic Competition 224 Caption of Fig. 12.6: Assuming Esmeralda charges a uniform price p, the profit- maximizing output is y = 5 and price is p(y) = 15. (b) If Esmeralda can read her customers’ minds, and if she is allowed to charge different prices as she pleases, she charges each person the maximum that person would be willing to pay for that session, for each person, and for each session. She is now a perfect price discriminating monopolist. This means that she works her way down the inverse demand curve, charging different prices for each session. She continues to do this, unit by unit, as long as it is profitable to do so. Once she hits the point where willingness-to-pay equals marginal cost, she stops. To find that point, we set inverse demand equal to marginal cost, or 20 − y = 2y, which gives y = 20/3 = 6.67 . The price she charges for the last unit is p = 20− 20/3 = 40/3 = 13.33. Her profit is revenue minus cost. Revenue is slightly complicated since she is charging a different price for each session. Given that the price for each session is the height of the inverse demand curve at that y, revenue is the area under the inverse demand curve from y = 0 to y = 20/3. That is, revenue is the area of the quadrilateral HCDE in Figure 12.6. In the figure, we see that HCDE is made up of a triangle HCF on top, plus a triangle FCE below the horizontal line at p = 13.33 but above the marginal cost curve, plus another triangle CDE below the marginal cost curve and to the left of a vertical line at y = 6.67. Using the usual rule for the area of a triangle, and adding together the areas of these three triangles, we get Esmeralda’s revenue as a perfect price discriminating monopolist: R(y) = ( 1 2 × 20 3 × ( 20− 403 )) + ( 1 2 × 20 3 × 40 3 ) + ( 1 2 × 20 3 × 40 3 ) = 111.11. We can find cost from the cost function: C(20/3) = (20/3)2 = 44.44. Alternatively, we can find it as the area under the marginal cost curve, that is, the area of the triangle CDE in the figure, which gives: C(y) = 1 2 × 20 3 × 40 3 = 44.44. Finally, Esmeralda’s profit as a perfect price discriminating monopolist equals revenue minus cost. In Figure 12.6, this is the area of the quadrilateral HCDE minus the area of 12 Monopoly and Monopolistic Competition 225 the triangle CDE, or the area of the large triangle HCE. This gives: pi(y) = R(y)−C(y) = 111.11− 44.44 = 66.67. Note that for the perfect price discriminating monopolist, consumers’ surplus is zero, and producers’ surplus equals 66.67, the entire area of the large triangle HCE. (c) With a competitive fortune-telling market, and supply curve yS = p/2, we calculate the equilibrium by setting supply equal to demand, or yS = yD, which gives p/2 = 20−p. This gives a competitive equilibrium price of p = 40/3 and a competitive equilibrium quantity of y = 20/3. Note that the inverse supply function is p(y) = 2y, identical to Esmeralda’s marginal cost function in parts (a) and (b). Referring again to Figure 12.6, under the competitive outcome, consumers’ surplus is the area under the demand curve but above p = 40/3, or the area of triangle HCF : CS = 1 2 × 20 3 × ( 20− 403 ) = 22.22. Producers’ surplus is the area below the below the horizontal line at p = 40/3, but above the supply curve (which coincides with the original MC(y) curve), or the area of triangle FCE: PS = 1 2 × 20 3 × 40 3 = 44.44. The sum of consumers’ surplus and producers’ surplus is now the area of the large triangle HCE: CS + PS = 1 2 × 20× 20 3 = 66.67. We see that the sum of consumers’ surplus and producers’ surplus, or total social surplus, is the same 66.67 in the perfect price discriminating monopolist case (part (b)), and in the competitive case (part (c)). In this sense, both the perfect price discriminating monopolist case and the competitive case are better for society than the case of the monopolist charging one price (part (a)), where the total of consumers’ surplus and producers’ surplus was 62.5. In the competitive case, the consumers actually do benefit—consumers’ surplus is 22.22. In the perfect price discrimination case, however, the monopolist gets all the surplus and consumers get none. 12 Monopoly and Monopolistic Competition 226 Exercises 1. Suppose a monopolist with constant marginal costs practices third degree price discrimi- nation. Group A’s elasticity of demand is A and Group B’s is B, and A > B. Which group will face a higher price? Explain. 2. Vito Corleone’s family is the only supplier of good h in the U.S. The market inverse demand for good h has been estimated to be p = 50− hD50 . The costs of production and distribution are represented by C(h) = 12h. Calculate the monopolist’s profit-maximizing level of output, the price at which it will be sold, and Corleone’s profits. 3. Consider a third degree price discriminating monopolist. Suppose p1(y1) = 100 − y1, p2(y2) = 75 − 12y2, and let the cost curve be C(y) = y2 = (y1 + y2)2. Show that the monopolist will produce y1 = 18.75, y2 = 12.50, and set prices p1 = 81.25, p2 = 68.75. 4. Horizon Telephone observes that there are two types of demand for telephone services: businesses and families. The businesses’ demand curve is xB = 100 − pB, where xB measures the hours of telephone services that businesses purchase per week and pB is the price per hour charged to businesses. The families’ demand curve is xF = 15− pF2 , where xF and pF represent hours and price respectively. Horizon Telephone’s cost function is C(x) = x, where x = xB + xF . (a) Suppose Horizon Telephone can price-discriminate between the two groups. Calculate the hours of telephone services that it sells to each group, the two prices, and total profits. (b) Calculate the consumers’ and producer’s surplus under price discrimination. (c) Suppose the government forbids price discrimination. Then, the total demand for telephone services is obtained from the horizontal sum of the demands from the two groups (businesses and families). Calculate the solution to the monopolist’s problem and its profits. (d) Now calculate the consumers’ and producer’s surplus if price discrimination is forbid- den. Is society better off or worse off after this change is introduced? 12 Monopoly and Monopolistic Competition 227 5. Sue has a monopoly over the production of strawberry shortcake. Her cost function is C(y) = y2 + 10y. The market demand curve for strawberry shortcake is p(y) = 100− 12y. (a) What is Sue’s profit-maximizing level of output y∗? (b) What is the price p∗ at this level of output? (c) Calculate her profit pi∗. (d) Find the consumers’ surplus at p∗ and y∗. 6. Consider Sue, the strawberry shortcake monopolist from Question 5. Suppose the dictator decides to force Sue to price at marginal cost. (a) What is Sue’s new profit-maximizing level of output y∗∗? Compare your answer to Q5(a). (b) What is the new price p∗∗? Compare your answer to Q5(b). (c) Calculate her profit pi∗∗. Compare your answer to Q5(c). (d) Find the consumers’ surplus at p∗∗ and y∗∗. Compare your answer to Q5(d). (e) How does total welfare compare to the situation in Question 5? 13 Duopoly 228 13 Duopoly 13.1 Introduction In this chapter, we study market structures that lie between perfect competition and monopoly. As before we assume, at least in most of this chapter, that there is one homogeneous good which is the same no matter who makes it. We assume everyone has perfect information about the good and its price. In our discussion of monopoly, we assumed there were barriers to entry which preserved the monopolist’s position. In this chapter. we also assume there are barriers to entry which prevent other firms from entering the market. However, we now assume there are already two (or more) firms in the market. An oligopoly is a market with just a few firms. For instance, the market for cell phone service in our part of the U.S. is currently dominated by Verizon Wireless, ATT, and Sprint, a total of three large companies. (There are also some smaller companies.) In this market, each of these large firms realizes that its own output and the output of each of its competitors will affect the market price. In contrast, in a competitive market (like the markets for wheat, corn, or cattle), there are hundreds or thousands of firms supplying the good, and each firm can safely ignore the possible effect of its own output or each competitor’s output on the price. In this chapter, we assume that each firm takes into account how its own output, and its competitors’ outputs, affects the price, and through the price, its own profit. Since one firm’s output decision will affect the profits of the other firms, firms in an oligopoly are likely to act strategically. Two firms are acting strategically when each looks at what the other is doing, and thinks along these lines: “I’ve got to make my production decisions contingent on what he does. If he sells 1,000 units, then to maximize my profits, I have to sell 1,100 units. And if I produce 1,100 units, does it make sense for him to produce 1,000 units?” The firms are reacting to each other. In a competitive market, in contrast, the firms do not react to each other; they only react to the market price, which they take as predetermined or fixed. In this chapter, we will assume there are only two firms in the market. A market with just two firms is called a duopoly. Obviously a duopoly is the simplest sort of oligopoly, and many of the concepts and results that we will describe can be extended to the case of an oligopoly with more than two firms. Duopoly analysis by economists dates back to the 19th century. Some of the central concepts of duopoly analysis have to do with strategic behavior, and the analysis of 13 Duopoly 229 strategic behavior is the heart of the 20th century discipline called game theory. So game theory builds on duopoly theory. We will turn to game theory in the next chapter. There are two fundamentally different approaches to duopoly theory. The first assumes that duopolists compete with each other through their choices of quantity: each firm decides on the quantity it should produce and sell in the market, contingent on the other firm’s quantity. The second assumes that duopolists compete with each other through their choices of price: each firm decides on the price it should charge, contingent on the price the other firm is charging. The first approach was taken by the French mathematician and economist Antoine Augustin Cournot (1801-1877), who wrote about duopoly in 1838. The second approach was developed by another French mathematician, Joseph Louis Francois Bertrand (1822-1900), in 1883. We will start in Section 2 by describing the basic Cournot duopoly model, and we will develop that model in Sections 3 and 4. The crucial behavioral assumption of the Cournot model is that each firm assumes the other firm’s output is given and fixed, and maximizes its own profit based on that assumption. There are other behavioral assumptions that might be made about the two firms. One is the assumption made by the German economist Heinrich von Stackelberg (1905- 1946). Stackelberg assumed, as did Cournot, that the firms make decisions about quantities. But he also assumed, unlike Cournot, that the two firms act differently; one of the duopolists acts as a follower (as in Cournot’s model), taking the other firm’s output as given and fixed, and choosing its own output based on that assumption, but the other duopolist acts as a leader, by anticipating that its rival will act as a follower, and choosing its own output based on that knowledge. We will describe the Stackelberg model in Section 5. In Section 6, we will describe the Bertrand model, in which the firms compete with each other through their choices of price, instead of competing, as in Cournot (and Stackelberg) through the choices of quantity. We will see that there are two rather different versions of Bertrand’s model, depending on whether the good produced by the two firms is exactly the same (the homogeneous good case), or somewhat different (the differentiated goods case, e.g., Coke and Pepsi). 13.2 Cournot Competition Assume there are two firms in the market. Firm 1 produces y1 units of the good; firm 2 produces y2 units. The total amount produced is y = y1 + y2. We assume there is a downward sloping 13 Duopoly 230 inverse market demand curve p(y) = p(y1 + y2). We assume firm i has a cost curve Ci(yi), for i = 1, 2. Firm 1 wants to maximize its profit pi1, given by: pi1(y1, y2) = p(y1 + y2)y1 −C1(y1). Similarly, firm 2 wants to maximize its profit pi2, given by: pi2(y1, y2) = p(y1 + y2)y2 −C2(y2). The basic Cournot assumption is this: when firm 1 chooses its output y1 to maximize its profit, it takes firm 2’s output y2 as given and fixed; and, similarly, when firm 2 chooses its output y2 to maximize its profit, it takes firm 1’s output y1 as given and fixed. Therefore when firm 1 differentiates its profit function pi1(y1, y2), it treats y2 as a constant. This leads to the first order condition ∂pi1 ∂y1 = p(y) + dp(y)dy y1 − dC1(y1) dy1 = 0. Now firm 1 can solve this equation for y1 as a function of y2. We write the result as y1 = r1(y2). The function r1 is called firm 1’s reaction function. It shows, for any output level y2 of firm 2, the quantity of the good that firm 1 should produce in order to maximize its profit. Similarly, firm 2’s maximizes its profit subject to the assumption that y1 is a constant. This leads to ∂pi2 ∂y2 = p(y) + dp(y)dy y2 − dC2(y2) dy2 = 0. Firm 2 can solve this equation for y2 as a function of y1, and we write the result as y2 = r2(y1). The function r2 is firm 2’s reaction function. It shows, for any output level y1 of firm 1, the quantity of the good that firm 2 should produce in order to maximize its profit. Now if the two firms randomly choose their output levels y1 and y2, it is almost certain that neither would be maximizing its profits subject to what the other one is doing. Neither firm would be behaving in a clever way. The result wouldn’t make sense. It would be doubly stupid. And if firm 2 randomly chooses an output level y2, and then firm 1 uses its reaction function 13 Duopoly 231 r1 to choose its output level y1, the result would be half sensible—sensible on the part of firm 1, but stupid on the part of firm 2. But suppose the reaction functions intersect at a point y∗1 and y∗2, and suppose firm 1 chooses y∗1 and firm 2 chooses y∗2. This outcome does make very good sense for both firms, because firm 1 is making the best choice it can, subject to what firm 2 has chosen, and firm 2 is making the best choice it can, subject to what firm 1 has chosen. A Cournot equilibrium in a duopoly model is a pair of output levels y∗1 and y∗2 that are consistent in this sense—each firm i is maximizing its profit at y∗i , subject to what the other firm j has chosen, y∗j . The Cournot equilibrium is Augustin Cournot’s brilliant solution to the duopoly puzzle. In short, a Cournot equilibrium is a consistent, self-sustaining, and self-reinforcing outcome in the duopoly model. We now turn to an example to show how the Cournot equilibrium can be found. Example 1. Assume the inverse demand curve is p(y1 + y2) = 100 − y = 100 − y1 − y2. Assume the cost curves are C1(y1) = 25y1 and C2(y2) = 25y2. Marginal cost for either firm is a constant 25. To find firm 1’s reaction function, we find the y1 that maximizes pi1(y1, y2), under the assumption that y2 is constant. Firm 1’s profit is: pi1(y1, y2) = (100− y1 − y2)y1 − 25y1. Differentiating with respect to y1 while holding y2 constant, and setting the result equal to zero, gives: ∂pi1 ∂y1 = 100− 2y1 − y2 − 25 = 0. Now solving for y1 as a function of y2 gives firm 1’s reaction function: y1 = r1(y2) = 37.5− y2/2. Firm 2’s profit pi2(y1, y2) is: pi2(y1, y2) = (100− y1 − y2)y2 − 25y2. Differentiating with respect to y2 while holding y1 constant, and setting the result equal to zero, gives: ∂pi2 ∂y2 = 100− 2y2 − y1 − 25 = 0. 13 Duopoly 232 Therefore firm 2’s reaction function is: y2 = r2(y1) = 37.5− y1/2. In Figure 13.1 below, we show the reaction functions and the Cournot equilibrium in Example 1. The Cournot equilibrium is the point where the two reaction functions intersect. Solving the two reaction function equations simultaneously (y1 = 37.5− y2/2 and y2 = 37.5− y1/2) easily gives (y∗1 , y∗2) = (25, 25). At (25, 25), each firm is maximizing its profit, given what the other firm is doing. The market price is 100 − 25− 25 = 50. The reader can easily check that profit levels for the firms are (pi1, pi2) = (625, 625). The output levels are mutually consistent; neither firm has an incentive to change, given what the other firm is doing. The Cournot equilibrium (25, 25) makes sense for firm 1, and simultaneously makes sense for firm 2. INSERT FIGURE 13.1 HERE Caption of Fig. 13.1: The Cournot equilibrium in Example 1. Comparison with monopoly and competition. We can use Example 1 to show how a Cournot equilibrium in a duopoly compares to a monopoly outcome and a competitive outcome. The general result is that in a duopoly (and more generally an oligopoly), total output and price lie somewhere between what they would be under competition or under monopoly. In Example 1, remember that the Cournot equilibrium price is $50, and the total quantity is 50. How would we describe the competitive outcome? We would have the same demand curve, but price would be equal to marginal cost. That is, the competitive supply curve would be a horizontal line at p = 25. Combining this with the inverse demand curve p = 100 − y gives a competitive equilibrium at pC = $25 and yC = 75, where the “C” subscript means “competitive.” How would we describe the monopoly outcome? The monopolist would maximize profit by setting marginal revenue equal to marginal cost. In the example, MR(y) = 100 − 2y and MC(y) = 25. Therefore 100−2y = 25, or y = 37.5. Putting this y in the equation for the inverse demand curve gives p = 100− y = 62.5. In the monopoly solution, then, we have pM = $62.5 and yM = 37.5, where the “M” subscript stands for “monopoly.” We conclude that the Cournot equilibrium in a duopoly lies between the competitive outcome and the monopoly outcome, both for quantity and price. 13 Duopoly 233 What about efficiency? We will now investigate the social surplus created at the Cournot equilibrium in the duopoly. As you might expect, the duopoly social surplus lies between the social surplus in the monopoly market, and the social surplus in the competitive market. In short, duopoly (and more generally oligopoly) creates some deadweight loss, but not as much as monopoly creates. We show this in Figure 13.2 below. The figure is based on Example 1. It shows total output y on the horizontal axis. The outermost line is the inverse demand curve p(y) = 100− y. A monopolist in this market would find the corresponding marginal revenue curve MR(y) = 100−2y. This is the steeper downward- sloping line shown in the figure. A monopolist would set marginal revenue equal to marginal cost, point A in the figure, to get the quantity yM = 37.5. He would then go up to the demand curve, to point B, and get the price pM = $62.5. Total social surplus under monopoly in this example would be consumers’ surplus (the cross-hatched triangle) plus producer’s surplus (the cross-hatched square). If this market were a duopoly, the Cournot equilibrium total quantity would be 50 (shown on the horizontal axis), and the price would be $50 (shown on the vertical axis). In a transition from a monopoly to duopoly, consumers’ surplus would grow and producers’ surplus would shrink. But the sum of the two welfare measures would definitely grow. It would grow by the area of the horizontally cross-hatched trapezoid in the figure. Finally, if this were a competitive market, the equilibrium would require that price equal marginal cost (point C in the figure). In a transition from duopoly to competition, consumers’ surplus would greatly expand and producers’ surplus would disappear. However, social surplus would definitely grow, by the area of the non-cross-hatched triangle. INSERT FIGURE 13.2 HERE Caption of Fig. 13.2: Welfare analysis of the duopoly, based on Example 1. We conclude that the competitive outcome is best for society in the sense that it maximizes social surplus. The Cournot equilibrium in a duopoly is worse than the competitive outcome. The monopoly outcome is the worst of all. 13 Duopoly 234 13.3 More on Dynamics We have been a little bit vague about how our two firms get to the Cournot equilibrium. The sophisticated and modern game-theory oriented economist looks at Cournot’s model and describes it as a simultaneous move game. This means that firms 1 and 2, with full knowledge of market demand, and full knowledge of their own cost function and their rival’s cost function, choose their output levels, one time only, and simultaneously. They end up with a pair of output levels (y1, y2). If the pair is a Cournot equilibrium, the outcome makes sense for both firms; it’s doubly sensible. If it’s not a Cournot equilibrium, the outcome fails to make sense for at least one of the firms, and possibly for both firms. If each firm is a rational profit maximizer and expects the other firm to also be a rational profit maximizer, they should end up at the Cournot equilibrium. It may be useful to discuss some other possible dynamics in Cournot’s model. These descrip- tions of dynamics necessarily go beyond the simple assumption of simultaneity. One possible dynamic has the firms taking turns reacting to each other. First, firm 1 reacts to firm 2’s out- put; then firm 2 reacts to firm 1’s output, and the process goes on until it (hopefully) gets to a Cournot equilibrium. To make this discussion more understandable, let’s assume that there is a time dimension, and production and consumption are repeated time unit after time unit, say, day after day. Let us assume, then, that firms 1 and 2 start at some initial output quantities, on day 0, say, (y01, y02). On the morning of day 1, firm 1 looks at firm 2’s output, and calculates what it should produce, contingent on what firm 2 produced on day 0. That is, it goes to its reaction function, and calculates y1 = r1(y02). This gives (y11, y12) as the firm outputs on day 1, where y11 = r1(y02) and where y12 = y02 . On the morning of day 2, firm 2 looks at firm 1’s output, and calculates what it should produce, contingent on what firm 1 produced on day 1. That is, it goes to its reaction function, and calculates y2 = r2(y11). This gives (y21 , y22) as the firm outputs on day 2, where y21 = y11 and where y22 = r2(y11). In other words, the two firms take turns reacting to each other. The process continues, day after day, until it (hopefully) converges to a point where neither firm wants to make further modifications to its daily output. That point is a Cournot equilibrium. Of course, it is slightly 13 Duopoly 235 odd to think that each firm will use its reaction function at each of its turns, since the reaction functions are based on the assumption that the rival’s output is fixed, and that the rival is changing its planned output every other day. (A more rigorous treatment of this and other dynamic adjustment processes is beyond the scope of this book.) Figure 13.3 illustrates this dynamic story. The process starts at some initial output levels P 0 = (y01 , y02), shown on the vertical axis in the figure. On day 1, the process moves to P 1; on day 2, it moves to P 2, and so on. As in the story of Genesis, on the 7th day they rest. In the figure, the process converges nicely to the Cournot equilibrium. However, this dynamic process would not converge if the reaction functions had the wrong slopes at the equilibrium. The reader is invited to relabel the reaction functions to see what happens if r1 is less steep than r2! INSERT FIGURE 13.3 HERE Caption of Fig. 13.3: A dynamic story about the Cournot equilibrium, based on Example 1. 13.4 Collusion Let’s return to Example 1, and assume that our duopolists are at the Cournot equilibrium. Once again, to make this discussion more understandable, we assume that there is a time dimension, and production and consumption are repeated day after day. Since the two firms are at the Cournot equilibrium, they are producing and selling y∗1 = y∗2 = 25, day after day. Given those production levels, the market price is p = 100 − 25 − 25 = 50, day after day. Each firm has profits of pii = py∗i − 25y∗i = 625, day after day. Now suppose one day the owners of the two firms meet for a game of golf. They have the following conversation: Firm 1 owner: “I’m maximizing my profits at y∗1 = 25. But this is based on your holding your output constant at y∗2 = 25. What if we both cut output a little bit? Could we make more money that way?” Firm 2 owner: “Well, if we each cut production by one unit, the market price would rise to $52, since the price is given by p = 100−y1−y2. This means my revenue would change from $50 × 25 to $52× 24. That’s almost no change—it’s a drop of $2, to be exact.” Firm 1 owner: “But your costs would drop by $25. So your profit would shoot up.” Firm 2 owner: “That’s right. In short, if we both cut back output by one unit, your profit 13 Duopoly 236 would rise by $23, and mine would too!” Then their caddy speaks up: “I’m an undercover federal agent. You are both under arrest for colluding and conspiring to fix prices in the market for the gizmos you are producing.” As the presence of our fictional caddy/federal agent suggests, it may be illegal for two duopolists, or more generally a group of firms in an oligopoly, to get together and make plans like this. A cartel is a group of producers or firms who organize (or conspire) to raise the price of the good they are selling by restricting supply. Under the antitrust laws of the U.S. and other developed nations, cartels are usually, but not always, illegal. One of the most notorious (but outside the reach of law) cartels of recent history is OPEC, the Organization of Petroleum Exporting Countries. This is an organization of countries whose main purpose is to keep petroleum prices high by controlling production in member countries. Legal cartels in the U.S. include sports leagues, such as Major League Baseball and the National Football League. What exactly would our two duopolists do if they took it upon themselves to maximize joint or total profit, rather than simply letting each firm maximize its own profit, conditional on the other firms’s output? As our discussion above suggests, they might gain a lot if they agree to both reduce output. Let pi(y1 + y2) = pi1(y1, y2) + pi2(y1, y2) represent total profit for the two firms combined. Then pi(y1, y2) = p(y1 + y2)(y1 + y2)−C1(y1)−C2(y2). The first order conditions for maximizing this function of two variables are: ∂pi ∂y1 = p(y1 + y2) + ∂p(y1 + y2)∂y1 (y1 + y2)− dC1(y1) dy1 = 0 and ∂pi ∂y2 = p(y1 + y2) + ∂p(y1 + y2)∂y2 (y1 + y2)− dC2(y2) dy2 = 0. Both of these conditions must hold for total profit to be maximized, at least for an interior maximum. (The first order condition for a maximum at a boundary is slightly different.) In Example 2 below, we will examine joint profit maximization for the simple duopoly in- troduced in Example 1. 13 Duopoly 237 Example 2. From Example 1, we have pi(y1, y2) = (100−y1−y2)(y1+y2)−25y1−25y2 = 100y1+100y2−y21−y22 −2y1y2−25y1−25y2. Taking partial derivatives with respect to y1 and y2, we get ∂pi ∂y1 = 100− 2y1 − 2y2 − 25 = 0 or y1 + y2 = 37.5. Similarly, ∂pi ∂y2 = 100− 2y1 − 2y2 − 25 = 0 or y1 + y2 = 37.5. The first order conditions for joint profit maximization are identical, because the two firms have identical cost curves. We conclude that joint profit maximization requires y1 + y2 = 37.5. For example, each firm could produce yi = 37.5/2 = 18.75. With these levels of output, pi1(y1, y2) = pi2(y1, y2) = (100 − 37.5)18.75− (25)18.75 = 703.125. Each firm would be mak- ing $703.125. This is considerably better than the $625 profit for each firm at the Cournot equilibrium. In Figure 13.4 below, we show the joint profit maximization points, the collusion outcomes, for this duopoly example. The shaded line is the set of outcomes which maximize joint profits, that is, the set for which y1 + y2 = 37.5. Note that (18.75, 18.75) is one of many possibilities, but they all involve total output of 37.5 units. Finally, the reader should remember our Figure 13.2 comparison of monopoly, duopoly, and competition. In conjunction with that figure, we determined that a monopoly firm would produce 37.5 units. In Example 2 and Figure 13.4, the two duopolists together are producing a total of 37.5 units. We get the same answer because the duopolists in Example 2 are acting just like a monopolist! INSERT FIGURE 13.4 HERE Caption of Fig. 13.4: The reaction curves, the Cournot equilibrium, and the collusion outcomes, all based on Example 1. 13 Duopoly 238 Happily for consumers of their products, cartels and colluding duopolists are inherently unstable. Because a collusion agreement is not a Cournot equilibrium, each firm has an incentive to cheat on the agreement. For instance, to continue our numerical example, suppose the two duopolists have agreed to be at the joint profit maximizing point (18.75, 18.75). Some time later, the owner of firm 1 wakes up one morning, and says to himself: “The hell with that lawbreaking S.O.B. If he’s going to produce 18.75 units per day, I shall greatly increase my own profits by using my reaction function to figure out what I should produce.” The answer is y1 = r1(y2) = 37.5− y2/2 = 37.5− 18.75/2 = 28.125. As soon as the owner of firm 1 figures this out, he produces 28.125 units per day. This raises firm 1’s profits from $703.125 per day to pi1(y1, y2) = (100−28.125−18.75)28.125−25(28.125) = $791 per day, a gain of nearly $88. Shortly thereafter, the owner of Firm 2 realizes he’s been duped. So firm 2 reacts to firm 1’s output of y1 = 28.125. Firm 2 switches to y2 = r2(y1) = 37.5− y1/2 = 37.5− 28.125/2 = 23.44. And so it goes. After a few rounds of this reacting and re-reacting, the duopoly may end up back at, or near, the Cournot equilibrium of (25, 25). The point of this discussion is that duopolists, and more generally members of cartels, always have incentives to collude, to get together and plot against the public, to figure out how they might reduce output and increase their joint profits. However, having come to some kind of collusion agreement, the duopolists, or the cartel members, will be tempted to cheat. If they do start to cheat, they are likely to drift back toward a Cournot equilibrium. This, then, is the big dynamic: independent profit maximization leads toward the Cournot solution. Then joint profit maximization leads toward the collusion solution. Unless the firms can enforce their collusion agreements, cheating and independent profit maximization leads back toward the Cournot solution. So turns the world of duopoly, or the world of cartels. In the absence of collusion enforcement mechanisms, the likely prediction for a duopoly, or for a cartel, is instability. 13 Duopoly 239 13.5 Stackelberg Competition We will now turn to the duopoly model of the German economist Heinrich von Stackelberg. Stackelberg assumed that one of the duopoly firms acts like a Cournot duopolist. That is, it takes the other firm’s output as given and fixed, and it chooses its own output based on that assumption. We call this firm the follower. Stackelberg assumed that the other firm anticipates this behavior, and maximizes its profit based on the assumption that its rival is a follower. We call this firm the leader. Recall that in the analysis of the Cournot equilibrium, we formally assumed a simultaneous move structure. That is, we assumed the interaction between the two firms was one time only, and simultaneous. Our extensive informal discussion of dynamics involved stories about day- by-day interactions, reactions, and re-reactions, but that discussion was not necessary for the formal definition of the Cournot equilibrium. In order to describe the Stackelberg model, we now formally assume that the interaction between the two firms is in two steps, sequentially. In the first step, the leader firm determines its planned output. In the second and final step, the follower firm determines its output. The firms then produce and sell their outputs, at a market price contingent on y1 + y2, and make their profits. Here is how it works. We will let firm 1 be the leader firm; its output is y1. Firm 2 is the follower firm; its output is y2. The follower firm is the second firm to act. It knows what the leader firm is producing, because that was determined (and announced) at step one. The follower firm acts just like a firm in the Cournot analysis, it takes y1 as given, and determines what maximizes its own profits given y1. Therefore it uses its reaction function to determine its output. That is, y2 = r2(y1). But firm 1, the leader firm, knows how firm 2, the follower firm, behaves. That is, firm 1 anticipates that firm 2 will choose y2 by using its reaction function formula. Therefore firm 1 can use its knowledge of firm 2’s behavior, and use the fact that it goes first and firm 2 goes second. It does this in a simple way; it just substitutes r2(y1) for y2 in its own profit function. This gets rid of the y2 terms; firm 1’s profit is now simply a function of y1, and firm 1 simply chooses y1 to maximize profits. Formally, firm 1’s profit is pi1(y1, y2) = p(y1+y2)y1−C1(y1). Substituting r2(y1) for y2 makes 13 Duopoly 240 this a function of one variable only: pi1(y1) = p(y1 + r2(y1))y1 −C1(y1). The first order condition for profit maximization is to set the derivative of profit with respect to y1 equal to zero. This gives: dpi1(y1) dy1 = p(y1 + r2(y1)) + y1 dpdy (1 + dr2(y1) dy1 )−MC1(y1) = 0. Let’s apply the result to Example 1. Recall that in that example, p(y) = 100− y = 100− y1 − y2, C1(y1) = 25y1, and r2(y1) = 37.5 − y1/2. Therefore dpdy = −1, and dr2(y1)dy1 = −1/2. Plugging into the first order condition then gives: (100− y1 − (37.5− y1/2))− y1(1− 1/2)− 25 = 0. This gives y1 = 37.5. Putting this into the follower’s reaction function gives: y2 = r2(y1) = 37.5− 37.5/2 = 18.75. Note that we know that a Stackelberg leader firm will never end up with a profit level lower than what it gets at the Cournot equilibrium. One option always available to the leader is to announce its Cournot output, to which the follower would respond with its Cournot output. This would produce the Cournot profits for the two firms. Figure 13.5 below shows the Cournot equilibrium, the collusion outcomes, and the Stackel- berg equilibrium for the numerical example we have been using throughout this chapter. INSERT FIGURE 13.5 HERE Caption of Fig. 13.5: The reaction curves, the Cournot equilibrium, the collusion outcomes, and the Stackelberg equilibrium, all based on Example 1. 13.6 Bertrand Competition We will now turn to the model developed by the mathematician Joseph Louis Francois Bertrand (1822-1900). While the Cournot model of duopoly assumes that each of the two firms decides on what quantity to produce, the Bertrand model assumes that each firm decides on what price 13 Duopoly 241 to charge. As we shall see, this approach can lead to a very interesting but somewhat unrealistic model, with implications that are very different than the implications of the Cournot model. Or it can lead to a model that is perhaps more realistic than Cournot’s, but with implications similar to Cournot’s. This difference arises because we can develop the Bertrand analysis in either of two ways: (1) We can assume, as we assumed for the Cournot model, that the two firms are producing exactly the same good. That is, whether produced by firm 1 or firm 2, a unit of the good is a unit of the good, as far as the buyers are concerned. This is the property of homogeneity. Commodities like electricity, oil, metals, or wheat, are homogeneous; whether produced by firm 1 or firm 2, a gallon of fuel oil is a gallon of fuel oil. For a homogeneous good being produced and sold by two firms, there can be only one price; if firm 1 tries to sell it at a slightly higher price than firm 2, its sales drop to zero. (2) Alternatively, we can assume that the two firms produce goods which are similar but slightly different, or differentiated. Think of Coke and Pepsi, McDonald’s and Burger King, Bud Light and Miller Lite, or Schick and Gillette. If the two firms produce goods which are differentiated, they can charge different prices, and in fact commonly do so. Each is likely to claim that its good is both better and less expensive. Homogeneous goods case. Let us now assume that firms 1 and 2 produce exactly the same good. For simplicity, we will assume that the two firms have identical constant marginal cost functions. Let MC represent the marginal cost of producing a unit of the good, and let MC be the same for both firms and constant over all output levels. Let y1 and y2 represent the production levels of the two firms. We assume the firms set prices p1 and p2, and then sell their output to meet demand. Since the good is homogeneous, if firm i sets a lower price than firm j, then all the customers will buy from firm i. This implies that in any equilibrium where both firms are operating, we must have p1 = p2 = p, where p is the (single) market price. If there is one price p, the market demand curve is given by a function y = y(p). We will assume that if both firms are charging the same price p, they will split the market demand equally—each will sell y1 = y2 = y(p)/2 units of the good. This model cannot be solved using standard calculus techniques. This is because although the function y(p) is well behaved, firm i’s demand function yi(pi) is not. It has a sharp discontinuity 13 Duopoly 242 when pi equals pj . If pi < pj, demand for firm i is y(pi); if pi = pj, demand for firm i is y(pi)/2; and if pi > pj, demand for firm i is zero. Since we cannot use standard calculus techniques, we must reason along more abstract lines. Recall our definition of a Cournot equilibrium from Section 2 above. In a model where the two duopolists are reacting to each other by setting quantities, a Cournot equilibrium is a pair of output levels y∗1 and y∗2 that are consistent, in the sense that each firm i is maximizing its profit at y∗i , subject to what the other firm j has chosen, y∗j . Let us now define an equilibrium in a similar way, but for the current model where the two duopolists are reacting to each other by setting prices. A Bertrand equilibrium is a pair of prices p∗1 and p∗2 that are consistent, in the sense that each firm i is maximizing its profit with the choice of p∗i , subject to what the other firm j has chosen, p∗j . What can we say about a Bertrand equilibrium in the homogeneous goods case? Let (p∗1, p∗2) represent the equilibrium prices and let (y∗1, y∗2) the corresponding equilibrium quantities. Here’s what we can conclude: (1) The firms must be charging the same price. That is, p∗1 = p∗2 = p∗. Suppose to the contrary that they are charging different prices, and without loss of generality, assume p∗1 < p∗2. Then firm 1 is selling a positive quantity of the good, and firm 2 is selling nothing. (a) If p∗1 < MC, then firm 1 has negative profits and would be better off shutting down. So this cannot be an equilibrium. (b) If p∗1 = MC, firm 1 is making $0 on each unit it produces and sells. It could increase its price somewhat, while keeping it below p∗2, and make positive amounts on all the units it sells. (It would sell fewer units, but it would make money on each one.) So this cannot be an equilibrium. (c) If p∗1 > MC, firm 1 is making positive profits on all the units it produces and sells. But if this were the case, firm 2 would gain by entering the market with a price strictly between MC and p∗1, taking all of firm 1’s customers away. So this cannot be an equilibrium either. We have established that p∗1 6= p∗2 implies we cannot have a Bertrand equilibrium. There- fore, at a Bertrand equilibrium, we must have p∗1 = p∗2 = p∗. Since we have assumed that 13 Duopoly 243 demand is split equally between the two firms when their prices are the same, therefore y∗1 = y∗2 = y(p∗)/2. (2) Marginal cost cannot be less than price; that is, MC < p∗ cannot hold. Here’s why. Suppose the inequality held. Assume for concreteness that MC = 25 and that p∗1 = p∗2 = p∗ = 26. Then either firm, say, firm 1, could shave its price to p1 = 25.99. By doing so, it would steal away all of firm 2’s customers (half of the total market), make almost a dollar profit on each of those sales, while giving up a penny’s profit on each of the sales it already had (half of the total market). Its profits would obviously go way up. This contradicts our assumption that firm 1 is choosing a price that maximizes its profit subject to what firm 2 has chosen. (3) Marginal cost cannot be greater than price; that is, MC > p∗ cannot hold. With constant marginal costs, for either firm i, MC > p∗ = p∗i would imply negative profits, and the firm would opt to go out of business, rather than sell y∗i at a price of p∗. (4) For both firms, marginal cost equals price; that is, MC = p∗ = p∗1 = p∗2. This obviously follows from (2) and (3). We conclude that in a Bertrand equilibrium, in the homogeneous good case, under the assumptions we have made, firms 1 and 2 will charge the same price, and the price will be equal to marginal cost. But this means that the duopoly market, in the Bertrand model with a homogeneous good, looks just like a competitive market. In particular, there is no inefficiency (no loss of social surplus) in the duopoly market. Differentiated goods case. Now we assume that firms 1 and 2 produce goods that are differentiated—similar, but not identical. (Think of McDonald’s and Burger King.) We continue to let y1 and y2 represent the outputs of the two firms, and p1 and p2 represent the prices. Since the goods are different, pi < pj does not imply that firm j’s sales will drop to zero, and the firms will not be forced to charge the same price in equilibrium. Now there are demand functions for each of the two firms that depend on the two prices: y1 = y1(p1, p2) and y2 = y2(p1, p2). For firm i’s demand function yi, the partial derivative with respect to pi is assumed to be negative (as i raises its price, demand for i’s good falls), but the 13 Duopoly 244 partial derivative with respect to pj is positive (as i’s competitor j raises its price, demand for i’s good rises). Firm 1’s profit is pi1(p1, p2) = p1y1(p1, p2)−C1(y1(p1, p2)), and firm 2’s profit is pi2(p1, p2) = p2y2(p1, p2)−C2(y2(p1, p2). We write these as functions of the two prices (p1, p2) because each firm chooses its own price, rather than its own quantity as in the Cournot model. Firm 1 chooses p1 to maximize pi1(p1, p2), taking p2 as given and fixed; firm 2 chooses p2 to maximize pi2(p1, p2), taking p1 as given and fixed. Firm 1’s first order condition is ∂pi1 ∂p1 = y1(p1, p2) + ∂y1(p1, p2)∂p1 p1 − dC1(y1) dy1 ∂y1(p1, p2) ∂p1 = 0. We can write this more compactly as ∂pi1 ∂p1 = y1 + ∂y1 ∂p1 p1 −MC1 ∂y1 ∂p1 = 0. Similarly, firm 2’s first order condition is ∂pi2 ∂p2 = y2 + ∂y2 ∂p2 p2 −MC2 ∂y2 ∂p2 = 0. Recall that when we analyzed the Cournot model, we used the two firms’ first order conditions to derive reaction functions. We can do the same here, using firm i’s first order condition to find a reaction function that shows firm i’s profit-maximizing price as a function of firm j’s price: pi = ri(pj). To find a Bertrand equilibrium, we look for a pair of prices p∗1 and p∗2 that are consistent in the sense that each firm i is maximizing its profit with the choice of p∗i , subject to what the other firm j has chosen, p∗j . We will again let y∗1 and y∗2 represent the corresponding equilibrium quantities. In the differentiated goods case, we use the first order conditions for profit maximization (or the reaction functions) to find the equilibrium. We will illustrate with an example below. Before we do, however, let’s use firm 1’s reaction function to show that in the differentiated goods Bertrand model (unlike the homogeneous good Bertrand model), at equilibrium, the price will be greater than marginal cost. 13 Duopoly 245 Here’s why. At the equilibrium, firm 1’s first order condition must be satisfied, which gives: y∗1 + ∂y1 ∂p1 p∗1 −MC1(y∗1)∂y1∂p1 = 0. Rearranging gives: y∗1 = − ∂y1 ∂p1 (p∗1 −MC1(y∗1)). But y∗1 is positive, ∂y1 ∂p1 is negative, and therefore p ∗ 1 −MC1(y∗1) is positive, or p∗1 > MC1(y∗1). That is, in the differentiated goods case, at the Bertrand equilibrium, price is greater than marginal cost for firm 1, and similarly for firm 2. In short, in the differentiated goods case, a Bertrand equilibrium has social welfare properties similar to the Cournot equilibrium discussed in Section 2 above. Example 3. We again assume the cost curves are C1(y1) = 25y1 and C2(y2) = 25y2, and so MC1 = MC2 = 25. We now assume the demand functions for firms 1 and 2 are y1(p1, p2) = 50− p1 + p2/2 and y2(p1, p2) = 50− p2 + p1/2. Firm 1 wants to choose p1 to maximize pi1(p1, p2) = p1(50− p1 + p2/2)− 25(50− p1 + p2/2), taking p2 as given. The first order condition gives firm 1’s reaction function, p1 = r1(p2) = 75/2 + p2/4. Similarly, firm 2 wants to choose p2 to maximize pi2(p1, p2) = p2(50− p2 + p1/2)− 25(50− p2 + p1/2). The first order condition leads to firm 2’s reaction function, p2 = r2(p1) = 75/2 + p1/4. 13 Duopoly 246 Solving the two reaction functions (or the two first order conditions) simultaneously gives the Bertrand equilibrium prices of p∗1 = 50 and p∗2 = 50. The equilibrium quantities are y∗1 = 25 and y∗1 = 25. Equilibrium profit levels are easily calculated. For firm 1, pi(p∗1, p∗2) = p∗1y∗1 −C1(y∗1) = 50× 25− 25× 25 = 625, and similarly for firm 2. Figure 13.6 below shows the reaction functions and the equilibrium prices for Example 3. INSERT FIGURE 13.6 HERE Caption of Fig. 13.6: The Bertrand equilibrium in Example 3. 13.7 A Solved Problem The Problem Recall the inverse demand function assumed in Example 1: p(y1 + y2) = 100− y1 − y2. The cost functions in that example were C1(y1) = 25y1 and C2(y2) = 25y2. In order to find the Cournot equilibrium, we used the reaction functions of firms 1 and 2: y1 = r1(y2) = 37.5− y2/2, and y2 = r2(y1) = 37.5− y1/2. Recall from Example 1 that the Cournot equilibrium was (y∗1 , y∗2) = (25, 25), and the profit levels at the Cournot equilibrium were (pi1, pi2) = (625, 625). (a) Now assume firm 1 is a Stackelberg leader, and firm 2 is a Stackelberg follower. Calculate the Stackelberg equilibrium price and quantities. Also find the Stackelberg equilibrium profit levels. (b) What would happen if both firms believed that they were Stackelberg leaders? The Solution 13 Duopoly 247 (a) If firm 1 is a Stackelberg leader and firm 2 is a Stackelberg follower, firm 2 acts like a standard Cournot firm; it takes firm 1’s output y1 as given and fixed and chooses its own output in response. In other words, it chooses y2 to maximize pi2(y1, y2), under the assumption that y1 is constant. This means that it derives and uses its reaction function y2 = r2(y1) = 37.5− y1/2. Firm 1, on the other hand, knows that y2 is not fixed; in fact, firm 1 knows exactly how firm 2 chooses y2 based on y1. In other words, firm 1 knows firm 2’s reaction function and exploits that knowledge. We can now write firm 1’s profit function as pi1(y1, y2) = p(y1+y2)y1−C1(y1) = p(y1+r2(y1))y1−C1(y1) = (100−y1−r2(y1))y1−25y1. The next step is to substitute for r2(y1), using r2(y1) = 37.5− y1/2. After a little minor algebra and some rearranging, we get pi1(y1) = 37.5y1−y21/2. We differentiate this function, set the result equal to zero and solve, which produces y∗1 = 37.5. The ∗ denotes the Stackelberg equilibrium quantity for firm 1, the leader firm. Plugging in y∗1 into firm 2’s reaction function will allow us to solve for the follower firm’s Stackelberg equilibrium quantity: y∗2 = 37.5− y∗1/2 = 18.75. To find the Stackelberg equilibrium price, we insert y∗1 and y∗2 in the inverse demand function, which gives p∗ = 100 − 37.5 − 18.75 = 43.75. Profit levels are given by pi∗1 = p∗y∗1 − 25y∗1 = (p∗ − 25)y∗1 = (43.75− 25)37.5 = 703.13. Similarly, pi∗2 = (p∗ − 25)y∗2 = (43.75− 25)18.75 = 351.56. (b) If firm 1 is a Stackelberg leader, it knows firm 2’s reaction function and takes advantage of it. In part (a), we found that this line of reasoning would lead firm 1 to choose y∗1 = 37.5. Let us now assume that firm 2 also acts like a Stackelberg leader. It figures out firm 1’s reaction function y1 = r1(y2) = 37.5 − y2/2, plugs this into its own profit function pi2 = (100 − r1(y2) − p2)y2 − C2(y2), and maximizes, which leads to y∗2 = 37.5. Now the market price would be p∗ = 100− 37.5− 37.5 = 25 and profit levels for each firm is pi∗i = (25− 25)37.5 = 0. In their quest for leadership, they end up with nothing. This situation, however, is quite peculiar in that firm 1 believes firm 2 is choosing its output according to its reaction function, and firm 2 believes firm 1 is choosing its output according to its reaction function. But both beliefs are false. (We could call this situation 13 Duopoly 248 a Stackelberg “disequilibrium.”) 13 Duopoly 249 Exercises 1. The Corleone and Chung families are the only providers of good h in the U.S. The market demand for good h is h = 1, 200 − 20p. The costs of production for each of them are represented by the cost functions C1(h1) = 10h1 and C2(h2) = 20h2, respectively. Suppose both families must choose their output levels simultaneously. (a) Derive their reaction functions. (b) Calculate the Cournot equilibrium in this market. Indicate output levels, market price, and individual profits. 2. Consider the Corleone and Chung families from Question 1. Suppose the two families sign an agreement to restrict the amount of good h in the market. By doing this, market price and profits will increase. Suppose the agreement specifies that given the cost differential, Corleone will receive 3/5 of the total profits and Chung will receive 2/5. (a) Find the solution to this collusion problem. Indicate individual outputs, market price, total profits, and individual profits. (b) Does either family have an incentive to break the agreement? Who and why? Hint: Remember that the first order conditions obtained from differentiation gives an interior solution. If the first order conditions do not yield a solution, try cases in which one of the firms produces zero. 3. MBI and Pear are the only two producers of computers. MBI started producing computers earlier than Pear. MBI’s costs of production are given by C1(y1) = y21. Pear’s cost function is C2(y2) = 5y2. The national demand for computers is y = 106 − 105p. (a) Calculate the Stackelberg equilibrium in which MBI is the leader in this market. Indicate output levels, market price, and the profits of each firm. (b) Suppose that both firms enter this market at the same time. Calculate the Cournot equilibrium and compare it to the situation in part (a). 13 Duopoly 250 4. Reuben and Simeon are duopolists producing jeans in a differentiated goods market. The market demand for Reuben’s jeans is y1 = 80 − p1 + 12p2, while the market demand for Simeon’s jeans is y2 = 160 − p2 + 12p1. Reuben’s cost function is C1(y1) = 80y1 while Simeon’s cost function is C2(y2) = 160y2. (a) Calculate the Bertrand equilibrium in this market. Indicate each firm’s price, output level, and profits. (b) Find prices and output levels that would maximize joint profits, and calculate the maximum joint profits. 5. Laban and Jacob are sheep farmers in a differentiated goods market. The market demand for Laban’s sheep wool is y1 = 34− p1 + 13p2, while the market demand for Jacob’s sheep wool is y2 = 40 − p2 + 12p1. Laban’s cost function is C1(y1) = 24y1, while Jacob’s cost function is C2(y2) = 20y2. They compete with each other through their choices of price. (a) Calculate the equilibrium in which Laban is the price leader in this market, and Jacob is the price follower. Indicate prices, output levels, and individual profits. (b) How do prices, output levels, and individual profits change if Jacob is the price leader in this market, and Laban is the price follower? 6. Compare the social welfare properties of the following models of duopoly behavior: simul- taneous quantity setting (Cournot), quantity leadership (Stackelberg), simultaneous price setting (Bertrand, both homogeneous goods and differentiated goods cases), price leader- ship, and collusion. Which model results in the highest output? The lowest output? The highest price? The lowest price? 14 Game Theory 251 14 Game Theory 14.1 Introduction In the last chapter we discussed duopoly markets in which two firms compete to sell a product. In such markets, the firms behave strategically; each firm must think about what the other firm is doing in order to decide what it should do itself. The theory of duopoly was originally developed in the 19th century, but it led to the theory of games in the 20th century. The first major book in game theory, published in 1944, was Theory of Games and Economic Behavior, by John von Neumann (1903-1957) and Oskar Morgenstern (1902-1977). We will return to the contributions of Von Neumann and Morgenstern in Chapter 19, on uncertainty and expected utility. A group of people (or teams, firms, armies, countries) are in a game if their decision problems are interdependent, in the sense that the actions that all of them take influence the outcomes for everyone. Game theory is the study of games; it can also be called interactive decision theory. Many real-life interactions can be viewed as games. Obviously football, soccer, and baseball games are games. But so are the interactions of duopolists, the political campaigns between parties before an election, and the interactions of armed forces and countries. Even some interactions between animal or plant species in nature can be modeled as games. In fact, game theory has been used in many different fields in recent decades, including economics, political science, psychology, sociology, computer science, and biology. This brief chapter is not meant to replace a formal course in game theory; it is only an introduction. The general emphasis is on how strategic behavior affects the interactions among rational players in a game. We will provide some basic definitions, and we will discuss a number of well-known simple examples. We will start with a description of the prisoners’ dilemma game, where we will introduce the idea of a dominant strategy equilibrium. We will briefly discuss repeated games in the prisoners’ dilemma context, and tit for tat strategies. Then we will describe the battle of the sexes game, and introduce the concept of Nash equilibrium. We will discuss the possibilities of there being multiple Nash equilibria, or no (pure strategy) Nash equilibria, and we discuss the idea of mixed strategy equilibria. We will then present an expanded battle of the sexes game, and we will see that in game theory, an expansion of choices may make players worse off instead of better off. At the end of the chapter, we will describe sequential 14 Game Theory 252 move games, and we will briefly discuss threats. 14.2 The Prisoners’ Dilemma, and the Idea of Dominant Strategy Equilib- rium The most well-known example in game theory is the prisoners’ dilemma. (It was developed around 1950 by Merrill M. Flood (1908-1991) and Melvin Dresher (1911-1992) of the RAND Corporation. It was so-named by Albert W. Tucker (1905-1995), a Princeton University math- ematics professor.) Consider the following. A crime is committed. The police arrive at the scene and arrest two suspects. Each of the suspects is taken to the police station for interrogation, and they are placed in separate cells. The cells are cold and nasty. The police interrogate them separately, and without any lawyers present. A police officer tells each one: “You can keep your mouth shut and refuse to testify. Or, you can confess and testify at trial.” We use some special and potentially confusing terminology to describe this choice. If a suspect refuses to testify, we say that he has chosen to cooperate with his fellow suspect. If a suspect confesses and testifies at trial, we say that he has chosen to defect from his fellow suspect. The reader will need to remember that to “cooperate” means to cooperate with the other suspect, not with the police, and also to remember that to “defect” means to defect from the other suspect. The officer goes on: “If both of you refuse to testify, we will only be able to convict you on a minor charge, which will result in a sentence of 6 months in prison for each of you. If both of you confess and testify, you will each get 5 years in prison. If one of you refuses to testify (i.e., “cooperates”) while the other confesses and testifies (i.e., “defects”), the one who testifies will go free, and the one who refuses to testify will get a full 10 years in prison.” The officer concludes: “That’s what we’re offering you, you lowlife hooligan. Think it over. We’ll be back tomorrow to hear what you have to say.” We now consider this question: given this information, how should a rational suspect behave? Should the suspects “cooperate” with each other (and tell the police nothing) or should they “defect” from each other (and confess)? Table 14.1 below shows the prisoners’ dilemma game. In game theory, the people playing 14 Game Theory 253 the game are called players, so we now refer to our suspects as players. Player 1 chooses the rows in the table, while player 2 chooses the columns. Each of them has two possible actions to choose: “Cooperate” or “Defect.” Each of the four action combinations results in payoffs to each player, in the form of prison time to be served. The outcomes are shown as the vectors in the cells of Table 14.1. The first entry is always the outcome for player 1, and the second is always the outcome for player 2. For instance, if player 1 defects while player 2 cooperates (bottom row, left column of the table), prison time for player 1 is None, and prison time for player 2 is 10 years. Note that these outcomes are “bads” rather than “goods”; each player wants to minimize his outcome. Player 2 Cooperate Defect Player 1 Cooperate 6 months, 6 months 10 years, None Defect None, 10 years 5 years, 5 years Table 14.1: The prisoners’ dilemma. Each suspect wants to minimize his own jail time. But each must think about what the other suspect will do. Let us now analyze the problem carefully. Here’s how player 1 thinks about the game. He considers what player 2 might do. If player 2 cooperates, they are in the first column of the table. In this case, player 1 gets 6 months if he cooperates (first row), and no prison time if he defects (second row). Therefore, if player 2 cooperates, player 1 will defect. On the other hand, if player 2 defects, they are in the second column of the table. In this case, player 1 gets 10 years if he cooperates (first row), and 5 years if he defects (second row). Therefore if player 2 defects, player 1 will defect. We now realize that whatever action player 2 chooses, player 1 will want to defect. We leave it to the reader to do the same type of analysis for player 2, whose payoffs are the second entries in each of the payoff vectors. When you do this, you will conclude that player 2 will want to defect, whatever action player 1 chooses. In a game like this, actions that players might take are called strategies. A dominant strategy is a strategy which is optimal for a player, no matter what strategy the other player is choosing. 14 Game Theory 254 In the prisoners’ dilemma, the best thing for player 1 to do is to defect, no matter what player 2 might do. Therefore “Defect” is a dominant strategy for player 1. Similarly, “Defect” is a dominant strategy for player 2. When a pair of strategies are each dominant for the two players, the pair is called a dominant strategy equilibrium or a solution in dominant strategies. We now know that (Defect, Defect) is a dominant strategy equilibrium in the prisoners’ dilemma. Rational players should choose dominant strategies if they exist; they clearly make sense, since a dominant strategy is the best for a player no matter what the other player is doing. We conclude that the two suspects should both confess to the police, or defect from each other. Therefore they will each end up with a prison sentence of 5 years. Between the two of them, the total will be 10 years of prison. But this outcome is very peculiar, because if they had both chosen to keep their mouths shut, or cooperate with each other, they would have ended up with prison sentences of only 6 months each, and a total of 1 year between the two of them. Back in Chapter 11 on perfectly competitive markets, we introduced the reader to Adam Smith’s free market philosophy—his invisible hand theory. In brief, this is the theory that if the market is allowed to operate freely, with each consumer seeking to maximize his own utility and each firm seeking to maximize its own profits, with each of the players in the grand market game ignoring the welfare of all the others and doing the best it can for itself, the outcome will actually be best for society. That is, self-interested consumers and firms in a competitive market will end up maximizing social surplus, the sum of consumers’ and producers’ surplus. But now note the dramatically different conclusion in the prisoners’ dilemma. In this game, where we are focusing on the outcomes for the two suspects and ignoring the welfare of the police officers, the victims of the original crime, and the rest of society, the obvious and simple measure of social welfare for our two suspects is −1 times the sum of the two prison sentences. (We need the −1 to convert a cost—prison time—into a benefit.) But our analysis above indicates that each player, pursuing his own self-interest, maximizing his own welfare by minimizing his years in prison, will choose “Defect.” They will end up with a total of 10 years of prison between the two of them. If they had gotten together and determined what would be best for them, and if they had had some way to enforce their agreement, they would have decided on (Cooperate, Cooperate) instead. That would have resulted in a total of 1 year of prison between the two of them. But (Cooperate, Cooperate) is not an equilibrium in the prisoners’ dilemma, and even if they had agreed to keep silent before they were arrested by the police, they would likely have 14 Game Theory 255 confessed anyway, because of the ever-present incentive to break such an agreement. The moral of the story is important. In a game, because of the strategic interactions, pursuing individual self-interest may be inconsistent with maximizing social welfare. This matters in evaluating the performance of market institutions in these contexts. We saw in our analysis of duopoly in Chapter 13 that the Cournot equilibrium would not maximize the joint profits of the two duopolists. There are many other examples where strategic interactions result in individual players’ pursuit of private gains producing a loss to the group of players. Famous examples include international arms races, and overutilization of natural resources like fisheries. In these examples, dominant strategies lead to socially undesirable outcomes. The prisoners’ dilemma clearly illustrates the problem—the tension that may exist between self-interest and cooperation. These are two of the key forces in game theory and in reality. 14.3 Prisoners’ Dilemma Complications: Experimental Evidence and Re- peated Games We have argued above that (Defect, Defect) is a dominant strategy equilibrium in the prisoners’ dilemma game. But social scientists have performed experiments to see whether people actually choose the “defect” strategy. (These people are usually university students paid to be exper- imental subjects in a lab setting.) Often they don’t; often they choose “Cooperate” instead. There are many reasons why this might happen. Subjects may not understand the game, or they simply may not act in the “rational” way that game theorists say they should act. For instance, they might choose “Cooperate” because they believe cooperating is morally preferable to defecting, no matter what the payoffs are. Perhaps game theory is wrong in the sense that it does not correspond to how people actually behave. Another possibility is that the game theory model described above is incomplete. Perhaps we have left something out. This possibility of incompleteness has led some game theorists to expand the model. One of the most important expansions is the idea of repeated games. A one shot game is a game that is played once. The players choose their strategies, there is an outcome and there are payoffs, and that’s that. A repeated game is played over and over. The players choose their strategies, there is an outcome and there are payoffs. Then they do it again. And again. And perhaps again. A repeated game might repeat n times, where n is 14 Game Theory 256 known beforehand, or it might repeat n times, where n is not known beforehand, or it might repeat an infinite number of times. Now suppose our prisoners’ dilemma is a repeated game and the players do not know n, but think that n might be large. Then a player may choose “Cooperate,” knowing it may cost him in the short run (the current game), but believing that if he chooses “Cooperate,” the other player will be more likely to also choose “Cooperate” in future plays of the game. Similarly, if one player chooses “Defect” in the current game, he may fear that the other player will punish him by defecting in the future. Under certain conditions—if future payoffs matter enough— (Cooperate, Cooperate) is an equilibrium in the repeated prisoners’ dilemma. The moral of the story is that we may see cooperation in situations like the prisoners’ dilemma, where simple game theory indicates we should see defection, not because people are good-hearted or virtuous, but because of a dynamic social contract: “Let’s cooperate with each other now and get good payoffs; for if we don’t, in future periods we’ll punish each other and get bad payoffs.” Players may also develop retaliatory repeated game strategies affecting their choices within a game, contingent on what has happened in prior periods in the game. One of the simplest is called “tit for tat.” The tit for tat repeated game strategy works like this. In the first period of the game, the player chooses “Cooperate.” In any subsequent period, the player looks at his opponent’s action in the previous period of the game. If the opponent chose “Cooperate” in the previous period, then the player chooses “Cooperate” in the current period; if the opponent chose “Defect” in the previous period, then the player chooses “Defect” in the current period. In short, the player matches what his opponent did in the last period of the game. This kind of repeated game strategy might be described as “crazy” or “tough,” but it might also be very effective. Under certain conditions, it can be shown that if player 1 plays “tit for tat,” there may be an equilibrium in which both players are choosing “Cooperate” most of the time. One lesson here is that it may sometimes be in the interest of people to have reputations as being “crazy” or “tough,” in order to induce beneficial changes in the behavior of others. The moral of this story is that game theory can sometimes improve its predictions in ex- plaining real-world phenomena by expanding its models. 14 Game Theory 257 14.4 The Battle of the Sexes, and the Idea of Nash Equilibrium Most games are not as simple to solve as the prisoners’ dilemma. That is, in most strategic situations, players do not have dominant strategies. In general, what each player will want to do will depend on what the other players are doing. Consequently, each player’s conjectures about the behavior of the other players are crucial for determining his own behavior. For example, remember the first duopoly game of the last chapter, and its solution, the Cournot equilibrium (y∗1, y∗2). (Here y∗1 is firm 1’s output, and y∗2 is firm 2’s.) It is obvious that the Cournot equilibrium is not a dominant strategy equilibrium. If firm 2 decided to flood the market with product and drive the price down to zero, for example, firm 1 would not choose y∗1. Rather, firm 1 would produce zero and save its production costs. This shows that producing y∗1 is not a dominant strategy for firm 1. The same argument applies to firm 2. We will now analyze a new game, the battle of the sexes. This was first studied by R. Duncan Luce (1925-) and Howard Raiffa (1924-), in their 1957 book Games and Decisions: Introduction and Critical Survey. A young woman (player 1) and her boyfriend (player 2) are out on Saturday night, driving in their own cars, on their way to meet each other for an evening together. Since this game was invented long before cellphones were around, they cannot communicate with each other. There are two options that they had talked about previously: a football game and an opera performance. But neither one of them can recall which option they had decided on. They like each other very much, and both would hate to spend the evening without the other. The young woman likes opera much better than football, but her boyfriend likes football better than opera. If the woman ends up at the opera with her boyfriend, her payoff is 3. But her payoff is 0 if she ends up at the opera without him. If the woman ends up at the football game with her boyfriend, her payoff is 1. But her payoff is 0 if she ends up at the football game without him. Similarly for the young man, if he ends up at the football game with her, his payoff is 3; if he ends up at the opera with her, his payoff is 1; and if he ends up at either place without her, his payoff is 0. Table 14.2 shows the game. The rows of the table are the woman’s possible strategies, and the columns are the man’s. In other words, the woman chooses the row, and the man chooses the column. Each vector in each cell of the table shows the payoffs to the two players. For 14 Game Theory 258 instance, if both of them choose football, they are in the first row, first column cell of the table. The payoff to the woman is then 1, and the payoff to the man is 3. Note that these payoffs, unlike the payoffs in the prisoners’ dilemma game, are “goods” rather than “bads”; each player want to maximize rather than minimize her/his outcome. Man Football Opera Woman Football 1, 3 0, 0 Opera 0, 0 3, 1 Table 14:2 The battle of the sexes. What predictions can we make about this game? First of all, note that there are no dominant strategies. For either player, “Football” is better if she/he expects the other to choose “Football,” but “Opera” is better if she/he expects the other to choose “Opera.” The standard equilibrium concept in the battle of the sexes is the Nash equilibrium, named for the famous 20th century economist, mathematician, and game theorist John Nash (1928-). A Nash equilibrium is a pair of strategies, one for each player, such that player 1’s strategy is the best for her given player 2’s strategy, and such that player 2’s strategy is the best for him given player 1’s strategy. Each player’s strategy is a best response to the other’s. The reader should note that a Cournot equilibrium in a duopoly model is a Nash equilibrium, and a Bertrand equilibrium in a duopoly model is also a Nash equilibrium in the corresponding duopoly game. Moreover, any dominant strategy equilibrium is a Nash equilibrium. For example, (Defect, Defect) in the prisoners’ dilemma is also a Nash equilibrium. This is because a dominant strategy for a player is always a best response for that player; therefore it is the best response when his opponent is playing his dominant strategy. But the reverse doesn’t hold; and there will generally be Nash equilibria in a game that are not dominant strategy equilibria. Remember that there are no dominant strategies in our battle of the sexes, and therefore no dominant strategy equilibria. What about the existence of Nash equilibria in the battle of the sexes game? There are two Nash equilibria in the battle of the sexes: (Football, Football) with payoffs (1,3), and (Opera, Opera) with payoffs (3,1). Here is why (Football, Football) is a Nash equilib- rium. (The argument for (Opera, Opera) is entirely symmetric.) If player 1 expects player 2 to 14 Game Theory 259 drive to the football game, that’s what she will choose as well, because a payoff of 1 is greater than a payoff of 0. And if player 2 expects player 1 to drive to the football game, that’s what he will choose as well, because a payoff of 3 is greater than a payoff of 0. Each Nash equilibrium is a theory of how the game should be played, consistent with assumed rationality of the players and the mutual knowledge of that rationality. It seems plausible to predict that player 1 and her boyfriend will end up at a Nash equilibrium in this game, or at least that they ought to end up at a Nash equilibrium. It is certainly the case that at the planning stages of the game, when the players are talking to each other about going to a football game or going to the opera, they are only considering going to the same event together. That is, these rational players, in planning this game, would agree that the non-Nash outcomes are undesirable, and that the Nash equilibria, even though one is inferior to the other in each player’s eyes, are reasonable in the sense that neither player would want to break an agreement to be at such an outcome. 14.5 Battle of the Sexes Complications: Multiple or No Nash Equilibria, and Mixed Strategies From the battle of the sexes, we see that there may be multiple Nash equilibria. So the Nash equilibrium concept may have some predictive power—(Football, Football) and (Opera, Opera) seem more likely than (Football, Opera) and (Opera, Football)—but it may not point to a unique outcome. Moreover, in this game, the players may end up at a non-Nash outcome by accident, if not by intent. That is, even if our young woman and her boyfriend know exactly what their preferences are, and are completely informed about Table 14.2 and the Nash equilibria in that table, they just don’t remember which event they had planned to attend, and they have no cellphones with which to communicate. Therefore they may end up apart, even though their feelings toward each other, and the power of Nash reasoning, say they should be together. And things may get even trickier. There may be no equilibria of the kind we have been describing. Consider the following strangely modified battle of the sexes: Let the two players have the same payoffs as before when they are coordinated. That is, when they choose (Football, Football) and (Opera, Opera), the payoffs are (1, 3) and (3, 1), respectively. But when they are 14 Game Theory 260 miscoordinated, and choose (Football, Opera) or (Opera, Football), they won’t get payoffs of (0, 0). Rather, they will get the following: at (Football, Opera) the payoffs are (4,−4), and at (Opera, Football) the payoffs are (2,−2). Here’s a possible explanation for these payoffs. At the miscoordinated pairs of strategies, the totals of the payoffs to the young woman and her boyfriend are zero, as they were previously. The boyfriend’s payoffs are simple to explain. He’s happiest (payoff 3) when they are together at the football game, less happy (payoff 1) when they are together at the opera, even less happy (payoff -2) when he’s alone at the football game, and miserable (payoff -4) when he’s alone at the opera. It is more difficult to explain the young woman’s preferences, perhaps because women are more complex. When she and her boyfriend are together, she is happier at the opera (payoff 3) than at the football game (payoff 1). However if they are miscoordinated and she is at the football game by herself, she is happiest (payoff 4). This surprising payoff is because she feels that although she loves opera and her boyfriend, it would be really good for her to be forced to learn something about football, and for him to be forced to learn something about opera. If she is at the opera by herself, her payoff is 2, not as good as being at the opera with him (payoff 3), but better than being at the football game with him (payoff 1). Payoffs in the strangely modified battle of the sexes are shown in Table 14.3 below. Man Football Opera Woman Football 1, 3 4, -4 Opera 2, -2 3, 1 Table 14.3: The strangely modified battle of the sexes. When we examine the table of payoffs in Table 14.3, we see the following. From the upper left cell, player 1 would want to move down to the lower left cell. From the lower left cell, player 2 would want to move right to the lower right cell. From the lower right cell, player 1 would want to move up to the upper right cell. From the upper right cell, player 2 would want to move left to the upper left cell. In short, at every pair of strategies, one of the players would be unhappy and would want to change her or his strategy. Therefore, at least based on our definition of Nash equilibrium to this point, there is no Nash equilibrium in this game. 14 Game Theory 261 In fact, our definition of Nash equilibrium up to now has assumed that a player can only choose a single strategy with certainty. Player 1, for instance, can choose either “Football” or “Opera.” If she chooses “Football,” she goes to the football game for sure. Going to the football game for sure is called a pure strategy. The games we have been discussing to this point allow only pure strategies. Player 1 can go to the opera, or she can go to the football game. That’s it. But there is another way to play games like this. Players might make random choices over pure strategies. For instance, player 1 might decide: “I’m going to flip a coin, and go to the football game if it’s heads, and to the opera if it’s tails.” This means she decides: “I’ll choose “Football” with probability 1/2, and I’ll choose “Opera” with probability 1/2.” This is an example of what is called a mixed strategy. More formally, if there are two pure strategies, say S1 and S2, a mixed strategy is a pair of probabilities, say p1 and p2, chosen by the player and summing to 1, with the player choosing S1 with probability p1 and choosing S2 with probability p2. (Note that any pure strategy is also a mixed strategy, but not vice versa. For example, the pure strategy S1 is the same as the mixed strategy over S1 and S2 with p1 = 1 and p2 = 0.) A pure strategy Nash equilibrium is a Nash equilibrium in which players use pure strategies. A mixed strategy Nash equilibrium is a Nash equilibrium in which players use mixed strategies. What we have shown with the strangely modified battle of the sexes is that there may be no pure strategy Nash equilibrium in a game. In a famous paper written in 1951, John Nash (1928-) proved that under general conditions, any game with a finite number of pure strategies must have at least one mixed strategy equilibrium. It follows that our strangely modified battle game must have a mixed strategy Nash equilibrium, even though it doesn’t have a pure strategy equilibrium. In this chapter we will not discuss how one might find the mixed strategy equilibrium which we know, thanks to Nash, must exist. In the rest of this chapter we will return to our focus on pure strategies and pure strategy equilibria. 14.6 The Expanded Battle of the Sexes, When More Choices Make Players Worse Off In the decision problem for an individual consumer or firm, the expansion of the set of feasible actions has a clear effect—the decision maker cannot end up worse off than before, and will 14 Game Theory 262 likely end up better off. Consider, for example, the basic consumer choice model. When the budget set expands, whether because of an increase in income with prices fixed, or because of a fall in prices with income fixed, the consumer will generally be better off, and will definitely not be worse off. In this section, we shall see that this basic property—expansion of the choice set is a good thing for the decision maker—may not hold in a strategic situation. We now turn to an expanded battle of the sexes game. Here is the story. After the original battle of the sexes described above (not the strangely modified version), some weeks pass. Our couple gets into a fight. They are mad at each other, but they are still together. Another Saturday rolls around, and it’s time for another date. The old options of football and opera are still there, and our young woman and her boyfriend have exactly the same feelings they used to have about those options. But there is a new option available to them: the player can stay at home, and deliberately stand up her/his date. (We are assuming the two live separately, so if one stays at home, the other doesn’t immediately observe it.) If the woman stays at home and the boyfriend goes out, we will assume she gets a payoff of 2. (This is the satisfaction of hurting her boyfriend.) And we will assume the boyfriend gets a payoff of -1. (This is the pain from discovering he was deliberately stood up.) Similarly, if the boyfriend stays at home and she goes out, we assume he gets a payoff of 2 and she gets a payoff of -1. If they both stay at home, we assume a payoff of 0 to each. Table 14.4 below shows the table of payoff vectors. Note that the payoffs are exactly the same as they used to be for the four pairs of strategies in the original battle of the sexes, as shown in Table 14.2. What’s new are the third row in Table 14.4, based on player 1 staying at home, and the third column, based on player 2 staying at home. Everything that player 1 and player 2 used to be able to do, they can still do. But now they have more options. The table showing the possible payoff vectors is now 3 by 3 instead of 2 by 2; it has 9 cells instead of 4. Each of the new cells looks worse for both players than at least one of the old cells. Man Football Opera Stay Home Woman Football 1, 3 0, 0 -1, 2 Opera 0, 0 3, 1 -1, 2 Stay Home 2, -1 2, -1 0, 0 14 Game Theory 263 Table 14.4: The expanded battle of the sexes. We have expanded the options available to the two players. But whatever was available to them in the past is still available. What are the effects of this expansion of choices? First, it is easy to see that the old Nash equilibria, of the original battle of the sexes game, are no longer Nash equilibria in this new game. Take, for instance, the pair of strategies (Football, Football). Now if the woman expects her boyfriend to drive to the football game, her best response is no longer to drive to the football game and meet him there, which would have given her a payoff of 1. Rather, she will stay at home, which will give her a payoff of 2. Similarly, the pair of strategies (Opera, Opera) is no longer a Nash equilibrium, since now the man (whose payoff is 1 at (Opera, Opera)) prefers to stay home, which will give him a payoff of 2. In fact, the only Nash equilibrium in the expanded battle of the sexes is (Stay Home, Stay Home), which has payoffs of 0 for both players. Let’s check this. If the woman expects her boyfriend to stay home, she looks at the third column of Table 14.4. She gets a payoff of -1 if she goes to the football game, a payoff of -1 if she goes to the opera, and a payoff of 0 if she stays home also. Her best response is therefore to stay home. The argument is symmetric for the man. If he thinks she is staying home, his best response is to stay home also. Therefore (Stay Home, Stay Home) is a Nash equilibrium. It is easy to see that any of the pairs of strategies where one person goes out and the other person stays at home cannot be a Nash equilibrium. We’ll leave it to the reader to check this. The addition of a new strategy has had a major effect in the battle of the sexes. It has demoted the original pair of Nash equilibria—they are no longer Nash equilibria. It has created a new Nash equilibrium, which is now the only equilibrium in the game. Moreover, at the new Nash equilibrium, the payoff vector (0, 0) is worse for both players than the original Nash equilibrium payoff vectors of (1, 3) and (3, 1). The expansion of choices has had the effect of making both players worse off at the Nash equilibria. What produced this strange result? The addition of the new choice led to a very different strategic situation, which undermined the original Nash equilibria and paradoxically elevated a new, and worse, equilibrium. 14 Game Theory 264 14.7 Sequential Move Games All the games we have presented so far are simultaneous move games. This means that (at least in theory) the two players choose their strategies at the same time, each one not knowing what the other is choosing. Then there is an outcome, and payoffs are made. (The repeated games we mentioned in Section 3 above were sequences of simultaneous move games, with payoffs made at the end of each game in the sequence.) However there are other games in which time plays a crucial role, where one player moves first and is observed by the other player who moves second, after which payoffs are made. And there are games where the players make a sequence of moves, alternating turns, with each player observing the other player’s move at each step of the process, and with payoffs made at the end. These games are called sequential move games or sequential games. We will discuss such games in this section. In sequential move games, the conventional wisdom is that there is a first-mover advantage. It is better to move first, because a first move sets the tone for the rest of the game, and the first mover can create the kind of play that she or he wishes. In the game of tic-tac-toe for instance, the first mover seems to have an advantage because he has 9 squares available at his first move, whereas the second mover only has 8 squares. (Studies indicate there is a first-mover advantage in tic-tac-toe for players of average skill—who make errors—but not for expert players. A tic- tac-toe game between experts should result in a tie.) In chess, there is serious debate about whether or not white has a first-move advantage over black. There are studies that indicate white wins a slightly higher proportion of tournament games than black. Some chess experts claim that perfectly played games should result in a draw; others claim that perfectly played games should result in a win for white. In the following examples, we will show that in theory, sequential games do not necessarily provide an advantage to the first mover. Whether there exists a first-mover or a second-mover advantage will depend on the specifics of the game. We now consider a sequential version of what is called the matching pennies game. Generally, in a matching pennies game, two players each place a penny on the table. If the pennies “match,” meaning they are both heads or both tails, a dollar is paid by one of the players to the other player. If the pennies “do not match” (one is a head and the other a tail), the dollar transfer 14 Game Theory 265 goes in the opposite direction. This can be a simultaneous move game (in which case it is like the ancient and familiar odds-and-evens game) or a sequential move game. We will now consider the sequential move game. Assume that player 1 moves first, and must put his penny on the table, either face up (“Heads”) or face down (“Tails”). Player 2 observes this. Then she moves, and puts down her penny, either face up (“Heads”) or face down (“Tails”). The rules of the game require that player 1 pay $1 to player 2 if the pennies match, and that player 2 pay $1 to player 1 if they do not match. Figure 14.1 shows the game in the form of a game tree. A game tree is a diagram with connected nodes and branches. Time flows from left to right in the diagram. At the farthest left is a node, at which the first player to move (player 1 in this case) chooses a strategy. Each strategy is represented by a branch to the right. At the end of each of those branches are new nodes, at which the second player to move chooses her actions. The ultimate payoff vectors appear at the very end of the sequence of nodes and branches. In Figure 14.1, for example, the uppermost sequence of nodes and branches can be read as follows. Player 1 starts the game and chooses heads. Then player 2 goes and chooses heads. Then the game ends, with payoffs to players 1 and 2 of -1 and +1, respectively. INSERT FIGURE 14.1 HERE Caption of Fig. 14.1: The sequential version of matching pennies. To solve a sequential game like this, we apply a procedure called backward induction. This procedure assumes that at each decision node, each player will behave optimally, given his or her theory about how the players will behave at nodes farther in the future. To solve the game with backward induction, we go to the last decision nodes in the game tree, the ones farthest in the future (and farthest to the right in the game tree). We determine the optimal action (or actions) for that player making the decision at that point in time. Having done so, we go backwards in time (and to the left in the game tree) and determine the optimal action (or actions) at the previous set of decision nodes. We repeat this until we have gone all the way back in time (and all the way to the left in the game tree), and determined the optimal action at the first node of the game, for the first mover. Let’s do this in Figure 14.1. We go to the last decision nodes, the ones for player 2. At 14 Game Theory 266 the upper node (which follows player 1’s choice of “Heads”), if player 2 chooses “Heads,” her payoff is +1. If she chooses “Tails,” her payoff is -1. Therefore she chooses “Heads.” At the lower decision node (which follows player 1’s choice of “Tails”), if player 2 chooses “Tails,” her payoff is +1. If she chooses “Heads,” her payoff is -1. Therefore she chooses “Tails.” We see at this stage that player 2 is always going to win the dollar. We now move to the left and decide what player 1 should do at the first decision node. The answer is that it doesn’t matter, he can choose “Heads” or “Tails.” The outcome is the same to him in either case. Either one of these leads to the payoff vector (-1,+1). In short, in this game, player 2, the second mover, will win the dollar. This game has a clear second-mover advantage. This shows that whether there is a first-mover or a second-mover advantage in a game depends on the specifics of the game. We will complete the discussion of the sequential matching pennies game with an observation about the distinction between strategies and actions. In game theory, a strategy is a complete contingent plan of the actions which a player will play in a game. If it is a simultaneous move game, where the actions all take place at one point in time, a strategy coincides with an action. In a sequential move game, a strategy does not necessarily coincide with an action because a player who moves later in the game can make his actions contingent on the history of actions before his. To be clear, in the sequential matching pennies game, player 1 has only two strategies, which coincide with his actions: “Heads” and “Tails.” But player 2 has four strategies: “Always Heads: After Heads and After Tails, Play Heads,” “Always Tails: After Heads and After Tails, Play Tails,” “Matching: After Heads, Play Heads, and After Tails, Play Tails,” and “Not Matching: After Heads, Play Tails, and After Tails, Play Heads.” Therefore, there are two backward induction strategy solutions to this game: player 1 chooses “Heads” and then player 2 chooses “Matching,” and player 1 chooses “Tails” and then player 2 chooses “Matching.” This more careful analysis still leads to the conclusion that the second player in the game will match the action of the first player, and will win the dollar. We now consider a slightly different game, which we will call a duopoly sequential competition game. The two players are now two firms in a duopoly market. Firm 1 moves first and can produce a “High” output or a “Low” output. After firm 2 observes firm 1’s choice of output, it responds by also choosing either “High” or “Low.” Assume that the payoffs to the firms, that is, profits, are (pi1, pi2) = (−1,−1) if both firms choose “High” because the market is inundated with the product and the price falls below average cost. If both firms produce “Low,” profits are 14 Game Theory 267 (pi1, pi2) = (2, 2). Finally, if one firm produces “High” and the other produces “Low,” assume that the firm with the higher output ends up with profit of 3, while the firm with the low output has profit of 1. Figure 14.2 represents this game in a game tree. INSERT FIGURE 14.2 HERE Caption of Fig. 14.2: A duopoly sequential competition game. Applying the backward induction procedure to this game, we go first to firm 2’s decision nodes. If firm 1 has produced “High,” firm 2 will produce “Low” because 1 is greater than -1. And if firm 1 has produced “Low,” firm 2 will respond with “High” because 3 is greater than 2. Now we go back to firm 1’s decision node. Firm 1 knows that firm 2 will do the opposite of what it has done. If firm 1 chooses “High,” it will end up with a payoff of 3. If firm 1 chooses “Low,” it will end up with a payoff of 1. Therefore firm 1 will choose “High.” Firm 2 will respond with “Low,” and the ultimate profits will be (pi1, pi2) = (3, 1). As you can easily see, there is a first-mover advantage in this game. The game is completely symmetric in payoffs, and so, if the roles of firm 1 and 2 were reversed (with firm 2 moving first and firm 1 moving second), we would end up with a similar outcome, with the first mover choosing “High”, and the second mover responding with “Low.” With the roles reversed, the payoff vector would be (pi1, pi2) = (1, 3). This game should remind the reader of the Stackelberg solution to the duopoly model. 14.8 Threats We conclude this chapter by briefly discussing threats. A threat is an announcement made by a player at the beginning of a sequential move game, indicating that at some node, at some point in time, he will depart from what is rational in order to punish the other player. The sequential move game framework can help us to evaluate the credibility of threats. For instance, in the duopoly sequential competition game of the section above, firm 2 could try to change the outcome of (High, Low) by threatening firm 1 as follows: “No matter what you do, my plan is to produce “High.” Therefore if you decide to produce “High,” we will actually end up with a payoff vector of (-1,-1). I won’t do what you think I ought to do. I will take us both down if you produce “High.”” 14 Game Theory 268 Obviously, if firm 1 believes the threat, it should produce “Low,” for which the payoff vector is (1,3). A payoff of +1 is much better than a payoff of -1. But in a sequential move game like this, especially if it is played just one time, firm 1 probably should not believe firm 2’s threat. The reason is this: if firm 2 made the threat before the game started, and if firm 1 ignored the threat at the first move, firm 2 would make itself better off when it moves by not carrying through on its threat. If it drops the threat, it ends up with +1. If it carries through on its threat, it ends up with -1. So threats like this seem less credible from the vantage point of the backward induction procedure. Of course life may be more complicated if games are played over and over, or if people (or firms) play games with different partners, and develop reputations that spread out to other players. If a game is played over and over between two players, an aggressive player may carry out threats in the initial games, so that his playing partner comes to believe that he will carry out his threats, no matter how self-destructive they may be. In this case, his partner becomes trained to give in to his threats. Or, if he plays with many different players who talk to each other, an aggressive player may want one player to see that he carries out his threats, so that word gets around. Here is a final observation about some very large threats. For most of the second half of the 20th century, there was a cold war with the United States on one side and the Soviet Union on the other. In this cold war, the two superpowers accumulated large stockpiles of nuclear weapons. Those stockpiles of weapons still exist. The superpowers threatened each other with those weapons. One reason the cold war never became a hot war was the two-way threat of mutual assured destruction, abbreviated MAD, also called nuclear deterrence or massive retaliation. The idea of the mutual assured destruction game was this. If one of the superpowers attacked the other, even in an indirect, non-devastating way, the superpower that had been attacked would retaliate with a massive nuclear strike. For instance, if the Soviet Union invaded (Western) Europe, the United States would launch nuclear weapons against the Soviet Union. This retaliation would lead to a world-wide nuclear war, effectively destroying both superpowers. The MAD game would have been played just one time. Our comments above suggest that the Soviet Union’s threats against the United States, and the United States’ threats against the Soviet Union, may have all been hollow threats. Or maybe they weren’t. Or maybe the threats were so huge that even if they were unbelievable, neither side could dare to test them. 14 Game Theory 269 14.9 A Solved Problem The Problem Consider the following coordination game. There are two players and two strategies available to each player: A and B. The payoffs in the first row (corresponding to player 1 choosing A) are (a, a) and (0, 0). The payoffs in the second row (corresponding to player 1 choosing B) are (0, 0) and (1, 1). (a) Draw the 2× 2 payoff matrix. (b) For what values of a is (A, A) a dominant strategy equilibrium? (c) For what values of a is (B, B) a dominant strategy equilibrium? (d) Can you find the Nash equilibria of the game as a function of the parameter a? The Solution (a) The payoff matrix is shown in Table 14.5 below. Player 2 A B Player 1 A a, a 0, 0 B 0, 0 1, 1 Table 14.5: When is (A, A) a dominant strategy equilibrium? When is (B, B) a dominant strategy equilibrium? (b) (A, A) can never be a dominant strategy equilibrium, no matter what a is. For (A, A) to be a dominant strategy equilibrium, A would have to be a dominant strategy for both players. But if player 2 is playing B (right column), player 1 is better off with B (payoff 1) than with A (payoff 0). So no matter what a is, playing A cannot be a dominant strategy for player 1. (Similar comments apply to player 2.) Therefore (A, A) cannot be a dominant strategy equilibrium. 14 Game Theory 270 (c) If a ≤ 0, then B is a dominant strategy for player 1. If player 2 chooses A (left column), player 1 is at least as well off at B (payoff 0) as he is at a (payoff a); and if player 2 chooses B (right column), player 1 is better off at B (payoff 1) than at A (payoff 0). Similarly, B is a dominant strategy for player 2. If player 1 chooses A (top row), player 2 is at least as well off at B (payoff 0) as at A (payoff a); and if player 1 chooses B (bottom row), player 2 is better off at B (payoff 1) than at A (payoff 0). Since B is a dominant strategy for player 1, and B is a dominant strategy for player 2, (B, B) is a dominant strategy equilibrium. Since it is a dominant strategy equilibrium, it is also a Nash equilibrium. (d) If a ≥ 0, then (A, A) is a Nash equilibrium. At (A, A), both players compare the payoff a to the payoff 0, and since a ≥ 0, (A, A) is a Nash equilibrium. But (B, B) is also a Nash equilibrium when a ≥ 0. This is because no matter how big a might be, at (B, B), the payoff for both players is 1, and a deviation by either player 1 or player 2 (not both simultaneously) would reduce that player’s payoff to 0. However, if a < 0, then (B, B) is the only Nash equilibrium. 14 Game Theory 271 Exercises 1. Consider the game of chicken with two players. If both players play “Macho,” each of them gets a payoff of 0. If both players play “Chicken,” each of them gets a payoff of 6. If one player plays “Macho” and the other plays “Chicken,” the one who plays “Macho” gets a payoff of 7 and the one who plays “Chicken” gets a payoff of 2. (a) Draw the payoff matrix. (b) Does either player have a dominant strategy in this game? (c) Find the Nash equilibrium or equilibria. 2. Jack and Jill want a treehouse to play in. They have to decide simultaneously whether to build or not to build. Each individual who builds bears a cost of 3. They both have access to the treehouse once it is built. If only one of them builds the treehouse, they each derive a utility of 2. If both of them build the treehouse, they each derive a utility of 4 (presumably the treehouse is more elaborate because two heads are better than one). If the treehouse is not built, they each derive a utility of 0. (a) Draw the payoff matrix. (b) What is Jack’s strategy? What is Jill’s strategy? What is the Nash equilibrium or equilibria? (c) Does this game resemble the prisoner’s dilemma, the battle of the sexes, or chicken? Explain. 3. Sam and Dan are twins who like playing tricks on each other. Sam is deciding whether to take Dan’s blanket. Sam has a utility of 0 if he doesn’t take Dan’s blanket. If Sam takes Dan’s blanket, there is a possibility of Dan retaliating by taking Sam’s pillow, thereby earning Sam a utility of −5. If Dan doesn’t retaliate, Sam gets a utility of 5. Dan has a utility of 10 if Sam doesn’t take his blanket. If Sam takes his blanket, Dan’s utility is -10. Dan’s utility changes by X if he retaliates. (a) Draw the game tree. 14 Game Theory 272 (b) For what values of X would we observe Sam taking Dan’s blanket in the backward induction equilibrium? 4. Consider the following sequential strategic situation, called the centipede game. The game has 100 stages. There are two players who take turns making decisions, starting with player 1. At stage t = 1, . . . , 99, player 1 (if the stage is odd) or player 2 (if it is even) chooses whether to “Terminate the game” or to “Continue the game.” If the game is terminated at stage t = 1, . . . , 99, the player terminating the game receives a payoff of t, while the other player receives a payoff of zero. Finally, at stage t = 100, player 2 chooses between action A with a payoff of 99 for each player, or action B with a payoff of zero for player 1 and a payoff of 100 for player 2. (a) Draw the game tree for this strategic situation (the name of the game will become apparent then). (b) What is the backward induction solution to this game? 5. Two players take turns choosing a number between 1 and 10, inclusive. The number is added to a running total. The player who takes the total to 100 (or greater) wins. (a) What is the backward induction solution to this game? Map out the complete strategy. (b) Is there a first-mover or a second-mover advantage in this game? 6. Consider a Bertrand duopoly with a homogeneous good, as in the first part of Section 6 of Chapter 13. Assume the market demand curve is y = y1 + y2 = 1 − p, where p is the relevant market price, y is the total amount demanded at that price, and y1 and y2 are the output levels for firms 1 and 2. Assume the firms’ cost functions are C(yi) = 12yi for i = 1, 2. The rules of the pricing game are as follows. The firms must each simultaneously name a price in the interval [0, 1]. If the prices are different, the firm with the lower price sells all the units demanded at that price, while the other firm sells nothing. If they name the same price, the amount demanded at that price is split equally between the two 14 Game Theory 273 firms. Show that there is a unique Nash equilibrium, and find the equilibrium price and quantities. Hint: Note that calculus cannot be used to solve this problem, because the firms’ profit functions are not continuous in the price variables. For instance, pi1(p1, p2) is not contin- uous at p1 = p2 = p. 274 Part IV General Equilibrium 15 An Exchange Economy 275 15 An Exchange Economy 15.1 Introduction What economists call a pure exchange economy, or more simply an exchange economy, is a model of an economy with no production. Goods have already been produced, found, inherited, or endowed, and the only issue is how they should be distributed and consumed. Even though this model abstracts from production decisions, it illustrates important questions about the efficiency or inefficiency of allocations of goods among consumers, and provides important answers to those questions. In this chapter, we will start with a very simple model of an exchange economy, and we will discuss the Pareto optimality or Pareto efficiency for allocations of goods among consumers. Then we will turn to the role of markets, and discuss market or competitive equilibrium al- locations. Finally we will discuss the extremely important connections between markets and efficiency in an exchange economy. These connections between markets and efficiency are among the most important results in economic theory, and are appropriately called the fundamental theorems of welfare economics. 15.2 An Economy with Two Consumers and Two Goods We will study the simplest possible exchange economy model, with only two consumers and two goods. A model of exchange cannot get much simpler, since if there was only one good or one person, there wouldn’t be any reason for trade. However, even though our model is extremely simple, it captures all the important issues, and it easily generalizes. So let’s suppose there are two consumers. In recognition of Daniel Defoe’s very early novel Robinson Crusoe (published in 1719), we’ll call them Robinson and Friday. We’ll abbreviate Robinson R; for Friday we’ll use F . We’ll assume there are only two consumption goods on their island, bread (good x) and rum (good y). In this model of exchange, we are abstracting from the fact that Robinson and Friday produce rum (or somehow have acquired a stock of it), and produce bread (by making flour, mixing, and baking)! Therefore we will assume that there are fixed totals of rum and bread that are available, and that the only issue is how to distribute those totals among the two consumers. 15 An Exchange Economy 276 Here’s how the distribution of the two goods works. The two consumers start with initial endowments of the goods. And then they make trades. We let X represent the total quantity of good x, bread, that is available. We let Y represent the total quantity of good y, rum, that is available. In general, if we are talking about an arbitrary bundle of goods for Robinson, we show it as (xR, yR), where xR is his quantity of bread and yR is his quantity of rum. An arbitrary bundle of goods for Friday is (xF , yF ). Robinson has initial endowments of the two goods, as does Friday. We will use the “naught” superscript (that is, 0) to indicate an initial quantity. Robinson’s initial bundle of goods is (x0R, y0R). Friday’s initial bundle of goods is (x0F , y0F ). The quantities of the two goods in the initial bundles must be consistent with the assumed totals of bread and rum. That is, X = x0R + x0F and Y = y0R + y0F . Moreover, if they start with their initial quantities and then trade, any bundles they end up with must also be consistent with the given totals. That is, if they end up at ((xR, yR), (xF , yF )), it must be the case that X = xR + xF and Y = yR + yF . Robinson’s preferences for bread and rum are represented by the utility function uR(xR, yR), and, similarly, Friday’s preferences are represented by uF (xF , yF ). That is, we assume that each consumer’s utility depends only on his own consumption bundle. Note that the utility functions uR and uF will generally be different, and unrelated to the initial bundles that Robinson and Friday happen to have. The facts that preferences are generally different, and initial bundles are also generally different, make mutually beneficial trade probable. To show our simple exchange economy with a graph, we use a diagram first suggested (in 1881) by the great Anglo-Irish economist Francis Ysidro Edgeworth (1845-1926). (Actually, Edgeworth didn’t really invent this diagram; the version we use today is due to the English economist Arthur Bowley (1869-1957).) This graph is called an Edgeworth box diagram. We show it in Figure 15.1 below. In the figure, note that the initial endowment is given by the point W . That is, W is the allocation of bread x and rum y giving Robinson the bundle (x0R, y0R) and giving Friday the bundle (x0F , y0F ). There are two indifference curves shown in the figure: IR belongs to Robinson, and IF belongs to Friday. The small arrows attached to those indifference curves indicate the directions of increasing utility. 15 An Exchange Economy 277 INSERT FIGURE 15.1 HERE Caption of Fig. 15.1: The Edgeworth box diagram. The novel feature of the Edgeworth box diagram, which makes it different from other dia- grams we have used, is that it has two origins. The lower left origin is for Robinson. Quantities for Robinson are measured from that origin. The upper right origin is for Friday. Quantities for Friday are measured from that origin. Of course this is a little confusing at first, because in a sense Friday is upside down and backwards! (This is why his indifference curve IF seems to look wrong.) But once the reader is past this confusion, the advantages of the diagram become apparent. First, we see that the quantities at the initial allocation W add up as they should; that is, X = x0R + x0F and Y = y0R + y0F . Second, we see that any allocation of the given totals between Robinson and Friday could be represented by some point in the diagram. This is because if the totals in an arbitrary pair of bundles ((xR, yR), (xF , yF )) add up to the given totals X and Y , that is, if X = xR + xF and Y = yR + yF , then the given allocation can be plotted as a single point in the box. (The interested reader should convince herself that this is true, by plotting such a point.) And third, the Edgeworth box diagram has the remarkable virtue that it easily shows four quantities, ((x0R, y0R), (x0F , y0F )), in a two-dimensional picture! 15.3 Pareto Efficiency In previous chapters, when we analyzed the welfare properties of competitive markets, monopoly, and duopoly, we looked at efficiency in terms of consumers’ and producers’ surplus. The sum of these two surpluses represents the total net benefit created in a market, to buyers and sellers, measured in money units, e.g., dollars. Consumers’ plus producers’ surplus is a measure of the “size of the economic pie,” created by the production and trade of some good. A market for a good is inefficient if there is a way to make that pie bigger (e.g., the standard monopoly case), and it is efficient if there is no way to make it bigger (e.g., the standard competitive case). 15 An Exchange Economy 278 All this assumed that consumers’ surplus is well defined, which in turn required some special assumptions about preferences. Note that this kind of analysis focused on one good under study, and ignored what might have been happening in other markets for other goods, for labor and savings, and so on. It was therefore what is called partial equilibrium analysis, and models which study one good in this fashion are called partial equilibrium models. But we are now looking at a simple model of exchange, without production. If we measure the size of the economic pie in terms of total quantities of bread and/or rum, the size cannot change because these total quantities are fixed. There is no money in the model (at least not yet), and so it wouldn’t be easy to measure the size of the pie in money units. We might try to measure the economic pie in utility units, but we know that it would probably be wrong to try to add together Robinson’s utility and Friday’s utility. How then can we decide when an allocation of the fixed quantities of bread and rum, between the two consumers, is efficient (or when it is not)? The solution to this problem was developed by the Italian economist Vilfredo Pareto (1848- 1923), and so we call the central concept Pareto optimality or Pareto efficiency. Here are some important definitions. First, we need to be careful about which allocations of bread and rum are possible and which are not. We will say that a pair of bundles of goods, (xR, yR) and (xF , yF ), is a feasible allocation if all the quantities are non-negative and if X = xR + xF and Y = yR + yF . That is, a feasible allocation is one in which the goods going to Robinson and Friday add up to the given totals. In fact, the feasible allocations in the exchange model are simply the points in the Edgeworth box diagram, no more and no less. Second, if A and B are two feasible allocations, we will say that A Pareto dominates B if both Robinson and Friday like A at least as well as B, and at least one likes it better. If A Pareto dominates B, we call a move by Robinson and Friday from B to A a Pareto move. Third, a feasible allocation is not Pareto optimal if there is a different feasible allocation which both of the consumers like at least as well, and which is preferred by at least one of them. That is, a feasible allocation is not Pareto optimal if there is a Pareto move from it. Note that any Pareto move would get a unanimous vote of approval (possibly with an abstention). 15 An Exchange Economy 279 Fourth and finally, a feasible allocation is Pareto optimal or Pareto efficient if there is no feasible allocation which both of the consumers like at least as well, and which is preferred by at least one of them. That is, a feasible allocation is Pareto optimal if there is no Pareto move from it. The reader can see a non-optimal feasible allocation in Figure 15.1; it is the initial allocation W , and there are many points in the Edgeworth box diagram (the lens-shaped area to the northwest of W ), which Pareto dominate W . (Of course W isn’t the only non-optimal allocation in Figure 15.1!) So much for the Pareto-related definitions. Note that these definitions can easily be extended to exchange economies with any number of consumers and any number of goods, and can be extended, less easily, to any kind of economic model, including models with production as well as exchange. Note also that these definitions are not restricted to models of markets with just one good (“partial equilibrium models”). They are very general, and useful in general equilibrium models, that is, models which consider supply and demand in all markets simultaneously. Our pure exchange model, with two people and two goods, is a very simple example of a general equilibrium model. Now think about a feasible allocation that is not Pareto optimal. It is obviously undesirable for society to be at that allocation, since there are other feasible allocations that are unambigu- ously better, in the sense that a move from the given non-optimal allocation to the alternative would get unanimous consent. Obviously, if an economy is at a non-Pareto optimal point, it should move to something better. But note that Pareto optimality has nothing to do with con- siderations of distributional fairness or equity. That is, a non-Pareto optimal allocation may be a lot more equal than a Pareto optimal one. In fact, allocating all the bread and rum to Robinson (for example) is Pareto optimal, since there is no Pareto move away from that totally lopsided and unfair allocation. Moreover, giving both Robinson and Friday exactly half the bread and half the rum, the allocation which is the most equal of all the feasible allocations, is probably not Pareto optimal. With all this said, we now return to our Edgeworth box diagrams. They should make the mysteries of Pareto optimality and non-optimality clear. Feasible allocations and the Edgeworth box. Suppose we have a bundle of goods (xR, yR) for Robinson and another bundle (xF , yF ) for Friday. For the pair of bundles ((xR, yR), (xF , yF )) 15 An Exchange Economy 280 to be a feasible allocation, the numbers must add up to the totals X and Y . Consider Figure 15.2 below. In it the two bundles are shown, but the quantities of bread (on the horizontal axis) and rum (on the vertical axis) do not add up to the total quantity of bread available, X (the width of the box), or the total quantity of rum available, Y (the height of the box). In other words, if you are interested in finding Pareto optimal allocations of bread and rum, don’t even think about the pair of bundles (xR, yR) and (xF , yF ) shown in Figure 15.2, because that pair of bundles is just not possible. INSERT FIGURE 15.2 HERE Caption of Fig. 15.2: This pair of bundles is not feasible. Therefore it is not Pareto optimal. In Figure 15.3, we show another pair of bundles of goods whose totals do not add up to X and Y . However, this time the totals fall short. We’ll still consider this pair of bundles ((xR, yR), (xF , yF )) non-feasible and therefore non-Pareto optimal. (Some economists would pronounce ((xR, yR), (xF , yF )) “feasible,” because you could throw away some bread and some rum (the excesses of each good), starting at X and Y , and get to the lesser totals. But even if you take this approach, ((xR, yR), (xF , yF )) still wouldn’t be Pareto optimal, because it is Pareto-dominated by another allocation formed by starting with ((xR, yR), (xF , yF )), and then adding back half the excesses to each consumer’s bundle.) INSERT FIGURE 15.3 HERE Caption of Fig. 15.3: This pair of bundles is not Pareto optimal either, because it’s not feasible. Even if we were to expand our definition of feasibility to allow it, it still wouldn’t be Pareto optimal. The moral of Figures 15.2 and 15.3 is that in the exchange economy model, an allocation of bread and rum must be feasible before we can decide whether or not it is Pareto optimal. To be feasible, it must be the case that X = xR + xF and Y = yR + yF . 15 An Exchange Economy 281 Tangencies of indifference curves and the Edgeworth box. From this point on, we will only consider feasible allocations. That is, we will only look at pairs of bundles ((xR, yR), (xF , yF )) which can be represented by single points in the Edgeworth box diagram. The point W in Figure 15.1 was such a feasible allocation. But we will draw a fresh figure, Figure 15.4 below, with a similar point, labeled P . In the figure, the indifference curves IR and IF cross at the point P = ((xR, yR), (xF , yF )). The arrows on the indifference curves show the directions of increasing utility. The point P cannot be Pareto optimal, because a move to the interior of the lens-shaped area would make both consumers better off. (Moving to the other end of the lens, where IR and IF cross again, would leave both of them exactly as well off; moving to the edges of the lens would make one person better off and the other exactly as well off as at the highlighted point.) We now have a tentative conclusion. If indifference curves of the two consumers cross at a point in an Edgeworth box diagram, that point must be non-Pareto optimal, and there must be other points, other feasible allocations, which Pareto dominate it. Note, by the way, that if two indifference curves actually cross at a point in the box (rather than just touch each other) then that point must be in the interior of the box. That is, it must be the case that xR, yR, xF , and yF > 0. INSERT FIGURE 15.4 HERE Caption of Fig. 15.4: A point in the Edgeworth box diagram that is not Pareto optimal. If Robinson and Friday are at a point like P in Figure 15.4, they can make trades that benefit one or both, and harm neither. That is, they can make Pareto moves. If they are free to trade, aware of the feasible allocations, and in touch with their preferences or utility functions, they will probably continue to trade until they can no longer make Pareto moves; that is, they will trade to a Pareto optimal allocation. Moreover, if the point where they end up is in the interior of the Edgeworth box diagram, it cannot be a point where Robinson’s and Friday’s indifference curves cross. Rather, assuming the indifference curves are smooth and do not have kinks, the point where they end up, where further Pareto moves are impossible, must be a tangency point. In short, in the interior of the Edgeworth box diagram, the Pareto optimal points must be points of tangency between the indifference curves of Robinson and Friday. Figure 15.5 below 15 An Exchange Economy 282 shows one such Pareto optimal point, identified as Q = ((xR, yR), (xF , yF )). Four arrows are drawn from Q. The one pointing northeast suggests a move that would make Robinson better off, but would make Friday worse off. The one pointing southwest suggests a move that would make Friday better off, but would make Robinson worse off. A move in the direction of each of the other arrows would make both consumers worse off. Therefore there is no Pareto move away from Q, which means that Q must be Pareto optimal. INSERT FIGURE 15.5 HERE Caption of Fig. 15.5: The allocation Q is Pareto optimal. All this leads to a necessary condition for Pareto optimality for points in the interior of the Edgeworth box diagram. For such a point to be Pareto optimal, the slopes of the indifference curves of Robinson and Friday must be equal at the point. That is, Robinson’s marginal rate of substitution of good y for good x must equal Friday’s marginal rate of substitution of y for x. This gives MRSR = MRSF , which in turn gives MURx MURy = MUFx MUFy . Note that the R superscript is for Robinson and the F superscript is for Friday. The set of Pareto optimal allocations in the Edgeworth box diagram is called the contract curve. The name is fitting, for these are the allocations that could potentially be outcomes of trading contracts. That is, Robinson and Friday would be likely to agree to a contract that would take them from the initial allocation W to the contract curve. Of course, where they end up on the contract curve depends on the location of the starting allocation W , and it may also depend on their bargaining abilities. If W gives most of the bread and most of the rum to Robinson, they will end up somewhere on the contract curve where Robinson still has most of the bread and most of the rum. In Figure 15.6, we show a contract curve. In the interior of the Edgeworth box diagram it is the set of tangency points. We show an initial allocation W that gives most of the bread to Robinson and most of the rum to Friday. In this exchange economy, Robinson and Friday will trade to the contract curve, but not to anywhere on the contract curve. They will want to make Pareto moves. This means that neither should end up worse off than 15 An Exchange Economy 283 they were at the initial allocation W . The part of the contract curve where neither is worse off than they were at W is between allocations A and B. This is called the core. INSERT FIGURE 15.6 HERE Caption of Fig. 15.6: The contract curve and the core. Most economists believe that Pareto moves are unambiguously good, and that Pareto opti- mality is desirable, since any non-Pareto optimal point is unambiguously inferior to some Pareto optimal one. Most economists who look at the Edgeworth box diagram agree that it would be a good thing to end up on the contract curve. There is of course disagreement about distribution, and so we do not claim that the Pareto optimal point A is better than (or worse than) the Pareto optimal point B. But we do agree that it would be a good thing to end up at some Pareto optimal point—at some point on the contract curve. We have suggested that in the exchange economy model, our two traders, starting at some initial allocation W , can simply trade, or barter, to get to the contract curve. Is there another way to get there? The answer is “Yes,” through market trade. 15.4 Competitive or Walrasian Equilibrium We will now model a competitive market in our simple two-person two-good Robinson/Friday economy. This theory was first developed by the French economist Leon Walras (1834-1910). The market equilibrium idea we will describe is called a competitive equilibrium or a Walrasian equilibrium. The connections between market equilibria and Pareto optimality were rigorously analyzed in the 1950’s, especially by the American economists Kenneth Arrow (1921- ) and Lionel McKenzie (1919- ), and the French economist Gerard Debreu (1921-2004). Here’s the story of Walrasian equilibrium. Imagine an auctioneer lands on the island with Robinson and Friday. The auctioneer has no bread and no rum, nor does he have any desire to consume any. His sole function is to create a market where people can trade the two goods. He does this by calling out prices for the two goods. He starts by announcing px, the per unit bread price, and py, the per unit rum price. He announces that he will buy or sell any quantities of bread and/or rum, at those prices. He asks Robinson and Friday: “What do you want to do at those prices?” 15 An Exchange Economy 284 In our model of a competitive market economy, we assume Robinson and Friday take those prices as given and fixed, unaffected by their actions. (This is obviously a little unrealistic when we are talking about just two consumers. But the model is meant to be extended to cases where there are many consumers, in which case the assumption of competitive behavior becomes plausible.) Now Robinson and Friday hear the Walrasian auctioneer announce a pair of prices (px, py), and they understand that they should tell him what bundle they want to consume, based on those prices. Our traders have no money in the bank or in their pockets; they only have their initial bundles. Robinson and Friday hear the announced prices and know the bundles they start with. If Robinson starts with 10 loaves of bread, and decides he wants to consume 12 loaves, he will go to the auctioneer and swap some of his rum for the extra 2 loaves of bread. What exactly is his budget constraint? We could figure it in terms of such a swap; it would then be “value of bread acquired = value of rum given up” or (xR − x0R)px = (y0R − yR)py. With a little rearranging, this gives pxxR + pyyR = pxx0R + pyy0R. Alternatively, we could derive Robinson’s budget constraint by realizing that in a world where consumers do not have money income, what substitutes for income in the budget constraint is the value of the bundle the consumer starts out with. Robinson’s budget constraint should then say “value of his desired consumption bundle = value of the bundle he starts with,” which also gives pxxR + pyyR = pxx0R + pyy0R. Now recall the Walrasian auctioneer has called out some prices, and asked Robinson and Friday: “What do you want to do at these prices?” Robinson of course wants to maximize his utility, or get to the highest indifference curve, subject to his budget constraint. That is, he wants to maximize uR(xR, yR) subject to the constraint pxxR + pyyR = pxx0R + pyy0R. 15 An Exchange Economy 285 Think of the budget line implied by this budget constraint. Note that the absolute value of the slope of the budget line is px/py, and note that the budget line must go through the initial bundle (x0R, y0R). All of this leads Robinson to conclude that he wants to consume some bundle, call it AR for now. Robinson tells the auctioneer that based on the announced prices, he wants to consume AR. Friday goes through the same exercise, and he ends up telling the auctioneer that he wants to consume BF . In Figure 15.7 below we plot the results. There is one budget line going through the initial allocation W . We do not have two separate lines, one for Robinson and the other for Friday. This because, first, either trader’s line must go through W , which represents both their initial bundles. And second, since the auctioneer called out only one set of prices, there is only one possible price ratio px/py and only one possible slope. In the figure, we show the bundle AR that Robinson would like to consume, and the bundle BF that Friday would like to consume. INSERT FIGURE 15.7 HERE Caption of Fig. 15.7: At these prices, there is excess supply of good x and excess demand of good y. The Walrasian auctioneer should announce new prices with a lower relative price for x, px/py. Now it’s time for the Walrasian auctioneer to act. He asks himself: “Is it possible for Robinson to consume AR and for Friday to consume BF ?” The reader should immediately see the answer: “No,” because (AR, BF ) is not a feasible allocation. The totals do not add up to X and Y . In particular, the amount of bread that the two want to consume is less than the amount X that is available, and the amount of rum that the two want to consume is greater than the amount Y that is available. So (AR, BF ) is just not possible. The Walrasian auctioneer sees this. He says to himself: “The (px, py) I announced must be changed. There is excess supply of bread and excess demand for rum. I must lower the relative price of bread px/py.” So he tells Robinson and Friday that there will be no trading at the previously announced prices. Instead he announces a new pair of prices, for which px/py is a little lower than the first pair of prices. (For instance, if the original (px, py) was (2, 1), he announces new prices (1.75, 1).) He tells Robinson and Friday to forget about the bundles they 15 An Exchange Economy 286 wanted to consume at the previous pair of prices. Instead, they should now tell him what bundles they want to consume at his newly announced prices. Robinson and Friday then figure out what bundles they want to consume at the new prices, and duly report back to the auctioneer. The auctioneer together with Robinson and Friday continue this price-to-desired-consumption- bundles-to-price process until, finally, they end up with a pair of prices and desired consumption bundles that work. That is, the process continues until Robinson and Friday tell the Walrasian auctioneer that based on his latest price combination (p∗x, p∗y), they want to consume certain bundles A∗R and B∗F , and those bundles are consistent with the given totals of bread and rum; they are a feasible allocation, a single point in the Edgeworth box diagram. That is, at the pair of bundles A∗R and B∗F , for each of the goods, Total demand = Total supply. Once this end has been reached, the Walrasian auctioneer makes his final announcement to Robinson and Friday: “We’re finally there. Make the trades at the (p∗x, p∗y) prices, either through me as an intermediary or directly between yourselves. Then consume and enjoy!” (We have been somewhat casual about the nature of the dynamic price adjustment process. Analysis of convergence for the process is beyond our scope.) The process we described above is called a Walrasian auctioneering process or Walrasian process or tatonnement process. (The word “tatonnement” is French for “groping.”) The end result is called a competitive equilibrium or Walrasian equilibrium. The pair of prices where it ends up, p∗x and p∗y, are called the competitive equilibrium prices. The Walrasian process produces the equilibrium prices and a pair of consumption bundles A∗R and B∗F , such that A∗R maximizes Robinson’s utility subject to his budget constraint with the equilibrium prices, B∗F maximizes Friday’s utility subject to his budget constraint with the equilibrium prices, and such that (A∗R, B∗F ) is a feasible allocation; that is, the desired total consumption of each good equal the total supply of that good. The allocation (A∗R, B∗F ) is called a competitive equilibrium allocation. Figure 15.8 shows a competitive equilibrium. Note that crucial difference between Figure 15.7 and Figure 15.8; in Figure 15.7 the desired consumption bundles are 2 distinct points in the Edgeworth box, which means they are not a feasible allocation; there is excess supply of bread and excess demand for rum. This suggests the relative price of bread px/py, should fall. That is, the budget line should get flatter. Figure 15.8 has a flatter budget line, and in that figure the desired consumption bundles do coincide in the Edgeworth box. They constitute a feasible 15 An Exchange Economy 287 allocation. Supply equals demand for each good. INSERT FIGURE 15.8 HERE Caption of Fig. 15.8: The Walrasian or competitive equilibrium. Finally, note two last extremely important facts about (A∗R, B∗F ) in Figure 15.8. First, the competitive equilibrium allocation is a tangency point for the two indifference curves shown. That means it’s on the contract curve. It’s Pareto optimal! And second, a look at Figures 15.6 and 15.8 together should convince you that the competitive equilibrium allocation is in the core. 15.5 The Two Fundamental Theorems of Welfare Economics The relationships between free markets and efficiency, and between market incentives and na- tional wealth, have been written about since the time of Adam Smith (1723-1790), who published The Wealth of Nations in 1776. Smith’s arguments were neither formal nor mathematical; the formal and mathematical analysis was developed in the late 19th and mid 20th centuries. We now call the two basic results that relate Pareto optimality and competitive markets the first and second fundamental theorems of welfare economics. Figure 15.8 illustrates the first fundamental theorem in our simple pure exchange model, with only two people and two goods. The figure shows that a competitive equilibrium allocation is Pareto optimal. That result easily extends to exchange models with any number of people and any number of goods, as well as to economic models with production as well as exchange. The result only requires a few assumptions; in particular, we must assume that there are markets and market prices for all the goods, that all the agents are competitive price takers, and that any individual’s utility depends only on his or her own consumption bundle, and not on the consumption bundles of other individuals. (Similarly, if there are firms, we must assume that they are all competitive price takers, and that any firm’s production function only depends on that firm’s inputs and outputs.) We’ll now state the first fundamental theorem, for a general exchange economy. First fundamental theorem of welfare economics. Suppose there are markets and market prices for all the goods, that all the people are competitive price takers, and that each person’s 15 An Exchange Economy 288 utility depends only on her own bundle of goods. Then any competitive equilibrium allocation is Pareto optimal. In fact, any competitive equilibrium allocation is in the core. This is an extremely important result, because it suggests that a society that relies on com- petitive markets will achieve Pareto optimality. Note that although there are lots of competitive allocations, most allocations in an exchange model are actually not Pareto optimal. The reader should look back at Figure 15.6 and think about throwing a dart at that Edgeworth box dia- gram, hoping to hit the contract curve. What are the odds you will hit it? At least in theory the odds are zero, because a line has zero area. So ending up at a Pareto optimal allocation is not easy, and the fact that the market mechanism does it is impressive. Moreover, the market mechanism is cheap (it only requires a Walrasian auctioneer, in theory, or perhaps something like eBay, in reality). It does not require that some central power learn everybody’s utility function (which would be terribly intrusive and dangerous) and then make distributional decisions; it only requires publicly known prices that move in response to excess supply or demand. In short, the competitive market mechanism is relatively cheap, relatively unobtrusive, relatively benign, and remarkably effective. This is what the first fundamental theorem of welfare economics helps us understand. However, one important shortcoming of the first fundamental theorem is that the location of the competitive equilibrium allocation is highly dependent on the location of the initial allo- cation. In other words, if we start at an initial allocation that gives Robinson most of the bread and rum, we will end up at a competitive equilibrium allocation that gives Robinson most of the bread and rum. Or, more generally, if a society has a very unequal distribution of talents and abilities and initial quantities of various goods, it will end up with a competitive equilibrium that, while Pareto optimal, is very unequal. What can be done? This is where the second fundamental theorem comes in. The second fundamental theorem of welfare economics uses all the assumptions of the first theorem, and adds an additional one, convexity. In particular, at least for the exchange version of the second fundamental theorem, we will assume that the traders have convex indifference curves. (This is in fact how we drew the indifference curves in Figures 15.4 through 15.8.) Here’s what the theorem says. Suppose the initial allocation in society is very skewed, very unfair, and therefore a competitive equilibrium based on it would be very unfair. Suppose that people in 15 An Exchange Economy 289 society have decided that there is a different, perhaps much fairer Pareto optimal allocation, that they want to get to. But they want to mostly use the market mechanism to get to that desired Pareto optimal point; they do not want a dictator announcing what bundle of goods each and every person should consume. Is there a slightly modified market mechanism that will get society from the initial allocation to the target Pareto optimal one? The answer is “Yes.” Figure 15.9 below illustrates the theorem. The initial allocation W gives all the bread and most of the rum to Robinson. Given this initial allocation, the Walrasian mechanism we described above, if left alone, will produce a competitive equilibrium that gives most of the bread and most of the rum to Robinson. This allocation is labeled Laissez faire in the figure. This is the outcome of the unshackled free market. (“Laissez faire” is French for “let do,” that is, let the market do its thing.) But it’s very unfair; it leaves Friday a pauper. A more equitable goal would be the Pareto optimal allocation labeled Target. Can the market, with a relatively small fix, be used to get society to Target? INSERT FIGURE 15.9 HERE Caption of Fig. 15.9: A very unfair Laissez Faire competitive equilibrium, and a more equitable, and Pareto optimal, Target. Referring now to Figure 15.9, this is how the market mechanism is modified to get the economy to Target, the Pareto optimal allocation that’s more equitable than what the market would produce by itself. Since Target is Pareto optimal, and in the interior of the Edgeworth box diagram, it must be a tangency point of the two traders’ indifference curves. (The argument is more complicated if Target is not an interior point.) Since it’s a tangency point, we can draw a tangent line like the one in Figure 15.9. From the slope of the tangent line we can figure out what price ratio p∗x/p∗y we’re going to need. One of the prices could be arbitrarily set to 1, and then the required price ratio would give the other price. (A good whose price is set equal to 1 is called a numeraire good.) This gives the required pair of prices (p∗x, p∗y). Now the government must step in and introduce some lump-sum taxes and transfers. These are imposed on Robinson and Friday. They are “lump sum” because they are independent of the quantities of goods the parties want to consume. A tax (money taken from a person) will be represented by a negative number, a transfer (or subsidy) will be represented by a positive 15 An Exchange Economy 290 number. Let TR be Robinson’s tax or transfer, and let TF be Friday’s. The government is not creating or destroying wealth, and so we require that TR + TF = 0. Whatever the government taxes away from one party will be quickly sent to the other party. The government now sends Robinson a message: “Put TR onto the right hand side of your budget constraint, and assume prices (p∗x, p∗y) for the two goods. If TR is a negative number, too bad for you. You’ve lost some money; send it to us. If it’s a positive number, good for you. You’ve gained some money; we’ll be wiring it to you today.” The government sends Friday a similar message about TF . The budget constraints of Robinson and Friday now become p∗xxR + p∗yyR = p∗xx0R + p∗yy0R + TR and p∗xxF + p∗yyF = p∗xx0F + p∗yy0F + TF . Robinson and Friday now choose their desired consumption bundles, based on these budget constraints. And if the government sets the taxes and transfers at the right level, Robinson and Friday will end up at the desired Pareto optimal point, the point Target. In short, by properly setting lump-sum taxes and transfers, society can get from any initial allocation, no matter how inequitable, to a more desirable Pareto optimal allocation without abandoning the use of the market mechanism. The second fundamental theorem of welfare economics says all this is possible. Here is a more formal statement. Second fundamental theorem of welfare economics. Suppose there are markets for all the goods, that all the people are competitive price takers, and that each person’s utility depends only on his or her own bundle of goods. Suppose further that the traders have convex indifference curves. Let Target be any Pareto optimal allocation. Then there are competitive equilibrium prices for the goods, and a list of lump-sum taxes and transfers for the people, which sum to zero, such that when the budget constraints based on these prices are modified with these taxes and transfers, Target is the resulting competitive equilibrium allocation. Loosely speaking, the first fundamental theorem of welfare economics says that any com- petitive equilibrium is Pareto optimal, and the second says that any Pareto optimal point is a 15 An Exchange Economy 291 competitive equilibrium, given the appropriate modification of the traders’ budget constraints. The second theorem needs an additional assumption (convexity), and relies heavily on the bud- get constraint modifications. But the existence of the second theorem allows all economists to more-or-less agree: “We like the market mechanism; it gets us Pareto optimality.” Conservative economists tend to say “The market’s great, don’t touch it; let’s go to the Pareto optimal out- come it gives us.” Liberal economists tend to say “The market’s great, but the initial allocation is terrible; let’s use some taxes and transfers to fix the inequities, and then let’s go to the Pareto optimal outcome it gives us.” This debate is one of the things that makes life interesting for economists, and for many, many others. 15.6 A Solved Problem The Problem Consider a pure exchange economy with two consumers, 1 and 2, and two goods, x and y. Consumer 1’s initial endowment is w1 = (1, 0), that is, 1 unit of good x and 0 units of good y. Consumer 2’s initial endowment is w2 = (0, 1), that is, 0 units of good x and 1 unit of good y. Let W = (w1, w2) represent the initial allocation. Consumer i’s utility function (for i = 1, 2) is ui(xi, yi) = xiyi, where (xi, yi) represents i’s consumption bundle. (a) Show this economy (with some indifference curves and the initial endowments) in an Edge- worth box diagram. (b) Write down the equations that describe the Pareto efficient allocations. Identify them in the Edgworth box. Is the initial endowment point Pareto efficient? Why or why not? (c) Calculate the competitive equilibrium of this pure exchange economy. You should indicate final consumption bundles for each agent, and the equilibrium prices p∗x and p∗y. (Remember that you can normalize the price of one good to be 1.) (d) For any pair of prices (px, py), consumers 1 and 2 can figure their desired consumption levels, and the net amounts of good x and good y that they want to buy or sell. For example, consumer 1’s net demand for good 1 will be x1 − 1. (This is a negative number, 15 An Exchange Economy 292 meaning that he will want to sell some of his initial 1 unit of x). Similarly, consumer 2’s net demand for good 1 will be x2 − 0. Adding over both consumers gives the total net demand for good x, or the excess demand for x measured in units of x. (This might be positive or negative. If it’s negative, there is excess supply of good x.) Multiplying by px would give the excess demand for x measured in dollars. Assume that (px, py) is any pair of positive prices. (Note that this is any pair of prices, not just the competitive equilibrium prices.) Show that the sum of excess demand for good x in dollars and excess demand for good y in dollars must be zero. This kind of result was first formally established by Leon Walras, and is therefore called Walras’ Law. Walras’ Law can be put this way: the sum of market excess demands, over all markets, measured in currency, must be zero. The Solution (a) We will not draw the Edgeworth box diagram; it is very similar to Figure 15.8. However we will describe the diagram. Please sketch the diagram as you read this, if you have not already drawn it. The Edgeworth box diagram is a square, one unit on each side. Consumer 1’s origin is the lower left-hand corner; consumer 2’s origin is the upper right-hand corner. The initial point W is the lower right-hand corner of the box. Indifference curves are generally symmetric hyperbolas; symmetric around the diagonal of the box that goes from consumer 1’s origin to consumer 2’s origin, that is, from lower left to upper right. However, the indifference curves that go through the initial point W are “degenerate” hyperbolas; this means that for consumer 1, for instance, the indifference curve through his initial bundle (1, 0) is given by x1y1 = 0; graphically, this is his horizontal axis plus his vertical axis. (b) The Pareto optimal points in this example are points of tangency between indifference curves of the two consumers. Tangency requires that consumer 1’s marginal rate of sub- stitution equal consumer 2’s marginal rate of substitution. Consumer i’s marginal rate of substitution is MRSi = MU i x MU iy = yi xi . 15 An Exchange Economy 293 Setting the two consumers’ marginal rates of substitution equal, and then substituting 1− x1 for x2 and 1− y1 for y2, gives y1 x1 = y2 x2 = 1− y1 1− x1 . This leads directly to x1 = y1. Therefore the set of Pareto optimal points, that is, the contract curve, is simply the upward-sloping diagonal of the box diagram, from consumer 1’s origin to consumer 2’s origin. The initial point W is obviously not efficient; it’s not on the contract curve. In fact any move from W into the interior of the Edgeworth box diagram would make both consumers better off. (c) At a competitive equilibrium in the interior of an Edgeworth box diagram, the price ratio p∗x/p∗y, consumer 1’s marginal rate of substitution, and consumer 2’s marginal rate of substitution must all be equal. On the contract curve, where MRS1 = MRS2, we found that x1 = y1 must hold. Since MRS1 = y1/x1, MRS1 = 1/1 = 1 on the contract curve. Therefore p∗x/p∗y = 1 at the competitive equilibrium. We are free to set the price for one of the goods (the numeraire good) equal to 1. Let’s make good y the numeraire good. Then p∗y = 1, and since p∗x/p∗y = 1, therefore p∗x = 1 also. To find the exact location of the competitive equilibrium allocation, we note that the competitive equilibrium budget line must have slope p∗x/p∗y = 1 in absolute value, and must start at the initial allocation W , which is the lower right-hand corner of the box. Therefore the competitive equilibrium budget line is the diagonal of the box going from the lower right corner to the upper left corner. The competitive equilibrium will be at the intersection of the competitive equilibrium budget line and the contract curve, which we already noted was the lower left to upper right diagonal. It follows that the competitive equilibrium allocation is the exact center of the box. Consumer 1’s competitive equilibrium bundle is therefore (1/2, 1/2). And consumer 2’s competitive equilibrium bundle is also (1/2, 1/2). (d) Let (px, py) be any pair of positive prices, and let (x1, y1) and (x2, y2) be the corresponding desired consumption bundles of the two consumers. (The assumption of positive prices 15 An Exchange Economy 294 guarantees that no one wants to consume an infinite amount of x or y.) We will let $ED(x) represent the excess demand for x, measured in dollars, and similarly $ED(y) will represent excess demand for y, measured in dollars. Note that $ED(x) = px(x1 − 1) + px(x2 − 0). The sum of excess demands, for goods x and y is $ED(x) + $ED(y) = px(x1 − 1) + px(x2 − 0) + py(y1 − 0) + py(y2 − 1) = (pxx1 + pyy1 − px) + (pxx2 + pyy2 − py). But consumer 1’s budget constraint says pxx1 + pyy1 = px × 1 + py × 0 = px, and so the terms in the first set of parentheses sum to zero. Similarly, by consumer 2’s budget constraint, the terms in the second set of parentheses sum to zero. Therefore $ED(x) + $ED(y) = 0, which is Walras’ Law. 15 An Exchange Economy 295 Exercises 1. There are two goods in the world, tiramisu (x) and espresso (y). Michael and Angelo both consider tiramisu and espresso to be complements; each will consume a slice of tiramisu only if it is accompanied with a cup of espresso, and vice versa. Michael has five slices of tiramisu and a cup of espresso. Angelo has a slice of tiramisu and five cups of espresso. (a) Draw an Edgeworth box for this exchange economy. Label it carefully. Mark the original endowment point W . (b) Draw Michael’s and Angelo’s indifference curves passing through the endowment point. (c) Can you suggest a Pareto improvement over the original endowment? Mark the new allocation W ′. How many slices of tiramisu and cups of espresso will each of them consume at the new allocation? 2. Ginger has a pound of sausages (xg = 1) and no potatoes (yg = 0), and Fred has a pound of potatoes (yf = 1) and no sausages (xf = 0). Assume Ginger has the utility function ug = xαg y1−αg , Fred has the same utility function, uf = xαf y 1−α f , and the parameter 0 < α < 1 is the same for Ginger and Fred. (You may remember that these are called “Cobb-Douglas utility functions.”) (a) Show that the contract curve is the diagonal of the Edgeworth box. (b) Show that at the competitive equilibrium px py = α 1− α . 3. Consider an exchange economy with two goods, x and y, and two consumers, Rin and Tin. Rin’s utility function is ur = xryr and his endowment is ωr = (2, 2). Tin’s utility function is ut = xty2t and his endowment is ωt = (3, 3). Duncan suggests that there might be a competitive equilibrium at (x′r, y′r) = (4, 1), (x′t, y′t) = (1, 4), with prices px = py = 1. (a) Does Duncan’s suggested equilibrium allocation have the right totals of the two goods x and y? Explain. 15 An Exchange Economy 296 (b) Is Duncan’s suggested equilibrium allocation a Pareto improvement over the endow- ment? Explain. (c) Write down Rin’s budget constraint given these prices. Solve for Rin’s optimal con- sumption bundle, (x∗r, y∗r). (d) Write down Tin’s budget constraint given these prices. Solve for Tin’s optimal con- sumption bundle, (x∗t , y∗t ). (e) Is Duncan right that these bundles and these prices make a competitive equilibrium? Explain. 4. Consider Rin and Tin from Question 3. We shall now solve this general equilibrium model. We are free to set one of the prices equal to 1. We will let good x be the numeraire good; that is, we will set px = 1, and we will solve for the appropriate py. (a) Write down Rin’s budget constraint. Solve for Rin’s optimal consumption bundle, (x∗r, y∗r), with x∗r and y∗r figured as functions of py. (b) Write down Tin’s budget constraint. Solve for Tin’s optimal consumption bundle, (x∗t , y∗t ), x∗t and y∗t figured as functions of py. (c) Write down the market-clearing (i.e., total demand = total supply) condition for x. Using your answers from (a) and (b), rewrite the market-clearing condition as a function of py, and solve for py. (d) Plug py back into your answers from (a) and (b) to find the competitive equilibrium. 5. There are two goods in the world, milk (x) and honey (y), and two consumers, Milne and Shepard. Milne’s utility function is um = xmy3m and his endowment is ωm = (4, 4). Shepard’s utility function is us = xsys and his endowment is ωs = (0, 0). We will again let px = 1, and we will let py vary. (a) Is the original endowment Pareto optimal? Explain. Suppose the dictator sets Milne’s lump-sum tax at Tm = −4, and Shepard’s lump-sum transfer at Ts = 4. 15 An Exchange Economy 297 (b) Write down Milne’s new budget constraint. Solve for Milne’s optimal consumption bundle, (x∗m, y∗m), with x∗m and y∗m figured as functions of py. (c) Write down Shepard’s new budget constraint. Solve for Shepard’s optimal consump- tion bundle, (x∗s, y∗s), with x∗s and y∗s figured as functions of py. (d) Solve for the competitive equilibrium, i.e., market-clearing (x∗m, y∗m), and (x∗s, y∗s), and the equilibrium py. (e) Prove that the new equilibrium allocation is Pareto optimal. 6. “To achieve an efficient allocation, lump-sum taxes on consumers’ endowments and per unit taxes on the prices of goods are equivalent.” Do you agree with this assertion? Explain using the welfare theorems in your arguments. 16 A Production Economy 298 16 A Production Economy 16.1 Introduction In the last chapter, we analyzed a model of a pure exchange economy. Although there were only two people, Robinson Crusoe and Friday, and only two goods, bread and rum, our model was a general equilibrium model. That is, it was a model that took everything into account simultaneously: Robinson’s preferences for bread and for rum, as well as Friday’s, and Robinson’s initial endowments of bread and rum, as well as Friday’s. To keep that model simple, there were only two people and two goods, and there was no production. The quantities of the two goods were taken as given and fixed. We now turn to another general equilibrium model, where everything is taken into account simultaneously. But in this model we will analyze production. To keep this model easy we will assume there is only one person in the economy, who functions both as a producer and as a consumer. We call that one person Robinson Crusoe. (The reader interested in literature may remember that in Defoe’s novel, Robinson is alone on the island for many years before Friday arrives. Our production model can be viewed as an economic analysis of work and consumption on the island, before Friday’s arrival.) In our analysis of the pure exchange economy, we discussed Pareto optimality (or Pareto efficiency) and related concepts, and we analyzed market equilibria. We showed the crucial connections between Pareto optimality and the market, connections that are expressed in the first and second fundamental theorems of welfare economics. Recall that the first theorem says, roughly speaking, that a market equilibrium is Pareto optimal. The second says, roughly speaking, that any Pareto optimal allocation can be achieved with the market mechanism. In this chapter, we will describe the production economy, and identify the Pareto optimal pro- duction outcomes in that economy. We will discuss market equilibria in the production economy. We will end the chapter with production versions of the first and second fundamental theorems of welfare economics, which will provide the connections between the market mechanism and efficiency in production. 16 A Production Economy 299 16.2 A Robinson Crusoe Production Economy Robinson Crusoe is now alone on the island; Friday has not yet arrived. Robinson spends his days working and resting. We let l represent the time he spends working per day (or per week, month, or other time unit). This is his labor. Time spent resting we’ll call leisure, written L. Since there are only 24 hours in the day, labor and leisure time are connected: l+L = 24. (You may recall this kind of analysis of a consumer’s consumption/leisure choice from Chapter 5.) We will assume for now that when Robinson works, he produces bread. We use x to represent a quantity of bread (per day). Near the end of this chapter we will complicate matters by introducing the other good rum, represented by y. But for now, we assume Robinson is only producing bread. As a consumer, Robinson has preferences for the two things he enjoys, leisure and bread. We’ll assume for simplicity that he cares about the time he spends working, l, but only insofar as it limits the time he spends resting, since l = 24−L. (Recall that this is what we assumed in Chapter 5.) Robinson’s preferences regarding leisure and bread might be represented by a utility function v(L, x). The indifference curves for this utility function are downward sloping, with the usual convexity. However, our analysis of Robinson’s behavior as a consumer and producer will be much easier if we get rid of leisure L in the utility function, and replace it with labor l. Of course, leisure is a good (Robinson prefers more of it), and labor is a bad (Robinson prefers less of it). Let u(l, x) represent Robinson’s utility from labor and bread. Note that for the utility function u(l, x), the marginal utility of bread, MUx = ∂u(l, x)/∂x, is positive, and the marginal utility of labor, MUl = ∂u(l, x)/∂l, is negative. Robinson prefers more bread and less work. Because of this, the indifference curves for the utility function u(l, x) will be upward sloping and convex, instead of downward sloping and convex as with the indifference curves for v(L, x). Figure 16.1 below shows two indifference curve graphs; the top one is for the v(L, x) utility function, and has standard downward sloping indifference curves; the bottom one is for the u(l, x) utility function, and has upward sloping indifference curves. INSERT FIGURE 16.1 HERE Caption of Fig. 16.1: Robinson as a consumer. 16 A Production Economy 300 Now let’s consider Robinson as a producer of bread. He has a production function x = f(l). This shows what is technologically feasible for him—how many loaves of bread he can produce (at best) for a given number of hours of labor. If we graph the production function, we get a picture of the best levels of output (bread) Robinson can achieve for given levels of input (labor). These are the points on the production function itself. Such points are called technologically efficient. The production function is graphed in Figure 16.2 below. Points below the graph of the production function are feasible, but technologically inefficient. These points are possible combi- nations of labor and bread, but not the best Robinson could do; that is, with the same amounts of labor he could produce more bread. Points above the graph of the production function are non-feasible; they are simply impossible given Robinson’s technology for producing bread. INSERT FIGURE 16.2 HERE Caption of Fig. 16.2: Robinson as a producer. 16.3 Pareto Efficiency Recall the definitions of Pareto optimality and related concepts that we discussed in the last chapter. First, to be optimal or efficient, an alternative must be feasible. Second, it must be the case that nothing Pareto dominates it. When there are two or more people in the economy, and A and B are two feasible alternatives, we say that A Pareto dominates B if everybody likes A at least as well as B, and at least one person likes it better. When there is just one person, Robinson, A Pareto dominates B if both are feasible and Robinson likes A better. An alternative is Pareto optimal or Pareto efficient if it’s feasible and there are no other feasible alternatives that Robinson likes better. In short, in a one- person economy, the Pareto optimal or efficient outcomes are simply the feasible outcomes that Robinson likes best. Figure 16.3 below shows the (unique) efficient alternative in this simple economy. It is simply the point on the production function that maximizes Robinson’s utility. This is the point where the production function curve is tangent to the highest indifference curve, the point (l∗, x∗) in Figure 16.3. INSERT FIGURE 16.3 HERE 16 A Production Economy 301 Caption of Fig. 16.3: The Pareto efficient production/consumption point. Figure 16.3 reveals the conditions that must hold for an interior (l∗, x∗ > 0) Pareto optimal point in this simple economy. First, (l∗, x∗) must be technologically efficient. It must be on the production function—not below it, which would make it feasible but inefficient, and not above it, which would make it impossible, or non-feasible. It must satisfy the equation x = f(l). Second, it must be a tangency point between the production function and an indifference curve. Very loosely speaking, the slope of an indifference curve is the marginal rate of substi- tution of bread for labor. We say “very loosely speaking” because in this model, labor l is a bad rather than a good, and this means the marginal rate of substitution will not have the usual sign. Remember that when we are looking at two goods, the marginal rate of substitution of the second good for the first good is the amount of the second good we would have to give the consumer to compensate for his consuming one less unit of the first good. Under our definition of MRS in Chapter 2, for two goods x1 and x2, MRSx1,x2 = MU1/MU2, which is positive. But since labor is a bad, the marginal rate of substitution of bread for labor, orMRSl,x = MUl/MUx, is negative. That is, to compensate Robinson for working one hour less, we would have to give him a negative quantity of bread. In Figure 16.3, Robinson’s indifference curves are positively sloped. The slope of an indif- ference curve at any given point equals −1 times the MRS at that point, or Indifference Curve Slope = −MRSl,x = − MUl MUx > 0. The slope of the production function is MP (l) = df(l)dl . Therefore the optimal point (l∗, x∗) must satisfy the equation −MRSl,x = MP (l). 16 A Production Economy 302 If there were a social welfare optimizer in heaven overseeing this simple Robinson Crusoe production economy, we now know what she would like to do. She would want to bring about the allocation (l∗, x∗) that maximizes Robinson’s utility, subject to his technological constraint. She would like to solve the two equations above. Could she do this with the market mechanism? 16.4 Walrasian or Competitive Equilibrium We will now bring the market to Robinson. Of course this seems artificial and bizarre, something only crazy economists would want to do, since Robinson can get along fine without prices, profit maximization, budget constraints, and so on! But we do it anyway because we want to model how the market works, and our particular story is odd only because we chose to develop the simplest possible production model, with just one person, who is simultaneously the producer and consumer of everything. So we will assume there is a market for bread and a market for labor. Robinson as a worker sells his labor in the labor market. Robinson as a consumer buys his bread in the bread market. Robinson as an entrepreneur is 100 percent owner of a bread-producing firm, called “Robby’s Natural Breadworks,” or “Robby’s” for short. Robby’s buys labor on the labor market (from Robinson, of course); it sells bread on the bread market (to Robinson, of course), and if it has profits, those profits go to its owner, namely Robinson. We will also assume that the markets for bread and labor are competitive; that is, we assume that the parties all take prices as given and fixed. We let px be the price (per loaf) of bread, and we let w be the price (per hour) of labor. The reader may recall from the last chapter that since only relative prices matter in a general equilibrium model, one of the prices can be set equal to 1. The good whose price is set at 1 is called the numeraire. In order to simplify our Robinson production model slightly, we will now make bread the numeraire good. That is, we set px = 1. Later on, when we introduce a second good, we will go back to using px for the price of bread, and we will then make labor the numeraire good. Let’s now discuss what our players want to do in this market economy. The firm. Robby’s Natural Breadworks wants to maximize profit. Its revenue is pxx = 1x = x, 16 A Production Economy 303 where x is the amount of bread it supplies on the market (per day or per unit time). Its cost is wl, where w is the wage and l is the number of hours of labor it is buying on the labor market (per day or per unit time.) Its profit is piR = x− wl. It wants to maximize profit given its production technology, that is, its production function x = f(L). Any profit that Robby’s makes is immediately paid to its owner, Robinson. In Figure 16.4 below, we show the profit maximization problem of the firm, and how that problem is solved. Note that the vertical axis is now “x or $” instead of just “x.” This is because we are assuming x is a numeraire good, with price equal to 1. The figure includes the production function x = f(l), and it also includes something new: isoprofit lines. An isoprofit line is a locus of input-output combinations in the figure that all produce the same level of profit. (Recall that “iso” means “the same,” as in “isoquant” in the theory of the firm.) The equation for an isoprofit line is x−wl = piR = a constant. We can rewrite this as x = wl + piR. And we can easily see from the latter equation that the slope of an isoprofit line is w, and the intercept of the isoprofit line on the vertical (bread) axis is the constant piR. In the figure, the higher isoprofit lines correspond to higher profit levels. Robby’s wants to get to the highest isoprofit line in Figure 16.4. The solution to the profit- maximization problem is shown in the figure as (lD, xS). lD is Robby’s demand for labor, and xS is Robby’s supply of bread. (Both are per day or per unit time.) At the desired point (lD, xS), the production function is tangent to an isoprofit line. That is, their slopes are equal, and therefore w = df(l) dl = MP (l). Note that Robby’s profit level at (lD, xS) is xS − wlD = pi∗R. Here pi∗R represents Robby’s maximum profit level. INSERT FIGURE 16.4 HERE Caption of Fig. 16.4: The firm maximizes profit at (lD, xS), at which point profit equals pi∗R. 16 A Production Economy 304 The consumer. Now let’s turn to Robinson. He wants to maximize his utility subject to his budget constraint. His utility function is u(l, x). His budget constraint says what he wants to spend, 1x = x, must be less than or equal to his income. We will assume that he always wants more bread, and so he will spend all his income. His total income is the sum of his earnings as a worker, and his income as owner of Robby’s. In short, Robinson wants to solve the following problem: max u(l, x) subject to x = wl + piR. To solve this utility maximization problem, Robinson looks for a tangency between one of his indifference curves and his budget line. Since the slope of an indifference curve is −MRSl,x and since the slope of the budget line is w, the tangency condition is −MRSl,x = w. To connect Robinson the consumer with Robby’s the firm, we replace the general term piR in Robinson’s budget constraint with the maximum profit level of Robby’s, namely pi∗R = xS−wlD. This gives Robinson the budget constraint x = wl + pi∗R. In Figure 16.5 below, we show Robinson’s budget line, and one of his indifference curves tangent to his budget line. Note that his budget line, based on his budget constraint x = wl+pi∗R, is exactly the same line as Robby’s maximum profit isoprofit line x − wl = pi∗R. The optimal point in the figure for Robinson, the tangency point, is the point (lS, xD). In short, Robinson wants to supply lS hours of labor (per day or per unit time unit), and he wants to consume xD loaves of bread (per day or per unit time). INSERT FIGURE 16.5 HERE Caption of Fig. 16.5: Robinson the consumer maximizes utility at (lS, xD), while Robby’s the firm wants to maximize profit at (lD, xS). The reader who looks at Figure 16.5 for a moment should see a giant problem. The numbers don’t add up. The amount of labor demanded by Robby’s the firm exceeds the amount of labor supplied by Robinson the worker. That is, there is excess demand for labor. The amount of 16 A Production Economy 305 bread supplied by Robby’s the firm exceeds the amount demanded by Robinson the consumer. That is, there is excess supply of bread. In a real economy, excess demand for labor should make the wage rate rise, and excess supply of bread should make the price of bread fall. In our model, where bread is the numeraire good, the price of bread is fixed at $1 per unit, but the wage rate w should rise. And so the isoprofit lines in Figure 16.4, and the budget line in Figure 16.5, should all get steeper. (Their slopes equal w, which should be increasing.) The slopes of those lines should all change until there is neither excess demand for nor excess supply of either labor or bread. Let us now define a Walrasian or competitive equilibrium in our simple production economy. This is a list of prices, w for labor and px = 1 for bread, an input-output vector for the firm (lD, xS), and a labor-supply and consumption vector (lS, xD) for the consumer, such that: 1. Given the prices, the firm maximizes profits at (lD, xS). That is, (lD, xS) solves the following problem: max piR = x− wl subject to x = f(l). 2. Given the prices, and given his budget constraint, the consumer maximizes utility at (lS, xD). That is, (lS, xD) solves the following problem: max u(l, x) subject to x = wl + pi∗R, where pi∗R represents the firm’s maximum profit level. 3. There is no excess demand and no excess supply, of labor or bread: lD = lS and xD = xS. Figure 16.5 above shows a non-equilibrium situation; it fails point 3 in the definition of a competitive equilibrium. In Figure 16.6 below, we modify the budget line of Figure 16.5. We make it steeper (that is, we increase its slope w), so that the point chosen by Robby’s the firm (lD, xS) and the point chosen by Robinson the consumer (lS, xD) just coincide. In Figure 16.6, there is neither excess demand for nor excess supply of either labor or bread. In short, Figure 16.6 shows a competitive equilibrium. 16 A Production Economy 306 As the reader can plainly see in the figure, the competitive equilibrium point (l∗, x∗) = (lD, xS) = (lS, xD) is Pareto optimal. In fact, in this simple economy, there is only one Pareto optimal point and only one competitive equilibrium point, and they are the same point. INSERT FIGURE 16.6 HERE Caption of Fig. 16.6: A Walrasian or competitive equilibrium at (lD, xS) = (lS, xD) = (l∗, x∗). 16.5 When There are Two Goods, Bread and Rum Before turning to the production versions of the two fundamental theorems of welfare economics, we complicate our Robinson Crusoe model slightly by putting another consumption good into the picture. We now assume that Robby’s adds a second good to its (short) line of products, namely rum, and changes its name to Robby’s Natural Bread And Rum Works. It is still Robby’s for short. As before, l is a quantity of labor, x is a quantity of bread, and now y is a quantity of rum. We now have three prices, the wage rate w, the price of bread px and the price of rum py. In the previous sections of this chapter we let bread be the numeraire; that is, we assumed px = 1 for simplicity. In this section, we will let labor be the numeraire; that is, we will set w = 1, and let px and py vary. As before, we have one consumer/worker, namely Robinson. The firm. As before, Robby’s wants to maximize profit. Its revenue is now pxx+pyy, and its cost is wl = l. (All the units, as usual, are per day or per unit time.) Its profit is piR = pxx+pyy− l. It wants to maximize profit given its production technology. Any profit that Robby’s makes is immediately paid to its owner, Robinson. Describing Robby’s production technology is slightly tricky, because we now have 1 input and 2 outputs. The familiar picture of a production isoquant from Chapter 9 is based on 2 inputs and 1 output, and Figures 16.2 through 16.6 in this chapter are based on 1 input and 1 output. The alert reader may remember that we discussed a 1 input, 2 output model in Chapter 8, Section 4. But here’s a direct and easy way to think about this problem: Suppose Robby’s has 2 sheds; one is for baking and the other is for brewing. Robinson the worker spends part of each day in the baking shed, and part of each day in the brewing shed. Each of these activities has a production function like the one pictured in Figures 16.2 through 16 A Production Economy 307 16.6; the functions are concave; meaning these activities are both decreasing returns to scale activities. Now let’s fix Robinson’s total daily work hours at some number, like l0 = 8 hours per day. Quantities that are contingent on the assumed l0 will be called provisional. Let’s think about the possible maximum combinations of bread and rum that Robinson the worker might produce, given that he’s around for only 8 hours. That is, if he produced 0 loaves of bread, what’s the most rum he could produce? If he produced 1 loaf of bread, what’s the most rum he could produce? And so on. The answers to these questions would give a locus of points in a graph with bread x on one axis and rum y on the other axis. The usual convention is to put x on the horizontal axis and y on the vertical axis of such a graph. This locus of points is called Robby’s production frontier. Now the question is: What is the curvature of Robby’s production frontier? The reader should be able to convince herself that if the separate production functions for bread and rum are concave, then the production frontier will also be concave. Note that the production frontier as we have defined it depends on the given input level (that is, l0 = 8), that is, it is provisional, and a different input level would give a different provisional production frontier. Figure 16.7 below is a graph with bread x on the horizontal axis and rum y on the vertical axis. The figure includes Robby’s provisional production frontier, a concave curve. Robby’s would never want to be below the production frontier, because below it the company could make more bread and more rum, for the given labor total of 8 hours per day. Points below the frontier are technologically inefficient. And Robby’s cannot get above the frontier with the given labor total; points above the frontier are non-feasible. Now let’s consider Robby’s provisional profit maximizing problem. At this point we are only analyzing Robby’s choice of 2 variables, x and y, for a given level of the third variable l0. Robby’s wants to maximize its profit piR = pxx+pyy−l, but for a given l = l0, maximizing profit is the same as maximizing revenue pxx+pyy. In Figure 16.7 we have plotted two isoprofit lines. On each of these lines revenue and therefore profit is constant. They are straight lines with slope px/py. The lower one is identified as “an isoprofit line”, and the higher one is identified as “Robby’s best isoprofit line”. Robby’s wants to maximize profit, and so it wants to get to the highest isoprofit line it can, subject to its production frontier. Given the prices px and py, and given the l0 value of 8 hours per day we started with, (xS, yS) represents the amounts of bread and rum that Robby’s wants to supply in the market. 16 A Production Economy 308 We will let pi∗R represent Robby’s profit level at the best-of-all isoprofit line. Note that pi∗R = pxx S + pyyS − l0. Also note the equation for the best isoprofit line is pxx+ pyy = pxxS + pyyS. INSERT FIGURE 16.7 HERE Caption of Fig. 16.7: With two goods, bread and rum, and for a given amount of labor l0, the provisional Walrasian or competitive equilibrium is at (xS, yS) = (xD, yD) = (x∗, y∗). The consumer. Now let’s turn back to Robinson the consumer. He wants to maximize his utility subject to his budget constraint. His utility function is u(l0, x, y). (Remember we are holding l constant at l0, and so this utility maximization will be provisional.) His budget constraint says that what he wants to spend, pxx+pyy, must be less than or equal to his income. We will assume that he always wants more bread and rum, and so he will spend all his income. His total income is the sum of his earnings as a worker, and his income as owner of Robby’s. In short, Robinson wants to maximize u(l0, x, y), subject to pxx+ pyy = l0 + piR. To solve this provisional utility maximization problem, Robinson looks for a tangency between one of his indifference curves and his budget line. To connect Robinson the consumer with Robby’s the firm, we replace the general term piR in Robinson’s budget constraint with the maximum profit level of Robby’s, namely pi∗R = pxxS + pyyS − l0. This gives Robinson the budget constraint pxx+ pyy = l0 + pi∗R = l0 + pxxS + pyyS − l0 = pxxS + pyyS. But this is precisely the same equation as the equation for Robby’s best isoprofit line. Therefore Robinson’s budget line, and Robby’s best isoprofit line, are one and the same line, the straight line tangent to the production frontier in Figure 16.7 above. We call Robinson’s 16 A Production Economy 309 utility-maximizing consumption bundle (xD, yD). At this consumption bundle, one of his indif- ference curves must be tangent to his budget line. We use the D superscripts because the bundle (xD, yD) represents Robinson’s provisional levels of demand for bread and rum in the market. We now see that, contingent on l0, there is a supply bundle (xS, yS) that Robby’s intends to supply to the market, and there is a demand bundle (xD, yD) that Robinson wants to buy in the market. If these bundles are different, then supply and demand for bread are inconsistent, and there must be excess demand for one of the goods and excess supply of the other good. (This would be a 2-output analogy to what we saw in Figure 16.5, the 1-output model.) With excess demand in one market and excess supply in the other market, px/py must change. The price ratio would continue to change until there is no excess demand, and no excess supply. When there is no excess supply and no excess demand, the supply and demand bundles coincide. That is, (xS, yS) = (xD, yD) = (x∗, y∗). This is what we show in Figure 16.7. In that figure we show what must hold in a provisional Walrasian or competitive equilibrium when there are 2 goods. The reader can plainly see in the figure that, contingent on the assumed l0, the competitive equilibrium point (x∗, y∗) is Pareto optimal. It is the best Robinson can do, given l0, and given the available production technology. In fact, in this economy, as in the 1-input, 1-output version of the Robinson Crusoe model discussed in previous sections of this chapter, there is only one Pareto optimal point and only one competitive equilibrium point, and they are the same point. Up to this point in this section we assumed a fixed amount of labor l0, and so the quantities of bread and rum were all provisional. We conclude by briefly sketching how to solve for the equilibrium quantity of labor. This would allow us to transform provisional equilibria into genuine equilibria. Here is how the argument goes, in rough terms: In the previous paragraphs, based on a given l0, we derived a provisional competitive equi- librium (x∗, y∗). Now imagine doing this analysis over and over, with different labor quantities. For each l we start with, we derive a competitive equilibrium consumption bundle, which we now write (x∗(l), y∗(l)), to show that it depends on l. Robinson’s utility function becomes a function of l alone; that is, u(l, x, y) = u(l, x∗(l), y∗(l)) = u(l). Robinson’s utility function has been reduced to a function of just one variable. Under the 16 A Production Economy 310 assumptions we have made in this chapter, u(l) is a concave function and has a unique maximum. Let l∗ be the labor quantity that maximizes u(l). The triple (l∗, x∗(l∗), y∗(l∗)) is the absolute best combination of work and consumption for Robinson. It is the unique Pareto optimal consumption bundle of work, bread and rum. What about prices and the competitive equilibrium? We continue to assume that labor is the numeraire good, and so w = 1. To find the price ratio px/py, we look at the slope of the production frontier, based on l∗, where that production frontier is tangent to an indifference curve, also based on l∗. This is similar to our discussion at Figure 16.7. Multiplying that slope by −1 gives px/py. Finally, to connect the price of labor (w = 1) and the separate prices for bread and rum (px, py), we set −1 times Robinson’s marginal rate of substitution of bread for labor, evaluated at point the (l∗, x∗(l∗), y∗(l∗)), equal to the price ratio w/px = 1/px. This allows us to find px, and then to find py. Once we have all three prices, (w = 1, px, py) for labor, bread and rum, respectively, we show that Robinson is maximizing his utility subject to his budget constraint at (l∗, x∗(l∗), y∗(l∗)), and that Robby’s is maximizing its profit at the same point. In conclusion, we start by finding the l∗ that maximizes Robinson’s utility, and the cor- responding amounts of bread and wine (x∗(l∗), y∗(l∗)). These constitute the unique Pareto optimum combination of labor, bread and rum. Then we find the right prices (w = 1, px, py). With these prices, (l∗, x∗(l∗), y∗(l∗)) is the unique competitive equilibrium combination of labor, bread and rum. It is a genuine equilibrium, not a provisional equilibrium. And in this economy, the competitive equilibrium vector of labor, bread and rum, and the Pareto optimum, are one and the same point. 16.6 The Two Welfare Theorems Revisited The reader may recall the two fundamental theorems of welfare economics from Chapter 15 on the exchange economy model. The first fundamental theorem said that a competitive equilibrium allocation in an exchange economy is Pareto optimal. The second fundamental theorem said that if indifference curves are convex, then any Pareto optimal allocation can be achieved by the market, provided the market is modified with taxes and transfers. Figures 16.6 and 16.7 above indicate that something similar must be true in the production model, since in those figures, the 16 A Production Economy 311 competitive equilibrium is Pareto optimal (theorem 1), and the (unique) Pareto optimal point is a competitive equilibrium (theorem 2). In this section, we will carefully state the two welfare theorems for economies with production. We will state these theorems for models that are more general than the one-firm, one-consumer, one-input, one(or two)-output model we have been examining. Now let us assume that there are any number of consumers and any number of firms. The firms are owned by the consumers; each consumer has shares in the various firms. For example, Robinson may own 3 shares out of 100 (or 3 percent) of firm 1, 20 shares out 200 (or 10 percent) of firm 2, and so on. We assume that the consumers own various inputs (like labor l) that they sell on input markets to the firms. We assume that each firm uses various inputs to produce some output (like bread x), which it sells to consumers on a market for that product. All the markets are competitive, which means that all agents, consumers, and firms take prices as given. We assume that each consumer’s utility depends only on his own bundle of goods, and that each firm’s output depends only on the inputs which it is using. For the second fundamental theorem, we also assume that all the consumers’ indifference curves are convex, and that the firms’ production technologies are also convex. (For a firm that only uses one input, like Robby’s Natural Breadworks, this means the production function is concave, as in Figure 16.2. That is, it satisfies the assumption of diminishing returns to scale.) In this more general production economy model, a Walrasian or competitive equilibrium is a list of prices for inputs and outputs, a set of planned input-output vectors, one for each firm, and a set of planned consumption bundles (including intended supplies of inputs like l), one for each consumer, such that: 1. Given the list of prices, each firm is maximizing profits at its planned input-output vec- tor. The firm distributes those profits to its shareholders, according to their ownership percentages. 2. Given the list of prices, and given the profits received from the firms in which it owns shares, each consumer is maximizing utility subject to his budget constraint, at the planned consumption bundle. 3. At these prices, total supply equals total demand for every good. 16 A Production Economy 312 In the more general production economy model, Pareto optimality is defined in much the same way as it is in the pure exchange model. That is, we first define what is feasible. This depends on the initial allocation of all the goods to the consumers, including the various goods that are used as inputs by the firms. It also depends on the production technologies of the firms. Then, if A and B are both feasible, we say that A Pareto dominates B if everyone likes A at least as well as B, and at least one person likes it better. Finally, a feasible allocation is Pareto optimal or efficient if there is nothing feasible that Pareto dominates it. As an aside, it is interesting to note that the concept of Pareto optimality in a general equilibrium model of an economy with firms and consumers ultimately only looks at the welfare of the consumers. It does not look at the welfare, or the profit levels, of the firms. Those profits flow back to the owners of the firms, and those owners are consumers. The reader may recall that partial equilibrium analysis, done in a market for a single good, adds together consumers’ surplus and producers’ surplus, which seems to imply that society should place some weight on the welfare (i.e., profitability) of firms. But general equilibrium analysis treats firms as producers of goods rather than money. Certainly they produce profits, and they should be profit maximizers, but those profits flow right back to the owners/consumers. In general equilibrium analysis, the ultimate purpose of firms, their reason for being, is to expand the set of things that are feasible so that consumers can achieve higher utility levels. We now turn to the two fundamental theorems of welfare economics for an economy with production and consumption, with any number of firms, consumers, and goods. We are making all the assumptions as listed above, and we are only repeating the crucial assumptions. First fundamental theorem of welfare economics. Suppose there are markets and market prices for all the inputs and outputs, that is, all the goods. Then any competitive equilibrium is Pareto optimal. Second fundamental theorem of welfare economics. Suppose there are markets and market prices for all the inputs and outputs, that is, all the goods. Suppose further that all the consumers have convex indifference curves, and that all the firms have convex technologies. Suppose there is a Pareto optimal allocation that is society’s Target. Then there is a vector of lump-sum taxes and transfers in the numeraire good, which sum to zero, such that when budget constraints are modified with these taxes and transfers, the Target is the resulting competitive 16 A Production Economy 313 equilibrium allocation. As with the first and second fundamental theorems of welfare economics in the pure exchange model of the last chapter, the moral is simple: society should aim for Pareto optimality. In a competitive economy with producers and consumers, the market mechanism gives us Pareto optimality. That’s the first fundamental theorem. If the untouched competitive equilibrium is very unfair—if it makes some people very rich and many people very poor—don’t throw the baby out with the bath water. The market mechanism can be modified in a relatively minor way—by appropriate taxes and transfers—so that the modified competitive equilibrium is any Pareto optimal allocation we might want. That’s the second fundamental theorem. 16.7 A Solved Problem The Problem Robby’s production function for transforming labor l into bread x is given by x = 12 √ l. Robinson Crusoe’s utility function for labor and bread is u(l, x) = x − l2/9. Robinson Crusoe owns Robby’s, and both Robinson the consumer and Robby’s the firm are price takers. (a) Show that at the Pareto efficient allocation Robinson works 9 hours per day. (b) Find the market equilibrium allocation, and explain why it is Pareto efficient. (c) Find the equilibrium market prices and the equilibrium profit level for Robby’s. The Solution (a.1) Here is a direct, brute force approach. In order to find the efficient outcome, we substitute the production function x = 12 √ l into the utility function, making utility a function of only one variable, l: u(l) = x− l2/9 = 12√l − l2/9. Next we differentiate with respect to l and set the result equal to zero: (12l−1/2)/2− 2l/9 = 0. Rearranging terms gives l3/2 = 27, and therefore l∗ = 9. Substituting this into the production function then gives x∗ = 36. 16 A Production Economy 314 (a.2) Here is a less direct, but more intuitive, graphical story. Consider Figure 16.8, which is really Figure 16.3 with some new labels. We are looking for the utility-maximizing point (l∗, x∗), where an indifference curve and the production function are tangent. The tangency condition says the absolute value of the marginal rate of substitution of bread for labor, or −MRSl,x, should equal the marginal product of labor in the production of bread, or MP (l). This gives −MRSl,x = − MUl MUx = 2l/9 = MP (l) = (12l−1/2)/2. A little rearranging gives l3/2 = 27, which leads to l∗ = 9. Substituting into the production function gives x∗ = 36. INSERT FIGURE 16.8 HERE Caption of Fig. 16.8: The Pareto efficient production/consumption point (l∗, x∗). (b) In this simple economy, for any pair of prices (x, px) there will be one and only one line representing both Robinson’s budget line and Robby’s highest isoprofit line. (See Figure 16.5.) Given this line, Robinson the consumer will want a point (lS, xD), and Robby’s the firm will want a point (lD, xS). In order to have a competitive equilibrium, with supply and demand equal in both markets, (lS, xD) and (lD, xS) must coincide. (See Figure 16.6.) Therefore our market equilibrium must be at the point (l∗, x∗) = (9, 36) in Figure 16.8. But this is the Pareto efficient point. (And of course the first fundamental theorem of welfare economics says the competitive equilibrium must be Pareto optimal.) (c) Let’s set the price of bread equal to 1, so that bread is the numeraire good. The slope of Robinson’s budget line and the slope of Robby’s isoprofit line equals the price ration w/px = w/1 = w. But at the market equilibrium, the budget line, the isoprofit line, and the tangent line to the production function are all the same. The slope of the tangent line to the production function at (l∗, x∗) is MP (l∗) = (12l∗−1/2)/2 = 6× 9−1/2 = 2. Therefore the equilibrium wage is w = 2. 16 A Production Economy 315 Robby’s equilibrium profit is piR = 1 × 36− 2 × 9 = 18. Robby’s sends this profit to its shareholder Robinson, whose budget constraint now says x = 2l+ 18. At (lS, xD) = (l∗, x∗) = (9, 36), Robinson’s income is 2 × 9 = 18 from his wages as a worker, plus 18 from his ownership of Robby’s, for a total of 36, all of which he spends on 36 loaves of bread. 16 A Production Economy 316 Exercises 1. Robinson’s technology for producing coconuts (x) is represented by x = √l, where l is labor, in hours per day. His preferences for coconuts and labor are given by the utility function u(l, x) = x − l/2. Assume Robinson is the only consumer of coconuts, and the owner of the only firm which produces coconuts. Suppose the price of coconuts is set at 1. (a) Calculate the Pareto efficient allocation of this simple production economy. (b) Derive the competitive equilibrium of this economy. Find Robinson’s consumption of coconuts, his labor supply, the market wage rate, and the firm’s profits. 2. Consider Robinson from Question 1. Suppose his technology for producing coconuts (x) changes to x = l2/3. His utility function remains the same: u(l, x) = x − l/2. (a) Calculate the new Pareto efficient allocation. (b) Derive the competitive equilibrium of this economy. Find Robinson’s consumption of coconuts, his labor supply, the market wage rate, and the firm’s profits. 3. Remy produces omelettes (x) according to the technology x = √l + 1. He derives the following utility from his consumption of omelettes (x) and his labor (l): u(l, x) = x−2l2. Remy is the only consumer and producer of omelettes in his household. (a) Show that Remy consumes 1.5 omelettes at the Pareto efficient allocation. (b) Suppose the price of omelettes is set at 1. Find the market equilibrium allocation and wage rate. 4. Wendy’s technology for studying chapters of economics (x) and chapters of mathematics (y) is given by x = √lx and y = √ly respectively, where lx and ly are hours per day spent studying economics and mathematics, respectively. Her utility function is u(lx, ly, x, y) = xy− √ lx + ly. Suppose that Wendy has decided to study for a total of four hours per day. (a) How many hours should she spend on economics? How many hours on mathematics? (b) How many chapters of each subject does she study? 16 A Production Economy 317 (c) Calculate her utility. (d) How does her utility change if she decides to double the number of hours she studies? 5. Robinson has expanded his production to coconuts (x) and mangos (y). His inverse produc- tion function is represented by l = x2 + y2 + 3xy, where l is labor, in hours per day. His utility function is u(l, x, y) = 3xy/2+ x+ y − l/2. Suppose the market wage rate w is set at 1. (a) Solve Robinson’s profit maximization problem. Derive his supply of coconuts and mangos, xS(px, py) and yS(px, py). (b) Solve Robinson’s utility maximization problem. Derive his demand for coconuts and mangos, xD and yD, and find his labor supply, lS. (c) Find the price of coconuts and mangos, px and py. 6. Consider an economy where Robinson is the sole producer and the sole consumer of three goods, x, y, and z. Given prices px, py, and pz, the wage w, and his inverse production function l(x, y, z), write down his profit maximization problem and his utility maximization problem. 318 Part V Market Failure 17 Externalities 319 17 Externalities 17.1 Introduction The last two chapters focused on the connections between the market mechanism and Pareto optimality. We showed that in exchange and in production, the free market leads to efficiency, and that any efficient situation can be achieved via a slightly modified market mechanism. These important results relied on some equally important assumptions. One assumption was that all the parties in the economy must act competitively; they must all assume prices are given and fixed, beyond their control. We know that when the assumption of competitive behavior breaks down, as with monopoly, duopoly, or oligopoly, the market does not lead to efficiency. The behavior of a monopolist, or more generally the non-competitive behavior of any player in a market, leads to what is called market failure. In this chapter, we will examine another very important kind of market failure, the kind produced by externalities. When we analyzed trade between two people, we assumed person i’s utility depended only on his bundle of goods, and not on person j’s. When we analyzed production by firms, we assumed that firm i’s costs and output depended only on its inputs, and not on the inputs or outputs of firm j. When we analyzed the interactions between firms and consumers, we assumed a consumer buying a firm’s output cared only about how much he consumed, and the price he paid. If the consumer was selling labor to the firm, he cared only about the quantity he was selling, and the price he received. The consumer did not care about what quantities of its output the firm might be selling to others, or what quantities of its inputs it might be buying from others. In other words, in past chapters we assumed that the players in the market economy—the consumers and the firms—affected each other only through market trade and the prices paid for inputs and outputs in various markets. Person j’s consumption of food might have an effect on person i, but that would be an “indirect” effect via the prices; j’s food consumption might result in food prices being somewhat higher than they would have been, thereby affecting i. But such indirect effects would be entirely incorporated in the market prices. However, the world is full of examples where the actions of consumers directly affect other consumers, in ways not captured by market prices, where the actions of firms directly affect 17 Externalities 320 other firms, in ways not captured by prices, and where the actions of firms and consumers directly interact, in ways not captured by prices. These interactions are called externalities. As we will see, they lead to inefficiency in markets. Externalities create very important market failures. In this chapter we will analyze externalities. We will start with some examples. Then we will carefully describe how the market fails when externalities are present, and we will describe various possible remedies for the market failures created by externalities. The classical remedies for such market failures include Pigouvian taxes and subsidies, and Coasian legal remedies involving property rights. More modern remedies involve markets for pollution rights, including cap and trade markets. 17.2 Examples of Externalities Externalities can be small or large, trivial or extremely important, and they can be negative or positive. Here are some examples. 1. Hip-hop music. Your neighbor buys hip-hop music and plays it on his stereo speakers. He downloads it for $1 per song. He plays it too loud, and each song he buys causes you $2 worth of misery. The result: his consumption of hip-hop has a direct effect on you, imposing a cost of $2 per unit on you. However, that cost is not captured in the market price of $1. As a result, he consumes too much hip-hop, and the market outcome is inefficient. 2. Flowers in your neighbor’s garden. Your neighbor has a flower garden which you can see from your window. She buys tulip bulbs which produce beautiful blooms. She pays $3 per bulb, but each bulb also gives you $1 worth of enjoyment. The result: her consumption of tulip bulbs has a direct positive effect on you, creating a benefit to you of $1 per unit. That benefit is not reflected in your neighbor’s calculations. As a result, she buys too few bulbs, and the market outcome is inefficient. 3. Harley-Davidson motorcycles with aftermarket pipes. Harley-Davidson motorcycles, partic- ularly the ones modified with “aftermarket” exhausts, can be heard by people within a half-mile radius. Their exhaust pipes produce a low-pitched, rumbling, thundering sound. 17 Externalities 321 The Harley owners love that sound, but the rest of us may not. Assume that such a bike costs its owner $0.25 per mile to own, maintain, and fuel. But assume that each mile of riding irritates 25 people who live, work, or play with earshot of the Harley, and assume that these neighbors, on average, would say that the noise of the bike causes them $0.01 worth of irritation. The result: the Harley rider rides too much. He rides to the point where his extra benefit from another mile equals what it costs him, namely $0.25, but he does not take into account the $0.25 worth of irritation imposed on the neighbors. The market is inefficient. 4. Food consumption by people you care about. People you care about don’t have enough money to buy all the food they would like to eat. It pains you to see them hungry. The result: their consumption of food is inefficient. They consume to the point where another dollar’s worth of food produces a marginal benefit to them of $1. But another dollar’s worth of food that they might consume would also produce a benefit to you, say, $0.50 worth. They are consuming too little food, and the market is inefficient. Note that this externality problem is probably not a problem at all if the people you care about live in your house. The more distant they are, however, the more difficult the problem. 5. Mountaintop removal mining in Appalachia. In West Virginia, Kentucky, and other Ap- palachian states in the eastern U.S., coal mining companies are mining coal by blasting the summits and summit ridges off of mountains, in order to expose the coal seams lying below. This is a less costly process than underground mining, but it may have unfortunate results. Although mountaintop mining firms are required to do some reclamation of the land once the coal is removed, the reclamation can only be partial. The mined areas are extensively deforested and the topography permanently altered. The soil and rock that was blasted off the coal seams may end up in valleys and streams below. Some watersheds may be polluted. Toxins and dust from blasting and removal of the overburden may have adverse health effects on nearby residents. In short, there may be various negative effects on local people and firms, as well as wider adverse environmental effects. In making their production decisions, the mountaintop mining firms consider the price of coal, the usual labor and materials costs of their operations, and the costs of required 17 Externalities 322 reclamation. But they may not count the costs of adverse health effects on the residents, or the costs of stream pollution, or the possible costs of permanent alteration of the terrain. Therefore they may produce too much coal, and their market decisions may be inefficient. 6. Fossil fuels and global warming. When we use fuel oil to heat our houses, gasoline to fuel our cars, or coal to generate our electricity, we release CO2 into the atmosphere. CO2 is a greenhouse gas, and many experts believe it contributes to global warming. The ultimate effects of global warming are uncertain, but the effects might be large and negative. The consumer is deciding how big a car to drive, and how many miles to drive it. The consumer looks at the price of gasoline, and consumes to the point where the marginal benefit per gallon of gasoline equals the price. But the consumer does not take into account the cost imposed on others, possibly 50 or 100 years in the future, created by the CO2 produced by a gallon of gasoline. Therefore he uses too much gasoline, and his market decision is inefficient. The list could go on and on. It is obvious that there are lots of externalities in the global economy, some questionable and some very clear, some minor and some extremely important. In the next section we will carefully analyze one externality situation. 17.3 The Oil Refiner and the Fish Farm Suppose we have an oil (petroleum) refinery located on a river. The oil refinery is an industrial process plant which transforms crude oil into refined products, such as gasoline, heating oil, diesel fuel, and liquefied petroleum gas. By accident or by design, it sometimes dumps its wastes in the river, which flows down to a bay in the ocean. A fish farm is located in the bay. The fish farm is adversely affected by the oil refiner’s water pollution. More oil produces more pollution, and more pollution increases the fish farmer’s costs. We will let o and f represent quantities of oil refined and fish produced by the two firms, respectively. They are competitive in the markets where they sell their outputs and where they buy their inputs. We assume that the refiner’s output is measured in barrels of crude oil converted into refined products. The price for refining a barrel of crude oil is po; the price for a unit of fish is pf . The oil refiner’s cost function is Co(o), and the fish farmer’s cost function is 17 Externalities 323 Cf (f, o). Note that the fish farmer’s cost function depends both on its own level of output f , and on the oil refiner’s level of production o. The externality in this example is the fact that o shows up in Cf . When they maximize profits, the firms set price equal to marginal cost. We assume both firms have positive and increasing marginal costs, as usual. We let MCo(o) represent the oil refiner’s marginal cost function, and we let MCf (f, o) represent the fish farmer’s marginal cost function. Profit maximization by the oil refiner will lead it to its market-based level of output, which we will call oM . Similarly, profit maximization by the fish farm will lead it to its market-based level of output fM . We turn to the oil refiner first. His economic problem is straightforward. (The actual oil refining process is complex.) He solves the following equation to find oM : po = dCo(o) do = MCo(o). Now consider the fish farmer. His problem is slightly less straightforward. His profit- maximization condition is pf = ∂Cf (f, o) ∂f = MCf (f, o). Note that the fish farmer has no control over o, even though it appears in his cost function. He will take it as given, since it is outside his control. On the other hand, he does know that the oil refinery upstream hurts him at his fish farm. The damage shows up in his cost function Cf (f, o). Obviously, if he produces more fish his costs increase; MCf (f, o) = ∂Cf (f, o)∂f is positive and increasing. But if the refinery processes more oil, it also costs the fish farmer more; ∂Cf (f, o) ∂o is also positive. The term ∂Cf (f,o)∂o is called the marginal external cost imposed by the oil refiner on the fish farm, or MEC for short. But the fish farmer cannot do much about MEC, at least not by himself. In Figure 17.1 below, we show the oil refiner’s profit maximization problem. The firm faces a market price po, and has a marginal cost curve MCo(o). It maximizes profit at oM , where price equals marginal cost. 17 Externalities 324 INSERT FIGURE 17.1 HERE Caption of Fig. 17.1: The market level of oil refining. Here is how the market equilibrium for oil refining and fish production is determined. The oil refiner finds oM by setting price equal to marginal cost, or by using Figure 17.1. Then the fish farmer incorporates oM in his price-equals-marginal-cost equation; that is, he solves pf = MCf (f, oM). This gives fM . We now have oM and fM , with both firms maximizing profits, and everything looks fine. If the first fundamental theorem of welfare economics applied here, the market outcome would be Pareto optimal. But it turns out that it’s not optimal. Everyone calculated everything properly, except that MEC was ignored. No one took into account the fact that more oil refined means higher costs for the fish farmer. In order to carefully show the market failure, we must first define optimality in this small model. Then we must show that the competitive market outcome (oM , fM) is not optimal. What then does optimality mean in this model with two firms? Each firm is interested in only one thing: its profit. Let pio be the oil refiner’s profit, and let pif be the fish farmer’s profit. If there were some alternative to the market equilibrium that increased the profit of both firms, (oM , fM) would not be Pareto optimal. Since cash could be easily transfered from one firm to another, a necessary condition for optimality is that joint or total profit, pio + pif , must be maximized. In this model, pio + pif is the size of the “economic pie” created by the activities of the two firms combined. To find the Pareto efficient solution in this simple economy, then, we only need to solve the joint profit maximization problem. Therefore let us assume the two firms are combined, or merged, into an oil refining/fish farming conglomerate. The merged firm refines oil upstream on the river, and produces fish at the fish farm downstream in the bay. However, merging the two firms into one internalizes the externality. That is, a profit maximizing manager of a merged oil refining/fish farming firm would not ignore the marginal external cost imposed by his oil refining activity on his own fish production activity. The merged firm is interested in maximizing total profit from its two 17 Externalities 325 activities, pi(o, f) = pio + pif . The cost functions for the two activities remain as they were, but now the oil refining/fish farming firm wants to maximize: pi(o, f) = poo+ pff −Co(o)− Cf (f, o). There are two first order conditions for maximizing the function pi(o, f). The first requires that the partial derivative of pi(o, f) with respect to o must equal zero: ∂pi ∂o = po − dCo(o) do − ∂Cf (f, o) ∂o = 0, or po = dCo(o) do + ∂Cf(f, o) ∂o = MCo(o) + MEC. The second requires that the partial derivative of pi(o, f) with respect to f must equal zero: ∂pi ∂f = pf − ∂Cf (f, o) ∂f = 0, or pf = ∂Cf (f, o) ∂f = MCf (f, o). We will let (o∗, f∗) represent the solution to the two first order condition equations. That is, (o∗, f∗) is the Pareto optimal combination of production levels for oil and fish. How does the market combination (oM , fM) compare to the Pareto optimal (joint profit-maximizing) com- bination (o∗, f∗)? Note that they are different, because the pair of equations that produced (oM , fM) is different than the pair of equations that produced (o∗, f∗). Since (oM , fM) and (o∗, f∗) are different, the market outcome was not optimal. The difference was in the equation for determining the oil refinery quantity o. The market equation was po = MCo(o). This equation ignores MEC. But the joint-profit maximizing equa- tion is po = MCo(o) +MEC. This equation takes MEC into account. For efficiency, we must have price equals marginal cost plus marginal external cost. In Figure 17.2 below, we provide a graph that we can use to find o∗. In the figure, we have a marginal cost function MCo(o), the same as in Figure 17.1. We also have a horizontal line at the oil refining price po. But we now add another curve, a positive (and upward sloping) MEC curve. (In order to find o∗ in the figure, we need f = f∗. This is because the exact position of the MEC curve depends on f .) We also show the sum of the marginal cost and marginal 17 Externalities 326 external cost curves. And we see in the figure that the optimal quantity of oil to refine o∗ must be less than the market amount oM . INSERT FIGURE 17.2 HERE Caption of Fig. 17.2: The efficient amount of oil to refine is less than the market amount. Note that d− c equals the height of MEC at o∗, and a− b equals the height of the MEC curve at oM . In short, the competitive market is inefficient, or non-Pareto optimal, when oil refining creates an externality that adversely affects fish production. This is a market failure, created by the oil refining/fish farming externality. Too much of the externality-causing good is produced, because under the market the negative effect of that good on the other firm, namely MEC, is ignored. In order for optimality to hold, external effects must not be ignored. What can be done about situations like this? A free-market zealot might say: “The govern- ment should do nothing. Never touch the market. Socialists, keep your hands off! The market knows what it is doing!” But as we have seen, the analysis suggests this is wrong. On the other hand, an environmental zealot might say: “The government should crack down on those polluting sons of b—–s. Activities that create water pollution and mess up fish farms should be shut down.” But this would also be wrong, because the optimal amount of oil to refine (with accompanying pollution) is not zero. (And we are aware that in reality, fish farms also pollute.) In the next section of this chapter, we will discuss solutions to the externality problem. 17.4 Classical Solutions to the Externality Problem: Pigou and Coase We have seen that market decisions about production may be non-optimal or inefficient when there are external effects, that is, when one producer’s production decisions have direct effects on another producer’s costs. There are several possible solutions to externality-generated market failures, which will be outlined here and in the next two sections. Pigouvian taxes and subsidies. The problem we saw in the last section came about because the oil refiner did not have to pay the external cost it imposed on the fish farmer. One possible 17 Externalities 327 solution is to tax the oil refiner. The tax would be collected by the government. Ideally, the tax would be tightly linked to the cost imposed by the oil refiner on the fish farmer. Such taxes were advocated in 1920 by the great English economist Arthur Pigou (1877- 1959), in his book The Economics of Welfare. Therefore we call them Pigouvian taxes. If an externality is positive instead of negative, efficiency might require that the government pay the firm creating the externality a subsidy, in order to increase its output of the good with the beneficial externality. Such a subsidy is called a Pigouvian subsidy. A Pigouvian tax can be defined at the margin, or in total. For our oil refiner/fish farmer example, the Pigouvian tax at the margin, which we will call t(o), is set equal to t(o) = MEC = ∂Cf (f ∗ , o) ∂o . When deciding how much oil to refine, the oil refiner now reasons that an extra unit of oil refined costs him his marginal cost MCo(o), plus t(o) = MEC. This means that he uses the MCo(o) + MEC curve in Figure 17.2, finding the point where it crosses the horizontal line at po. This leads him to the choice of o∗, the optimal output for the refinery. The in total version of the Pigouvian tax, which we will call T (o), is simply the integral of the marginal Pigouvian tax, from a base point o0 to whatever refinery output o the firm chooses, all contingent on the efficient fish output f∗. (A note about notation: We apologize for o0!) That is, it is the area under the MEC curve from o0 to o. If the base point o0 = 0, then the oil refiner must pay a total tax of T (o), based on every unit of oil refined, with T (o) = Cf (f∗, o)−Cf (f∗, 0), the total extra cost imposed on the fish farmer by the presence of the oil refiner. The marginal version and the total version of the Pigouvian tax are really two versions of the same thing. The only notable difference between them is that the base point for calculating the tax is explicit in our definition of the total tax, whereas it is unstated in our definition of the marginal tax. Finally, it is interesting and important to note that the base point for calculating the total version of the Pigouvian tax need not be zero. The government may decide that the oil refiner was there first, and ought to be allowed to produce some output o0 > 0 tax-free. As long as o0 is set less than or equal to o∗, imposing the total version of the Pigouvian tax on the oil refiner, 17 Externalities 328 that is setting T (o) = Cf (f∗, o∗)− Cf (f∗, o0), will still induce that firm to choose o∗. Coasian property rights. It might be argued that the oil refiner is really not the “bad guy” in our story. It is just unfortunate that the fish farmer is located in the bay that the oil refiner’s river flows into. If it were not for that accident of location, there would be no externality. (The same kind of comment would apply to our hip-hop music story, and to our Harley exhaust noise story.) Often people (and firms) solve the problem of a bad location by moving. The fish farmer could move to a different bay, and if you live next to a hip-hop music fan and you like Beethoven, you could move to a different apartment. But moving is expensive, and for some externalities (for instance, global warming) it is impossible. In what follows we will assume that moving away from the externality is prohibitively expensive, or impossible. An English-born, American economist named Ronald Coase (1910-), in a 1960 article titled “The Problem of Social Cost,” took the position that externality problems are more the result of unlucky location than the result of bad behavior by bad guys. Coase argued that economists are too quick to advocate taxes, including Pigouvian taxes, to fix various problems, including externality problems. He argued that externality-based market failure is not a consequence of nasty firms dumping their costs on innocent firms. Rather, this kind of market failure is a consequence of ill-defined property rights. That is, the trouble between our oil refiner and our fish farmer, who happen to be neighbors connected by a river, is a consequence of the absence of clear legal rights to clean water in the river, or clear legal rights to dump waste in the river. (Of course Coase wrote before the enactment of clean water legislation in the United States, but laws like the 1972 Clean Water Act in the U.S. do not provide the legal structure that Coase had in mind anyway.) Here’s the Coase solution to our market failure. Courts should create legal rules to deal with interactions like the one between the oil refiner and the fish farmer. (And in fact in Anglo- American law there are areas of law, such as nuisance law, which lay out rights of parties in somewhat similar situations.) The legal rules should make perfectly clear whether the oil refiner has the right to dump waste in the river, or the fish farmer has the right to waste-free water. The legal rules should specify how a party whose rights are violated can take legal action against 17 Externalities 329 the violator, and what remedies a court hearing such a case can use. Standard remedies include imposing money damages, or granting an injunction—a legal order to cease the violation. And, hopefully, the legal process should be quick, effective, and inexpensive. (Coase is aware that these are properties that legal action may or may not have.) Let’s now assume that courts give fish farmers clear rights to unpolluted waters, and let’s assume that courts use money damages (rather than injunctions) as remedies. Suppose the oil refiner is producing oM and the fish farmer is producing fM . The fish farmer goes to court (assumed to be quick, easy, and cheap). He makes his case. The court rules in his favor, and grants money damages. Under the law, damages are set equal to the costs imposed on the fish farmer by the actions of the oil refiner. The court calculates damages accurately, and rules that the oil refiner must pay the fish farmer Cf (fM , oM)−Cf (fM , 0). Now suppose that all the output variables and costs are per year, and will repeat over and over again as long as the parties don’t change their behavior. By the beginning of the next year, the oil refiner figures out what the legal system is going to do to him. He wises up, knowing that if he continues to produce oM he will be assessed money damages just like total Pigouvian taxes, year after year, even though no nasty Internal Revenue Service agents or other tax collectors are involved in this process. He then decides to stop producing oM , and to produce o∗ instead, the efficient output. And the fish farmer ends up producing f∗, and collecting money damages every year from the oil refiner. But the Coasian story doesn’t stop there. Let us now assume that the law allocates the initial rights in the opposite way. That is, courts lay out legal rules that say firms have clear and unambiguous rights to dump waste in rivers. But let us now also assume that the firm managers can meet, negotiate, and enter into contracts with each other, and do so at minimal cost. Here is what happens. The oil refiner starts out producing oM . The fish farmer is aware of the pollution and the external costs imposed on him. He goes to the oil refiner and says: “I know you have a legal right to put waste in the river. However I will pay you if you reduce your output somewhat, from oM to o∗. If you do this, and if I produce f∗, my costs will drop by Cf (f∗, oM)−Cf (f∗, o∗). Your profit from the oil refining operation will drop when you reduce output from oM to o∗. However, if you look at this graph I drew [at this point he pulls out Figure 17.2, which he tore out of this book], your reduction in profit is the area of triangle ∆(b, c, d). I will reimburse you that amount plus one half the area of triangle ∆(a, b, d). What 17 Externalities 330 do you say?” The oil refiner responds: “Yes, I have had courses in economics and calculus at M.I.T., and I see that your proposed contract would make my firm more profitable when your payment is added to my operating profit from oil refining.” So they sign the contract, and from that point onward, they produce (o∗, f∗) per year, with the fish farmer paying the contracted amount to the oil refiner each year. In short, Coase argues that the oil refiner/fish farmer externality problem can be remedied through the use of legal principles, rather than through the imposition of taxes. The law should grant clear rights to either one side or the other. If using the law is quick, easy, and cheap, and if it is easy and cheap to negotiate mutually-beneficial contracts, then for purposes of efficiency it does not matter whether the fish farmer is granted the right to clean water, or the oil refinery is granted the right to dump its waste in the river. Of course which party is granted the right does affect the profitability of the two parties. Note that Coase does not claim that the assumptions we’ve made above, about quick and low-cost legal structures, and cheap and easy negotiation of contracts, necessarily hold in reality. He takes the position that if there are high costs or frictions on one side or the other, courts should take those costs or frictions into consideration when they make their initial decisions about which side should be granted rights. The Coase argument is now summed up in what is called the Coase Theorem. Applied to our oil refiner/fish farmer example, it says this: 1. Suppose the law grants a clear right to fish farmers to recover money damages from firms that pollute their water. Suppose further that the courts are quick, accurate, and cheap to use. Then the two firms will end up at the efficient output levels (o∗, f∗), and the oil refiner will pay damages to the fish farmer for the external costs imposed on him. 2. Moreover, suppose the law grants a clear right to oil refiners to dump waste in the river. Suppose further that making and enforcing contracts between the parties is cheap, quick, and easy. Then the two firms will end up at the efficient output levels (o∗, f∗), with the fish farmer making periodic contractual payments to the oil refiner in exchange for the reduction in his refinery output from oM to o∗. 17 Externalities 331 17.5 Modern Solutions for the Externality Problem: Markets for Pollution Rights Let’s complicate our oil refiner/fish farmer example slightly. Instead of just putting oil in the fish farmer’s cost and marginal cost functions, let us call the amount of pollution or waste produced by the oil refiner x. So the oil refining firm processes o units of oil, and simultaneously produces x units of pollution as an unintentional byproduct. Its cost function is now Co(o, x), with ∂Co(o, x)/∂o > 0. The amount of pollution x might or might not be proportional to o. In fact, we assume that there will be various different (costly) techniques to reduce pollution x, for a given amount of oil refined o. We assume that x is easy to observe and measure. Turning to the fish farmer, we replace o with x in its cost and marginal cost functions. Its cost function is now Cf (f, x). If the oil refiner and the fish farmer maximize profits in the usual fashion, they end up at the usual place, (oM , fM), with a corresponding pollution level xM . But now we assume that the government sets up a market for pollution rights. This sounds odd, because no sane person wants to buy pollution. Here is what we mean: 1. The government decides on some benchmark level of pollution. We will call this benchmark level x0. The idea is that the government will allow at most x0 units of pollution (per unit time). Government officials must decide on how high (or low) to set x0. To do so, they do an optimality analysis similar to what we did above to find (o∗, f∗), and they figure out the corresponding pollution level x∗. The government officials decide that the oil refining firm must be allowed to produce at least x∗ units of pollution (per unit time), for otherwise (o∗, f∗) is unattainable. Therefore they set the benchmark level at some x0 ≥ x∗. 2. They create pollution permits. A pollution permit allows its owner to produce one unit of pollution per unit time. A firm that produces a quantity of pollution without an equal number of permits is shut down. 3. They distribute the permits. The obvious way to distribute them would be to give them all to the oil refiner. But this is not the only way; they could also be given to the fish farmer, or split between the two firms, or given to poor people, or kept by the government. 4. Finally, the government sets up a market for pollution permits (or allows someone else to set 17 Externalities 332 up such a market.) On the market, firms that want permits can buy them, and firms or people or governments that have permits can sell them. We assume that the participants in this newly-created pollution permit market act competitively. The presence of this market solves the externality problem, because the externality gets incorporated in the pollution permit price. Here are some partial explanations of this important result. Suppose all the permits are granted to the fish farmer. The oil refiner needs permits (creating a source of demand in the permit market), and the fish farmer has permits he doesn’t need (creating a source of supply). The fish farmer sells x∗ of his permits on the market; the oil refiner buys them. The market price is MEC at (o∗, f∗). (This outcome is somewhat similar to Coase’s first scenario.) Or, suppose x∗ permits are granted to the oil refiner, and x0 − x∗ are granted to the fish farmer. Then no sales take place on the market. Or, suppose all the permits are granted to the oil refiner, and x0 > x∗. Then the fish farmer buys x0 − x∗ permits on the market from the oil refiner. (This outcome is similar in flavor but much more plausible than Coase’s second scenario.) Or, suppose all the permits are granted to the government (which can sell them on the market). Then the oil refiner buys x∗ permits, and the fish farmer buys the remaining x0 − x∗. (This outcome is somewhat similar to the outcome of the Pigouvian tax, paid on all units of oil refined, from zero.) In short, we see that a market for pollution permits is another reasonable solution to market failure due to externalities. The permit market solves the externality problem because the externality gets incorporated in the price of the permit, and is therefore not ignored. 17.6 Modern Solutions for the Externality Problem: Cap and Trade A pollution permit market can also be used to achieve a slightly different efficiency goal. We now turn to what is called a cap and trade system. A cap and trade market is a version of a pollution rights market, similar to what was described in the last section, but redesigned for use in an economy with multiple polluting firms. The purpose of cap and trade is to create market coordination of pollution abatement activities by multiple polluters. Market coordination of pollution abatement will result in an efficient distribution of pollution abatement activities, or what we might call an efficient distribution of pollution origination (so-called because this is 17 Externalities 333 about which of the oil refining firms originate which amounts of pollution). To explain cap and trade, we now assume that there are many oil refining firms on the river. Each oil refiner refines oil and produces pollution simultaneously; producer i produces (oi, xi). Their cost functions are different, and their pollution reduction techniques are different. Some firms are able to cut pollution output cheaply when their oil refining output is high, some when it is low, some always, some never. The fish farmer’s cost function depends on the total pollution in the river, the sum of the xi’s, which we call x. We let (oMi , xMi ) represent firm i’s output of oil refined and pollution produced, under a market or laissez faire policy. As usual, we let fM represent the fish farmer’s output in the market equilibrium. Of course the market outcome is not efficient or Pareto optimal. We let (o∗i , x∗i ) represent efficient levels of oil refined and pollution produced by firm i, f∗ represent the efficient output of fish, and x∗ = x∗1 + x∗2 + . . . the efficient total pollution level. Now the government steps in to set up a cap and trade market. The government may not know the pollution reduction techniques of the different firms; it may not know the cost functions of the different firms. But we assume the government can easily observe each firm’s pollution output xi. Here is how cap and trade works: 1. The government first decides on some benchmark level of total pollution from all the oil refiners. We will call this benchmark level x0. We will assume now that the government chooses the efficient pollution level as its benchmark, so that x0 = x∗. The government creates x0 = x∗ permits. (The model could be generalized to allow more permits than the efficient total pollution level, similar to what we did in the last section. However, it is simpler this way.) A firm needs one permit for each unit of pollution that it produces. Any firm operating without the required number of permits is shut down. 2. The government allocates those permits to the polluting oil refining firms. Since the gov- ernment does not have detailed information about all the firms, the number of permits allocated to firm i may not equal x∗i . Let ai represent the number of permits allocated to firm i. The government gives ai permits to firm i; it does not sell them to firm i. We will assume for simplicity that the government allocates all x∗ permits to the oil refiners. (The model could be easily generalized to allow the government to retain some of its permits for itself, or to allocate some permits to other parties.) 17 Externalities 334 3. The government sets up a market for pollution permits (or allows someone else to set up such a market). On the market, firms that need permits can buy them. That is, if a firm wants to produce xi units of pollution, and xi > ai, then the firm can buy the additional permits it needs. On the other hand, if a firm wants to produce xi units of pollution, and xi < ai, then it was allocated more permits than it needs, and it can sell the extra permits on the market. (Similarly, in a more general model, if the government has allocated itself some permits it could sell those on the market, as could other parties who were allocated permits.) What is the result? First of all, if the government set its benchmark correctly and created x∗ permits in total, then the total output of pollution will be efficient, as will the fish farmer’s output f∗. Second and more important, pollution will be mitigated in an efficient way. Here is an intuitive argument for this result. Once again, we must decide what we mean by efficiency or Pareto optimality in this model. Since we are considering only the group of oil refining firms, the optimal set of production and pollution decisions is the one that maximizes total profit of all these firms. So we consider a conglomerate firm made up of all the oil refiners. As a group, they have been allocated x∗ permits in total. As a group, they do not pay for permits. (They certainly pay each other for permits, but the total net amount paid, when you sum over all the oil refining firms, is zero.) Let n be the number of oil refiners. Aggregate profit for all n firms is given by: pi = po(o1 + o2 + ...+ on)− C1(o1, x1)−C2(o2, x2)− . . .− Cn(on, xn). This is maximized subject to the constraint that x1 + x2 + . . .+ xn = x∗. A necessary condition for maximizing pi subject to this constraint is that for every pair of firms i and j, ∂Ci(oi, xi) ∂xi = ∂Cj(oj, xj) ∂xj . On the left hand side of the equation is firm i’s marginal cost of pollution (which is negative). This equals -1 times its marginal cost of pollution abatement (which is positive). A similar comment applies for firm j on the right hand side of the equation. 17 Externalities 335 This equation now has a nice intuitive interpretation. It says that holding oi and oj constant, efficiency requires that the marginal cost of pollution abatement at firm i must equal the marginal cost of pollution abatement at firm j. This is an intuitively clear efficiency condition. Suppose to the contrary that these marginal costs were different. For instance, suppose it costs firm i $100 to reduce pollution by 1 unit, holding oi constant, whereas it costs firm j $50 to do the same thing, holding oj constant. This couldn’t be efficient because the manager of the conglomerate firm that includes i and j could instruct i to allow 1 more unit of pollution (saving $100) and simultaneously instruct firm j to cut 1 unit of pollution (at a cost of $50). The net result would be the same total amount of oil refined, the same total amount of pollution, but $50 more in profit. Finally, we need to explain why the existence of the cap and trade market implies that ∂Ci(oi, xi) ∂xi = ∂Cj(oj, xj) ∂xj must hold for every pair of firms. As we indicated above, abbreviating marginal cost with MC, the equation says: Firm i’s MC of pollution abatement = Firm j’s MC of pollution abatement. The reason is this. A competitive market price for pollution permits has been estab- lished in the cap and trade market. Call it px. With the cap and trade market in place, the independent oil refining firms all buy and sell pollution permits on that market. If px > i’s MC of pollution abatement, firm i can sell a permit on the market and spend its money on abating its pollution by 1 unit. It would increase its profit that way, by the difference between px and its MC of pollution abatement. Similarly if px < i’s MC of pollution abatement, firm i can increase its profit by buying a pollution permit and saving the cost of abating a unit of pollution. Therefore, for it to be maximizing its profit, firm i must end up at a point where px equals its MC of pollution abatement. But this must be true for all the oil refining firms. That is, px = −∂Ci(oi, xi)/∂xi for all i. Therefore for all firms i and j, ∂Ci(oi, xi) ∂xi = ∂Cj(oj, xj) ∂xj must hold, which is our condition for efficiency. 17 Externalities 336 In sum, a cap and trade market establishes a price for a unit of pollution. All the oil refiners face that price. They all adjust their oil refining and pollution abatement activities so that the marginal cost of pollution abatement at all the firms equals the market price for a unit of pollution. This guarantees that they end up at a place where the marginal cost of pollution abatement is the same for every one of the firms, and this is the condition that must hold for efficiency. Before ending this chapter, we should comment on the politics of cap and trade. If the government were to allocate all the pollution permits to itself, and none to the oil refiners, the government would end up selling the permits to the firms and efficiency would still hold. This would be analogous to a Pigouvian-style tax on pollution, on all units produced, with the tax rate set equal to the permit price. But the oil refiners have a strong preference for having the permits allocated to the oil refiners, rather than to the government or others. This way their only loss, in the aggregate, is the difference between profit at xM and profit at x∗. Moreover, cap and trade might make it more difficult for new competitors to enter the oil refining business. In short: Polluting firms are happiest with no regulation at all. They would be less happy with cap and trade. They would be least happy with pollution taxes. 17.7 A Solved Problem The Problem Assume that there are two firms producing steel; firm 1’s output is s1 and firm 2’s is s2. Assume that the market price for steel is ps = 1. There is a negative externality because of pollution: Firm 1’s operation causes firm 2’s costs to rise. Assume that firm 1’s cost function is C1(s1) = s21 and that firm 2’s cost function is C2(s1, s2) = (s2 +0.75s1)2. In short, firm 1 is not affected by firm 2, but firm 2 is adversely affected by firm 1. (a) Remember that in the short run firms can have negative profits, but that in the long run their profits must be non-negative. Find the market equilibrium (sM1 , sM2 ) in the short run. (b) Show that the short run equilibrium from part (a) is not efficient. (c) Describe the market equilibrium in the long run. 17 Externalities 337 (d) Now assume that the negative externality works both ways; that is, assume C1(s1, s2) = (s1+0.75s2)2 and C2(s1, s2) = (s2+0.75s1)2. Find the market equilibrium, and show that it is not efficient. The Solution (a) Firm 1 maximizes profit by setting price equal to marginal cost. Marginal cost for firm 1 is MC1 = 2s1. Price equals marginal cost gives 1 = 2s1 or sM1 = 0.5. Firm 1’s profit is pi1 = pssM1 − C1(sM1 ) = 0.5− (0.5)2 = 0.25. Firm 2 maximizes profit by setting price equal to marginal cost also, but s1 will appear in its marginal cost function. Firm 2 cannot do anything about s1, and so it assumes s1 is constant. Marginal cost for firm 2 is MC2 = 2(s2 + 0.75s1). Price equals marginal cost for firm 2 gives 1 = 2(s2 + 0.75s1) or s2 = 0.5− 0.75s1. Plugging in sM1 = 0.5 now gives sM2 = 0.5− 0.75× 0.5 = 0.125. Firm 2’s profit is pi2 = pssM2 −C2(sM1 , sM2 ) = 0.125− (0.125+ 0.75× 0.5)2 = 0.125− 0.25 = −0.125. Note that firm 2 is losing money, which is possible in the short run but not in the long run. (b) In order to show that the outcome above is not efficient, we only have to show that there is an alternative arrangement which would result in higher total profits for firms 1 and 2. With the (sM1 , sM2 ) we’ve just calculated, pi1 + pi2 = +0.25− 0.125 = 0.125. 17 Externalities 338 Here’s an alternative that produces higher total profits. Shut down firm 1; that is, force s1 = 0. Then pi1 = 0. Then firm 2 sets price equal to marginal cost, based on s1 = 0; this gives s2 = 0.5. Firm 2’s profit is now pi2 = 0.5− (0.5)2 = 0.25 > 0.125. (c) In the long run, firm 2 will exit the market, since its profit is negative in the short run. With firm 2 gone, firm 1 sets s1 = 0.5 and has profit of pi1 = 0.125. The externality problem disappears in the long run, since firm 2 has exited the market. (d) Now we’re assuming the externality works both ways, and so firm 1’s cost function incor- porates the externality just like firm 2’s. That is, C1(s1, s2) = (s1 + 0.75s2)2. The profit maximization condition for firm 1 is 1 = 2(s1 + 0.75s2) or s1 = 0.5− 0.75s2. For firm 2, the profit maximization condition is 1 = 2(s2 + 0.75s1) or s2 = 0.5− 0.75s1. Solving the two profit maximization conditions simultaneously (and writing them in frac- tional instead of decimal form) gives sM1 = s M 2 = 2/7. Profit levels are pi1 = pi2 = 2/7− (2/7 + 3/4× 2/7)2 = 1/28. Total profits of both firms together are pi1 + pi2 = 2× 1/28 = 0.071. The outcome is inefficient; total profits would increase if one of the firms were forced to produce zero. For example, shut down firm 1; that is, force s1 = 0. Firm 2 sets price equal to marginal cost; this gives s2 = 0.5 17 Externalities 339 and pi2 = 0.5− (0.5)2 = 0.25 as in part (b) above. Firm 1’s negative (short run) profit is now pi1 = 0− (0 + 0.75× 0.5)2 = −0.141. The sum of firm 1’s (short run) profit and firm 2’s profit is now pi1 + pi2 = −0.141 + 0.25 = 0.109 > 0.071. 17 Externalities 340 Exercises 1. Moe has two puppies, and he is the sole caretaker of the puppies. Moe also has two housemates, Larry and Curly. Larry likes playing with the puppies while Curly dislike the mess the puppies make. The three housemates’ utility from the puppies are um(x) = x2 − 2x + 2, ul(x) = x2, and uc(x) = −x2 − 1, respectively, where x is the number of puppies. (a) Calculate Moe’s, Larry’s, and Curly’s utilities from the two puppies. (b) Suppose Moe is considering getting a third puppy. How would each of the housemates’ utility change? (c) What number of puppies would maximize the total utility of the three housemates? 2. Sam and Gam are neighbors. Sam maintains a beautiful garden. His production function is p = 2h and his utility function is us(p, h) = 50p − p2 − 28h, where p is the number of plants in the garden and h is the number of hours spent gardening per week. The plants cost nothing, but they require time. Gam enjoys his neighbor’s garden; his utility is given by ug(p) = 14p2 + 3p. (a) What is the equilibrium number of plants in Sam’s garden? How many hours per week does he spend gardening? (b) Calculate Sam’s and Gam’s utilities. (c) What is the Pareto optimal number of plants in Sam’s garden? 3. There are eleven people with identical preferences living in an apartment building. Each of them likes blasting his music, but complains about everyone else’s music. They all play music at different times during the day; so at any point in time at most one person is playing music. Each individual’s utility function is represented by ui(m, x) = 8m − 2m2 − 310x, where m is the number of hours per day that that individual is blasting his music, and x is the total number of hours per day that others are blasting their music. 17 Externalities 341 (a) If each individual is unaware of the impact of his music on the other tenants, how many hours per day does he blast his music? What is the total number of hours per day during which somebody’s music is blasting? (b) Calculate each individual’s utility. (c) Suppose the tenants get together and realize the negative impact of each of their actions on the other tenants. They collectively agree to limit the number of hours of music-blasting in order to maximize everyone’s utility. They decide to limit every- body’s music playing to m∗ hours per day. What is m∗? 4. Flo is a farmer, and grows flowers on her farm, which is located right next to Beatrice’s property. Beatrice is a beekeeper. Each of Beatrice’s bee hives pollinates an acre of Flo’s flowers. Flo’s cost function is Cf (f) = 5(f − 13b)2, and Beatrice’s cost function is Cb(b) = 10(b− 12f)2, where f is the number of acres of flowers, and b is the number of bee hives. Each acre of flowers yields $50 worth of flowers and each bee hive yields $100 worth of honey. (a) Suppose Flo maximizes her profits, taking the number of bee hives as given. Likewise, Beatrice maximizes her profits, taking the number of acres of flowers as given. How many acres of flowers and how many bee hives will there be? Calculate each of their profits. (b) If Flo and Beatrice jointly maximize profits, how many acres of flowers and how many bee hives will there be? Calculate each of their profits. (c) What sort of transfer would be necessary in order for both Flo and Beatrice to agree to jointly maximize their profits? 5. Clyde produces chemicals. His cost function is Cc(c) = 5c2 +100c, where c is the number of liters of chemicals produced. The market price for a liter of a chemical is pc = 700. Bonnie is a baker. She is adversely affected by the noxious fumes emitted by Clyde’s production of chemicals. Her cost function is Cb(b) = 12b2− 140b+ bc, where b is the number of baked goods produced. The market price for a baked good is pb = 10. 17 Externalities 342 (a) How many liters of chemicals and how many baked goods will be produced in the market equilibrium? (b) Suppose Clyde and Bonnie were to jointly maximize profits. How many liters of chemicals and how many baked goods will be produced? (c) Compare the competitive market outcome from (a) and the joint profit maximization outcome from (b). Which outcome is Pareto optimal? Why? (d) One of the solutions to the externality problem is Pigouvian taxes. In this example, who pays the tax? How does the Pigouvian tax solve the externality problem? (e) Another solution is Coasian property rights. Suppose Clyde has the legal right to emit noxious fumes. How can they arrive at the Pareto optimal outcome? 6. There are two factories in Pollutopia emitting air pollutants. The government decides to crack down on pollution and cap total pollution at 30 units. Each factory is issued 15 permits; one permit allows the emission of one unit of pollution. A factory may consume more or fewer permits than originally issued by buying from or selling to the other factory. The two factories’ cost of pollution abatement are C1(x1) = 60x21 and C2(x2) = x32 re- spectively, where xi is the number of units of pollution produced by factory i. How many permits does each factory end up with in equilibrium? 18 Public Goods 343 18 Public Goods 18.1 Introduction In the last chapter we looked at market failures created by externalities. A good creates a (consumption-based) external effect, or an externality, when person i’s consumption of it has a direct effect—an effect that is not reflected in the market price—on someone else, person j. When externalities are present the market fails to give us efficiency or Pareto optimality. This is because person i, considering only the market price he must pay for it, fails to account for the cost (or the benefit) imposed on person j by i’s consumption of the good. In this chapter, we will look at market failures created by public goods. A public good is a good that is non-exclusive in use. That is, if it is there and available for use by one consumer, then it is there and available for use by all consumers. In a sense, these are goods that create super-externalities. For example, a judicial system is a public good. If the laws, courts, and police are in place to protect person i, they are there to protect person j as well. (Obviously a public good may be valued differently by the different people; i might be a shopper, happy to have the police around to protect her, and j might be a shop-lifter!) A non-public good is sometimes called a private good. A pair of socks is a private good. If i is wearing a pair of socks, then j is not wearing that pair of socks. But if i has that judicial system, then j has it also. We know that markets should provide efficient or Pareto optimal quantities of things like socks. However, as we will see, markets provide inefficient quantities of public goods. The presence of public goods creates another important type of market failure. In this chapter, we will first provide some examples of public goods. Next we will describe a simple model of public goods. The model makes it clear why private market provision of a public good is inefficient. That is, it makes clear why public goods result in market failure. Then we will turn to the Samuelson optimality condition, the condition that must hold for the quantity of a public good to be Pareto optimal or efficient. After that we will discuss the free rider problem—the problem of consumer i’s taking advantage of consumer j’s decision to produce some of the public good, which, since it is available for i to use, causes i to take a free ride on j’s good citizenship. In the same section, we will discuss provision of the public good through voluntary contributions. Finally, we will close the chapter with some possible solutions to the problem of efficient provision of public goods. The solutions include Wicksell/Lindahl taxes and 18 Public Goods 344 demand-revealing taxes. 18.2 Examples of Public Goods Public goods create externalities on steroids. If person i is consuming the public good, then it is available for consumption by everyone. We have already mentioned one example, a judicial system. Here are some other examples: 1. National defense. For the people in one country, national defense is a public good. If the armed forces are protecting i from the threat of attack by dangerous outsiders, then they are also protecting j from that attack. Of course, if i lives in one country and j lives in another, national defense in i’s country may not be viewed as a public good by j. In short, national defense is an excellent example of a public good in the sense that it is truly non-exclusive in use, but one has to be careful about the identity of the group that is using it. 2. Lighting on a city street. City streets are often brightly lit at night to deter crime. If i and j live on the street and the street is lit up for i, then it is lit up for j as well. The light is clearly non-exclusive in use. 3. Fire departments. If a city has a fire department to fight fires, it will almost surely try as hard to put out a fire on i’s property as on j’s property. It would be impractical to provide fire fighting services to one person in the city without providing the same services to others. 4. Broadcast television. In days gone by, television was broadcast from towers, and anyone within range of the broadcast signal could set up an antenna to capture that signal and watch programs on her TV. If the signal was in the air for i, it was also in the air for j. (This is still how television is delivered to a small fraction of its audience in the U.S.) Television stations in the U.S. didn’t charge for delivery of the signal to watchers; their revenue came from the sale of advertisements. Broadcast television is interesting because it shows how public goods might be transformed into private goods by excluding potential users. In Great Britain, BBC television has 18 Public Goods 345 historically charged a television license fee that households with a TV must pay. Therefore if i and j are British BBC watchers, and if the signal is in the air for i, it is also in the air for j (non-exclusivity in use). However if either i or j watches TV without having paid the licensing fee, he or she is fined (producing exclusivity in use). Moreover, broadcast television has now been largely replaced by cable television, or other methods of delivery of television signal via paid subscriber services. If you are a cable TV subscriber, there is nothing public about the delivery of your TV service to your house; you pay a monthly fee if you want the service, and if you stop paying, you are shut off. Satellite television is carried on a signal broadcast from a satellite; everybody in your area gets the signal, but you must pay a monthly fee to get a box to unscramble it. 5. Public parks, public libraries, public monuments. If the Grand Canyon National Park is there for i, it is also there for j. If the New York Public Library is there for one New Yorker, it is there for all; if the Eiffel Tower is available for your viewing pleasure, it is available for all of us. Note, however, that things like parks, libraries, and monuments may have the potential of privatization if access can be restricted. Public parks and libraries might charge for access, for example. And of course there are many private parks (like Disney World), and many private libraries. 6. Public education, public health. In modern countries, an educated population is considered generally beneficial—a good thing for society as a whole. Therefore education is commonly provided by the government, at least education through secondary school. A similar comment can be made about medical care. But note that education and medical care were historically privately provided, and both could be privatized now. Education and medical services can be restricted and made contingent on fees, as college students in the U.S. know. Both education and medical care are what we might call quasi-public goods, goods whose provision is largely public and non-exclusive, but is not necessarily so. 7. Scientific and technological knowledge. Knowledge in the public domain is an exceptionally important type of public good, and it is really public in the sense that if the laws of physics and chemistry (for example) are available and known to i, they are also available and known to j. The significant exception to non-exclusivity of use of knowledge is created 18 Public Goods 346 by patent law, which privatizes some knowledge for a period of time. (For example, the use of the formula for a patent drug is exclusive to the owner of the patent, but only for the limited life of the patent.) In the next section of this chapter, we will describe a simple model of public goods. This model makes it clear why private market provision of a public good is inefficient. 18.3 A Simple Model of an Economy with a Public Good We assume there are only two goods—a public good and a private good. The quantity of the public good is x. Think of this as a composite of things that are non-exclusive in use, like public parks, highways, schools, courts, and armed forces. Rather than defining units of the public good and discussing a production function for that good, we will measure it by expenditure in dollars. So one unit of the public good is one dollar spent on this composite of parks, etc. That is, the price of the public good is $1 per unit. We assume that whatever quantity of the public good is available, is available to all. There- fore x will enter every consumer’s utility function, although different consumers will value it differently. Generally the consumers like the public good, but some may be indifferent to it and some may actually dislike it. The private good is also a composite, but of things that are exclusive in use, like food, clothing, housing, and so on. It is private in the sense that person i’s consumption of that good benefits person i alone. We will let yi represent person i’s consumption of the private good. Person i’s private good consumption enters i’s utility function, but no other utility function. We will measure the private good in dollar units also, so the price of the private good is also $1 per unit. Production in this simple model involves transforming a private good into a public good, or vice versa. We will assume for simplicity that a unit of private good can be transformed by firms into a unit of public good, or vice versa. This means an additional unit of either the public good or the private good costs $1 to produce; that is, both marginal costs equal $1. The firms doing the transforming are competitive and make zero profits. We will assume there are n people. Person i’s income is Mi and his utility function is ui(x, yi). To make our analysis simpler, we will assume our consumers’ utility functions satisfy 18 Public Goods 347 the special property of quasilinearity. The reader may recall this assumption was made in the context of the discussion of consumers’ surplus, back in Chapter 7. We say our consumers have quasilinear preferences if their utility functions can be written as ui(x, yi) = vi(x) + yi. Note that under the assumption of quasilinearity, each person’s private good consumption yi enters his utility function as a simple additive term. The function vi(x) is called i’s benefit from the public good, and its derivative is his marginal benefit from the public good, abbreviated MBi(x). We assume that vi(x) is an increasing and concave function. Let us turn to consumer i’s budget. We will assume for now that person i buys his consump- tion of the public good; that is, he pays for it himself just as he pays for his consumption of the private good. Of course this assumption is a little strange, because if i is buying x units of the public good, that public good is then available to everybody. Nonetheless we make the assump- tion, because it is important to see what happens when public goods are privately provided by the market. Since we are assuming that both goods are measured in dollar units, if person i is buying x units of the public good, and if he is consuming yi units of private good, then his budget constraint is x+ yi = Mi. In short, we are now assuming that person i solves the following problem: max ui(x, yi) = vi(x) + yi subject to x+ yi = Mi. Consumer i’s indifference curves are convex under our assumptions about the vi(x) functions. To maximize his utility subject to his budget constraint, our consumer finds the point on his budget line where the marginal rate of substitution of the private good for the public good equals the price ratio, which is 1 given our definitions of the units. That is, MRSx,yi = dvi(x)/dx dyi/dyi = v′i(x) 1 = MBi(x) = 1. Note that the assumption of quasilinearity implies that the marginal rate of substitution of the private good for the public good equals the marginal benefit of the public good, which depends only on x. This makes our public good analysis relatively simple. 18 Public Goods 348 Figure 18.1 below shows how a consumer chooses his utility-maximizing bundle if he is buying quantities of both the public good and the private good. Note that the figure ignores possible complications created by other consumers’ having already bought and provided some public good. We will face those complications shortly. INSERT FIGURE 18.1 HERE Caption of Fig. 18.1: How consumer i maximizes his utility, when he is buying both the public good x and his own private good yi. He chooses x where his marginal benefit from the public good equals 1. We will now let (xMi , yMi ) represent the market quantities of the public good and the private good that consumer i buys when he maximizes his utility subject to his budget constraint, as described above. The subscript i on the quantity x needs some explanation, since x is the public good. We simply mean that xMi is the quantity of the public good that i buys and pays for. But the good is still public, and consumed by everybody. We let xM = xM1 +xM2 + . . .+xMn represent the total amount of public good, bought by all the consumers, when each consumer is paying the full cost of each unit of public good he is buying. A market equilibrium, or private market equilibrium, or private market outcome, is what results when the various consumers do what we have described above. That is, (xM , yM1 , yM2 , . . . , yMn ) is a market equilibrium. We will let (x∗, y∗i , y∗2, . . . , y∗n), represent an efficient combination of public and private goods. We claim that the market equilibrium cannot be efficient. In order to explain why, let’s assume that there are only two consumers, with identical tastes and identical income levels. Our argument is illustrated in Figure 18.2 below. The figure shows consumer 1’s budget line, and two indifference curves, which could belong to either consumer since we have assumed they have the same tastes. Assume that consumer 1 goes through the exercise above, decides that (xM1 , yM1 ) maximizes his utility subject to his budget constraint, and buys those quantities. Note that (xM1 , yM1 ) is a tangency point of consumer 1’s budget line with an indifference curve, and that xM1 is the quantity of the public good for which consumer 1’s marginal benefit equals 1. Now consumer 2 steps up, and thinks about what to do. He sees that consumer 1 has already bought and paid for xM1 units of the public good, which are available for him to use because x is 18 Public Goods 349 public. In other words, consumer 2 is getting xM1 units of the public good free of charge, thanks to consumer 1. If he wanted to, he could buy an additional amount of the public good, call it xM2 , but he would have to pay $1 per unit for any additional amount. His budget constraint therefore is not a standard straight line with slope -1. If he spent all his money M2 on private consumption, he could still consume up to xM1 units of the public good, because it is there thanks to consumer 1. So the first part of his budget line is a horizontal line, starting at M2 on the vertical axis, and extending out to xM1 . The second part of his budget line starts at the point (xM1 ,M2) and extends toward the right. On this part the budget line is a straight line with slope -1, because if he wants to consume more public good than xM1 , he has to consume less private good, and the tradeoff is a unit of private good per additional unit of public good. Consumer 2’s budget line is shown in Figure 18.2 as the dashed line, with a horizontal section at the top, and then a straight line section with slope -1. Now let’s consider this question: how much additional public good does consumer 2 want to buy? If he wants to be at some point on the downward-sloping part of his budget line, he must look for a point of tangency between an indifference curve and the budget line, which means he looks for a point where MB2(x) = 1. But we assumed consumer 2’s utility function was the same as consumer 1’s. Therefore the marginal benefit functions are the same: MB2(x) = MB1(x). Therefore if MB1(x) = 1 at xM1 , MB2(x) also equals 1 at xM1 . In short, consumer 2 is happy with x = xM1 , and he wants to add nothing to it; that is, he chooses xM2 = 0. Note that Figure 18.2 shows an indifference curve for consumer 2 touching the downward-sloping part of 2’s budget line, at the point (xM1 ,M2). (Saying that consumer 2’s indifference curve is tangent to the budget line at the point (xM1 ,M2) would be slightly inaccurate, because the budget line has a kink at that point, making its slope undefined. However, (xM1 ,M2) is clearly the point that maximizes consumer 2’s utility subject to his budget constraint). INSERT FIGURE 18.2 HERE Caption of Fig. 18.2: There are two consumers with identical preferences and incomes. Consumer 1 moves first, and buys some of the public good. Consumer 2 moves second, takes advantage of consumer 1’s having bought xM1 , and buys no additional public good. We now have the private market equilibrium. Consumer 1 purchases (xM1 , yM1 ) as shown in 18 Public Goods 350 Figure 18.2; consumer 2 purchases (xM2 = 0, yM2 = M2) and consumes (xM1 , yM2 ) as shown in the figure. The total amount of the public good produced and consumed by the two consumers is xM = xM1 + xM2 = xM1 , also shown in the figure. Let us consider the market equilibrium further. Consumer 1’s marginal benefit from the public good at his consumption point (xM , yM1 ) is 1. Consumer 2’s marginal benefit from the public good at his consumption point (xM , yM2 ) is also 1. Adding the two marginal benefits together, we have MB1(xM) + MB2(xM) = 2. But this cannot be efficient (that is, Pareto optimal), because the sum of the marginal benefits is 2, while the cost of an additional unit of the public good is only 1. The consumers could easily rearrange things so that both would be better off. For instance, each consumer could contribute $0.50 to pay for one additional unit of x. The result would be a change in consumer 1’s utility; it would go up by 1.0 because of the increase in x and simultaneously fall by 0.5 because of the decrease in private good consumption. The net change in u1 would be 1.0 − 0.5 = 0.5. Similarly, consumer 2’s utility would rise by 1.0 because of the increase in x and simultaneously fall by 0.5 because of his decrease in private good consumption. The net change in u2 would be 1.0− 0.5 = 0.5. Since we have shown a way to make both consumers better off, (xM , yM1 , yM2 ) is not Pareto optimal. 18.4 The Samuelson Optimality Condition Let us formalize the optimality ideas we used at the end of the last section. Suppose again that there are n people with quasilinear preferences in our public good economy. As a group, they are choosing (x, y1, y2, . . . , yn) subject to the constraint that the total cost of (x, y1, y2, . . . , yn) must not exceed the total amount of money they start with, M1 + M2 + . . .+ Mn. Under the quasilinearity assumption, ui(x, yi) = vi(x) + yi. Since the term yi appears in every consumer’s utility function, for every consumer, an extra dollar is equivalent to an extra unit of the private good, and equivalent to an extra unit of utility. Therefore we can legitimately add together the utilities of the n consumers, much as we added together the utilities (or willingnesses-to-pay) of n consumers when we discussed consumers’ 18 Public Goods 351 surplus. It follows that (x, y1, y2, . . . , yn) is Pareto optimal if (and only if) it maximizes u1(x, y1) + u2(x, y2) + . . .+ un(x, yn) = v1(x) + y1 + v2(x) + y2 + . . .+ vn(x) + yn subject to society’s overall budget constraint x+ y1 + y2 + . . .+ yn ≤ M1 +M2 + . . .+Mn. We can clearly replace the less-than-or-equal-to sign in the constraint with an equal sign. Therefore Pareto optimality requires that society maximize v1(x) + v2(x) + . . .+ vn(x) + y1 + y2 + . . .+ yn subject to y1 + y2 + . . .+ yn = −x+ M1 +M2 + . . .+ Mn. Noting that M1+M2 + . . .+Mn is a constant and can therefore be ignored in the maximiza- tion problem, and substituting from the constraint, we conclude that the condition for Pareto optimality in our public good model is that v1(x) + v2(x) + . . .+ vn(x)− x must be maximized. The function v1(x) + v2(x) + . . . + vn(x) − x has a simple and intuitive interpretation. It is the total benefit of the public good, to all consumers in the economy, net of its cost. We’ll call it the total net benefit from the public good for short. In our public good model, optimality means maximization of total net benefit from the public good. We conclude with an efficiency condition called the Samuelson optimality condition, after the great 20th century American economist Paul Samuelson (1915-2009). Maximizing the total net benefit function above gives the following necessary condition: v′1(x) + v′2(x) + . . .+ v′n(x) = 1, or MB1(x) +MB2(x) + . . .+MBn(x) = 1. In words, the sum of the marginal benefits from the public good must equal the marginal cost. This is the Samuelson optimality condition. 18 Public Goods 352 To understand the intuition of the Samuelson condition think of this example. Suppose there are 100 people, and suppose an extra unit of the public good costs $1. Suppose every person would get a 2 cent benefit from another unit of the public good. Then more of the public good should be produced, because the cost of another unit is only $1, but the total benefit to society from another unit is $2. Our market equilibrium of the last section failed the Samuelson optimality condition. That is why it was not efficient. To wrap up this section, let’s think about the intuitive contrast between the market provision of a private good and the market provision of a public good. Whether he is buying a private good or a public good on the market, a consumer will buy it up to the point where the marginal benefit to him from the good equals its price, $1 in our model. If the good is private, this makes sense, because the consumer doing the buying is the only one getting that benefit; from a social perspective, the benefit of that last unit just equals $1. But if our consumer is buying another unit of the public good, it creates a marginal benefit for him and for others as well. Therefore, from the social perspective, that last unit of the public good purchased by one consumer results in an aggregate benefit that far exceeds its cost. This means that if a public good is bought on a private market like a private good, the market equilibrium quantity of the good will be too low. 18.5 The “Free Rider” Problem and Voluntary Contribution Mechanisms In Section 3 above we presented an example to show that the standard market mechanism results in a non-optimal or inefficient amount of public good. In our example, we had two identical consumers. Consumer 1 went first and bought xM1 units of the public good. Consumer 2 went second, and he bought no public good at all. He saw that consumer 1 had already bought xM1 , which was there for consumer 2 to enjoy because the good is public. In other words, consumer 2 took a free ride on consumer 1’s prior purchase of the public good. This was an example of a widespread problem called the free rider problem. The equilibrium we described in that section was slightly special because it obviously de- pended on which consumer moved first. In this section we will tell a somewhat different story about the provision of the public good, a story that is slightly more general, and more in line with 18 Public Goods 353 the game theoretic Nash equilibrium concept. However, the new story has the same moral as the old; the provision of public goods creates free riders, resulting in non-optimality or inefficiency. This is the story of a voluntary contribution mechanism to supply the public good. We will again assume there are two consumers with quasilinear preferences. We now allow that their vi(x) functions and their income levels M1 and M2 might be different. We assume as before that the public good and the private good can be transformed into each other at a rate of 1 to 1, and that the price of a unit of public good or a unit of private good is $1. The consumers buy and consume the private good as before. Now, however, instead of going to the market and buying units of the public good, they are brought together in a room, and they are asked by a kind intermediary named Mr. Rational Fundraiser to make a voluntary contribu- tion of money to be used to purchase the public good. Let x1 represent consumer 1’s voluntary contribution, and let x2 represent consumer 2’s voluntary contribution. Mr. Fundraiser will take x1 + x2 to the market and buy x = x1 + x2 units of the public good, to be used by consumers 1 and 2. Mr. Fundraiser tells them: “I want you to be virtuous but I know you are rational. Therefore, when you decide on your contribution xi, choose it in anticipation of xj , and choose it in a way that maximizes your utility subject to what you think xj will be.” An equilibrium of the voluntary contribution mechanism is a pair of contributions (x1, x2) such that each consumer is maximizing his utility, given what the other is doing. That is, given what i anticipates j will do, xi maximizes vi(xi + xj) + yi subject to i’s budget constraint xi + yi = Mi. Note that this is now a game, similar to the games we discussed in Chapter 14 (also similar to the duopoly analysis of Chapter 13), and we will be looking for Nash equilibria of the game. Now let’s focus on consumer 1. He chooses x1 to maximize his utility u1(x, y1) = v1(x1 + x2)+y1, subject to his budget constraint x1+y1 = M1, and in anticipation of some contribution x2 by consumer 2. He can easily solve this maximization problem, by substituting x1 from the budget constraint into the function he is maximizing. Also, he can ignore M1, which is a constant. This means he wants to maximize v1(x1 + x2) − x1. He treats x2 as a constant, differentiates, and sets the derivative equal to zero. This gives v′1(x1 + x2) = 1. 18 Public Goods 354 Consumer 1 uses this equation to solve for his x1, contingent on the x2 anticipated from consumer 2. Now we move on to consumer 2. He goes through a process very much like what we described for consumer 1 above. He ends up with an analogous equation, v′2(x1 + x2) = 1. He uses this equation to solve for his x2, contingent on the x1 anticipated from consumer 1. Let’s define an equilibrium of the voluntary contribution mechanism. We say than (xe, xe1, xe2) is an equilibrium if xe is the sum of xe1 and xe2; if xe1 is the x1 that maximizes v1(x1 + xe2)− x1; and if xe2 is the x2 that maximizes v2(xe1+x2)−x2. We now see that (xe, xe1, xe2) is an equilibrium if and only if the following hold: 1. xe = xe1 + xe2, 2. v′1(xe1 + xe2) = 1 3. v′2(xe1 + xe2) = 1 In the special case where the marginal benefit functions are the same, so v′1(x1 + x2) = v′2(x1 + x2) = v′(x1 + x2), it is easy to find all the voluntary contribution mechanism equilibria (of which there are many). First, find the xe for which v′(xe) = 1. Then let xe1 and xe2 be any pair of contributions that sum up to xe. They are all voluntary contribution mechanism equilibria. But whether it is the special case or the more general case, we can be sure of one thing. An equilibrium of this mechanism is not Pareto optimal. It obviously fails the Samuelson optimality condition, because v′1(xe1 + xe2) + v′2(xe1 + xe2) = 1 + 1 = 2 > 1. Voluntary contributions as we have described them result in too little of the public good. Each consumer i is taking a free ride on the anticipated contribution of the other consumer j, although perhaps not as big a free ride as consumer 2 took on consumer 1’s public good purchase in Section 3 above. 18.6 How To Get Efficiency in Economies With Public Goods We have seen that public goods may create market failure. A private market does not produce enough of the public good. Nor does a system of voluntary contributions, if consumers are 18 Public Goods 355 acting rationally. Here are some possible solutions to the problem of market failure resulting from public goods: Command policies. If the government has enough information, and if it is interested in Pareto optimality, it can calculate the optimal quantity of the public good x∗, and then arbitrarily im- pose taxes on the various consumers, adding up to x∗, in order to pay for it. This may sometimes be possible; however, it will not work if the government lacks information about the marginal benefit functions (or more generally, about the individuals’ marginal rates of substitution of the private good for the public good). It will not work if the government imposes the taxes in some objectionable, offensive, unfair, or discriminatory way. It will not work if the government simply does not care about efficiency. Moreover, command policies may be bad simply because intelligent people don’t like to be told “Uncle Sam (or Uncle Vladimir) has decided that our society needs x∗ units of armed forces (or scientific research or public schools or street lights), so you need to pay us $10,000 (or 10,000 rubles).” Wicksell/Lindahl taxes and the Lindahl equilibrium. The government might set up a lovely taxation scheme for financing the public good, one that was developed early in the 20th century by Swedish economists Knut Wicksell (1851-1926) and Erik Lindahl (1891-1960). One of Knut Wicksell’s important contributions to economics was to develop the principle of just taxes. As long as there have been taxes, people have grappled with the issues of who should pay, and how much should they pay. Loosely speaking, there are three main schools of thought. (1) The first school believes that taxes should be calculated as a more-or-less flat percentage of income (or wealth). Making taxes proportional to income (or wealth) seems fair and simple. This position dates at least to the time of the Old Testament and tithing. (2) The second school believes that taxes should be based on the payer’s ability to pay. This implies progressive taxes, with the wealthy paying a higher proportion of their income than the poor. This is the position of utilitarians, who believe that the marginal utility of income declines as income increases. Therefore if two taxpayers are to make the same sacrifice in utility when they pay their taxes, the rich man has to pay a higher proportion of his income than the poor man. (3) The third school believes in taxation according to benefit. According to this principle, it is unjust to tax Ms. i to pay for something that she doesn’t care about and may not want. It is just to make Mr. j pay a high tax for what is very useful, beneficial, or profitable to him. This 18 Public Goods 356 is the position that Knut Wicksell took; the taxes a person has to pay should be based on the benefit he receives from the public good. We now turn to a system of taxation to finance the public good based on Wicksell’s principle of taxation according to benefit. More precisely, we will use the principle of taxation according to marginal benefit. We return to our model with n consumers, and we continue to assume quasilinear preferences. Person i’s utility function is ui(x, yi) = vi(x) + yi. The government collects taxes and uses what it collects to pay for the public good. We assume that the government initially decides on a list of tax shares, one for each consumer. Consumer i’s tax share, written ti, is the fraction of the total cost of the public good that will be placed on i. The government’s revenue must equal what it spends on the public good; so we require that t1 + t2 + . . .+ tn = 1. We assume the government sends each consumer i the following message: “Dear Mr./Ms. i. We are going to provide the public good. We will tax everybody to pay the cost. Your tax share, that is, your share of the total cost, is going to be ti. In other words, if we end up providing x units of the public good, at a cost of x dollars, you will be billed tix. Please tell us: Given this policy, how much of the public good do you want us to provide?” Mr. i gets the message, sits, and thinks. He can think about this in an honest and straightfor- ward way, or a devious and not-so-straightforward way. We’ll get to the not-so straightforward way later; here is his honest and straightforward line of thought. He wants to maximize his utility subject to his budget constraint. Suppose the government produces the quantity that he tells them. Call it x. His utility is vi(x)+yi and his budget constraint is now tix+yi = Mi, given what the government has told him about how he will be taxed. Using his budget constraint, he substitutes for yi in his utility function, and he drops the Mi from his maximization problem since it is a constant. So he wants to maximize vi(x) − tix. He takes the derivative of this function and sets it equal to zero, which gives v′i(x) = MBi(x) = ti. He solves this equation for x; we will call the solution xi(ti) since it is the public good output that Mr. i wants, and it is contingent on the tax share that the government assigned him. 18 Public Goods 357 He now responds to the government, and tells them the quantity he wants, xi(ti). (He only reports the number, not the whole function.) At this point, the government has a list of desired amounts of the public good, one for each person. They will most likely differ. If they do differ, the government sends out a new set of messages. In the transition from the old set of messages to the new set of messages, the government raises the tax shares of people who wanted a lot of the public good, and lowers the tax shares of people who wanted very little of it. The new messages say: “Dear Mr./Ms. i. Please disregard our last message. We have assigned you a new tax share. Use the new tax share, shown on this message, to recalculate the level of public good that you want. Then please tell us, once again, how much of the public good you think we should provide.” This process goes on until everyone agrees on the quantity of the public good that should be produced. The result is called a Lindahl equilibrium. This is a set of tax shares (t1, t2, . . . , tn), and a quantity of the public good xe, such that every consumer, when told his tax share is ti, reports back a desired public good quantity xi(ti), and they all agree: x1(t1) = x2(t2) = . . .= xn(tn) = xe. The Wicksell/Lindahl scheme we have described has two virtues. First, it bases a person’s tax on his marginal benefit. If i gets a lot of utility from the marginal unit of the public good, his tax share is going to be high; if not, it’s going to be low. Second, at least as we have described it so far, with everybody reporting honestly, it results in an efficient or Pareto optimal level of output for the public good. Here is why. Each i calculated his desired quantity based on the equation MBi(x) = ti. And they all got their desired quantities, because the definition of the Lindahl equilibrium requires that the desired quantities of the public good all agree. Therefore MB1(xe) +MB2(xe) + . . .+MBn(xe) = t1 + t2 + . . .+ tn = 1. Therefore xe satisfies the Samuelson optimality condition. In Figure 18.3 below, we show how a Lindahl equilibrium looks in a graph when there are only two people. This graph is slightly different from what we are used to. The horizontal axis shows both t1 and t2. This axis is a line segment 1 unit long. A point on the line shows t1 and 18 Public Goods 358 t2; we measure t1 as the distance from the left end of the line segment to the given point, and t2 as the distance from the right end of the line segment to the given point. The graph has two vertical axes, one at the left end of the horizontal line segment, and one at the right end. Both vertical axes show quantities of the public good. Consumer 1’s desired quantity of the public good x1(t1) can be be read off the left vertical axis, and consumer 2’s desired quantity x2(t2) can be be read off the right. Note, however, that the height of any point represents a quantity of the public good, whether read off the left or the right vertical axis. As ti increases for either consumer, that consumer wants to see less of the public good produced, and so there are two downward sloping curves in the figure, representing the functions xi(ti) for i = 1, 2. These can be interpreted as public good demand curves, contingent on tax shares. Consumer 1’s curve looks downward sloping as it should, but consumer 2’s curve may confuse the reader because it looks upward sloping. The explanation is that t2 is measured from the right origin, and so t2 increases as you move left in the figure. Note that the public good demand curves may be asymptotic to their respective axes, because a consumer whose tax share is zero may want an infinite amount of the public good. Note also that the height of the point where i’s demand curve reaches j’s vertical axis shows the amount of the public good that i wants when ti = 1, which is analogous to i’s having to buy it (and pay 100 percent of the cost) on the private market. The Lindahl equilibrium in Figure 18.3 is found by looking for the point where the two public good demand curves cross. For the tax shares (t1, t2) at that point, the two demands are the same. The height of the crossing point is the Lindahl equilibrium public good output. This is the efficient public good output. INSERT FIGURE 18.3 HERE Caption of Fig. 18.3: A Lindahl equilibrium in an economy with two people. We will conclude this discussion by turning to the fatal flaw in the Wicksell/Lindahl scheme. When we were describing Mr. i’s thought process after he was asked by the government to reveal his desired quantity of the public good, we said there was a straightforward way for him to think about the question, and a not-so-straightforward way. We then gave the straightforward line of thought. We now turn to the devious line of thought. We can see this vividly in Figure 18.3, and so we now assume there are just two people. 18 Public Goods 359 Suppose consumer 1 is a straight-arrow Boy Scout type who never lies. Suppose consumer 2 isn’t. Consumer 2 is willing to stretch the truth a little bit. When the government asks him how much public good he wants them to produce, contingent on any t2, he says: “Public good? I don’t care much for it. If you’re going to tax me for the lousy stuff (t2 > 0), my answer is always zero. If it’s free to me (t2 = 0), then produce whatever amount the straight-arrow wants, I don’t care.” In Figure 18.3, this makes consumer 2’s demand curve for the public good a horizontal line at zero, for all 0 < t2, with a (discontinuous) jump up when t2 = 0. The new Lindahl equilibrium quantity is at xe = x1(1), which is inefficient. This is exactly the outcome we saw under the private market provision of the public good, with 1 moving first. The free rider problem strikes again, and with a vengeance. Demand-revealing taxes. In the 1960’s and 1970’s, various American economists, including Edward H. Clarke (1939-) and Theodore Groves (1942-), devised an ingenious scheme to provide the public good and to tax people to pay for it. This scheme was designed to (1) guarantee that the Pareto optimal quantity of the public good would be produced, (2) raise enough money to pay for that quantity, and (3) not create incentives for taxpayers to lie about their demands for the public good. In the demand-revealing mechanism, or demand-revealing tax scheme, each consumer sends a message to the government. The message is his public good benefit function, vi(x). Of course, if i’s message is false (like consumer 2’s message in our discussion of the Wicksell/Lindahl scheme above), the government will be tripped up and the mechanism will fail. The virtue of the demand-revealing scheme, however, is that the consumers will have no incentives to reply with false messages. Therefore they will all reveal their true demands, or more precisely, their true vi(x) functions. When the government gets all the vi(x) functions, it proceeds with the exercise of maximizing total net benefit from the public good. It solves the Samuelson optimality condition, and obtains x∗. Then it sends out tax bills. The tax bills are carefully crafted so that consumers do not have any incentives to lie about their vi(x) functions. The bills have two parts, a main part which we will describe below, and a secondary part. We will only say this about the secondary part: it is designed to ensure that enough money is collected, and it creates no incentives for consumers to lie. 18 Public Goods 360 The main part of the tax bill says that consumer i must pay a dollar amount Ti which equals the total cost xe, minus the sum over everyone except i of the benefit vj(xe). That is, Ti = xe − ∑ j 6=i vj(xe). It is not difficult to prove, although we will not do so here, that with this tax assessment i has nothing to gain by lying to the government about his vi(x) function. Moreover, i gains nothing from lying whether the other people are telling the truth or not. In other words, telling the truth is a dominant strategy for the demand-revealing mechanism. Therefore the scheme truly is demand-revealing, or incentive compatible —it causes people to honestly reveal their benefit functions. It results in the optimal output of the public good, since the government uses the (true) vi(x) functions along with the Samuelson optimality condition to find xe. In short, the demand-revealing tax scheme is a clever mechanism for financing the public good. It should cause people to be truthful about their demands for, or benefits from, the public good, and it should result in Pareto optimality. We will conclude this section with a numerical example. For the sake of simplicity, the example is discrete—either the public good is provided or not—so there are no real vi(x) func- tions, and there certainly aren’t any derivatives of functions. The public good is a bridge. The government either builds it (x = 1) at a cost of $1000, or doesn’t build it (x = 0) at a cost of $0. There are five people. If the bridge is not built, they get no benefit from it. If it is built, each person gets a different amount of benefit, as follows: v1(1) = 0, v2(1) = 500, v3(1) = 100, v4(1) = 200, and v5(1) = 300. With this discrete example, we cannot use the Samuelson optimality condition, but we can easily see that the way to maximize total net benefit from the public good is to build the bridge. (With the bridge, total net benefit is 0+ 500+100+200+300− 1000 = 100, without the bridge, total net benefit is 0.) Now recall how the demand-revealing scheme works. The government asks each person for his vi(1). (vi(0), or the value to i of no bridge, is obviously 0 for everybody, which the government knows.) Each i knows exactly how Ti will be calculated, and that the Ti’s will be T1 = −100, T2 = 400, T3 = 0, T4 = 100, and T5 = 200, 18 Public Goods 361 providing all report their true vi(1)’s. Now we ask: is the mechanism demand-revealing? Or does someone have an incentive to lie? The reader should choose one of these five people at random and think about the following. Person i’s net benefit when the bridge is built is vi(1) − Ti. If i exaggerated the value of the bridge, would that change the government’s decision to build it? Would it change his tax Ti? Would he gain by exaggerating the value? If i said the bridge was worth less to him than it truly was, the government’s decision might or might not change. If the government’s decision does not change, would i’s tax Ti change? Would i gain from the lie? If the government’s decision does change, and the bridge was not built, would i gain from the lie? The reader is encouraged to do this mental exercise. If you do, you will quickly realize that no one can gain by reporting a false value for the bridge. That is the remarkable virtue of the demand-revealing tax scheme. Unfortunately the demand-revealing scheme is too complex to be useful in most situations. There are related mechanisms used in the provision of private goods (particularly what are called second-price auctions) which are very important in some modern business applications. In a second-price auction, or Vickrey auction, named after Canadian economist William Vickrey (1914-1996), an object is put up for bids at an auction. It is sold to the highest bidder, but at the price bid by the second-highest bidder. Like the demand-revealing tax scheme we have discussed, the second-price auction is incentive compatible — it causes the auction bidders to reveal their true valuations. And it is much simpler and therefore much more usable than the demand-revealing scheme. In short, the demand-revealing scheme is a cousin of the second-price auction, but it’s the unpopular cousin. 18.7 A Solved Problem The Problem There are three people who consume public and private goods. As usual, the public good is x, and yi represents person i’s consumption of the private good. The prices of both the public good and the private good are $1 per unit. The initial endowments of the private good are (M1,M2,M3) = (10, 10, 10). The three people have the following utility functions: u1(x, y1) = ln x+ y1. 18 Public Goods 362 u2(x, y2) = 2 lnx+ y2. u3(x, y3) = 3 lnx+ y3. (a) Assume that the public good is privately purchased, and that person 3 is the first to go to the market and buy the public good. Assume he does not act strategically; he ignores persons 1 and 2 when buys x, and thinks only of his own utility maximization problem. What is the outcome? How much of the public good does person 3 buy? How much do persons 1 and 2 buy? (b) Use the Samuelson optimality condition to find the Pareto optimal quantity of the public good x∗. (c) Describe the Lindahl equilibrium. (d) Show how any of the three people could gain by misrepresenting his preferences to the government. The Solution (a) Person 3 maximizes u3(x, y3) = 3 lnx + y3, subject to his budget constraint x + y3 = 10. Using the budget constraint to substitute for y3, and ignoring the constant in the function being maximized, gives u3(x) = 3 lnx− x. Differentiating and setting the result equal to zero gives x = 3. So person 3 buys 3 units of x with his own money; but the public good is public and available for all. Now persons 1 and 2 have to decide whether or not to buy any additional amounts of it. The marginal benefit function for person 1 is MB1(x) = 1/x. At x = 3, the marginal benefit to person 1 of the public good is 1/3. But the price is $1, and so person 1 buys no additional public good. The same argument holds for person 2. The market result is therefore (xM1 , xM2 , xM3 ) = (0, 0, 3). 3 units of the public good are purchased, all by person 3; persons 1 and 2 are free riders. 18 Public Goods 363 (b) The Samuelson optimality condition says that the sum of the marginal benefits from the public good must equal the marginal cost. In our example, the marginal benefit functions are MB1(x) = 1/x, MB2(x) = 2/x, and MB3(x) = 3/x. Therefore the optimality condition says 1 x + 2 x + 3 x = 1. This gives x∗ = 6. (c) Assuming that no one is misrepresenting his preferences, at the Lindahl equilibrium, person i’s tax share ti ends up equal to his marginal benefit from the public good. That is, MBi(x) = ti. Also, assuming that the everybody is honest, at the Lindahl equilibrium, the public good quantity is efficient, x∗ = 6. Using the marginal benefit functions above, and setting x = 6, gives t1 = 1/6, t2 = 2/6, and t3 = 3/6. With these tax shares, each person wants the government to provide 6 units of the public good. Person 1 pays t1x = 1; person 2 pays t2x = 2; and person 3 pays t3x = 3. The government collects taxes totaling 6 units of the private good, and produces the 6 units of the public good that everybody wants. (d) Suppose person 1 claims that he always gets zero marginal benefit from the public good; for any t1 > 0, he wants x = 0. But continue to assume that persons 2 and 3 report honestly. The government then finds a Lindahl equilibrium where the tax shares are (t1, t2, t3) = (0, 2/5, 3/5); and it produces 5 units of the public good. Person 1’s tax bill is now zero. Person 1’s utility is now u1(x, y1) = ln 5 + y1 = ln5 + 10 = 11.61. When he was honest in part (c) above, his utility was u1(x, y1) = ln 6 + y1 = ln6 + 10− t1x = ln 6 + 10− 1 = 10.79 < 11.61. Therefore it pays for person 1 to lie. 18 Public Goods 364 Exercises 1. Fabio and Paolo are housemates. They are thinking of buying a TV that costs $300. Each of them would receives a $200 benefit from having the TV. If neither of them pays, the TV is not purchased. There a couple of ways they could make the purchase. They could simultaneously chip in $150 each, and buy the TV. Or, one housemate could pay $300 upfront, in the hope that his housemate would reimburse him $150. However the housemates are irresponsible jerks, and in fact never reimburse each other. (a) What two conditions must hold in order for the TV to be considered a public good? (b) Draw the payoff matrix, and find all Nash equilibria. Now suppose the World Cup is approaching, and they will each receive a $400 benefit from having the TV. (c) Draw the new payoff matrix, and find all Nash equilibria. 2. Consider Fabio and Paolo again. Now assume each consumes two goods, movie streaming and beer. Assume for now that they watch their movies separately. They have identical utility functions: ui = √ xiyi, where xi is the amount of movie streaming for person i and yi the amount of beer consumed by person i. A unit of either good costs $1. Fabio has an income of $10 a week, and Paolo has an income of $20 a week. (a) How much money does Fabio spend on movie streaming and on beer? How about Paolo? Calculate each of their utility levels. After a month, Fabio and Paolo realize that they could share a movie streaming account instead of having two separate accounts, and watch their movies together, instead of separately. Movie streaming is now a public good, instead of a private good, i.e., their utility functions are now ui = √ xyi, for i = f, p, and x = xf + xp. The price of movie streaming is still $1 per unit. (b) Suppose Fabio and Paolo agree to maintain their allocation from part (a). Calculate each of their utility levels and compare them to their utility levels in part (a). 18 Public Goods 365 (c) Is the allocation from part (a) Pareto optimal? (d) Show that Pareto optimality requires that Fabio spend all his income on beer and that Paolo spend $10 on beer and $10 on movie streaming. 3. Three little pigs are looking to buy a house of stone. Their utility functions are as follows: u1 = (3 + H)y1, u2 = (2 + H)y2, and u3 = (1 + H)y3, respectively, where H = 1 if they buy a stone house and H = 0 otherwise, and y is the amount of money spent on private consumption. They have each saved up M1, M2, and M3, respectively. (a) What is the maximum amount each pig is willing to contribute towards buying a stone house? (b) Suppose M1 = $2, 800, M2 = $1, 800, and M3 = $1, 000. There is a house of stone on sale for $1,800. Do the pigs buy it? 4. There are 1,000 people in the village of Fasching. The villagers enjoy only two goods, festivals and food. Festivals are a public good while food is a private good. The villagers have identical preferences for the two goods, denoted by ui = √ x/5 + yi, where x is euros spent on festivals and yi is euros spent on food. (a) Calculate the marginal rate of substitution of food for festivals, MRSx,yi . (b) What is the Pareto optimal amount of money spent on festivals? 5. Consider the voluntary contributions mechanism in the free rider section. Suppose that u1(x, y1) = √x+ y1 and u2(x, y2) = 2√x + y2. Show that agent 2 will never free ride on agent 1. Can you interpret that? 6. The mayor of Keukenhof would like to build a new public park, if he can raise enough taxes to cover the cost, which is 1,000 euros. There are four families in the town. Each family’s benefit from a public park is as follows: v1 = 100, v2 = 200, v3 = 300, and v4 = 400. The mayor wants to use the demand-revealing tax mechanism described in section 6 above. (a) If each family reports its true benefit, vi, how much will each family have to pay, Ti? 18 Public Goods 366 (b) Suppose family 4 misreports its benefit as 500, while the other families report their true benefits. Does the park get built? If so, how much does family 4 have to pay? Is family 4’s net benefit different from its net benefit in part (a)? (c) Suppose family 4 misreports its benefit as 300, while the other families report their true benefits. Does the park get built? If so, how much does family 4 have to pay? Is family 4’s net benefit different from its net benefit in part (a)? (d) Show that the Samuelson optimality condition is satisfied. 19 Uncertainty and Expected Utility 367 19 Uncertainty and Expected Utility 19.1 Introduction and Examples In most of this book we have assumed perfect information. That is, we have assumed that every buyer and every seller of every good and service in the market has complete information about all the relevant facts. All the buyers and all the sellers know the market prices, and they all know the characteristics of the things being bought and sold. This is a reasonable assumption in the markets for many goods, whose characteristics are easily observed. But it is often unreasonable. The following are some examples of things we buy and sell, or consume and pay for, whose properties or qualities may be quite uncertain. The buyers and sellers, or consumers and providers of these things face substantial uncertainty or randomness. 1. Used cars. When you buy a used car, you are very unsure about its condition. You may bring it to an independent mechanic to check it over, and you may check it out on CarFax, but you can never be sure that the previous owner changed the oil anytime in the last 25,000 miles. In other words, you might be lucky and get a beautifully maintained car with no mechanical problems, or you might get a lemon. 2. New cars. Of course you cannot be 100 percent sure about a new car either. Even if you are a thoughtful and careful consumer and you buy a Toyota Prius, you may have mechanical problems, perhaps even the much dreaded problem (in 2010, and probably exaggerated) of unintended acceleration. 3. College/university. You chose a university for an undergraduate degree. Your university is very expensive. When you choose which university (or universities) to apply to, how much do you know about it (or them)? Do you know what your major will be? Do you know what you might learn in your classes? Do you know how many professors are interesting and informative, and how many are deadly dull and uninterested in teaching? Do you know what a blessing or what a curse your classmates and roommates might be? 4. Dangerous activities. You travel between home and school by car, or by train, or by plane. Each mode of travel creates some very small risk of a fatal accident. Do you know what the odds are? And if you do know, how does this knowledge enter into your utility 19 Uncertainty and Expected Utility 368 maximization calculation and your decision process? 5. Life insurance and annuities. You fear you may die too young, and you want to buy a life insurance policy to protect your spouse and children in the event of your premature death. What kind of policy should you buy, and how much insurance should you have? Alternatively, you think you may live too long, and you may run out of savings before you go. You don’t want to be a burden to your children. You’ve heard you can buy insurance against this possibility also. Should you buy an annuity, and how big an annuity should you get? 6. Investments. You have some money that you are going to either invest in bank certifi- cates of deposit (with minimal risk and also minimal reward), or in shares of stock (with considerable risk but greater rewards). What should you do? 7. Gambling. You can play a game in a casino or buy a state lottery ticket. The game or the state lottery requires that you put in $1, and in exchange, you have a certain chance of winning nothing, a chance of winning $50, a chance of winning $1,000, and a chance of winning $1 million. If you know the odds for each outcome, what do you do? We could go on and on. The point is that many of the things we buy, or invest in, or receive from our governments, friends, and relatives, have a good measure of uncertainty attached to them. Also, many things that we do increase the uncertainty in our lives, while other things reduce it. Sometimes we pay money to buy risk; at other times we pay money to avoid risk. The modeling we have done so far has ignored all of this. We will now turn to modeling that explicitly addresses uncertainty. In the next two sections of this chapter we will lay out the basic model of consumer behavior under uncertainty, developed by John von Neumann and Oskar Morgenstern. This model gives a utility function that can be used to analyze uncertainty. Then we will turn to some examples, which show how the von Neumann-Morgenstern utility function approach can be applied to consumers who want to reduce or avoid risk, and to consumers who like risk and want to increase it. The examples also show how people can trade risk in ways which make everybody better off. 19 Uncertainty and Expected Utility 369 19.2 Von Neumann-Morgenstern Expected Utility: Preliminaries In the initial chapters of this book, we modeled consumer choice, in what’s called the standard consumer choice model. In the standard model, the consumer chooses among bundles of goods, which have known quantities of various goods in them. The consumer knows what the goods are and how much (or how little) she likes them. There is nothing random or uncertain about such a bundle. The consumer is choosing among certain alternatives. We assume the consumer has complete and transitive preferences over these alternatives. We also assume monotonicity and convexity, to ensure that the indifference curves have the usual shape. Then we define the consumer’s utility function. The utility function is based on, and perfectly represents, the consumer’s preferences. Utility functions in the standard consumer choice model are ordinal utility functions. For such functions, only relative values matter; saying ui(X) = 15, ui(Y ) = 10, and u(Z) = 5 only means that consumer i prefers X to Y to Z. There is no intrinsic meaning to the utility numbers, and if the function ui represents i’s preferences, then any order-preserving transformation of ui works just as well to represent i’s preferences. For instance, if ui is always positive, then vi = √ui works just as well to represent i’s preferences as ui does. We will now develop a model of consumer choice over alternatives which are random or uncertain. Based on our model of choice over uncertain alternatives, we will develop a new type of utility function. This new utility function will no longer be purely ordinal; it will have some properties that make it similar to (but certainly not identical to) the cardinal utility functions of 19th century utilitarian economists. Most importantly, it will provide a basis for the rigorous analysis of choice under uncertainty. In order to proceed, we need to introduce a few concepts and terms from probability and statistics. These concepts and terms may be familiar to you if you have taken any kind of statistics or econometrics course. For this purpose, we start by considering a simple kind of gamble or game, with two possible outcomes. (In this chapter, we will use the word “game” in the usual vernacular fashion; this is actually simpler than a “game” in the sense of Chapter 14, as it involves only one decision maker.) The possible outcomes in this game are dollar (or money) prizes; one is X dollars and the other is Y dollars. These are the only possible outcomes, and so the person playing this 19 Uncertainty and Expected Utility 370 game must end up with either X or Y . Each outcome will happen with a given likelihood or probability. We will write px and py for the probabilities. (Note that in this chapter, symbols like px and pi will represent the probabilities of outcome X or Xi, not the price of good X or good i, as in earlier chapters.) Since X and Y are the only possible outcomes in this game, the probabilities must sum to 1; that is, px + py = 1. A sensible person looking at this will want to know the (weighted) average outcome if she plays the game. That weighted average outcome is called the expected value or expectation of the game. We will write E for expected value. Note that the weighted average outcome, the expected value, is E = pxX + pyY . For example if X = $5 and if Y = $10, and if p1 = 0.9 and p2 = 0.1, then E = $5.50. However, if p1 = 0.1 and p2 = 0.9, then E = $9.50. (The moral of this chapter so far: if you play this gamble, know the odds!) The idea of expected value is easily generalized to the case of n outcomes. Consider a game with n outcomes, (X1, X2, . . . , Xn), all of which are dollar prizes. Each outcome Xi occurs with a given probability pi. We call this a lottery, written L. (Note that a state-sponsored lottery like Powerball is a special example of what we call a lottery, although the state-sponsored version may try to conceal the bad odds. The simple two-outcome gamble discussed above is a lottery. Other more complex lotteries are discussed below.) The expected value or expectation of the lottery with n outcomes is given by E(L) = p1X1 + p2X2 + . . .+ pnXn. A lottery may have outcomes which are other lotteries. For example, a coin is tossed. If it is heads, the game is over and you win $10. If it is tails, it is tossed again. If it is heads on the second toss, you win $10. If it is tails on the second toss, you lose $40. (Would you buy this game?) Let us now consider how we can calculate the expected value of a lottery on lotteries. We will again assume that the ultimate prizes are amounts of money, in dollars. We will illustrate with another example. Let L be a compound lottery whose possible outcomes are other lotteries L1, L2, and L3. Assume the probabilities of these outcomes are (p1, p2, p3) = (1/3, 1/3, 1/3). (A note about notation: p12, for example, is the probability of outcome X2, given that lottery L1 was the outcome at the first stage of the compound lottery.) Suppose the Li’s are as follows: L1 pays X1 = $0 and X2 = $50 with probabilities (p11, p12) = 19 Uncertainty and Expected Utility 371 (1, 0); L2 pays X1 = $0 and X2 = $50 with probabilities (p21, p22) = (1/2, 1/2); and L3 pays X1 = $0 and X2 = $50 with probabilities (p31, p32) = (1/3, 2/3). How is E(L) calculated? The expected value of the compound lottery is the expectation of the expectations. That is, E(L) = p1E(L1) + p2E(L2) + p3E(L3) = 13(0) + 1 3 ( 1 2 × 50 ) + 1 3 ( 2 3 × 50 ) ≈ 19.44. More generally, suppose there are n different ultimate outcomes, indexed by i, and m different possible outcomes of compound lottery L, each one itself a lottery, indexed by j. Assume lottery Lj has the ultimate prizes (for instance, amounts of money) as its outcomes, and let pji be the probability of outcome Xi from lottery Lj. Then the expected value of the compound lottery is again the expectation of the expectations, or E(L) = m∑ j=1 pjE(Lj) = m∑ j=1 pj n∑ i=1 pjiXi = m∑ j=1 n∑ i=1 pjpjiXi = n∑ i=1 m∑ j=1 pjpji Xi. The term in the far right summation, within the parentheses, is the ultimate probability of getting prize Xi, after the two stages of the compound lottery have played out. 19.3 Von Neumann-Morgenstern Expected Utility: Assumptions and Con- clusion We are now ready to describe the model of consumer behavior under uncertainty. This model was first developed by John von Neumann (1903-1957) and Oskar Morgenstern (1902-1977), in their 1944 book Theory of Games and Economic Behavior. Von Neumann was a Hungarian- American genius, a mathematician who made major contributions to mathematics, nuclear physics, nuclear weapons, game theory, and other fields. Morgenstern was a noted German-born Austrian-American economist. We assume the consumer is considering some set of alternatives. The alternatives include various certain things, like bundles of goods, amounts of money, and so on. For now we will continue to use notation like X1 and X2 for the certain outcomes. In addition to these certain outcomes, there are risky alternatives. We will use notation like X , Y , and Z for arbitrary alternatives, both certain and risky. A risky alternative is a lottery which attaches probabilities to all its outcomes. When we want to emphasize the fact that something is a lottery, we will 19 Uncertainty and Expected Utility 372 use notation like L, L1, and L2. The consumer has complete and transitive preferences over all the alternatives. If the possible outcomes of a lottery are the certain outcomes, and if we assume a finite number n of them, a risky alternative is simply a lottery over (X1, X2, . . . , Xn). That is, it is just a list of probabilities, summing to 1, over all the certain outcomes. (If there are infinitely many certain outcomes, the model requires more complex mathematical description. For the purposes of exposition, we will stick to the easier finite n story.) Note that any certain outcome can also be viewed as a (degenerate) lottery, since getting X1 for certain is the same as having a lottery with p1 = 1 and p2 = p3 = . . . = pn = 0. Risky alternatives also include lotteries whose outcomes are other risky alternatives (that is, lotteries over lotteries), or some of whose outcomes are certain and some risky. (Note that there are many gambles in the real world just like this. For example, you might buy a state lottery ticket for $1 with three possible outcomes: you win $0 with probability 0.95, you win $10 with probability 0.04, or you win a lottery ticket in a different lottery with probability 0.01.) Von Neumann-Morgenstern utility theory is based on the following five assumptions: 1. Completeness and transitivity. The consumer’s preferences over all the alternatives, certain and uncertain, are complete and transitive. 2. Continuity. Suppose that the consumer prefers X over Y over Z. That is, Y is somewhere between X and Z in the consumer’s preference ranking. Then there must exist a prob- ability px, with 0 < px < 1, such that the consumer is indifferent between the middle alternative Y , and a lottery with the best alternative X with probability px and the worst alternative Z with probability 1 − px. 3. Independence. Suppose the consumer is indifferent between alternatives X and Y , and suppose that Z is any alternative. Consider two lotteries. One has outcomes X and Z, and the other has outcomes Y and Z. Suppose both these lotteries assign the same probability to the “indifferent” alternative (that is, X or Y ), and therefore the same probability to the “other” alternative Z. Then the consumer must be indifferent between these two lotteries. 4. Unequal probabilities. Suppose the consumer prefers X to Y . Consider two lotteries; both have only X and Y as possible outcomes. Suppose the two lotteries attach different 19 Uncertainty and Expected Utility 373 probabilities to the two outcomes. Then the consumer prefers the lottery that assigns a higher probability to her preferred outcome X . 5. Compound lotteries (complexity). A consumer is given a choice between two lotteries. Lot- tery L1 is a straightforward lottery that provides the certain outcomes (X1, X2, . . . , Xn) with probabilities (p1, p2, . . . , pn). Lottery L2 is a lottery whose outcomes are other lotter- ies. However, after the intermediate lotteries play out, it ultimately ends up with the same certain outcomes, and with the same probabilities (p1, p2, . . . , pn). (These probabilities are calculated using formulas similar to the probability expression in our equation above for the expected value of a compound lottery.) Then the consumer is indifferent between L1 and L2. Before proceeding to the Von Neumann-Morgenstern result, we should make a few comments about these assumptions. Assumption 1, completeness and transitivity, is an obvious extension of the assumptions we made in the standard consumer theory model. Assumption 2, continuity, seems obvious and intuitive. However, we shall see that this assumption causes the von Neumann-Morgenstern utility function to be more than ordinal, although not quite cardinal. Assumption 3, indepen- dence, seems very intuitive, and assumption 4, unequal probabilities, seems like it has to be true. Surely the consumer should prefer the gamble that gives him better odds of the preferred prize. Assumption 5, compound lotteries, is very plausible for a “rational” consumer who should focus on the ultimate probabilities of the ultimate outcomes. But some consumers may like or dislike intervening lotteries (even though in our model they take no time and cost nothing to run). Some theorists have investigated what happens to the von Neumann-Morgenstern model without some of these assumptions. Now we can turn to the remarkable theorem: Von Neumann-Morgenstern Expected Utility Theorem. Suppose consumer i faces a set of alternatives, certain and uncertain. Suppose her preferences satisfy assumptions 1 through 5. Then there exists a utility function ui such that: 1. It represents the consumer’s preferences. That is, the function assigns utility numbers to all the alternatives, and for any pair of alternatives X and Y , ui(X) > ui(Y ) if and only 19 Uncertainty and Expected Utility 374 if consumer i prefers X to Y , and ui(X) = ui(Y ) if and only if consumer i is indifferent between them. 2. It satisfies the expected utility property. Let L be any risky alternative, that is, any lottery. Suppose its outcomes are X, Y, . . . , Z. These may be certain outcomes, or other lotteries. (Note that there may be any number of such outcomes; although for simplicity we assume a finite number.) Suppose the corresponding probabilities are px, py, . . . , pz. Then the utility of the risky alternative L is the expectation of the utilities of its possible outcomes. That is, ui(L) = pxui(X) + pyui(Y ) + . . .+ pzui(Z). 19.4 Von Neumann-Morgenstern Expected Utility. Examples We will now turn to some examples of Von Neumann-Morgenstern utility functions. In this section, when we write u(X) or ui(X) or uj(x) we are referring to such functions. 1. An example showing that Von Neumann-Morgenstern utility is not ordinal. Suppose that consumer i prefers X to Y to Z. Suppose her utilities from X and Z are ui(X) = 15 and ui(Z) = 5. If her preferences were ordinal, the utility she assigns Y could be any number greater than 5 and less than 15. But the continuity assumption ties down her ui(Y ). Here is why. Given that assumption, she can be asked the following question: “Consider a lottery L that gives you alternativeX (your favorite) with probability px and alternative Z (your least favorite) with probability 1 − px. What probability px would make you indifferent between L and your middle alternative Y ?” She must answer the question with a single number px. Then her utility from Y must equal her utility from L, which in turn must equal the expectation of the utilities of the outcomes of L. This gives ui(Y ) = ui(L) = pxui(X) + (1− px)ui(Z) = px15 + (1− px)5 = 5 + 10px. Once she says what px is, her utility from Y is uniquely defined. Let’s assume she answers the question: “My px = 0.4.” Then her utility from Y must be 9. For her, the fact that ui(X) = 15, ui(Y ) = 9, and ui(Z) = 5 means more than “I like X best, 19 Uncertainty and Expected Utility 375 Y second best, and Z least.” It also reveals something about her attitude toward uncertainty. Moreover, the utility function generated in this fashion is not preserved by an order-preserving transformation of ui. For example, if ui(X) = 15, ui(Y ) = 9, and ui(Z) = 5, and if we let vi = √ui, then vi(X) = 3.87, vi(Y ) = 3.00, and vi(Z) = 2.24. Now the utility of the lottery L would be vi(L) = pxvi(X) + (1− px)vi(Z) = px3.87 + (1− px)2.24 = 2.24+ 1.63px. Setting px = 0.4 then gives vi(L) = 2.89 6= 3 = vi(Y ). In other words, if we were to transform ui into vi in this fashion, consumer i would no longer be indifferent between L with px = 0.4 and Y ; the transformation would destroy this property of her preferences. The upshot of this is that a von Neumann-Morgenstern utility function, like a purely ordinal utility function, allows utility levels for two alternatives to be set arbitrarily. That is, for some pair X and Y , where consumer i prefers X to Y , utility levels can be set in any way, as long as ui(X) > ui(Y ). Once these two levels are selected, however, all other utility levels are tied down, and are not subject to arbitrary order-preserving transformations. This is the sense in which von Neumann-Morgenstern utility is more than ordinal, and is sometimes called, somewhat carelessly, a cardinal utility measure. 2. A risk averse consumer. Let us now assume the certain outcomes are amounts of money, measured as usual in dollars. We will let x represent dollars going to consumer i. Assume that her utility function is ui(x) = 10√x. Note that for this utility function, ui($0) = 0 and ui($100) = 100. This means that we have chosen our two arbitrary utility levels in a way that makes i’s utility function go through two convenient points: (0, 0) and (100, 100). In Figure 19.1 below, we have graphed this utility function, with amounts of money x on the horizontal axis and i’s utility ui(x) on the vertical axis. Note that the graph of the utility function is a parabola, passing through (0, 0) and (100, 100), and “opening to the right.” Note that it is a concave function, meaning that its slope dui(x)/dx decreases as x increases. The derivative dui(x)/dx is i’s marginal utility of money (or marginal utility of income), and so this individual has declining marginal utility of income. 19 Uncertainty and Expected Utility 376 We will assume that consumer i starts out with $50 for sure. This gives her utility, ui($50) = 10 √ 50 = 70.71. Now let’s introduce some risk. Suppose she discovers that she might suffer a loss of $25, because of a possible auto accident, a theft, or some other random event. We assume the probability of having no loss is 1/2, and the probability of a $25 loss is 1/2. The expected loss is $12.50. How much would she be willing to pay to insure against this loss? Suppose she pays P for an insurance policy that fully reimburses her if she suffers a loss. Then she has traded a random situation, which we will call lottery L, for a certain amount of money, equal to what she started with minus the insurance premium, or $50 − P . The lottery L is what she has with no insurance; it is $50− $25 = $25 with probability 1/2, and $50 with probability 1/2. Her maximum willingness to pay for the insurance policy is the P that would make her indifferent between having no insurance and having insurance: ui(L) = ui(50− P ), or (1/2) · ui(25) + (1/2) · ui(50) = 10 √ 50− P , or (1/2) · 10√25 + (1/2) · 10√50 = 10√50− P . Solving for P gives $13.57. All this is illustrated in Figure 19.1 below. Note that in the figure, the utility outcomes of L are shown on the vertical axis; these are ui(25) = 50 and ui(50) = 70.71. The utility level of L is exactly halfway between these points, because the probabilities of the two outcomes are 1/2 and 1/2. The fact that ui(L) is halfway between ui(25) and ui(50) forces P in the figure to be more than 1/2× ($50− $25) = $12.50. INSERT FIGURE 19.1 HERE Caption of Fig. 19.1: The risk averse consumer who starts with $50 and faces a $25 loss with probability 1/2. She has a lottery L, which leaves her with $25 with probability 1/2, and $50 with probability 1/2. The utility levels from these two outcomes are shown on the vertical axis, as is ui(L), which must be midway between them because of the assumed probabilities. 19 Uncertainty and Expected Utility 377 The certainty equivalent of the loss is P ; the expectation of the loss is $12.50. The concavity of the ui(x) function implies that P > $12.50. We have found that our risk averse consumer is willing to pay as much as $13.57 to ensure against a random loss with expectation $12.50. The random loss, to her, is the equivalent of a certain loss of $13.57, and so she is willing to pay up to that amount to get rid of it. The $13.57 is called consumer i’s certainty equivalent of the loss. It is significantly greater than the expected value of the loss because she is risk averse. 3. A risk loving consumer. For some consumers, risk is a good thing, something to pay extra for, rather than a bad thing to be avoided. A consumer who seeks risk instead of avoiding it is called a risk loving consumer, or a risk lover for short. Here is an example. Again, we assume the certain outcomes are amounts of money, measured in dollars. We assume that consumer j’s utility function is uj(x) = x2/100. This utility function, like risk averse consumer i’s in the last example, goes through the points (0, 0) and (100, 100). The utility function is shown in Figure 19.2 below. Like consumer i’s utility function, uj(x) is a parabola, but this one “opens upwards,” instead of “opening to the right.” Therefore it is a convex function, and consumer j’s marginal utility of money duj(x)/dx is increasing rather than decreasing. In the preceding example, we assumed consumer i started with $50 in certain money, but faced a random loss of $25, with probability 1/2. We calculated how much she would pay, at most, to get rid of the risk. We will change the story a little bit for our risk lover j. The change may make it slightly easier to see the certainty equivalent in our figure. We now assume that consumer j starts out with a very risky position. In fact, we will assume that she starts with this L: $0 with probability 1/2, and $100 with probability 1/2. The expectation of L is E(L) = (1/2) · $0 + (1/2) · $100 = $50. Now j is approached by someone who wants to trade her cash for L. How much cash would she require? We will now let P represent the minimum amount she would need to be paid for L. How can we find P? If she trades away L, all she has is P , a certain amount of money. The utility of this alternative to her is uj(P ) = P 2/100. If she keeps L, her utility is uj(L) = (1/2) · uj(0) + (1/2) · uj(100) = (1/2) · 0 + (1/2) · 100 = 50. 19 Uncertainty and Expected Utility 378 If P is the minimum amount she would accept in exchange for her lottery L, she must be indifferent between accepting the sure-thing P and holding onto the random L: uj(P ) = uj(L) , or P 2/100 = 50. Solving this equation gives P = $70.71. This amount is consumer j’s certainty equivalent of the lottery L. This risk lover, who owns a risky L with expectation $50, would need to be paid at least $70.71 to part with it. All this is illustrated in Figure 19.2 below. Note that in the figure, the utility outcomes of L are shown on the vertical axis; these are uj($0) = 0 and uj($100) = 100. The utility level of L is at 50, exactly halfway between these points, because the probabilities of the two outcomes are 1/2 and 1/2. The fact that uj(L) is halfway between uj($0) and uj($100) forces P in the figure to be more than half of E(L) = $50. INSERT FIGURE 19.2 HERE Caption of Fig. 19.2: The risk loving consumer, who starts with lottery L that gives her $100 with probability 1/2, and $0 with probability 1/2. The utility of the lottery is 50, and the certainty equivalent is P = $70.71. Note that the convexity of the ui(x) function implies P > E(L) = $50. 4. A risk neutral consumer. A consumer is called risk neutral if she neither avoids risk nor seeks it. This means her utility function doesn’t curve downward (as in the risk averse case), or upward (as in the risk loving case). If consumer k is risk neutral, the graph of her utility function is an upward-sloping straight line. The simplest version of such a utility function is uk(x) = x. A risk neutral consumer values a lottery at its expected value. For instance, the possible loss that our risk averse consumer was contemplating, $0 with probability 1/2 and $25 with probability 1/2, would be viewed as equivalent to a certain loss of $12.50 by the risk neutral consumer. Let’s conclude this discussion by illustrating an extremely important point. There are, in reality, all sorts of people and all sorts of firms with all sorts of attitudes toward risk. If two parties with different attitudes toward risk get together, they can often make great gains by 19 Uncertainty and Expected Utility 379 trading risk. For example, suppose our risk loving consumer j has $70.71 in her pocket, but has no lottery. Suppose she meets risk neutral k, who runs a casino. Risk neutral k can make up games, and sell them to people who want them. So here is what k does—she offers a gamble to consumer j: “You pay me $69.95, and I will flip a coin. If it’s heads, you win $100. If it’s tails, you get nothing.” Consumer j will accept this gamble because she loves risk, and her maximum willingness to pay is $70.71. She ends up slightly better off than she was with $70.71 in her pocket. Consumer k is now much better off. We can measure k’s gain in utility units, or in expected dollars, since her utility function is uk(x) = x. Let’s measure in dollars. Her revenue is $69.95 and her expected cost is $50. Therefore her expected profit is $19.95. In short, both parties are better off after the exchange, especially risk neutral k who has made almost $20. We will leave it to the reader to show that our risk averse consumer i could trade her chance of a $25 loss, either to our risk lover j or our risk neutral k, in a swap that makes both parties better off. 19.5 A Solved Problem The Problem Suppose consumer i has utility function ui(x) = √x. There are two financial assets; both cost $100. Consumer i is going to buy one of these. They will both pay off after a short time. Asset 1 pays $103 with certainty. Asset 2 pays $110 with probability 0.95, and $0 with probability 0.05. Which asset does she buy? Consumer j has utility function uj(x) = x2. Which asset would she buy? The Solution Consumer i’s utility from A1 = $103 is ui(103) = √ 103 = 10.149. Her expected utility from A2 is E(ui(A2)) = 0.95ui(110) + 0.05ui(0) = 0.95 √ 110 + 0.05 √ 0 = 9.964. Therefore consumer i buys the sure thing, A1. 19 Uncertainty and Expected Utility 380 Consumer j’s utility from A1 = $103 is uj(103) = 1032 = 10, 609. Her expected utility from A2 is E(uj(A2)) = 0.95uj(110) + 0.05uj(0) = 0.95× 1102 + 0.05× 02 = 12, 100. Therefore consumer j buys the risky thing, A2. 19 Uncertainty and Expected Utility 381 Exercises 1. The risk averse consumer of section 4 above meets up with the risk neutral consumer of that section. Describe a simple trade that they can make which makes both better off. 2. George is on a game show. He has equal probabilities of winning $5, $50, and $500. His utility function is ug(x) = 12x2, where x is the dollar amount he wins. (a) Calculate the expected value of the lottery. (b) Calculate George’s expected utility from the lottery. (c) Calculate his certainty equivalent of the lottery. Explain what it means. (d) Is George risk averse, risk neutral, or risk loving? How can you tell? 3. Jack and Jill go up a hill to fetch a pail of water. There is a 25 percent chance that Jack will trip and fall, a 25 percent chance that Jill will trip and fall, a 25 percent chance that both will trip and fall, and a 25 percent chance that neither will trip and fall. If neither trips and falls, the pail will contain 6 gallons of water. If one of them trips and falls, the pail will contain 2 gallons of water. If both of them trip and fall, the pail will contain no water. Jack’s utility from fetching a pail of water is ua(w) = w2, and Jill’s utility is ui(w) = 2w, where w is gallons of water in the pail. (a) How many gallons of water do you expect Jack and Jill to fetch? (b) Calculate the expected utility for each of them. Suppose there is a small well at the foot of the hill. If they draw water from the small well, they will obtain 3 gallons of water with certainty (and will not have to hike up the hill). (c) Who prefers to draw water at the foot of the hill? Who prefers to hike up the hill to draw water? (d) Is Jack risk averse, risk neutral, or risk loving? How about Jill? Explain. 19 Uncertainty and Expected Utility 382 4. Adam, Michael, and Stella each have $24 to spend on beer at the pub. Their utility functions are ua = 3√ba, um = bm, and us = b3s, respectively, where bi is the number of glasses of beer consumed by person i. The price of a glass of beer is $3. Suppose glasses of beer can be consumed in fractions, e.g., you can order half a glass. Assume there is a 50 percent chance that they will all get mugged on the way to the pub. If so, they all lose all their money. (a) Calculate the expected beer consumption for each of them. (b) Calculate expected utility for each of them. Suppose the neighborhood thug is selling protection for $6. An individual who pays for protection does not get mugged. (c) Who will buy protection? (d) If the neighborhood thug wants to sell protection to all three, and he can charge only one price, what price would he charge? 5. Ko is offered a lottery. There is a jar with six balls in it—one blue, two reds, and three yellows. His outcome will be determined by the color of the ball he draws. If he draws a blue ball, he receives $10. If he draws a red ball, he loses $5. If he draws a yellow ball, he loses $1. Given Ko’s utility function, uk(x) = x, where x is the dollar amount won, does Ko accept the lottery? 6. Will buys a lottery ticket for $6. The ticket has equal probabilities of being type A, type B, or type C. A type A ticket pays $20 with a 30 percent probability and nothing with a 70 percent probability; a type B ticket pays $15 with a 40 percent probability and nothing with a 60 percent probability; and a type C ticket pays $10 with a 50 percent probability and nothing with a 50 percent probability. Calculate the expected value of Will’s lottery ticket. 20 Uncertainty and Asymmetric Information 383 20 Uncertainty and Asymmetric Information 20.1 Introduction In the last chapter, we discussed von Neumann-Morgenstern utility functions, which are used to represent people’s preferences in situations where there is uncertainty—where information is imperfect or missing. We will continue the analysis of decision making under uncertainty in this chapter. But now we will focus on the problems that arise when information is unequally distributed, in the sense that some people in the market know more than other people. More precisely, we are now considering markets for goods or services where there is uncer- tainty, and the uncertainty is more on one side of the market (e.g., the buyers’ side) than on the the other side of the market (e.g., the sellers’ side). These are called markets with asymmetric information; the information is “asymmetric” because people on one side know more than people on the other side. In a world of perfect certainty there would be no asymmetric information, but in this chapter we will allow uncertainty. It turns out that asymmetric information may create serious market failures—failures that may need remedies. In particular, what happens when the sellers of some risky or uncertain good or service know more than the buyers? For example, what happens in the market for used cars, where the sellers often know much more about the condition of the cars they are selling than the buyers? We will look at the used car market in Section 2 below. What happens in insurance markets if insurance companies cannot distinguish between low-risk clients and high-risk clients? In Section 3, we will examine the problems created by asymmetric information in insurance markets. Of course, the very existence of insurance may cause some people to take greater risks than they would take if they weren’t insured. This is called the moral hazard problem, and we will examine it in Section 4. Finally, there are many economic situations where the people in charge (called principals) cannot directly observe the effort of the people who work for them (called agents). What problems are caused by the information asymmetry between principals and agents? We will look at this problem in Section 5. In Section 6, we will conclude this chapter with a brief discussion of what might be done to fix market failures caused by asymmetric information. 20 Uncertainty and Asymmetric Information 384 20.2 When Sellers Know More Than Buyers: The Market for “Lemons” This story is based on a paper that was published in 1970 by the American economist George Akerloff (1940-). (George Akerloff, Michael Spence (1943-), and Joseph Stiglitz (1943-) won the 2001 Nobel prize in economics for their work on markets with asymmetric information.) Consider the market for used cars. There is considerable uncertainty attached to any used car. It may break down and require a new engine or transmission tomorrow, or it may run perfectly, needing only occasional routine maintenance, for the next ten years. When people are buying and selling used cars, the sellers usually have better information about their reliability than the buyers. The sellers know much more about the probabilities of nasty and costly mechanical failures that may happen next month or next year. We will assume there are two kinds of used cars: “quality cars” and “lemons.” We will assume they look exactly alike, and even a mechanic cannot tell the difference. But a person who has owned a car for a reasonable period of time knows whether her car is a quality car or a lemon. The owner of a car knows its history of past repairs, and she therefore knows the probabilities of future repairs. A person who has not owned that car doesn’t know those probabilities. For simplicity, we will assume all people in the car market, both buyers and sellers, are risk neutral. We will also assume that quality cars and lemons deliver the same utility to any owner; they only differ in expected repair costs. Given her knowledge of the expected costs of future repairs, the owner of a lemon has a reservation price for her car. This is the minimum she would take for it, and it reflects both how much she likes it as a car, and the expected repair costs she will have to incur if she holds on to it. We will assume the reservation price for the lemon owner is $1,000. Similarly the owner of a quality car has a reservation price, which reflects her knowledge of expected repair costs for the quality car. We will assume the reservation price for the quality car owner is $2,000. And remember, the owner of a car knows which type of car it is. We will assume that any potential car buyer has a willingness to pay for either type of car, contingent on the type. We assume buyers would be willing to pay $1,200 for a lemon, and $2,400 for a quality car. But when a potential buyer looks at a car, she does not know the type. What would happen if potential buyers could distinguish between quality cars and lemons? 20 Uncertainty and Asymmetric Information 385 A market price for lemons, somewhere between $1,000 and $1,200, would be established, and a market price for quality cars, somewhere between $2,000 and $2,400, would be established. The markets would clear, and the result would be efficient or Pareto optimal. But buyers cannot distinguish between quality cars and lemons. What then happens in this market? Let’s assume, for simplicity, that there are roughly equal numbers of quality cars and lemons “out there.” If this is the case, buyers will think that buying a car is a lottery L, with a possible outcome of a lemon, with probability 0.5, and a possible outcome of a quality car, also with probability 0.5. Given that we have assumed risk neutrality, what potential buyers think about L simply depends on the dollar expectation of the lottery. This is E(L) = 0.5× $1, 200+ 0.5× $2, 400 = $1, 800. It follows that buyers are willing to pay up to $1,800 for a used car. Soon sellers realize that buyers are only willing to pay $1,800 for used cars, and so sellers with quality cars disappear. As sellers of quality cars disappear, buyers become less and less willing to pay as much as $1,800. In equilibrium, there are no quality cars being bought or sold; the market for quality cars disappears. The market price for used cars ends up somewhere between $1,000 and $1,200, but only lemons trade. This outcome is clearly inefficient because potential buyers and sellers of quality cars are unable to trade. This is a market failure due to asymmetric information. 20.3 When Buyers Know More Than Sellers: A Market for Health Insurance Often buyers of insurance know more about the risks they face than the insurance companies who insure them. This is true for auto insurance, since drivers know more about how carefully they drive and how much they drive than the insurance companies know; similarly, property owners know more about how careful or careless they are with fires than their insurance companies know; and consumers often know more about the state of their health than their health insurers know. The following example illustrates what might happen in health insurance markets as a result of this kind of asymmetric information. Health insurance/adverse selection example. Suppose there are two groups of health care consumers. In both groups there is a risk of some serious but non-fatal illness, which costs 20 Uncertainty and Asymmetric Information 386 $100,000 to treat. Group 1 consumers have a low risk of the illness. Their probability of having it is 0.01, or 1 percent. A consumer in this group is willing to pay up to $1,200 for an insurance policy which would provide her with free treatment if she gets the illness. Note that if she were to pay for treatment herself, her expected treatment cost would be 0.01× $100, 000 = $1, 000. The fact that she is willing to pay up to $1,200 for the insurance policy indicates that she is risk averse. If an insurance company (which is risk neutral) can tell that a consumer is in group 1, it is clearly advantageous for it to arrange a policy with that consumer at a price somewhere between $1,000 and $1,200. Group 2 consumers have a high risk of the same illness. Their probability of having it is 0.05, or 5 percent. The treatment cost is the same as for consumers in group 1. A consumer in this group is willing to pay up to $6,000 for an insurance policy that would pay for treatment. If she were to pay for treatment herself, her expected treatment cost would be 0.05× $100, 000 = $5, 000. She is willing to pay up to $6,000 to insure against this risk because she is risk averse. If an insurance company can tell that a consumer is in group 2, it is clearly advantageous for it to arrange a policy with that consumer at a price somewhere between $5,000 and $6,000. Under the condition of complete information, that is, if insurance companies know who belongs to which group, members of the two groups can be offered insurance policies with different prices. In a competitive market for insurance contracts, profits would be competed away in equilibrium. It follows that there would be two kinds of insurance contracts in a competitive equilibrium: insurance policies would be sold to consumers in group 1 at a price of $1,000, and to consumers in group 2 at a price of $5,000. By the first fundamental theorem of welfare economics, this outcome would be efficient. And all the consumers, who are all risk averse, would be insured. We now assume there are 9,000 people in group 1, and 1,000 people in group 2. The total population is 10,000. Over the total population, the probability of the illness is the total expected number of people who will get it, divided by 10,000. The total expected number of people who will get the illness is 0.01× 9, 000 + 0.05× 1, 000 = 140. The probability of the illness over the entire population is therefore 0.014. If all cases are treated, the expected treatment cost, over the entire population, is 0.014× $100, 000 = $1, 400. All of these assumptions are summarized in Table 20.1 below. 20 Uncertainty and Asymmetric Information 387 Number of People Exp. Cost per Person Willingness to Pay Group 1 9,000 $1,000 $1,200 Group 2 1,000 $5,000 $6,000 Total Population 10,000 $1,400 Table 20.1 Now let’s see what happens in this market when there is asymmetric information. The asym- metry is this: the potential insurance buyers, members of groups 1 and 2, know which group they are in, but the insurance companies do not. Since the insurance companies cannot distinguish between group 1 members and group 2 members, they must offer policies to everybody at the same price. We will let P represent the insurance policy price. In a competitive equilibrium, insurance companies must be making zero expected profits. That is, P must be equal to the expected cost of treatment for the insured population. As calculated above, the expected cost of treating the illness over the entire population is $1,400. But P = $1, 400 will not work. The entire population cannot be insured at this price. The reason is simple. At a price of $1,400, consumers in group 1 will not buy the insurance. They will drop out of the market. The only consumers who will buy the insurance are members of group 2, for whom the expected cost of treatment is $5,000. But if only members of group 2 are buying insurance, the insurance companies will need a price of at least $5,000. It follows that the competitive equilibrium is at a price of $5,000; group 2 members are insured, but group 1 members are uninsured. This market outcome is inefficient: Group 1 members cannot buy a product at a price which would be advantageous to them, and to the insurance companies who would sell it to them. They are excluded from this market, where there is a single price for the two very different groups of consumers. This is an example of adverse selection in insurance, an outcome in which insurance com- panies, unable to distinguish between groups with very different risks and therefore charging everyone the same price, cannot insure an entire population of risk averse consumers. All the consumers would like to purchase insurance at actuarially fair prices, and even prices that are above actuarially fair. However there are so many high risk consumers that the single market price is too high for the low risk consumers. The low risk consumers are in effect pushed out of the market. 20 Uncertainty and Asymmetric Information 388 This is another market failure due to asymmetric information. 20.4 When Insurance Encourages Risk Taking: Moral Hazard In the example of the last section, we assumed that two groups of people had different probabil- ities of illness, and the insurance companies couldn’t distinguish between group 1 members and group 2 members. The inability to distinguish between the two types led to a market failure. Another important insurance-based market failure occurs when insurance buyers change their behavior, and increase their probabilities of losses, because they have insurance. This leads to a market failure if the insurance companies fail to observe the changes in behavior and to charge premiums reflecting those changes. When insurance buyers take more risks because they have insurance, we say the insurance creates moral hazard. The very existence of insurance coverage causes increased losses, and increased social costs. The term “moral hazard” is now commonly used by economists (and by others who want to sound impressive) to describe bad behavior encouraged by the existence of compensation for losses. Economists (and many others) believe in incentives; positive incentives to encourage good behavior and negative incentives to discourage bad behavior. This is the idea of the carrot and the stick. Moral hazard is what results when you take away the stick, the negative incentives for bad behavior. For example, according to some analysts, government programs that rescued Wall Street firms after the crash of 2008 created moral hazard. Losses to big Wall Street firms were mitigated, some firms were saved from bankruptcy, and the negative consequences of risky financial decisions were reduced or eliminated. As a consequence, in the future Wall Street firms may not be as careful as they would have been, absent the rescue. Other, more obvious examples include the following: 1. Liability and collision insurance for motor vehicle operators. If motorists had to bear the full costs of the accidents they cause, they would likely drive more carefully. Of course insurance companies try hard to discover which of their insured drivers are taking risks, by monitoring accident histories, speeding tickets, drunk driving arrests, and so on, but they cannot be completely successful. 20 Uncertainty and Asymmetric Information 389 2. Fire and theft insurance for homeowners. If homeowners had no insurance, they would probably be more careful with fires in barbecue grills, fireplaces, and wood stoves; and they would be more careful about defective wiring, locking doors at night, and so on. 3. Flood insurance for homeowners. Flood insurance compensates the homeowner for flood losses. Flood losses mainly occur in flood plains, next to rivers, and in low-lying coastal areas. In the U.S., the federal government actually subsidizes flood insurance, which encourages people to build in flood-prone areas, exacerbating the moral hazard problem. Driving while on cell phone/moral hazard example. Let us now turn to a numerical example. Assume there are 1,000 identical people, who like to drive while talking on their cell phones. Assume each driver has a 0.04 probability of one accident (per year), if she does not use her cell phone, and a 0.08 probability of one accident (per year), if she does. Any accident would result in $10,000 in damages to the driver’s own car. (If accidents harm other drivers or their cars, the analysis is more complicated.) Expected losses are $400 per year if the driver is not a cell phone user, and $800 per year if she is a cell phone user. We assume that the insurance market is competitive, and the insurance companies end up charging premiums just sufficient to cover expected losses. If nobody uses a cell phone, the insurance premium will be around $400 per year; if everybody uses a cell phone, the insurance premium will be around $800 per year; and if it’s a mix, the insurance premium will be somewhere between those extremes. The drivers like their cell phones, but not enough to pay the extra $400 in expected losses created by their use. We assume that using the cell phone while driving gives each driver $300 worth of convenience and pleasure per year. We also assume that the drivers are risk averse; each driver would be willing to pay $600 per year to insure against a 0.04 probability of a $10,000 accident, and would be willing to pay $1,200 per year to insure against a 0.08 probability of a $10,000 accident. Finally, we assume that the insurance companies cannot tell whether or not a customer uses her cell phone while driving. What’s the equilibrium in this example? Since insurance companies cannot tell whether a driver is talking on her cell phone while driving, if a driver is insured, she will use her cell phone. This results in the 0.08 probability of one accident (per year), with expected losses of $800 per year. Drivers are willing to pay up to $1,200 per year to insure against this risk. However, since 20 Uncertainty and Asymmetric Information 390 the insurance market is competitive, with profits at or close to zero, the insurance premium is $800 per year. Therefore everybody will buy insurance, and everybody will talk on their cell phones while driving. The net benefit to each driver will equal the benefit of driving, say D, minus the $800 cost of insurance, plus the $300 value of being able to talk on the cell phone while driving, or D − $800 + $300 = D − $500. If a driver didn’t buy insurance and continued to talk on her cell phone, her expected utility from this lottery would equal the utility of the certainty equivalent net benefit, which would be D− $1, 200+ $300 = D− $900, which is much worse than driving with insurance. If a driver didn’t buy insurance and stopped talking on her cell phone, her net benefit would be D − $600, which is still worse than driving with insurance and talking on the cell phone. All these assumptions are summarized in Table 20.2 below. Ben. Drive. Ben. Cell Will. to Pay Insur. Prem. Net Ben. Insured, Use Cell D $300 $1,200 $800 D-$500 Insured, No Cell D – $1,200 $800 D-$800 Not Insured, Use Cell D $300 $1,200 – D-$900 Not Insured, No Cell D – $600 – D-$600 Table 20.2 It follows that the equilibrium in this example is one where everyone buys insurance, at a price of $800 per year, and they all use their cell phones while driving. However, this is an inefficient equilibrium. It is inefficient because if everyone stopped talking on their cell phones, they could buy insurance for $400 per year instead of $800 per year, and the net benefit for each driver would be D − $400 > D − $500. Finally, the availability of insurance creates moral hazard, because the insurance protection causes all the drivers to take an extra risk. The cost to society of that extra risk is $400 (the increase in expected accident losses per driver per year), while the benefit to the driver is only $300. In short, this is an example of a market failure created by moral hazard. 20 Uncertainty and Asymmetric Information 391 20.5 The Principal-Agent Problem In the example of the previous section, all the drivers were identical, but there was an information asymmetry between the insurance companies and the drivers. The insurance companies could not see whether or not the drivers were talking on their cell phones while driving. The availability of insurance created a moral hazard; it encouraged drivers to misbehave, and that misbehavior resulted in a market failure, an inefficient equilibrium. We will now turn to another information asymmetry that may lead to market failure. In many economic contexts, there are two (or more) people who are working on some project, more-or-less together, but with somewhat different goals. One is in charge; he is called the principal. The other is working for the principal; he is called the agent. Examples include an employer and an employee in an office or a factory; a general contractor and a subcontractor on a construction project; a patient and a doctor; a property owner who wants to sell his house and his real estate agent; a plaintiff in a lawsuit and his lawyer; a farmer and his farm laborer; and a legislature and the bureaucrats who write regulations and implement the law. Generally a principal can observe some of what the agent does, but not all of it, and the principal’s observation of the agent’s effectiveness is confounded by random events. (For in- stance, the patient gets sicker after the surgery, but he does not know whether this is because the surgeon didn’t prepare for the operation enough, or because his cancer was intractable. The property owner doesn’t manage to sell his house, but he doesn’t know whether this was because his real estate agent didn’t schedule enough showings, or because his only potential buyer was just laid off.) In short, there is an information asymmetry between the principal and the agent, compounded by random noise. The principal has a goal. The agent is on the principal’s side but may not have the same goal. The principal can observe the outcome, but cannot observe all that the agent does or fails to do. The principal-agent problem is the (principal’s) problem of maximizing his expected payoff, given the information asymmetry, and given the randomness inherent in the process leading from effort to outcome. Principal-agent example, introduced. We’ll now set up an example. We’ll assume there is a farmer, who is the principal. He employs a farm worker, who is the agent. The farmer grows corn, and for simplicity we will assume there are only two possible crop yields: 5 tons and 10 20 Uncertainty and Asymmetric Information 392 tons. Also for simplicity, we will assume both revenue and costs for the farmer, as well as wages for the farm worker, are measured in tons of corn. Therefore the number of tons grown also equals the farmer’s revenue. The output from the farm depends on effort of the farm worker and on random events. The farm worker can exert high effort or low effort. (Of course he prefers low effort, all else equal.) The farmer can observe the output. But he cannot observe the worker’s effort. Moreover, because of random noise, the farmer cannot conclude that the worker must have put in high effort if the output is 10 tons, or that the worker must have put in low effort if the output is 5 tons. We let e represent the farm worker’s effort level, and we assume the two effort levels are e = 1 (low effort) and e = 2 (high effort). High effort of course causes the worker more disutility than low effort. In particular, we assume disutility levels equal to the effort levels; low effort causes disutility of 1, and high effort causes disutility of 2. To get the worker to work requires compensation that provides enough utility to offset the disutility of the work. The worker requires compensation that gives him at least 1 unit of utility for low effort, and at least 2 units of utility for high effort. The connection between effort of the worker and output from the farm is as follows: We let p(e) represent the probability of the high (10 ton) crop yield, a function of effort e, and we let 1−p(e) represent the probability of the low (5 ton) crop yield. (As in other parts of this chapter, p stands for probability, not price.) We assume that if effort is low, then p(e) = p(1) = 0.1. That is, low effort implies a 10 percent chance of high output, and a 90 percent chance of low output. And we assume that if effort is high, then p(e) = p(2) = 0.9. That is, high effort implies a 90 percent chance of high output, and a 10 percent chance of low output. We assume that our farmer, the principal, is risk neutral, and we assume that our farm worker, the agent, is risk averse. The principal offers the agent a contract, which specifies a wage to be paid contingent on the output level, high (10 tons) or low (5 tons). The wages depend only on output, which is observable, and not on effort, which is unobservable. We let wh be the (contingent) wage if the output is high, and wl be the (contingent) wage if the output is low; both are measured in tons. If output is high (10 tons), the farmer’s profit (measured in tons) is 10 − wh. If output is low (5 tons), the farmer’s profit is 5 − wl. Since the principal is risk 20 Uncertainty and Asymmetric Information 393 neutral, he simply wants to maximize expected profit. Expected profit is E(pi) = p(e)(10−wh) + (1− p(e))(5−wl). It depends on the agent’s effort e, and on the contingent wages wh and wl. We assume the agent, who is risk averse, has a square root utility function. That is, for a given w, his utility is u = √ w, and as a function of his effort e, his expected utility is E(u|e) = p(e)√wh + (1− p(e)) √ wl. (A note about notation: E(u|e) means “Expected utility, contingent on the effort level e.”) This gives E(u|e = 1) = 0.1√wh + 0.9√wl and E(u|e = 2) = 0.9√wh + 0.1√wl, for low effort and high effort, respectively. If the wages contingent on output are set at the point where the worker is just (barely) willing to work, expected utility based on the wages has to just offset the disutility of working. This gives 0.1 √ wh + 0.9 √ wl = 1 and 0.9 √ wh + 0.1 √ wl = 2, for low effort and high effort, respectively. We lay out some of our example’s assumptions, notation, and preliminary conclusions in Table 20.3 below: Low Effort by Agent High Effort by Agent Crop Yield Probabilities (High, Low) (0.1, 0.9) (0.9, 0.1) Agent’s Wages if Low Yield wl wl Agent’s Wages if High Yield wh wh Wages Needed to Get Agent to Work 0.1√wh + 0.9 √ wl = 1 0.9 √ wh + 0.1 √ wl = 2 Principal’s Expected Profit 0.1(10− wh) + 0.9(5− wl) 0.9(10− wh) + 0.1(5− wl) Table 20.3: Assumptions, notation, wages needed to get the agent to work at given effort levels, and the principal’s expected profits. 20 Uncertainty and Asymmetric Information 394 The principal’s first-best outcome. Now we will calculate the first-best outcome for the principal. This is the maximum expected profit the principal could achieve if he paid the worker just enough to get him to work, and contrary to our basic assumption about the principal-agent model, if the principal could actually observe and choose the worker’s effort level. Note that the principal has to offer the worker contingent wages as shown above, in order to get him to work at the indicated effort levels. (The wages are still contingent on output.) We will use the two necessary contingent wage equations to calculate the principal’s maximum profits based on low effort and high effort. Then we will select the effort level that results in a higher profit level for the principal. Assuming low effort, the principal’s objective function is: E(pi) = 0.1(10− wh) + 0.9(5−wl) = 5.5− 0.1wh − 0.9wl. We use the agent’s low-effort constraint 0.1 √ wh + 0.9 √ wl = 1, which modifies to wh = 100− 180√wl + 81wl. Substituting the constraint into the objective function, we get E(pi) = −4.5 + 18√wl − 9wl. MaximizingE(pi) is now straightforward; it is equivalent to maximizing 2√wl−wl. The objective function is maximized at wl = 1; substituting back in the constraint equation gives wh = 1; and substituting the contingent wage values back into the expected profit function gives E(pi) = 0.1(10− 1) + 0.9(5− 1) = 4.5. This is the highest expected profit the principal can get if he chooses low effort by the agent. Now let’s analyze high effort. Assuming the principal chooses high effort by the agent, and pays him just enough to get him to do the work, the objective function is: E(pi) = 0.9(10− wh) + 0.1(5−wl) = 9.5− 0.9wh − 0.1wl. We use the agent’s high-effort constraint 0.9 √ wh + 0.1 √ wl = 2, which modifies to wl = 400− 360√wh + 81wh. (It’s a little easier to use the constraint to solve for wl as a function of wh in this case, rather than the reverse.) Substituting the constraint into the objective function, we get E(pi) = −30.5 + 36√wh − 9wh. 20 Uncertainty and Asymmetric Information 395 Maximizing E(pi) is now straightforward; it is equivalent to maximizing 4√wh − wh. The objective function is maximized at wh = 4; substituting back in the constraint equation gives wl = 4; and substituting the contingent wage values back into the expected profit function gives E(pi) = 0.9(10− 4) + 0.1(5− 4) = 5.5. This is the highest expected profit the principal can get if he chooses high effort by the agent. The results of all these calculations are shown in Table 20.4 below. Low Effort by Agent High Effort by Agent Crop Yield Probabilities (High, Low) (0.1, 0.9) (0.9, 0.1) Agent’s Wages if Low Yield 1 4 Agent’s Wages if High Yield 1 4 Principal’s Expected Profit 4.5 5.5 Table 20.4: Payoffs to principal and agent, assuming the principal can observe and choose the agent’s effort, to maximize the principal’s expected profit. The principal notes these results. He wants to maximize his expected profit. Therefore the first-best outcome for the principal, the outcome he opts for if he can observe and choose the agent’s effort, is the following: high effort by the agent, contingent wages of wl = 4 and wh = 4, and expected profit for himself of E(pi) = 5.5. The calculations above are hypothetical, and depend on the principal’s being able to choose the effort level of the agent. But the essential difficulty of the principal-agent problem is that the principal cannot see the agent’s effort level. Let us now consider how this affects the analysis. When the principal cannot observe the agent’s efforts. The first thing to note is that if the agent chooses low effort, the best possible outcome for the principal is E(pi) = 4.5, as we figured above. If the contingent wages are equal, or close to equal, the agent is probably going to choose low effort. On the other hand, if there is a big enough difference between wh and wl, the agent will put in the extra effort, whether or not the principal can observe that effort. The next thing we’ll do is to calculate the wage difference needed to “incentivize” the agent, that is, 20 Uncertainty and Asymmetric Information 396 to induce him to work hard. The required wage difference is called an incentive compatibility constraint. We derive the incentive compatibility constraint as follows. The agent’s utility, net of his disutility from effort, is 0.9 √ wh + 0.1 √ wl − 2 for high effort, and 0.1 √ wh + 0.9 √ wl − 1 for low effort. The necessary condition for getting the agent to choose high effort, when he cannot be observed, is that the former be greater than or equal to the latter. This gives 0.9 √ wh + 0.1 √ wl − 2 ≥ 0.1 √ wh + 0.9 √ wl − 1 which, with minor rearranging, gives √ wh − √ wl ≥ 1.25. Note that the strong inequality would imply the agent would choose high effort for sure; the equality would imply the agent is indifferent between low effort and high effort. (The best possible outcome for the principal would have the agent indifferent, but still choosing high effort.) The best conceivable outcome for the principal, based on the agent’s choosing high effort, now requires two equations. The first (the incentive compatibility constraint) is needed for the agent to choose high effort instead of low effort: √ wh − √ wl = 1.25, and the second is needed for the agent to choose work instead of no work: 0.9 √ wh + 0.1 √ wl = 2. Solving these two equations simultaneously gives √wl = 7/8 and √ wh = 17/8. Squaring the terms yields wl = .7656 and wh = 4.5156. Finally, we can substitute wl and wh into the principal’s expected profit equation. This gives E(pi) = 0.9(10− wh) + 0.1(5−wl) = 0.9(10− 4.5156)+ 0.1(5− 0.7656) = 5.36. The results of all these calculations are shown in Table 20.5 below. 20 Uncertainty and Asymmetric Information 397 Low Effort by Agent High Effort by Agent Crop Yield Probabilities (High, Low) (0.1, 0.9) (0.9, 0.1) Agent’s Wages if Low Yield 0.7656 0.7656 Agent’s Wages if High Yield 4.5156 4.5156 Principal’s Expected Profit 5.36 5.36 Table 20.5: Payoffs to principal and agent, assuming the principal cannot observe the agent’s effort, to maximize the principal’s expected profit. We conclude as follows. If our farmer, the principal in this story, could choose the effort level of our farm worker, the agent, the best possible outcome for the farmer would be an expected profit level of 5.5. In that case, the risk neutral principal would be able to fully insure the risk averse agent by offering him a constant wage, independent of the output. However, given that the principal cannot choose the worker’s effort level, or even observe it, he needs to incentivize the agent by offering a non-constant wage, and the best possible outcome for the farmer is 5.36. The principal cannot see the agent’s effort, nor infer it from the ultimate outcome, because of random noise. Therefore the absence of information creates an efficiency loss equal to the difference between 5.5 and 5.36. In short, the principal-agent relationship creates another market failure due to information asymmetry. 20.6 What Should Be Done About Market Failures Caused By Asymmetric Information In the sections above, we have seen that imperfect information, and in particular asymmetrically distributed imperfect information, will lead to inefficiency or market failure. So what should be done? The answer to this question, in a nutshell, is to devise some way to make the information flow from those who have it to those who don’t, or to develop schemes that give the correct incentives to people. Remedies include: 1. Signaling devices. These are rules or mechanisms that cause the side with information to reveal that information. In the used car market, for instance, there are various possible signaling devices, including inspection stickers, and dealer used car certifications. Car 20 Uncertainty and Asymmetric Information 398 sellers might be required to provide repair histories. In the health insurance market, buyers of insurance policies might be required to undergo physical examinations, or to answer a series of questions about their health status on an application form. In real estate markets, where there are “lemon” houses, many states and cities require disclosure statements in which a seller answers a long list of questions about plumbing, wiring, heating costs, susceptibility to flooding, and so on. Firms hiring employees often require proof of certification to establish a job candidate’s ability to be an electrician, a welder, or a phlebotomist, or proof of licensure to be a doctor or lawyer, or proof of a college degree to be a teacher or a stock analyst. 2. Screening contracts. With screening contracts, as opposed to signaling devices, the side of the market with poor information tries to get private information from the other side by drafting contracts that create incentives for the privately informed agents to self-select into groups with different risk characteristics. By choosing the contracts that they prefer, the buyers in effect reveal their information. For example, in the health sector, where the actual health status of a person may be private information, health insurance contracts with different levels of premiums, copays, and deductibles might be designed by insurance companies to separate the buyers. Under certain conditions, it is possible to offer the buyers contracts which will successfully separate the different risk types. Each type will have incentives to choose a different coverage-copay combination, and insurance companies will be able to infer a buyer’s risk characteristics by observing which contract she chooses. In the principal-agent relationship, it might be possible to design contracts or compensation schemes which better align the goals of the parties. 3. Monitoring. The moral hazard problems created by insuring drivers or homeowners can be reduced by requiring periodic updates of information on speeding violations, drunk driving arrests, and so on for drivers, and on building condition, repairs, additions, local real estate prices, and so on for homeowners. Agents who might exert low effort when their principals expect high effort can be subjected to monitoring. For example, we have had time punch clocks since the 19th century, and now we have GPS systems and a thousand other exotic devices which allow employers to watch their employees, literally and figuratively. 4. The legal system. Legal contracts ( and the common law) often provide incentives for sellers 20 Uncertainty and Asymmetric Information 399 (or buyers) to inform buyers (or sellers). Many car sellers bundle warranties with the used cars they sell. If the car needs a repair within the next X months, the dealer will provide it for free. Insurance companies may be able to cancel a health insurance contract if the person who buys it misrepresents her health history. An auto liability insurance policy may be canceled if the driver misrepresents her driving record. Many goods (other than cars) are sold with guarantees or warranties attached. These are legal contracts which commit the seller to repair or replace the product in case of a defect. Legal rules of implied warranty may find that the seller of a defective product has obligations to the buyer, even if such obligations were never written or spoken. Legal liability rules often place costs on the seller of a defective product, if that product results in an accident or an injury. Liability rules might also bear on moral hazard problems, and a principal, in some circumstances, might be able to sue an agent for low effort. In fact, there is a whole world out there of guarantees and warranties, explicit and implicit; legal remedies when explicit or implicit promises are broken; litigation in which the party who was misled or not informed attempts to recover from the party who misled and hid the truth; and litigation in which the party which expected great effort sues the party that didn’t work hard enough. 20.7 A Solved Problem The Problem Consider the market for health insurance with two populations with different probabilities of illness, as described in Section 3 above. Follow all the assumptions about probabilities of illness, treatment costs, willingness to pay, and so on, as in the text, except for the assumption about the numbers of people in the two groups. Assume instead that the number of people in group 1 is 9,800 and the number of people in group 2 is 200. (a) Suppose the insurance companies cannot tell the difference between group 1 consumers and group 2 consumers. Describe the equilibrium. (b) What will happen if the insurance companies devise a test that allows them to tell the difference between a group 1 consumer and a group 2 consumer? 20 Uncertainty and Asymmetric Information 400 (c) If the government steps in and says the insurance companies are not allowed to discriminate between group 1 consumers and group 2 consumers, what happens? The Solution We are assuming group 1 consumers face a 0.01 probability of illness and group 2 consumers face a 0.20 probability of illness. An illness in either group costs $100,000 to treat. The will- ingness to pay for an insurance policy is $1,200 in group 1 and $10,000 in group 2. The total population is 10,000. Over the entire population, the expected number of people who will get sick is 0.01×9800+ 0.20 × 200 = 98 + 40 = 138. Therefore the probability of illness, over the entire population, is 138/10, 000 = 0.0138. It costs $100,000 to treat the illness. If all cases were treated, the expected treatment cost over the entire population would be 0.0138× $100, 000 = $1, 380. (a) If insurance companies cannot tell the difference between group 1 consumers and group 2 consumers, they must charge everybody the same price. Assume the insurance companies try to charge a price P . (1) If P ≤ $1, 200, everybody is willing to buy the policy, but the insurance companies lose money because the expected treatment cost over the entire population is $1,380. Result: the insurance companies must raise their rates. (2) If $1, 200 < P ≤ $10, 000, group 1 consumers all drop the insurance, leaving only high risk group 2 consumers. But the expected payout for group 2 consumers is 0.20× $100, 000 = $20, 000. Result: collapse. (3) If $10, 000< P , nobody buys insurance. (b) If the insurance companies devise a test to discriminate between consumers in different groups, they will want to charge different prices, say P1 for group 1 consumers and P2 for group 2 consumers. Each price would have to at least cover the expected payout in that group. Therefore, for low risk consumers, the price would have to be at least 0.01×$100, 000 = $1, 000, and for high risk consumers, the price would have to be at least 0.20× $100, 000 = $20, 000. There would be an equilibrium at any pair of prices (P1, P2) satisfying $1, 000 ≤ P1 ≤ $1, 200 and $20, 000≤ P2. All group 1 consumers would end up with insurance, and all group 2 consumers would end up without insurance. This is the efficient outcome. 20 Uncertainty and Asymmetric Information 401 (c) If insurance companies are told that they are not allowed to discriminate between the two groups, one of two things happens: (i) The market collapses and no one gets insurance. (ii) The government subsidizes the insurance company losses for group 2 consumers. (Various rules incorporated in the 2010 Patient Protection and Affordable Care Act in the U.S., called “ObamaCare” by its opponents, have this effect.) For exam- ple, if insurance companies charged a single price to all consumers of P = $1, 200, everybody would buy insurance. Insurance companies would break even on group 1 members, and those consumers would have no gains from the arrangement, since they are paying maximum willingness to pay. On average, insurance companies would loose $20, 000−$1, 200 = $18, 800 for each group 2 consumer. The government would have to subsidize the companies with $18, 800 for each group 2 consumer. Each one of those consumers would gain $10, 000− $1, 200 = $8, 800. The net loss to society, for each group 2 consumer, would equal the government’s subsidy less the consumer’s gain, or $18, 800− $8, 800 = $10, 000. 20 Uncertainty and Asymmetric Information 402 Exercises 1. Harry has decided to start a used car dealership. There are three types of used cars—type A cars are valued at $3,000, type B cars are valued at $2,000, and type C cars are valued at $1,000. The true type of the car is known to the owners but not to Harry. Type A, type B, and type C cars are worth $2,400, $1,600, and $800, respectively, to their owners. (a) Suppose Harry believes that the three types exist with equal probabilities. How much is he willing to pay for a used car? What types of cars will be bought and sold in equilibrium? (b) After a month in business, Harry realizes he is overpaying for the types of cars he is getting. He revises his prior beliefs and now thinks that he has zero probability of getting a type A car, and equal probability of getting a type B car and a type C car. How much is he now willing to pay for a used car? What types of cars will be bought and sold in equilibrium? 2. There are 1,000 individuals in the city of Lincoln wishing to sell their used cars. The value of a car, V , ranges between $0 and $3,000. The distribution of values is such that the number of used cars worth less than $V is V/3. The true value of the car is known only to the owner. Potential buyers are risk neutral and value a car at its expected value. An owner may choose to have his car inspected for a fee of $300, and will then be able to sell his car for the true value. (a) Suppose nobody has his car inspected. What would the market price for used cars be? (b) Now suppose every car worth more than $X gets inspected, while every car worth less than $X does not get inspected. What would the market price of uninspected used cars be, as a function of X? (c) In equilibrium, the owner of a car worth $X is indifferent between getting an inspection and not getting an inspection. What is the equilibrium value of X? (d) How many cars will not get inspected? How much will each car sell for? 20 Uncertainty and Asymmetric Information 403 3. Placido, Jose, and Luciano are singers. Their probabilities of laryngitis, known to each of them, are pp = 0.6, pj = 0.2, and pl = 0.1, respectively. An individual who has laryngitis forgoes income of 1 million dollars, from a big concert. (a) Calculate each singer’s expected loss from laryngitis. They are contemplating buying insurance, which will provide a payment of $1 million in the event of laryngitis. Each singer’s willingness to pay for insurance is as follows: WTPp = $500, 000, WTPj = $175, 000, and WTPl = $125, 000. The insurance company knows that the probabilities of laryngitis are 0.6, 0.2, and 0.1, but does not know which tenor has which probability. (b) Is Placido risk averse, risk neutral, or risk loving? How about Jose and Luciano? How can you tell? (c) Calculate the expected value of the insurance company’s total payout, assuming the insurance company insures all three. (d) Suppose the insurance company sets the price to equal the expected payout above. Who will buy insurance? Will the insurance company make a profit or a loss? 4. There are 1,000 identical homeowners living in a small town. Let the benefit of home ownership be H . One of them, Kevin, is debating whether to get homeowners’ insurance. If he locks the door 100 percent of the time, there is a 2 percent probability that his house will be burgled. If he locks the door 80 percent of the time, the probability of burglary rises to 6 percent. However, locking the door 80 percent of the time is more convenient, and gives him a utility equivalent to $100. If his house is burgled, Kevin loses $5,000. Kevin is willing to pay $250 to insure against a 2 percent probability of a $5,000 loss, and $750 to insure against a 6 percent probability of a $5,000 loss. Suppose the insurance market is competitive, and charges premiums sufficient to cover expected losses. Assume that insurance companies are unable to monitor how often a homeowner locks his door. (a) If Kevin does not buy insurance, how often will he lock his door? (b) If everybody locks his door 100 percent of the time, how much will an insurance policy cost? 20 Uncertainty and Asymmetric Information 404 (c) If everybody locks his door 80 percent of the time, how much will an insurance policy cost? (d) How much will insurance policies cost in equilibrium? How often will Kevin lock his door? What is his net benefit? (e) Explain why the outcome in part (d) is not socially optimal. 5. In the principal-agent model discussed in section 5 above, we assumed the agent was paid wh if the farm output was high (10 tons), and was paid wl if the farm output was low (5 tons). Suppose that instead of assuming contingent payments of wh tons and wl tons in the high output and low output cases, respectively, we simply assumed the agent was paid a fixed fraction c of the output. (The symbol c stands for “commission.”) That is, if the output is 10, the agent gets 10c, and if the output is 5, the agent gets 5c. (There are many principal-agent relationships like this; for instance, a real estate agent selling a property may be paid 5 percent of the sales price, and so c = 0.05 for such an agent.) (a) Solve for the c that would make the agent indifferent between working at low effort and not working at all. Find the expected profit for the principal if the agent did low effort work and was paid based on that c. (b) Solve for the c that would make the agent indifferent between working at high effort and not working at all. Find the expected profit for the principal if the agent did high effort work and was paid based on that c. (c) If the agent is free to work at low effort or at high effort, how high would c have to be to guarantee that he works at high effort? (This is the incentive compatibility part of this exercise.) Comment on the c you have calculated. 6. There are two types of workers in Nephilim. Low productivity workers produce $900 worth of output a month and high productivity workers produce $3,000 worth of output a month. There are twice as many low productivity workers as there are high productivity workers. Both low and high productivity workers have utility function u(w) = 2√w, where w is the monthly wage. 20 Uncertainty and Asymmetric Information 405 (a) Suppose there is no way of distinguishing between the two types of workers, so ev- eryone is paid the same wage. Assuming a competitive labor market, what is this wage? Firm A has hired a consultant to solve this information asymmetry. The consultant sug- gests offering an elective training course. Workers who take the course will earn an extra $600 a month. The training course has no effect on productivity, but is extremely boring. Taking the course is equivalent to a $X dollar monthly wage cut for low productivity workers, and a $Y dollar monthly wage cut for high productivity workers. (b) Suppose both high and low productivity workers choose to take the course. What can you say about X and Y ? (c) The consultant revises the $600 incentive down to $400. Now only high productivity workers choose to take the course. What can you say about X and Y ? Index adverse selection, 385, 387 Akerloff, G., 384 allocation competitive equilibrium allocation, 286 feasible allocation, 278, 279, 286 non-feasible allocation, 280 Arrow, K., 283 asymmetric information, 383, 385, 388 average cost, 135–137 average cost function, 136 average product, 140, 171 average total cost, 174 average variable cost, 174 backward induction, 265 bankruptcy, 150 battle of the sexes, 251, 257 expanded battle of the sexes, 251, 262, 263 Bertrand competition, 240 Bertrand duopoly model, 229 Bertrand equilibrium, 242, 258 Bertrand, J., 229, 240 Bowley, A., 276 budget constraint, 35 intertemporal budget constraint, 38 standard budget constraint, 35 budget line, 35 budget set, 35 bundle, 16 cap and trade, 320, 332, 333, 336 capital market, 83 carrot and stick, 388 cartel, 236 certainty equivalent, 377, 378 change in consumer’s surplus, 119 Clarke, E., 359 classical school, 12 Coase Theorem, 330 Coase, R., 328 Coasian property rights, 320, 328 Cobb, C., 47, 69, 166 Cobb-Douglas production function, 166, 177 Cobb-Douglas utility function, 47 collusion, 235, 238 command policy, 355 comparative statics, 55 compensating variation/equivalent variation para- dox, 103, 114, 116 competitive equilibrium, 232, 233, 275, 283, 286, 305, 311 competitive firm, 183 competitive market, 133, 144, 183, 184, 221, 283, 287, 302 competitive market equilibrium, 183, 190, 191 complements, 60, 61 completeness, 16, 18, 372 concavity, 132 conditional factor demand, 135, 160 406 Index 407 conditional input demand, 135, 142, 160 consumer’s surplus, 113, 116, 118, 119, 191 consumers’ surplus, 113, 121–123, 191, 195, 210, 211, 277 consumption bundle, 16 consumption/leisure model, 74, 78, 299 continuity, 372 contract curve, 282 convexity, 16, 21, 153 core, 283, 288 corner solution, 43 cost, 134 cost minimization, 158–160, 172 coupon rationing, 38 Cournot competition, 229 Cournot equilibrium, 231–233, 258 Cournot, A., 229 deadweight loss, 183, 195, 196, 198, 199, 210, 211, 220, 233 Debreu, G., 283 Defoe, D., 275 demand curve, 55, 58, 207 compensated demand curve, 56, 65, 125 inverse demand curve, 55, 207 demand function, 55 demand-revealing mechanism, 359–361 differentiated goods, 243, 245 diminishing returns, 132 discount factor, 40 distributional fairness, 279 dominant strategy, 253 dominant strategy equilibrium, 251, 254, 258 Douglas, P., 47, 69, 166 Dresher, M., 252 duopoly, 228 economic problem, 12 economy exchange economy, 275 Edgeworth box diagram, 276, 277, 281 Edgeworth, F., 12, 276 efficiency, 277, 298, 325, 334, 343, 357 elasticity, 66 price elasticity of demand, 67, 208 endowment initial endowment, 276 Engel curve, 55, 57 Engel, E., 57 equity, 279 excess demand, 190, 304 excess supply, 190, 305 expected utility property, 374 expected value, 370 experimental evidence on games, 255 externality, 319, 320, 343 feasible, 17 fixed cost, 172 Flood, M., 252 free entry, 184 free exit, 184 free rider, 343, 352 Index 408 gamble, 369 game, 251, 369 battle of the sexes, 251, 257 centipede game, 272 chess, 264 duopoly sequential competition, 266 expanded battle of the sexes, 251, 262 matching pennies, 264 mutual assured destruction, 268 one shot game, 255 prisoners’ dilemma, 251, 252, 254 repeated game, 255 sequential move game, 264 simultaneous move game, 264 tic-tac-toe, 264 tit for tat, 251, 256 game theory, 229, 251 game tree, 265 general equilibrium, 279 general equilibrium analysis, 14 general equilibrium model, 298 Giffen good, 58 Giffen, R., 58 Groves, T., 359 health insurance market, 385 Hicks, J., 62 homogeneity, 241 homogeneous goods, 184, 241, 242 implied warranty, 399 incentive compatibility constraint, 396 incentive compatible, 360 income effect, 55, 62–64 income expansion path, 57 independence, 372 indifference, 18 indifference curve, 16, 19, 26 indifference relation, 18 individual demand function, 55 industry supply curve, 194 horizontal industry supply curve, 189 long run industry supply curve, 187 short run industry supply curve, 186 upward-sloping long-run industry supply curve, 188 inefficiency, 277 inferior good, 55, 57 inflation rate, 84 input demand curve, 142 insurance auto insurance, 388 flood insurance, 389 homeowner’s insurance, 389 interest rate, 84 nominal interest rate, 84 real interest rate, 86 internalizing the externality, 324 intertemporal budget constraint, 38, 83–85 intertemporal budget line, 85 inverse demand, 59 inverse production function, 134, 143 invisible hand, 254 Index 409 isocost line, 158 isofactor curve, 148 isoquant, 151, 152 isorevenue line, 148 Jevons, W., 12 Kaldor, N., 64 Keynes, J., 150 labor, 73, 299 labor market, 83 labor supply curve, 76 labor supply function, 76 Lagrange function, 45, 52 Lagrange function method, 45, 52 Lagrange, J., 52 law of demand, 58 legal contract, 398 leisure, 73, 74, 299 Leontief, W., 71 liability rules, 399 Lindahl equilibrium, 355, 357 Lindahl, E., 355 long run, 150, 151, 185 long run cost curve, 160 long run cost function, 160 long run market theory, 185 lottery, 370 compound lottery, 370, 373 Luce, R., 257 Malthus, T., 132 marginal benefit, 347 marginal cost, 135–137, 174, 208, 215, 217, 220 marginal cost function, 136 marginal external cost, 323 marginal product, 140, 152, 171 marginal rate of substitution, 16, 22, 23, 26, 27, 301 marginal revenue, 208, 215, 220 marginal unit, 118 marginal utility, 16, 27, 375 market cap and trade market, 320, 332, 333, 336 market for pollution rights, 320, 331 market constraints, 131, 133 market demand, 68 market demand curve, 56 market equilibrium, 190, 275, 348 market failure, 319, 326, 343, 385, 388, 390, 397 market failures, 196 market for lemons, 384 market supply curve, 194 horizontal market supply curve, 189 long run market supply curve, 187 short run market supply curve, 186 upward-sloping long-run market supply curve, 188 markup, 209 Marshall, A., 12, 58 Marx, K., 12 McKenzie, L., 283 microeconomics, 13 Index 410 Mill, J., 12 money good, 115 monitoring, 398 monopolist, 184, 205 monopolistic competition, 205, 218 monopoly, 206, 221, 232, 233 legal monopoly, 206 natural monopoly, 205 monopoly firm, 205 monopoly market, 205 monopoly profit maximization, 207 monopsony, 206 monotonicity, 16, 19, 131, 153 moral hazard, 383, 388, 390 Morgenstern, O., 251, 368, 371 mutual assured destruction, 268 Nash equilibrium, 251, 258, 353 mixed strategy Nash equilibrium, 261 pure strategy Nash equilibrium, 261 Nash, J., 258, 261 neoclassical school, 12 non-labor income, 78 normal good, 55, 56 numeraire, 289, 302 numeraire good, 39, 115 ObamaCare, 401 oligopoly, 228 opportunity cost of leisure, 75 optimal choice, 41, 44 overtime pay, 80 Pareto domination, 278, 300, 312 Pareto efficiency, 94, 275, 278, 300, 312 Pareto efficient, 279 Pareto move, 278, 283 Pareto optimal, 354 Pareto optimality, 94, 275, 278, 279, 281, 283, 287, 298, 300, 301, 312, 334, 343, 357, 359, 360 Pareto, V., 12, 278 partial derivative, 27 partial equilibrium, 278 partial equilibrium analysis, 14 Patient Protection and Affordable Care Act of 2010, 401 perfect competition, 183 perfect complements, 43 perfect information, 184, 367 perfect substitutes, 32, 43 Pigou, A., 327 pollution rights, 320, 331 Polonius, 84 preference relation, 17 strict preference relation, 17 weak preference relation, 18 preferences, 16 well behaved preferences, 21 present value of consumption stream, 40 of income stream, 40 present value of income stream, 85 price constraints, 133 Index 411 price discrimination, 205, 212 common price discrimination, 212, 213 first degree price discrimination, 213, 216 perfect price discrimination, 213, 216 second degree price discrimination, 212 third degree price discrimination, 212, 213 price-taking behavior, 184 principal-agent problem, 383, 391, 397 prisoners’ dilemma, 251, 252, 254 private good, 343, 346 probability, 370 producer’s surplus, 183, 193, 210, 211 producers’ surplus, 183, 194, 195, 277 production, 130 production function, 131, 149 production model multiple-input/single-output production model, 130, 149 single-input/single-output production model, 130 production technique, 151 profit, 130, 134 profit maximization, 137, 138, 141, 159, 163, 175, 177, 217, 220 first order condition for profit maximiza- tion, 138 joint profit maximization, 324 second order condition for profit maximiza- tion, 138 public good, 343 public good model, 346 quasi-public good, 345 quasilinear preferences, 113, 115, 119, 347 quasilinearity, 113, 117, 125, 347 Raiffa, H., 257 randomness, 367, 369 rationality, 12, 259 real wage, 75 real-world convexity/concavity, 133 repeated games, 251 reservation price, 384 returns to scale, 155 constant returns to scale, 155, 157, 160 decreasing returns to scale, 156, 162 increasing returns to scale, 156, 161 revealed preference, 94, 110 strong axiom of revealed preference, 111 weak axiom of revealed preference, 110 revenue, 134 Ricardo, D., 12, 132 risk averse, 375 risk loving, 377 risk neutral, 378 Robinson Crusoe, 275 Russell, B., 73 Samuelson optimality condition, 343, 350, 351, 354, 357 Samuelson, P., 110, 351 savings, 39 income effect on savings, 88 substitution effect on savings, 87 Index 412 screening contract, 398 second-price auction, 361 set less preferred set, 20 more preferred set, 20 Shakespeare, W., 84 short run, 170, 185 short run cost function, 173 short run demand function, 172 short run market theory, 185 short run production function, 171 short run total cost, 173 signaling device, 397 Slutsky, E., 62 Smith, A., 12, 94, 132, 195, 254, 287 social surplus, 191, 194, 195 social surplus maximization, 195 Spence, M., 384 Stackelberg competition, 239 Stackelberg duopoly model, 229 Stackelberg follower, 229, 239 Stackelberg leader, 229, 239 Stackelberg, H., 229, 239 Stiglitz, J., 384 strategic behavior, 228, 251 strategy, 253 mixed strategy, 261 pure strategy, 261 subsidy Pigouvian subsidy, 320, 326, 327 substitutes, 60, 61 substitution effect, 55, 62–64 Hicks substitution effect, 99 Kaldor substitution effect, 99 Slutsky substitution effect, 112 supply function, 144 supply of labor, 73 supply of savings, 73, 83, 86 tatonnement process, 286 tax ad valorem tax, 95 carbon emissions tax, 98 demand-revealing tax, 344, 359 federal income tax, 81 flat tax, 73, 81, 83 gasoline tax, 95 income tax, 80 just tax, 355 lump sum tax, 94–96 lump-sum tax, 289 payroll tax, 80 per head tax, 96 per unit tax, 94–97, 196–199 percentage tax, 95 Pigouvian tax, 320, 326, 327 poll tax, 96 progressive tax, 73, 80, 81, 83, 355 regressive tax, 81 sales tax, 95 Social Security tax, 81 specific tax, 95 Index 413 taxation according to ability to pay, 355 taxation according to benefit, 355 taxation according to marginal benefit, 356 value added tax, 95 Wicksell/Lindahl tax, 343, 355 tax rebate per unit tax rebate, 97 technical rate of substitution, 154 technological constraints, 131, 149 technological efficiency, 151, 300, 301 technological inefficiency, 300 theory of the firm long run theory of the firm, 150 short run theory of the firm, 150, 170 threat, 267 tit for tat, 251, 256 total cost, 135, 172 total cost function, 135 transfer lump-sum transfer, 289 transitivity, 16, 18, 372 Tucker, A., 252 U-shaped average cost curve, 133 uncertainty, 367–369 unemployment benefits, 79 unequal probabilities, 372 utility cardinal utility, 25, 369, 375 expected utility, 374 ordinal utility, 25, 113, 369 utility function, 16, 24 Cobb-Douglas utility function, 124 quasilinear utility function, 119 utility maximization, 44 value of average product, 141 value of marginal product, 141 variable cost, 172 variation compensating variation, 100–102, 113, 119 equivalent variation, 100–102, 113, 119 Veblen good, 59 Veblen, T., 59 Vickrey auction, 361 Vickrey, W., 361 voluntary contribution mechanism, 352 von Neumann, J., 251, 368, 371 von Neumann-Morgenstern expected utility, 369, 371 von Neumann-Morgenstern expected utility the- orem, 373 von Neumann-Morgenstern utility, 383 von Neumann-Morgenstern utility function, 368 wage rate, 74 Walras’ Law, 292 Walras, L., 12, 283 Walras,L, 292 Walrasian equilibrium, 283, 286, 305, 311 Walrasian process, 286 welfare change for many people, 121 Index 414 welfare change for one person, 98 welfare economics, 94 first fundamental theorem of welfare eco- nomics, 275, 287, 288, 312 second fundamental theorem of welfare economics, 275, 288, 290, 312 Wicksell, K., 355, 356 willingness-to-pay, 104, 118 naive willingness-to-pay, 104, 118 zero savings point, 84 textbook.DVI