Econ 121b: Intermediate Microeconomics

Dirk Bergemann, Spring 2012

1 Introduction

1.1 What’s Economics?

This is an exciting time to study economics, even though may not be so exciting

to be part of this economy. We have faced the largest financial crisis since the

Great Depression. $787 billion has been pumped into the economy in the form

of stimulus package by the US Government. $700 billion has been spent on the

Troubled Asset Relief Programs for the Banks. The unemployment rate has been

high for a long time. The August unemployment rate is 9.7%. Also there has been

big debates going on at the same time on health care reform, government deficits,

climate change etc. We need answers to all of these big questions and many others.

And all of these come under the purview of the discipline of economics (along with

other fields of study). But then how do we define this field of study? In terms of

subject matter it can be defined as the study of allocation of scarce resources. A

more pragmatic definition might be, economics is what economists do! In terms

of methodology Optimiz! ation Theory, Statistical Analysis, Game Theory etc

characterize the study of Economics. One of the primary goals of economics is to

explain human behavior in various contexts, humans as consumer of commodities

or decision maker in firms or head of families or politician holding a political office etc. The areas of research extends from international trade, taxes, economic

growth, antitrust to crime, marriage, war, laws, media, corruption etc. There are

a lot of opportunity for us to bring our way of thinking to these issues. Indeed,

one of most active areas of the subject is to push this frontier.

Economists like to think that the discipline follows Popperian methods, moving

from Stylized facts to Hypothesis formation to Testing hypothesis. Popperian tradition tells you that hypotheses can only be proven false empirically, not proven

true. Hence an integral part of economics is to gather information about the real

world in the form of data and test whatever hypothesis that the economists are

proposing to be true. What this course builds up, however, is how to come up

with sensible hypotheses that can be tested. Thus economic theory is the exercise

in hypothesis formation using the language of mathematics to formalize assumptions (about certain fundamentals of human behavior, or market organization, or

1

distribution of information among individuals etc). Some critics of economics say

our models are too simplistic. We leave too many things out. Of course this is

true – we do leave many many things out, but for a useful purpose. It is better to

be clear about an argument! and focusing on specific things in one model helps

us achieve that. Failing to formalize a theory does not necessarily imply that the

argument is generic and holistic, it just means that the requirement of specificity

in the argument is not as high.

Historically most economists rely on maximization as a core tool in economics,

and it is a matter of good practice. Most of what we will discuss in this course

follows this tradition: maximization is much easier to work with than alternatives.

But philosophically I don’t think that maximization is necessary for any work to

be considered as part of economics. You will have to decide on your own. My own

view is that there are 3 core tools:

• The principle that people respond to incentives

• An equilibrium concept that assumes that absence of free lunches

• A welfare criteria saying that more choices are better

Last methodological point: Milton Friedman made distinction of the field into

positive and normative economics:

• Positive economics – why the world is the way it is and looks the way it does

• Normative economics – how the world can be improved

Both areas are necessary and sometimes merge perfectly. But there are often

tensions. We will return to this throughout the rest of the class. What I hope you

will get out of the course are the following:

• Ability to understand basic microeconomic mechanisms

• Ability to evaluate and challenge economic arguments

• Appreciation for economic way of looking at the world

We now try to describe a very simple form of human interaction in an economic

context, namely trade or the voluntary exchange of goods or objects between two

people, one is called the seller, the current owner of the object and the other the

buyer, someone who has a demand or want for that object. It is referred to as

bilateral trading.

1.2 Gains from Trade

1.2.1 Bilateral Trading

Suppose that a seller values a single, homogeneous object at c (opportunity cost),

and a potential buyer values the same object at v (willingness to pay). Trade could

occur at a price p, in which case the payoff to the seller is p − c and to the buyer

is v − p. We assume for now that there is only one buyer and one seller, and only

2

one object that can potentially be traded. If no trade occurs, both agents receive

a payoff of 0.

Whenever v > c there is the possibility for a mutually beneficial trade at some

price c ≤ p ≤ v. Any such allocation results in both players receiving non-negative

returns from trading and so both are willing to participate (p − c and v − p are

non-negative).

There are many prices at which trade is possible. And each of these allocations,

consisting of whether the buyer gets the object and the price paid, is efficient in

the following sense:

Definition 1. An allocation is Pareto efficient if there is no other allocation that

makes at least one agent strictly better off, without making any other agent worse

off.

1.2.2 Experimental Evidence

This framework can be extended to consider many buyers and sellers, and to allow

for production. One of the most striking examples comes from international trade.

We are interested, not only in how specific markets function, but also in how

markets should be organized or designed.

There are many examples of markets, such as the NYSE, NASDAQ, E-Bay and

Google. The last two consist of markets that were recently created where they did

not exist before. So we want to consider not just existing markets, but also the

creation of new markets.

Before elaborating on the theory, we will consider three experiments that illustrate how these markets function. We can then interpret the results in relation

to the theory. Two types of cards (red and black) with numbers between 2 and

10 are handed out to the students. If the student receives a red card they are a

seller, and the number reflects their cost. If the student receives a black card they

are a buyer, and this reflects their valuation. The number on the card is private

information. Trade then takes place according to the following three protocols.

1. Bilateral Trading: One seller and one buyer are matched before receiving

their cards. The buyer and seller can only trade with the individual they are

matched with. They have 5 minutes to make offers and counter offers and

then agree (or not) on the price.

2. Pit Market: Buyer and seller cards are handed out to all students at the

beginning. Buyers and sellers then have 5 minutes to find someone to trade

with and agree on the price to trade.

3. Double Auction: Buyer and seller cards are handed out to all students at

the beginning. The initial price is set at 6 (the middle valuation). All buyers

3

and sellers who are willing to trade at this price can trade. If there is a

surplus of sellers the price is decreased, and if there is a surplus of buyers

then the price is increased. This continues for 5 minutes until there are no

more trades taking place.

2 Choice

In the decision problem in the previous section, the agents had a binary decision: whether to buy (sell) the object. However, there are usually more than two

alternatives. The price at which trade could occur, for example, could take on

a continuum of values. In this section we will look more closely at preferences,

and determine when it is possible to represent preferences by “something handy,”

which is a utility function.

Suppose there is a set of alternatives X = {x1, x2, . . . , xn} for some individual

decision maker. We are going to assume, in a manner made precise below, that

two features of preferences are true.

• There is a complete ranking of alternatives.

• “Framing” does not affect decisions.

We refer to X as a choice set consisting of n alternatives, and each alternative

x ∈ X is a consumption bundle of k different items. For example, the first element

of the bundle could be food, the second element could be shelter and so on. We

will denote preferences by %, where x % y means that “x is weakly preferred to

y.” All this means is that when a decision maker is asked to choose between x and

y they will choose x. Similarly, x y, means that “x is strictly preferred to y”

and x ∼ y indicates that the decision maker is “indifferent between x and y.” The

preference relationship % defines an ordering on X × X. We make the following

three assumptions about preferences.

Axiom 1. Completeness. For all x, y ∈ X either x % y, y % x, or both.

This first axiom simply says that, given two alternatives the decision maker

can compare the alternatives, and will weakly prefer one of the alternatives to the

other, or will be indifferent, in case both are weakly preferred to each other.

Axiom 2. Transitivity. For all triples x, y, z ∈ X if x % y and y % z then x % z.

Very simply, this axiom imposes some level of consistency on choices. For example, suppose there were three potential travel locations, Tokyo (T), Beijing (B),

and Seoul (S). If a decision maker, when offered the choice between Tokyo and

4

good 1

good 2

bundle x

bundles preferred to b

bundle b

Figure 1: Indifference curve

Beijing, weakly prefers to go to Tokyo, and when given the choice between Beijing

and Seoul weakly prefers to go to Beijing, then this axiom simply says that if she

was offered a choice between a trip to Tokyo or a trip to Seoul, she would weakly

prefer to go to Tokyo. This is because she has already demonstrated that she

weakly prefers Tokyo to Beijing, and Beijing to Seoul, so weakly preferring Seoul

to Tokyo would mean that their preferences are inconsistent.

But it is conceivable that people might violate transitivity in certain circumstances. One of them is “framing effect”. It is the idea that the way the choice

alternatives are framed may affect decision and hence in turn may violate transitivity eventually. The idea was made explicit by an experiment due to Danny

Kahneman and Amos Tversky (1984). In the experiment students visiting the

MIT-Coop to purchase a stereo for $125 and a calculator for $5 were informed

that the calculator is on sale for 5 dollars less at Harvard Coop. The question is

would the students make the trip?

Suppose instead the students were informed that the stereo is 5 dollars less at

Harvard Coop.

Kahneman and Tversky found that the fraction of respondents who would travel

for cheaper calculator is much higher than for cheaper stereo. But they were also

told that there is a stockout and the students have to go to Harvard Coop, and

will get 5 dollars off either item as compensation, and were asked which item do

you care to get money off? Most of them said that they were indifferent. If x =

go to Harvard and get 5 dollars off calculator, y= go to Harvard and get 5 dollars

off stereo, z = get both items at MIT. We have x z and z y, but last question

implies x ∼ y. Transitivity would imply that x y, which is the contradiction.

5

good 1

good 2

y

y’

ay+(1-a)y’

Figure 2: Convex preferences

We for the purposes of this course would assume away any such framing effects in

the mind of the decision maker.

Axiom 3. Reflexivity. For all x ∈ X, x % x (equivalently, x ∼ x).

The final axiom is made for technical reasons, and simply says that a bundle

cannot be strictly preferred to itself. Such preferences would not make sense.

These three axioms allow for bundles to be ordered in terms of preference. In

fact, these three conditions are sufficient to allow preferences to be represented by

a utility function.

Before elaborating on this, we consider an example. Suppose there are two

goods, Wine and Cheese. Suppose there are four consumption bundles z =

(2, 2), y = (1, 1), a = (2, 1), b = (1, 2) where the two elements of the vector represent the amount of wine or cheese. Most likely, z y since it provides more of

everything (i.e., wine and cheese are “goods”). It is not clear how to compare a

and b. What we can do is consider which bundles are indifferent with b. This is

an indifference curve (see Figure 1). We can define it as

Ib = {x ∈ X|b ∼ x)

We can then (if we assume that more is better) compare a and b by considering

which side of the indifference curve a lies on: bundles above and to the right are

more preferred, bundles below and to the left are less preferred. This reduces

the dimensionality of the problem. We can speak of the “better than b” set as

the set of points weakly preferred to b. These preferences are “ordinal:” we can

ask whether x is in the better than set, but this does not tell us how much x is

6

good 1

good 2

good 1

good 2

Figure 3: Perfect substitutes (left) and perfect complements (right)

preferred to b. It is common to assume that preferences are monotone: more of a

good is better.

Definition 2. The preferences % are said to be (strictly) monotone if x ≥ y ⇒

x % y (x ≥ y, x 6= y ⇒ x % y for strict monotonicity).1

Suppose I want to increase my consumption of good 1 without changing my

level of well-being. The amount I must change x2 to keep utility constant, dx2

dx1

is

the marginal rate of substitution. Most of the time we believe that individuals

like moderation. This desire for moderation is reflected in convex preferences. A

mixture between two bundles, between which the agent is indifferent, is strictly

preferred to either of the initial bundle (see Figure 2).

Definition 3. A preference relation is convex if for all y and y

0 with y ∼ y

0 and

all α ∈ [0, 1] we have that αy + (1 − α)y

0 % y ∼ y

0

.

While convex preferences are usually assumed, there could be instances where

preferences are not convex. For example, there could be returns to scale for some

good.

Examples: perfect substitutes, perfect complements (see Figure 3). Both of

these preferences are convex.

Notice that indifference curves cannot intersect. If they did we could take two

points x and y, both to the right of the indifference curve the other lies on. We

would then have x y x, but then by transitivity x x which contradicts

reflexivity. So every bundle is associated with one, and only one, welfare level.

Another important property of preference relation is continuity.

Definition 4. Let {xn}, {yn} be two sequences of choices. If xn % yn, ∀n and

xn → x, and yn → y, then x % y.

1

If x = (x1, . . . , xN ) and y = (y1, . . . , yN ) are vectors of the same dimension, then x ≥ y if

and only if, for all i, xi ≥ yi

. x 6= y means that xi 6= yi for at least one i.

7

This property guarantees that there is no jump in preferences. When X is no

longer finite, we need continuity to ensure a utility representation.

2.1 Utility Functions

What we want to consider now is whether we can take preferences and map them

to some sort of utility index. If we can somehow represent preferences by such a

function we can apply mathematical techniques to make the consumer’s problem

more tractable. Working with preferences directly requires comparing each of

a possibly infinite number of choices to determine which one is most preferred.

Maximizing an associated utility function is often just a simple application of

calculus. If we take a consumption bundle x ∈ R

N

+ we can take a utility function

as a mapping from R

N

+ into R.

Definition 5. A utility function (index) u : X → R represents a preference profile

% if and only if, for all x, y ∈ X: x % y ⇔ u(x) ≥ u(y).

We can think about a utility function as an “as if”-concept: the agent acts “as

if” she has a utility function in mind when making decisions.

Is it always possible to find such a function? The following result shows that

such a function exists under the three assumptions about preferences we made

above.

Proposition 1. Suppose that X is finite. Then the assumptions of completeness,

transitivity, and reflexivity imply that there is a utility function u such that u(x) ≥

u(y) if and only if x % y.

Proof. We define an explicit utility function. Let’s introduce some notation:

B(x) = {z ∈ X|x % z}

Therefore B(x) is the set of “all items below x”. Let the utility function be defined

as,

u(x) = |B(x)|

where |B(x)| is the cardinality of the set B(x), i.e. the number of elements in the

set B(x). There are two steps to the argument:

First part:

u(x) ≥ u(y) ⇒ x % y

Second part:

x % y ⇒ u(x) ≥ u(y)

8

First part of proof:

By definition, u(x) ≥ u(y) ⇒ |B(x)| ≥ |B(y)|. If y ∈ B(x), then x % y by

definition of B(x) and we are done. Otherwise, y /∈ B(x). We will work towards

a contradiction.

Since y /∈ B(x), we have

|B(x) − {y}| = |B(x)|

Since y ∈ B(y) (by reflexivity), we have

|B(y)| − 1 = |B(y) − {y}|

Since |B(x)| ≥ |B(y)|, |B(x)| > |B(y)| − 1 and hence,

|B(x) − {y}| > |B(y) − {y}|

Therefore, there must be some z ∈ X − {y} such that x % z and y z. By

completeness: z % y. By transitivity: x % y. But this implies that y ∈ B(x), a

contradiction. Second part of proof

Want to show: x % y ⇒ u(x) ≥ u(y).

Suppose x % y and z ∈ B(y).

Then x % y and y % z, so by transitivity x % z.

Hence, z ∈ B(x).

This shows that when x % y, anything in B(y) must also be in B(x).

B(y) ⊂ B(x) ⇒ |B(x)| ≥ |B(y)| ⇒ u(x) ≥ u(y)

This completes the proof.

In general the following proposition holds:

Proposition 2. Every (continuous) preference ranking can be represented by a

(continuous) utility function.

This result can be extended to environments with uncertainty, as was shown

by Leonard Savage. Consequently, we can say that individuals behave as if they

are maximizing utility functions, which allows for marginal and calculus arguments. There is, however, one qualification. The utility function that represents

the preferences is not unique.

Remark 1. If u represents preferences, then for any increasing function f : R → R,

f(u(x)) also represents the same preference ranking

In the previous section, we claimed that preferences usually reflect the idea

that “more is better,” or that preferences are monotone.

9

Definition 6. The utility function (preferences) are monotone increasing if x ≥ y

implies that u(x) ≥ u(y) and x > y implies that u(x) > u(y).

One feature that monotone preferences rule out is (local) satiation, where one

point is preferred to all other points nearby. For economics the relevant decision is maximizing utility subject to limited resources. This leads us to consider

constrained optimization.

3 Maximization

Now we take a look at the mathematical tool that will be used with the greatest

intensity in this course. Let x = (x1, x2, . . . , xn) be a n-dimensional vector where

each component of the vector xi

, i = 1, 2, . . . , n is a non-negative real number. In

mathematical notations we write x ∈ R

n

+. We can think of x as description of

different characteristics of a choice that the decision maker faces. For example,

while choosing which college to go (among the ones that have offered admission)

a decision maker, who is a student in this case, looks into different aspects of a

university, namely the quality of instruction, diversity of courses, location of the

campus etc. The components of the vector x can be thought of as each of these

characteristics when the choice problem faced by the decision maker (i.e. the student) is to choose which university to attend. Usually when people go to groceries

they are faced with the problem of buying not just! one commodity, but a bundle

of commodities and therefore it is the combination of quantities of different commodities which needs to be decided and again the components of x can be thought

of as quantities of each commodity purchased. Whatever be the specific context,

utility is defined over the set of such bundles. Since x ∈ R

n

+, we take X = R

n

+. So

the utility function is a mapping u: R

n

+ → R.

Now for the time being let x be one dimensional, i.e. x ∈ R. Let f : R → R be

a continuous and differentiable function that takes real numbers and maps it to

another real number. Continuity is assumed to avoid any jump in the function and

differentiability is assumed to avoid kinks. The slope of the function f is defined

as the first derivative of the function and the curvature of the function is defined

as the second derivative of the function. So, the slope of f at x is formally defined as:

df(x)

dx

, f

0

(x)

and the curvature of f at x is formally defined as:

d

2

f(x)

dx2

, f

00 (x)

10

In order to find out the maximum of f we must first look into the slope of f.

If the slope is positive then raising the value of x increases the value of f. So to

find out the maximum we must keep increasing x. Similarly if slope is negative

then reducing the value of x increases the value of f and therefore to find the

maximum we should reduce the value of x. Therefore the maximum is reached

when the slope is exactly equal to 0. This condition is referred to as the First

Order Condition (F.O.C.) or the necessary condition:

df(x)

dx = 0

But this in itself doesn’t guarantee that maximum is reached, as a perfectly flat

slope may also imply that we’re at the trough, i.e. at the minimum. The F.O.C.

therefore finds the extremum points in the function. We need to look at the

curvature to make sure whether the extremum is actually a maximum or not. If

the second derivative is negative then it means that from the extremum point if

we move x a little bit on either side f(x) would fall, and therefore the extremum

is a maximum. But if the second derivative is positive then by similar argument

we know that its the minimum. This condition is referred to as the Second Order

Condition (S.O.C) or the sufficient condition:

d

2

f(x)

dx2

≤ 0

Now we look at the definitions of two important kind of functions:

Definition 7. (i)A continuous and differentiable function f : R → R is (strictly)

concave if

d

2

f(x)

dx2

≤ (<)0.

(ii) f is convex if

d

2

f(x)

dx2

≥ (>)0.

Therefore a concave function the F.O.C. is both necessary and sufficient condition for maximization. We can also define concavity or convexity of functions

with the help of convex combinations.

Definition 8. A convex combination of two any two points x

0

, x

00

∈ R

n

is defined

as xλ = λx0

+ (1 − λ)x

00 for any λ ∈ (0, 1).

Convex combination of two points represent a point on the straight line joining

those two points. We now define concavity and convexity of functions using this

concept.

11

Definition 9. f is concave if for any two points x

0

, x

00

∈ R, f(xλ) ≥ λf(x

0

) + (1 −

λ)f(x

00) where xλ is a convex combination of x

0

and x

00 for λ ∈ (0, 1). f is strictly

concave if the inequality is strict.

Definition 10. Similarly f is convex if f(xλ) ≤ λf(x

0

)+(1−λ)f(x

00). f is strictly

convex if the inequality is strict.

If the utility function is concave for any individual then, given this definition,

we can understand that, she would prefer to have a certain consumption of xλ

than face an uncertain prospect of consuming either x

0

or x

00. Such individuals are

called risk averse. We shall explore these concepts in full detail later in the course

and then we would require these definitions of concavity and convexity.

4 Utility Maximization

4.1 Multivariate Function Maximization

Let x = (x1, x2, . . . , xn) ∈ R

n

+ be a consumption bundle and f : R

n

+ → R be a

multivariate function. The multivariate function that we are interested in here

is the utility function u: R

n

+ → R where u(x) is the utility of the consumption

bundle x.

The F.O.C. for maximization of f is given by:

∂f(x1, x2, . . . , xn)

∂xi

= 0 ∀ i = 1, 2, . . . , n

This is a direct extension of the F.O.C. for univariate functions as explained in

Lecture 3. The S.O.C. however is a little different from the single variable case.

Let’s look at a bivariate function f : R

2 → R. Let’s first define the following notations:

fi(x) ,

df(x)

dxi

, fii(x) ,

d

2

f(x)

dx2

2

, i = 1, 2,

fij (x) ,

d

2

f(x)

dxidxj

, i 6= j

The S.O.C. for the maximization of f is then given by,

(i) f11 < 0

(ii)

f11 f12

f21 f22

> 0

12

The first of the S.O.C.s is analogous to the S.O.C. for the univariate case. If we

write out the second one we get,

f11f22 − f12f21 > 0

But we know that f12 = f21. So,

f11f22 > f12

2

> 0

⇒ f22 < 0 (since f11 < 0)

Therefore the S.O.C. for the bivariate case is stronger than the analogous conditions from the univariate case. This is because for the bivariate case to make

sure that we are at the peak of a function it is not enough to check if the function

is concave in the directions of x1 and x2, as it could not be concave along the

diagonal and therefore the need to introduce cross derivatives in to the condition.

For the purposes of this class we’d assume that the S.O.C. is satisfied for the utility

function being given, unless it is asked specifically to check for it.

4.2 Budget Constraint

A budget constraint is a constraint on how much money (income, wealth) an

agent can spend on goods. We denote the amount of available income by I ≥

0. x1, . . . , xN are the quantities of the goods purchased and p1, . . . , pN are the

according prices. Then the budget constraint is

X

N

i=1

pixi ≤ I.

As an example, we consider the case with two goods. In that case we get that

p1x1 + p2x2 ≤ I, i.e., the agent spends her entire income on the two goods. The

points where the budget line intersects with the axes are x1 = I/p1 and x2 = I/p2

since these are the points where the agent spends her income on only one good.

Solving for x2, we can express the budget line as a function of x1:

x2(x1) = I

p2

−

p1

p2

x1,

where the slope of the budget line is given by,

dx2

dx1

= −

p1

p2

The budget line here is defined as the equation involving x1 and x2 such

that the decision maker exhausts all her income. The set of consumption bundles

(x1, x2) which are feasible given the income, i.e. (x1, x2) for which p1x1 + p2x2 ≤ I

holds is defined as the budget set.

13

4.3 Indifference Curve

Indifference Curve (IC) is defined as the locus of consumption bundles (x1, x2)

such that the utility is held fixed at some level. Therefore the equation of the IC

is given by,

u(x1, x2) = ¯u

To get the slope of the IC we differentiate the equation w.r.t. x1:

∂u(x)

∂x1

+

∂u(x)

∂x2

dx2

dx1

= 0

⇒

dx2

dx1

= −

∂u(x)

∂x1

∂u(x)

∂x2

= −

MU1

MU2

where MUi refers to the marginal utility of good i. So the slope of IC is the

(negative of) ratio of marginal utilities of good 1 and 2. This ratio is referred

to as the Marginal Rate of Substitution or MRS. This tells us the rate at which

the consumer is ready to substitute between good 1 and 2 to remain at the same

utility level.

4.4 Constrained Optimization

Consumers are typically endowed with money I, which determines which consumption bundles are affordable. The budget set consists of all consumption bundles

such that PN

i=1 pixi ≤ I. The consumer’s problem is then to find the point on the

highest indifference curve that is in the budget set. At this point the indifference

curve must be tangent to the budget line. The slope of the budget line is given by,

dx2

dx1

= −

p1

p2

which defines how much x2 must decrease if the amount of consumption of good 1

is increased by dx1 for the bundle to still be affordable. It reflects the opportunity

cost, as money spent on good 1 cannot be used to purchase good 2 (see Figure 4).

The marginal rate of substitution, on the other hand, reflects the relative benefit from consuming different goods. The slope of the indifference curve is −MRS.

So the relevant optimality condition, where the slope of the indifference curve

equals the slope of the budget line, is

p1

p2

=

∂u(x)

∂x1

∂u(x)

∂x2

.

14

good 1

good 2

budget set

optimal choice

Figure 4: Indifference curve and budget set

We could equivalently talk about equating marginal utility per dollar. If

∂u(x)

∂x2

p2

>

∂u(x)

∂x1

p1

then one dollar spent on good 2 generates more utility then one dollar spent on

good 1. So shifting consumption from good 1 to good 2 would result in higher

utility. So, to be at an optimum we must have the marginal utility per dollar

equated across goods.

Does this mean then that we must have ∂u(x)

∂xi

= pi at the optimum? No. Such

a condition wouldn’t make sense since we could rescale the utility function. We

could instead rescale the equation by a factor λ ≥ 0 that converts “money” into

“utility.” We could then write ∂u(x)

∂xi

= λpi

. Here, λ reflects the marginal utility of

money. More on this in the subsection on Optimization using Lagrange approach.

4.4.1 Optimization by Substitution

The consumer’s problem is to maximize utility subject to a budget constraint.

There are two ways to approach this problem. The first approach involves writing

the last good as a function of the previous goods, and then proceeding with an

unconstrained maximization. Consider the two good case. The budget set consists

of the constraint that p1x1 + p2x2 ≤ I. So the problem is

max

x1,x2

u(x1, x2) subject to p1x1 + p2x2 ≤ I

But notice that whenever u is (locally) non-satiated then the budget constraint

holds with equality since there in no reason to hold money that could have been

15

used for additional valued consumption. So, p1x1 + p2x2 = I, and so we can write

x2 as a function of x1 from the budget equation in the following way

x2 =

I − p1x1

p2

Now we can treat the maximization of u

x1,

I−p1x1

p2

as the standard single variable maximization problem. Therefore now the maximization problem becomes,

max

x1

u

x1,

I − p1x1

p2

The F.O.C. is then given by,

du

dx1

+

du

dx2

dx2(x1)

dx1

= 0

⇒

du

dx1

−

p1

p2

du

dx2

= 0

The second equation substitutes dx2(x1)

dx1

by −

p1

p2

from the budget line equation. We

can further rearrange terms to get,

du

dx1

p1

=

du

dx2

p2

⇒

du

dx1

du

dx2

=

p1

p2

This exactly the condition we got by arguing in terms of budget line and indifference curves. In the following lecture we shall look at a specific example where

we would maximize a particular utility function using this substitution method

and then move over to the Lagrange approach.

5 Utility Maximization Continued

5.1 Application of Substitution Method

Example 1. We consider a consumer with Cobb-Douglas preferences. CobbDouglas preferences are easy to use and therefore commonly used. The utility

function is defined as (with two goods)

u(x1, x2) = x

α

1 x

1−α

2

, α > 0

16

The goods’ prices are p1,p2 and the consumer if endowed with income I. Hence,

the constraint optimization problem is

max

x1,x2

x

α

1 x

1−α

2

subject to p1x1 + p2x2 = I.

We solve this maximization by substituting the budget constraint into the

utility function so that the problem becomes an unconstrained optimization with

one choice variable:

u(x1) = x

α

1

I − p1x1

p2

1−α

. (1)

In general, we take the total derivative of the utility function

du(x1, x2(x1))

dx1

=

∂u

∂x1

+

∂u

∂x2

dx2

dx1

= 0

which gives us the condition for optimal demand

dx2

dx1

= −

∂u

∂x1

∂u

∂x2

.

The right-hand side is the marginal rate of substitution (MRS).

In order to calculate the demand for both goods, we go back to our example.

Taking the derivative of the utility function (1)

u

0

(x1) = αxα−1

1

I − p1x1

p2

1−α

+ (1 − α)x

α

1

I − p1x1

p2

−α

−

p1

p2

= x

α−1

1

I − p1x1

p2

−α

α

I − p1x1

p2

− (1 − α)x1

p1

p2

so the FOC is satisfied when

α(I − p1x1) − (1 − α)x1p1 = 0

which holds when

x

∗

1

(p1, p2, I) = αI

p1

. (2)

Hence, we see that the budget spent on good 1, p1x1, equals the budget share αI,

where α is the preference parameter associated with good 1.

Plugging (2) into the budget constraint yields

x

∗

2

(p1, p2, I) = I − p1x1

p2

=

(1 − α)I

p2

.

These are referred to as the Marshallian demand or uncompensated demand.

17

Several important features of this example are worth noting. First of all, x1

does not depend on p2 and vice versa. Also, the share of income spent on each good

pixi

M

does not depend on price or wealth. What is going on here? When the price of

one good, p2, increases there are two effects. First, the price increase makes good 1

relatively cheaper ( p1

p2

decreases). This will cause consumers to “substitute” toward

the relatively cheaper good. There is also another effect. When the price increases

the individual becomes poorer in real terms, as the set of affordable consumption

bundles becomes strictly smaller. The Cobb-Douglas utility function is a special

case where this “income effect” exactly cancels out the substitution effect, so the

consumption of one good is independent of the price of the other goods.

Cobb – Douglass utility function u(x1, x2) = x1

αx2

(1−α)

sub. to budget constraint p1x1 + p2x2 = I

Therefore we get,

max

x1

x1

α

I − p1x1

p2

(1−α)

The F.O.C. is then given by,

αx1

α−1

I − p1x1

p2

(1−α)

+ (1 − α)x1

α

I − p1x1

p2

α

−

p1

p2

= 0

⇒ α

I − p1x1

p2

= (1 − α)x1

p1

p2

⇒ x1

∗

(p1, p2, I) = αI

p1

⇒ x2

∗

(p1, p2, I) = (1 − α)I

p2

This is referred to as the Marshallian Demand or uncompensated demand.

5.2 Elasticity

When calculating price or income effects, the result depends on the units used.

For example, when considering the own-price effect for gasoline, we might express

quantity demanded in gallons or liters and the price in dollars or euros. The

own-price effects would differ even if consumers in the U.S. and Europe had the

same underlying preferences. In order to make price or income effects comparable

across different units, we need to normalize them. This is the reason why we use

the concept of elasticity. The own-price elasticity of demand is defined as the

18

percentage change in demand for each percentage change in its own price and is

denoted by i

:

i = −

∂xi

∂pi

xi

pi

= −

∂xi

∂pi

pi

xi

.

It is common to multiply the price effect by −1 so that is a positive number since

the price effect is usually negative. Of course, the cross-price elasticity of demand

is defined similarly

ij = −

∂xi

∂pj

xi

pj

= −

∂xi

∂pj

pj

xi

.

Similarly the income elasticity of demand is defined as:

I =

∂xi

∂I

xi

I

=

∂xi

∂I

I

xi

5.2.1 Constant Elasticity of Substitution

Elasticity of substitution for a utility function is defined as the elasticity of the

ratio of consumption of two goods to the MRS. Therefore it is a measure of how

easily the two goods are substitutable along an indifference curve. In terms of

mathematics, it is defined as,

S =

d(x2/x1)

dMRS

MRS

x2/x1

For a class of utility functions this value is constant for all (x1, x2). These utility

functions are called Constant Elasticity of Substitution (CES) utility functions.

The general form looks like the following:

u(x1, x2) =

α1×1

−ρ + α2×2

−ρ

− 1

ρ

It is easy to show that for CES utility functions,

S =

1

ρ + 1

The following utility functions are special cases of the general CES utility function:

Linear Utility: Linear Utility is of the form

U(x1, x2) = ax1 + bx2, a, b constants

which is a CES utility with ρ = −1.

Leontief Utility: Leontief utility is of the form

U(x1, x2) = max{

x1

a

,

x2

b

}, a, b > 0

and this is also a CES utility function with ρ = ∞.

1

5.3 Optimization Using the Lagrange Approach

While the approach using substitution is simple enough, there are situations where

it will be difficult to apply. The procedure requires that, as we know, before the

calculation, the budget constraint actually binds. In many situations there may be

other constraints (such as a non-negativity constraint on the consumption of each

good) and we may not know whether they bind before demands are calculated.

Consequently, we will consider a more general approach of Lagrange multipliers.

Again, we consider the (two good) problem of

max

x1,x2

u(x1, x2) s.t. p1x1 + p2x2 ≤ I

Let’s think about this problem as a game. The first player, let’s call him the

kid, wants to maximize his utility, u(x1, x2), whereas the other player (the parent)

is concerned that the kid violates the budget constraint, p1x1 + p2x2 ≤ I, by

spending too much on goods 1 and 2. In order to induce the kid to stay within

the budget constraint, the parent can punish him by an amount λ for every dollar

the kid exceeds his income. Hence, the total punishment is

λ(I − p1x1 − p2x2).

Adding the kid’s utility from consumption and the punishment, we get

L(x1, x2, λ) = u(x1, x2) + λ(I − p1x1 − p2x2). (3)

Since, for any function, we have max f = − min f, this game is a zero-sum game:

the payoff for the kid is L and the parent’s payoff is −L so that the total payoff

will always be 0. Now, the kid maximizes expression (3) by choosing optimal levels

of x1 and x2, whereas the parent minimizes (3) by choosing an optimal level of λ:

min

λ

max

x1,x2

L(x1, x2, λ) = u(x1, x2) + λ(I − p1x1 − p2x2).

In equilibrium, the optimally chosen level of consumption, x

∗

, has to be the

best response to the optimal level of λ

∗ and vice versa. In other words, when we

fix a level of x

∗

, the parent chooses an optimal λ

∗and when we fix a level of λ

∗

,

the kid chooses an optimal x

∗

. In equilibrium, no one wants to deviate from their

optimal choice. Could it be an equilibrium for the parent to choose a very large

λ? No, because then the kid would not spend any money on consumption, but

rather have the maximized expression (3) to equal λI.

Since the first-order conditions for minima and maxima are the same, we have

20

the following first-order conditions for problem (3):

∂L

∂x1

=

∂u

∂x1

− λp1 = 0 (4)

∂L

∂x2

=

∂u

∂x2

− λp2 = 0 (5)

∂L

∂λ = I − p1x1 − p2x2 = 0.

Here, we have three equations in three unknowns that we can solve for the optimal

choice x

∗

, λ∗

.

Before solving this problem for an example, we can think about it in more

formal terms. The basic idea is as follows: Just as a necessary condition for a

maximum in a one variable maximization problem is that the derivative equals 0

(f

0

(x) = 0), a necessary condition for a maximum in multiple variables is that all

partial derivatives are equal to 0 ( ∂f(x)

∂xi

= 0). To see why, recall that the partial

derivative reflects the change as xi

increases and the other variables are all held

constant. If any partial derivative was positive, then holding all other variables

constant while increasing xi will increase the objective function (similarly, if the

partial derivative is negative we could decrease xi). We also need to ensure that

the solution is in the budget set, which typically won’t happen if we just try to

maximize u. Basically, we impose a “cost” on consumption (the punishment in the

game above), proceed with unconstrained maximization for the induced problem,

and set this cost so that the maximum lies in the budget set.

Notice that the first-order conditions (4) and (5) imply that

∂u

∂x1

p1

= λ =

∂u

∂x2

p2

or

∂u

∂x1

∂u

∂x2

=

p1

p2

which is precisely the “MRS = price ratio” condition for optimality that we saw

before.

Finally, it should be noted that the FOCs are necessary for optimality, but

they are not, in general, sufficient for the solution to be a maximum. However,

whenever u(x) is a concave function the FOCs are also sufficient to ensure that the

solution is a maximum. In most situations, the utility function will be concave.

Example 2. We can consider the problem of deriving demands for a Cobb-Douglas

utility function using the Lagrange approach. The associated Lagrangian is

L(x1, x2, λ) = x

α

1 x

1−α

2 + λ(I − p1x1 − p2x2),

21

which yields the associated FOCs

∂L

∂x1

= αxα−1

1 x

1−α

2 − λp1 = α

x2

x1

1−α

− λp1 = 0 (6)

∂L

∂x2

= (1 − α)x

α

1 x

−α

2 − λp2 = (1 − α)

x1

x2

α

− λp2 = 0 (7)

∂L

∂λ = (I − p1x1 − p2x2) = 0. (8)

We have three equations with three unknowns (x1, x2, λ) so that this system should

be solvable. Notice that since it is not possible that x2

x1

and x1

x2

are both 0 we cannot

have a solution to equations (6) and (7) with λ = 0. Consequently we must have

that p1x1 + p2x2 = I in order to satisfy equation (8). Solving for λ in the above

equations tells us that

λ =

α

p1

x2

x1

1−α

=

(1 − α)

p2

x1

x2

α

and so

p2x2 =

1 − α

α

p1x1.

Combining with the budget constraint this gives

p1x1 +

1 − α

α

p1x1 =

1

α

p1x1 = I.

So the Marshallian2 demand functions are

x

∗

1 =

αI

p1

and

x

∗

2 =

(1 − α)I

p2

.

So we see that the result of the Lagrangian approach is the same as from approach

that uses substitution. Using equation (6) or (7) again along with the optimal

demand x

∗

1

or x

∗

2

gives us the following expression for λ:

λ

∗ =

1

I

.

Hence, λ

∗

equals the derivative of the Lagrangian L with respect to income I. We

call this derivative, ∂L

∂I , the marginal utility of money.

2After the British economist Alfred Marshall.

22

6 Value Function and Comparative Statics

6.1 Indirect Utility Function

The indirect utility function

V (p1, p2, I) , u (x1

∗

(p1, p2, I), x2

∗

(p1, p2, I))

Therefore V is the maximum utility that can be achieved given the prices and

the income level. We shall show later that λ is same as

∂V (p1, p2, I)

∂I

6.2 Interpretation of λ

From FOC of maximization we get,

dL

dx1

=

∂u

dx1

− λp1 = 0

dL

dx2

=

∂u

dx2

− λp2 = 0

dL

dλ = I − p1x1 − p2x2 = 0

From the first two equations we get,

λ =

∂u

dxi

pi

This means that λ can be interpreted as the par dollar marginal utility from any

good. It also implies, as we have argued before, that the benefit to cost ratio is

equalized across goods. We can also interpret λ as shadow value of money. But

we explain this concept later. Before that let’s solve an example and find out the

value of λ for that problem.

Let’s work with the utility function:

u(x1, x2) = α ln x1 + (1 − α) ln x2

The F.O.C.s are then given by,

∂L

∂x1

=

α

x1

− λp1 = 0 (9)

∂L

∂x1

=

1 − α

x2

− λp2 = 0 (10)

∂L

∂λ = I − p1x1 − p2x2 = 0 (11)

23

From the first two equations (3) and (4) we get,

x1 =

α

λp1

and x2 =

1 − α

λp2

Plugging it in the F.O.C. equation (5) we get,

I =

α

λ

+

1 − α

λ

⇒ λ

∗ =

1

I

6.3 Comparative Statics

Let f : R × R → R be a function which is dependent on an endogenous variable,

say x, and an exogenous variable a. Therefore we have,

f(x, a)

Let’s define value function as the maximized value of f w.r.t. x, i.e.

v(a) , max

x

f(x, a)

Let x

∗

(a) be the value of x that maximizes f given the value of a. Therefore,

v(a) = f(x

∗

(a), a)

To find out the effect of changing the value of the exogenous variable a on the

maximized value of f we differentiate v w.r.t. a. Hence we get,

v

0

(a) = df

dx

dx∗

da +

df

da

But from the F.O.C. of maximization of f we know that,

df

dx(x

∗

(a), a) = 0

Therefore we get that,

v

0

(a) = df

da(x

∗

(a), a)

Thus the effect of change in the exogenous variable on the value function is only

it’s direct effect on the objective function. This is referred to as the Envelope

Theorem.

24

In case of utility maximization the value function is the indirect utility function.

We can also define the indirect utility function as,

V (p1, p2, I) , u(x1

∗

, x2

∗

, λ∗

) + λ

∗

[I − p1x1

∗ − p2x2

∗

] = L(x1

∗

, x2

∗

, λ∗

)

Therefore,

∂V

∂I =

∂u

∂x1

∂x1

∗

∂I +

∂u

∂x2

∂x2

∗

∂I

− λ

∗

p1

∂x1

∗

∂I − λ

∗

p2

∂x2

∗

∂I

+ [I − p1x1

∗ − p2x2

∗

]

∂λ∗

∂I + λ

∗

= λ

∗

(by Envelope Theorem)

Therefore we see that λ

∗

is the marginal value of money in the optimum. So

if the income constraint is relaxed by a dollar, it increases the maximum utility of

the consumer by λ

∗ and hence λ

∗

is interpreted as the shadow value of money.

7 Expenditure Minimization

Instead of maximizing utility subject to a given income we can also minimize expenditure subject to achieving a given level of utility ¯u. In this case, the consumer

wants to spend as little money as possible to enjoy a certain utility. Formally, we

write

min

x

p1x1 + p2x2 s.t. u(x) ≥ u. ¯ (12)

We can set up the Lagrange expression for this problem as the following:

L(x1, x2, λ) = p1x1 + p2x2 + λ[¯u − u(x1, x2)]

The F.O.C.s are now:

∂L

∂x1

= p1 − λ

∂u

∂x1

= 0

∂L

∂x2

= p2 − λ

∂u

∂x2

= 0

∂L

∂λ

= ¯u − u(x1, x2) = 0

Comparing the first two equations we get,

∂u

∂x1

∂u

∂x2

=

p1

p2

25

This is the exact relation we got in the utility maximization program. Therefore

these two programs are equivalent exercises. In the language of mathematics it is

called the duality. But the values of x1 and x2 that minimizes the expenditure is a

function of the utility level ¯u instead of income as in the case of utility maximization. The result of this optimization problem is a demand function again, but in

general it is different from x

∗

(p1, p2, I). We call the demand function derived from

problem (1) compensated demand or Hicksian demand.3 We denote it by,

h1(p1, p2, u¯) and h2(p1, p2, u¯)

Note that compensated demand is a function of prices and the utility level

whereas uncompensated demand is a function of prices and income. Plugging

compensated demand into the objective function (p1x1 + p2x2) yields the expenditure function as function of prices and ¯u

E(p1, p2, u¯) = p1h1(p1, p2, u¯) + p2h2(p1, p2, u¯).

Hence, the expenditure function measures the minimal amount of money required

to buy a bundle that yields a utility of ¯u.

Uncompensated and compensated demand functions usually differ from each

other, which is immediately clear from the fact that they have different arguments.

There is a special case where they are identical. First, note that indirect utility

and expenditure function are related by the following relationships

V (p1, p2, E(p1, p2, u¯)) = ¯u

E(p1, p2, V (p1, p2, I)) = I.

That is, if income is exactly equal to the expenditure necessary to achieve utility

level ¯u, then the resulting indirect utility is equal to ¯u. Similarly, if the required

utility level is set equal to the indirect function when income is I, then minimized

expenditure will be equal to I. Using these relationships, we have that uncompensated and compensated demand are equal in the following two cases:

x

∗

i

(p1, p2, I) = h

∗

i

(p1, p2, V (p1, p2, I))

x

∗

i

(p1, p2, E(p1, p2, u¯)) = h

∗

i

(p1, p2, u¯) for i = 1, 2. (13)

Now we can express income and substitution effects analytically. Start with

one component of equation (13):

h

∗

i

(p1, p2, u¯) = x

∗

i

(p1, p2, E(p1, p2, u¯))

3After the British economist Sir John Hicks, co-recipient of the 1972 Nobel Prize in Economic

Sciences.

26

and take the derivative with respect to pj using the chain rule

∂h∗

i

∂pj

=

∂x∗

i

∂pj

+

∂x∗

i

∂I

∂E

∂pj

. (14)

Now we have to find an expression for ∂E

∂pj

. Start with the Lagrangian associated

with problem (12) evaluated at the optimal solution (h

∗

(p1, p2, u¯), λ∗

(p1, p2, u¯)):

L(h

∗

(p1, p2, u¯), λ∗

(p1, p2, u¯)) = p1h

∗

1

(p1, p2, u¯)+p2h

∗

2

(p1, p2, u¯)+λ

∗

(p1, p2, u¯)[¯u−u(x(p1, p2, u¯))].

Taking the derivative with respect to any price pj and noting that ¯u = u(x(p, u¯))

at the optimum we get

∂L(h

∗

(p, u¯), λ∗

(p, u¯))

∂pj

= h

∗

j +

X

I

i=1

pi

∂h∗

i

∂pj

− λ

∗X

I

i=1

∂u

∂xi

∂xi

∂pj

= h

∗

j +

X

I

i=1

pi − λ

∗

∂u

∂xi

∂xi

∂pj

.

But the first -order conditions for this Lagrangian are

pi − λ

∂u

∂xi

= 0 for all i.

Hence

∂E

∂pj

=

∂L

∂pj

= h

∗

j

(p1, p2, u¯).

This result also follows form the Envelope Theorem. Moreover, from equation (13)

it follows that h

∗

j = x

∗

j

. Hence, using these two facts and bringing the second term

on the RHS to the LHS we can rewrite equation (14) as

∂x∗

i

∂pj

=

∂h∗

i

∂pj

|{z}

SE

− x

∗

j

∂x∗

i

∂I

|{z}

IE

.

This equation is known as the Slutsky Equation4 and shows formally that the price

effect can be separated into a substitution (SE) and an income effect (IE).

4After the Russian statistician and economist Eugen Slutsky.

27

8 Categories of goods and Ealsticities

Definition 11. A normal good is a commodity whose Marshallian demand is

positively related to income, i.e. as income goes up the uncompensated demand

of that good goes up as well. Therefore good i is normal if

∂xi

∗

∂I > 0

Definition 12. A inferior good is a commodity whose Marshallian demand is

negatively related to income, i.e. as income goes up the uncompensated demand

of that good goes down. Therefore good i is inferior if

∂xi

∗

∂I < 0

Definition 13. Two goods are gross substitutes if rise in the price of one good

raises the uncompensated demand of the other good. Therefore goods i and j are

gross substitutes if

∂xi

∗

∂pj

> 0

Definition 14. Two goods are net substitutes if rise in the price of one good

raises the compensated or Hicksian demand of the other good. Therefore goods i

and j are net substitutes if

∂h∗

i

∂pj

> 0

Definition 15. Two goods are net complements if rise in the price of one good

reduces the compensated or Hicksian demand of the other good. Therefore goods

i and j are net complements if

∂h∗

i

∂pj

< 0

8.1 Shape of Expenditure Function

The expression for expenditure function in a n commodity case is given by,

E(p1, p2, . . . , pn, u¯) ,

Xn

i=1

pihi

∗

(p1, p2, . . . , pn, u¯)

28

Now let’s look at the effect of changing price pi on the expenditure. By envelope

theorem we get that,

∂E

∂pi

= hi

∗

(p1, p2, . . . , pn, u¯) > 0

Therefore the expenditure function is positively sloped, i.e. when prices go up

the minimum expenditure required to meet certain utility level also goes up. Now

to find out the curvature of the expenditure function we take the second order

derivative:

∂

2E

∂pi

2

=

∂hi

∗

∂pi

< 0

This implies that the expenditure function is concave in prices.

Definition 16. A Giffen good is one whose Marshallian demand is positively

related to its price. Therefore good i is Giffen if,

∂xi

∗

∂pi

> 0

But from the Hicksian demand we know that,

∂hi

∗

∂pi

< 0

Hence from the Slutsky equation,

∂xi

∗

∂pi

=

∂h∗

i

∂pi

− x1

∗

∂xi

∗

∂I

we get that for a good to be Giffen we must have,

∂xi

∗

∂I < 0

and xi

∗ needs to be large to overcome the effect of substitution effect.

Definition 17. A luxury good is defined as one for which the elasticity of income

is greater than one. Therefore for a luxury good i,

i,I =

dx∗

i

dI

x

∗

i

I

> 1

We can also define luxury good in the following alternative way.

Definition 18. If the budget share of a good is increasing in income then it is a

luxury good.

29

Before we explain the equivalence of the two definitions let us first define the

concept of budget share. Budget share of good i, denoted by si(I), is the fraction

of income I that is devoted to the expenditure on that good. Therefore,

si(I) = pixi

∗

(p, I)

I

Now to see how the two definitions are related we take the derivative of si(I) w.r.t.

I.

dsi(I)

dI =

dxi

∗

dI piI − pixi

∗

I

2

Now if good i is luxry then we know that,

dsi(I)

dI > 0

⇐⇒

dxi

∗

dI piI − pixi

∗

I

2

> 0

⇐⇒

dxi

∗

dI piI − pixi

∗ > 0

⇐⇒

dxi

∗

dI

xi

∗

I

> 1

⇐⇒ i,I > 1

Therefore we see that the two definitions of luxury good are equivalent. Hence a

luxury good is one which a consumer spends more, proportionally, as her income

goes up.

8.2 Elasticities

Definition 19. Revenue from a good i is defined as the following:

Ri(pi) = pixi

∗

Differentiating Ri(pi) w.r.t. pi we get,

R

0

i

(pi) = x

∗

i + pi

dx∗

i

dpi

= x

∗

i

1 +

pi

x

∗

i

dx∗

i

dpi

= x

∗

i

[1 + i,i]

30

We say that:

Demand is inelastic if,

R

0

i(pi) > 0 ⇒ i,i ∈ (−1, 0)

Demand is elastic if,

R

0

i(pi) < 0 ⇒ i,i ∈ (−∞, −1)

Demand is unit-elastic if,

R

0

i(pi) = 0 ⇒ i,i = −1

9 Welfare Measurement

In order to do welfare comparison of different price situations it is important that

we move out of the utility space and deal with money as then we would have an

objective measure that we can compare across individual unlike utility. Let the

initial price vector be given by,

p

0 = (p

0

1

, p0

2

, . . . , p0

n

)

and the new price vector be,

p

1 = (p

1

1

, p1

2

, . . . , p1

n

)

9.1 Compensating Variation

The notion of compensating variation asks how much additional amounts of income

is required to maintain the initial level of utility under new prices.

CV = E(p

1

, u0) − E(p

0

, u0)

where E(p

0

, u0) is the expenditure function evaluated at price p

0 and utility level

u0. This gives us a measure of loss or gain of welfare of one individual in terms of

money due to change in prices.

9.2 Equivalent Variation

The notion of equivalent variation asks how much additional amounts of income

is required to raise the level of utility from the initial level to a specified new level

given the same prices.

EV = E(p

0

, u1) − E(p

0

, u0)

31

Let the change in price from p

0

to p

1

is only through the change in price of commodity 1. Let p

1

1 > p0

1

and p

1

i = p

0

i

for all other i = 2, 3, . . . , n. Then we can

write,

CV = E(p

1

, u0) − E(p

0

, u0)

=

Z p

1

1

p

0

1

∂E(p1, u0)

∂p1

dp1

=

Z p

1

1

p

0

1

h

∗

1

(p1, u0)dp1

9.3 Introduction of New Product

Let’s think of a scenario where a new product is introduced. Let that be commodity

k. This can be thought of as a reduction of the price of that product from pk = ∞

to pk = ¯p where ¯p is the price of the new product. Then one can measure the

welfare gain of introducing a new product by calculating the CV with the change

in price of the new product from infinity to ¯p.

CV = −

Z ∞

p¯

h

∗

k

(pk, u0)dpk

9.4 Inflation Measurement

Let the reference consumption bundle be denoted by,

x

0 = (x

0

1

, x0

2

, . . . , x0

n

)

and the reference price be,

p

0 = (p

0

1

, p0

2

, . . . , p0

n

)

Then one measure of inflation is the Laspayers Price Index,

IL =

p

1

· x

0

p

1

· x

0

The other measure is Paasche Price Index,

IP =

p

1

· x

1

p

1

· x

1

where x

1

is the consumption bundle purchased at the new price p

1

. Here the

reference bundle x

0

is the optimal bundle under the price situation p

0

. Therefore

we can say,

p

0

· x

0 = E(p

0

, u0)

32

where u0 is the utility level achieved with price p

0 and income p

0

· x

0

. Now given

the new price situation p

1 we know that,

p

1

· x

0 ≥ E(p

1

, u0)

⇒ IL =

p

1

· x

0

p

1

· x

0

≥

E(p

1

, u0)

E(p

0

, u0)

Hence we see that the Laspayers Price Index is an overestimation of price change.

10 Pareto Efficiency and Competitive Equilibrium

We now consider a model with many agents where we make prices endogenous

(initially) and later incomes as well. Let there be I individuals, each denoted by

i,

i = 1, 2, . . . , I

K commodities, each denoted by k,

k = 1, 2, . . . . , K

a consumption bundle of agent i be denoted by x

i

,

x

i = (x

i

1

, xi

2

, . . . , xi

K)

and the utility function of individual i be denoted by,

u

i

: R

K

+ → R

and society has endowment of commodities denoted by e,

e = (e1, e2, . . . , eK)

A social allocation is a vector of consumption bundles for all the individuals,

x = (x

1

, x2

, . . . , , xi

, . . . , xI

)

The total consumption of commodity k by all the individuals can not exceed the

endowment of that commodity, which is referred to as the feasibility constraint.

We say a social allocation is feasible if,

X

I

i=1

x

i

k ≤ ek ∀ k = 1, 2, . . . , K

which represent the K feasibility constraints.

33

F’s bananas

F’s coconuts

R’s bananas

R’s coconuts

Pareto efficient allocation

Definition 20. An allocation x is Pareto efficient if it is feasible and there

exists no other feasible allocation y such that nobody is worse off and at least one

individual is strictly better off, i.e. there is no y such that for all i:

u

i

(y

i

) ≥ u

i

(x

i

)

and at for some i

0

:

u

i

0

(y

i

0

) > ui

0

(x

i

0

)

We say that an allocation y is Pareto superior to another allocation x if for

all i:

u

i

(y

i

) ≥ u

i

(x

i

)

and at for some i

0

:

u

i

0

(y

i

0

) > ui

0

(x

i

0

)

and we say that y Pareto dominates x if, for all i:

u

i

(y

i

) > ui

(x

i

)

In a 2 agent (say, Robinson and Friday), 2 goods (say, coconuts and bananas)

economy we can represent the allocations in an Edgeworth box. Note that we have

a total of four axes in the Edgeworth box. The origin for Friday is in the southwest corner and the amount of bananas he consumes is measured along the lower

horizontal axis whereas his amount of coconuts is measured along the left vertical

axis. For Robinson, the origin is in the north-east corner, the upper horizontal

axis depicts Robinson’s banana consumption, and the right vertical axis measures

his coconut consumption. The height and width of the Edgeworth box are one

each since there are one banana and one coconut in this economy. Hence, the

34

endowment bundle is the south-east corner where the amount of Friday’s bananas

and Robinson’s coconuts are both equal to one. This also implies that Friday’s

utility increases as he moves up and right in the Edgeworth box, whereas Robinson

is better off the further down and left he gets. Any point inside the the two ICs is

an allocation that gives both Robinson and Friday higher utility. Hence any point

inside is a Pareto superior allocation than the initial one. The point where the two

ICs are tangent to each other is a Pareto efficient point as starting from that point

or allocation, it is not possible to raise one individual’s utility without reducing

other’s. Hence the set of Pareto efficient allocations in this economy is the set of

points in the Edgeworth box where the two ICs are tangent to each other. This

is depicted as the dotted line in the box. It is evident from the picture that there

can be many Pareto efficient allocations. Specifically, allocations that give all the

endowment of the society to either Robinson or Friday are also Pareto efficient as

as any other allocation would reduce that person’s utility.

10.1 Competitive Equilibrium

A competitive equilibrium is the pair (p, x), where p is the price vector for the K

commodities:

p = (p1, . . . , pk, . . . , pK)

and x is the allocation:

x = (x

1

, x2

, . . . , , xi

, . . . , xI

),

such that markets clear for all commodities k:

X

I

i=1

x

i

k ≤ ek,

allocation is affordable for each individual i:

p · x

i ≤ p · e

i

,

and for each individual i there is no y

i

such that

p · y

i ≤ p · e

i

and

u

i

(y

i

) > ui

(x

i

)

35

11 Social Welfare

We here are trying to formalize the problem from the point of view of a social

planner. The social planner has endowments given by the endowment vector e =

(e1, e2, . . . , eK) and attaches weight α

i

to individual i’s utility. So for him the

optimization problem is given by:

max

x1,x2,…,xI

X

I

i=1

α

iu

i

(x

i

) α

i ≥ 0,

X

I

i

α

i = 1

subject to X

I

i=1

x

i

k ≤ ek ∀ k = 1, 2, . . . , K

The Lagrange is given by,

L(x, λ) = X

I

i=1

α

iu

i

(x

i

) +X

K

k=1

λk

ek −

X

I

i=1

x

i

k

!

The first order conditions for individual i and for any two goods k and l are:

x

i

k

: α

i

∂ui

(x

i

)

∂xi

k

− λk = 0,

x

i

l

: α

i

∂ui

(x

i

)

∂xi

l

− λl = 0.

If we consider the ratio for any two commodities, we get for all i and for any pair

k, l of commodities:

αi∂ui

(x

i

)

∂xi

k

αi∂ui(xi)

∂xi

l

=

λk

λl

This means that the MRS between two goods k and l is same across individuals

which is the condition for Pareto Optimality. Hence a specific profile of weights

α = (α

1

, α2

, . . . , αI

) will give us a specific allocation among the set of Pareto

efficient allocations. Therefore we have the following powerful theorem:

Theorem 1. The set of Pareto efficient allocations and the set of welfare maximizing allocations across all possible vectors of weights are identical.

Below we solve a particular example with Cobb-Douglas preferences.

36

Example 3. Let Ann and Bob have the following preferences:

u

A

x

A

1

, xA

2

= α ln x

A

1 + (1 − α) ln x

A

2

u

B

x

B

1

, xB

2

= β ln x

B

1 + (1 − β) ln x

B

2

Let the weight on Ann’s utility function be γ and therefore the weight on Bob’s

utility function is (1 − γ). The Lagrange expression is then given by,

L(x, λ) = γuA + (1 − γ)u

B + λ1[e1 − x

A

1 − x

B

1

] + λ2[e2 − x

A

2 − x

B

2

]

The F.O.C.s are then given by,

∂L(x, λ)

∂xA

1

=

γα

x

A

1

− λ1 = 0

∂L(x, λ)

∂xA

2

=

γ(1 − α)

x

A

2

− λ2 = 0

∂L(x, λ)

∂xB

1

=

(1 − γ)β

x

B

1

− λ1 = 0

∂L(x, λ)

∂xB

2

=

(1 − γ)β

x

B

2

− λ2 = 0

Hence we get,

(1 − γ)βxA

1 = γαxB

1

and

(1 − γ)βxA

2 = γ(1 − α)x

B

2

Thus from the feasibility conditions we get that the allocations would be,

x

A

1 =

γα

γα + (1 − γ)β

e1, xA

2 =

γ(1 − α)

(1 − γ)β + γ(1 − α)

e2

x

B

1 =

(1 − γ)β

γα + (1 − γ)β

e1, xB

2 =

(1 − γ)β

(1 − γ)β + γ(1 − α)

e2

Here by varying the value of γ in its range [0,1] we can generate the whole set of

Pareto efficient allocations.

12 Competitive Equilibrium Continued

Example 4. We now consider a simple example, where Friday is endowed with the

only (perfectly divisible) banana and Robinson is endowed with the only coconut.

That is e

F = (1, 0) and e

R = (0, 1). To keep things simple suppose that both

agents have the same utility function

u(xB, xC) = α

√

xB +

√

xC

and we consider the case where α > 1, so there is a preference for bananas over

coconuts that both agents share. We can determine the indifference curves for

both Robinson and Friday that correspond to the same utility level that the initial

endowments provide. The indifference curves are given by

u

F

e

F

B,e

F

C

= α

q

e

F

B +

q

e

F

C = α = u

F

(1, 0)

u

R

e

R

B,e

R

C

= α

q

e

R

B +

q

e

R

C = 1 = u

R

(0, 1)

All the allocations between these two indifference curves are Pareto superior to

the initial endowment.

We can define the net trade for Friday (and similarly for Robinson) by

z

F

B = x

F

B − e

F

B

z

F

C = x

F

C − e

F

C

Notice that since initially Friday had all the bananas and none of the coconuts

z

F

B ≤ 0

z

F

C ≥ 0

There could be many Pareto efficient allocations (e.g., Friday gets everything,

Robinson gets everything, etc.), but we can calculate which allocations are Pareto

optimal. If the indifference curves at an allocation are tangent then the marginal

rates of substitution must be equated. In this case, the resulting condition is

∂uF

∂xF

B

∂uF

∂xF

C

=

α

2

√

x

F

B

1

2

√

x

F

C

=

α

2

√

xR

B

1

2

√

xR

C

=

∂uR

∂xR

B

∂uR

∂xR

C

which simplifies to

p

x

F

p

C

x

F

B

=

p

x

R

p

C

x

R

B

and, of course, since there is a total of one unit of each commodity, for market

clearing we must have

x

R

C = 1 − x

F

C

x

R

B = 1 − x

F

B

so

p

x

F

p

C

x

F

B

=

p

1 − x

F

p

C

1 − x

F

B

and squaring both sides

x

F

C

x

F

B

=

1 − x

F

C

1 − x

F

B

which implies that

x

F

C − x

F

Cx

F

B = x

F

B − x

F

Cx

B

and so

x

F

C = x

F

B

x

R

C = x

R

B.

What are the conditions necessary for an equilibrium? First we need the conditions

for Friday to be optimizing. We can write Robinson’s and Friday’s optimization

problems as the corresponding Lagrangian, where we generalize the endowments

to any e

R = (e

R

B, eR

C

) and e

F = (e

F

B, eF

C

):

L

x

F

B, xF

C, λF

= α

q

x

F

B +

q

x

F

C + λ

pBe

F

B + e

F

C − pBx

F

B − x

F

C

, (15)

where we normalize pC = 1 without loss of generality. A similar Lagrangian can

be set up for Robinson’s optimization problem. The first-order conditions for (15)

are

∂L

∂xF

B

=

α

2

p

x

F

B

− λ

F

pB = 0 (16)

∂L

∂xF

C

=

1

2

p

x

F

C

− λ

F = 0 (17)

∂L

∂λF

= pBe

F

B + e

F

C − pBx

F

B − x

F

C = 0. (18)

Solving as usual by taking the ratio of equations (16) and (17) we get the following

expression for the relative (to coconuts) price of bananas

pB = α

p

x

F

p

C

x

F

B

so that we can solve for x

F

C

as a function of x

F

B

x

F

C =

pB

α

2

x

F

B.

Plugging this into the budget constraint from equation (18) we get

pBx

F

B +

pB

α

2

x

F

B = pBe

F

B + e

F

C.

Then we can solve for Friday’s demand for bananas

x

F

B =

pBe

F

B + e

F

C

pB +

pB

α

2

and for coconuts

x

F

C =

pB

α

2 pBe

F

B + e

F

C

pB +

pB

α

2

.

The same applies to Robinson’s demand functions, of course.

Now we have to solve for the equilibrium price pB. To do that we use the

market clearing condition for bananas, which says that demand has to equal supply

(endowment):

x

F

B + x

R

B = e

F

B + e

R

B.

Inserting the demand functions yields

pBe

F

B + e

F

C

pB +

pB

α

2 +

pBe

R

B + e

R

C

pB +

pB

α

2 = e

F

B + e

R

B = eB,

where eB is the social endowment of bananas and we define eC = e

F

C + e

R

C

. We

solve this equation to get the equilibrium price of bananas in the economy:

p

∗

B = α

reC

eB

.

So we have solved for the equilibrium price in terms of the primitives of the economy. This price makes sense intuitively. It reflects relative scarcity in the economy

(when there are relatively more bananas than coconuts, bananas are cheaper) and

preferences (when consumers value bananas more, i.e., when α is larger, they cost

more). We can then plug this price back into the previously found equations both

for agents’ consumption and have an expression for consumption in terms of the

primitives.

Now we mention the two fundamental welfare theorems which lay the foundation for taking competitive markets as the benchmark for any study of markets

and prices. The first one states that competitive equilibrium allocations are always

Pareto efficient and the second one states that any Pareto efficient allocation can

be achieved as an outcome of competitive equilibrium.

Theorem 2. (First Welfare Theorem) Every Competitive Equilibrium allocation

x

∗

is Pareto Efficient.

Proof. Suppose not. Then there exists another allocation y, which is feasible, such

that

for all i: u

i

(y) ≥ u

i

(x

∗

)

for some i

0

: u

i

0

(y) > ui

0

(x

∗

).

If u

i

(y) ≥ u

i

(x

∗

), then the budget constraint (and monotone preferences) implies

that

X

K

k=1

pky

i

k ≥

X

K

k=1

pkx

∗k

k

(19)

and for some i

0

X

K

k=1

pky

i

0

k >

X

K

k=1

pkx

∗i

0

k

. (20)

Equations (19) and (20) imply that

X

I

i=1

X

K

k=1

pky

i

k >

X

I

i=1

X

K

k=1

pkx

∗i

k =

X

K

k=1

pkek,

where the left-most term is the aggregate expenditure and the right-most term the

social endowment. This is a contradiction because feasibility of y means that

X

I

i=1

y

i

k ≤

X

I

i=1

e

i

k = ek

for any i and hence

X

I

i=1

X

K

k=1

pky

i

k ≤

X

K

k=1

pkek.

Theorem 3. (Second Welfare Theorem) Every Pareto efficient allocation can be

decentralized as a competitive equilibrium. That is, every Pareto efficient allocation

is the equilibrium for some endowments.

41

13 Decision Making under Uncertainty

So far, we have assumed that decision makers have all the needed information. This

is not the case in real life. In many situations, individuals or firms make decisions

before knowing what the consequences will be. For example, in financial markets

investors buy stocks without knowing future returns. Insurance contracts exist

because there is uncertainty. If individuals were not uncertain about the possibility of having an accident in the future, there would be no need for car insurance.

Definition 21. π = (π1, π2, . . . , πN ) represents a probability distribution if

πn ≥ 0 ∀ n = 1, 2, . . . , N, and X

N

n=1

πn = 1

Now to conceptualize uncertainty we define the concept of lottery.

Definition 22. A lottery L is defined as follows:

L = (x; π) = (x1, x2, . . . , xN ; π1, π2, . . . , πN )

where x = (x1, x2, . . . , xN ) ∈ R

N is a profile of money awards (positive or negative)

to be gained in N different states and π = (π1, π2, . . . , πN ) is the probability

distribution over the N states.

13.1 St. Petersberg Paradox

Here we talk about a well-known lottery known as the St. Petersburg paradox

which was proposed by Daniel Bernoulli in 1736. A fair coin is tossed until head

comes up for the first time. Then the reward paid out is equal to 2n−1

, where n

is the number of coin tosses that were necessary for head to come up once. This

lottery is described formally as

LSP =

1, 2, 4, . . . , 2

n−1

, . . . ;

1

2

,

1

4

,

1

8

, . . . ,

1

2

n

, . . .

.

Its expected value is

E [LSP ] = X∞

n=1

πnxn =

X∞

n=1

1

2

n

2

n−1 =

X∞

n=1

1

2

= ∞.

Hence the expected payoff from this lottery is infinitely large and an individual

offered this lottery should be willing to pay an infinitely large amount for the right

to play this lottery. This is not, however, what people do and hence the paradox.

42

13.2 Expected Utility

The St. Petersberg paradox emphasized the fact that expected value may not be

the right way to describe individual’s preferences over lotteries. In general utility

over lotteries is a function:

U : R

N × [0, 1]N → R

Expected Utility is a particular formulation that says that there is another utility

function defined over money

u: R → R

such that the utility over the lottery is of the following form:

U(x1, x2, . . . , xN ; π1, π2, . . . , πN ) = X

N

n=1

u(xn)πn

Definition 23. A decision maker is called risk averse if the utility function u :

R −→ R+ is concave and she is called risk loving or a risk seeker if u is convex.

Suppose a lottery be given by

L = (x1, x2; π1, π2)

Then the individual is risk averse if

π1u(x1) + π2u(x2) < u(π1×1 + π2×2)

and the individual is risk loving if

π1u(x1) + π2u(x2) > u(π1×1 + π2×2)

and the individual is risk neutral if

π1u(x1) + π2u(x2) = u(π1×1 + π2×2)

13.3 Risky Investment

Let an individual with wealth w deciding how much to invest in a risky asset which

pays return r1 in state 1 and return r2 in state 2 such that

(1 + r1) < 1

(1 + r2) > 1

43

Therefore state 1 is the bad state which gives a negative return while state 2 is

the good state which gives a positive return. If z is the amount that’s invested in

this risky asset then the individual’s expected utility is given by,

U(z) = π1u((1 + r1)z − z + w) + π2u((1 + r2)z − z + w)

So the individual solves the following problem:

max

z

π1u((1 + r1)z − z + w) + π2u((1 + r2)z − z + w)

The marginal utility of investment is given by,

dU(z)

dz = π1u

0

((1 + r1)z − z + w).r1 + π2u

0

((1 + r2)z − z + w).r2

Therefore the MU at z = 0 is given by,

dU(z)

dz

z=0

= π1r1u

0

(w) + π2r2u

0

(w) = [π1r1 + π2r2]u

0

(w)

Therefore,

dU(z)

dz

z=0

T 0 according as π1r1 + π2r2 T 0

Hence whether the individual will invest anything in the asset or not will depend

on the expected value of the asset. If the expected value is positive he will invest

a positive amount irrespective of his degree of risk aversion. The actual value of z

will of course depend on the concavity of his utility function.

14 Theory of Production

We can use tools similar to those we used in the consumer theory section of the class

to study firm behaviour. In that section we assumed that individuals maximize

utility subject to some budget constraint. In this section we assume that firms

will attempt to maximize their profits given a demand schedule and production

technology.

Firms use inputs or commodities x1, . . . , xI to produce an output y. The

amount of output produced is related to the inputs by the production function

y = f(x1, . . . , xI ), which is formally defined as follows:

Definition 24. A production function is a mapping f : R

I

+ −→ R+.

44

The prices of the inputs/commodities are p1, . . . , pI and the output price is py.

The firm takes prices as given and independent of its decisions.

Firms maximize their profits by choosing the optimal amount and combination

of inputs.

max

x1,…,xI

pyf(x1, . . . , xI ) −

X

I

i=1

pixi

. (21)

Another way to describe firms’ decision making is by minimizing the cost necessary

to produce an output quantity ¯y.

min

x1,…,xI

X

I

i=1

pixi s.t. f(x1, . . . , xI ) ≥ y¯.

The minimized cost of production, C (¯y), is called the cost function.

We make the following assumption for the production function: positive marginal

product

∂f

∂xi

≥ 0

and declining marginal product

∂

2

f

∂x2

i

≤ 0.

The optimality conditions for the profit maximization problem (21) and the

FOCs for all i

py

∂f

∂xi

− pi = 0.

In other words, optimal production requires equality between marginal benefits

and marginal cost of production. The solution to the profit maximization problem

then is

x

∗

i

(p1, . . . , pI , py), i = 1, . . . , I

y

∗

(p1, . . . , pI , py),

i.e., optimal demand for inputs and optimal output/supply.

The solution of the cost minimization problem (), on the other hand is

x

∗

i

(p1, . . . , pI , y¯), i = 1, . . . , I,

where ¯y is the firm’s production target.

Example 5. One commonly used production function is the Cobb-Douglas production function where

f(K, L) = KαL

1−α

45

The interpretation is the same as before with α reflecting the relative importance

of capital in production. The marginal product of capital is ∂f

∂K and the marginal

product of labor is ∂f

∂L .

In general, we can change the scale of a firm by multiplying both inputs by a

common factor: f(tK, tL) and compare the new output to tf(K, L). The firm is

said to have constant returns to scale if

tf(K, L) = f(tK, tL),

it has decreasing returns to scale if

tf(K, L) > f(tK, tL),

and increasing returns to scale if

tf(K, L) < f(tK, tL).

Example 6. The Cobb-Douglas function in our example has constant returns to

scale since

f(tK, tL) = (tK)

α

(tL)

1−α = tKαL

1−α = tf(K, L).

Returns to scale have an impact on market structure. With decreasing returns

to scale we expect to find many small firms. With increasing returns to scale, on

the other hand, there will be few (or only a single) large firms. No clear prediction

can be made in the case of constant returns to scale. Since increasing returns to

scale limit the number of firms in the market, the assumption that firms are price

takers only makes sense with decreasing or constant returns to scale.

15 Imperfect Competition

15.1 Pricing Power

So far, we have considered market environments where single agent cannot control

prices. Instead, each agent was infinitesimally small and firms acted as price takers.

This was the case in competitive equilibrium. There are many markets with few

(oligopoly) or a single firms, (monopoly) however. In that case firms can control

prices to some extent. Moreover, when there are a few firms in a market, firms

make interactive decisions. In other words, they take their competitors’ actions

into account. In Section ??, we will use game theory to analyse this type of market

structure. First, we cover monopolies, i.e., markets with a single producer.

46

15.2 Monopoly

If a firm produces a non-negligible amount of the overall market then the price at

which the good sells will depend on the quantity sold. Examples for firms that

control the overall market include the East India Trading Company, Microsoft

(software in general because of network externalities and increasing returns to

scale), telecommunications and utilities (natural monopolies), Standard Oil, and

De Beers.

For any given price there will be some quantity demanded by consumers, and

this is known as the demand curve x : R+ −→ R+ or simply x(p). We assume that

consumers demand less as the price increases: the demand function is downward

sloping or x

0

(p) < 0. We can invert this relationship to get the inverse demand

function p(x) which reveals the price that will prevail in the market if the output

is x.

If the firm is a monopolist that takes the demand data p(x) as given then its

goal is to maximize

π(x) = p(x)x − c(x) (22)

by choosing the optimal production level. For the cost function we assume c

0

(x) >

0 and c

00(x) ≥ 0, i.e., we have positive and weakly increasing marginal costs.

For example, c(x) = cx satisfies these assumptions (a Cobb-Douglas production

function provides this for example). The monopolist maximizes its profit function

(22) over x, which leads to the following FOC:

p(x) + xp0

(x) − c

0

(x) = 0. (23)

Here, in addition to the familiar p(x), which is the marginal return from the

marginal consumer, the monopolist also has to take the xp0

(x) into account, because a change in quantity also affects the inframarginal consumers. For example,

when it increases the quantity supplies, the monopolist gets positive revenue from

the marginal consumer, but the inframarginal consumers pay less due to the downward sloping demand function. At the optimum, the monopolist equates marginal

revenue and marginal cost.

Example 7. A simple example used frequently is p(q) = a − bq, and we will

also assume that a > c since otherwise the cost of producing is higher than any

consumer’s valuation so it will never be profitable for the firm to produce and the

market will cease to exist. Then the firm want to maximize the objective

π(x) = (a − bx − c)x.

The efficient quantity is produced when p(x) = a−bx = c because then a consumer

buys an object if and only if they value it more than the cost of producing, resulting

47

x

p

x_M

p_M

x*

p*

A B

C

MC

p(x)

MR

Figure 5: Monopoly

in the highest possible total surplus. So the efficient quantity is

x

∗ =

a − c

b

.

The monopolist’s maximization problem, however, has FOC

a − 2bx − c = 0

where a − 2bx is the marginal revenue and c is the marginal cost. So the quantity

set by the monopolist is

x

M =

a − c

2b

< x∗

.

The price with a monopoly can easily be found since

p

M = a − bxM

= a −

a − c

2

=

a + c

2

> c.

Figure 5 illustrates this.

A monopoly has different welfare implications than perfect competition. In

Figure 5, consumers in a monopoly lose the areas A and B compared to perfect

48

competition. The monopolist loses area C and wins area A. Hence, there are

distributional implications (consumers lose and the producer gains) as well as

efficiency implications (overall welfare decreases).

We can write the monopolist’s FOC (23) in terms of the demand elasticity

introduced in Section ?? as follows:

p(x

∗

) + x

∗

p

0

(x

∗

) = c

0

(x

∗

) ⇐⇒

p(x

∗

)

1 +

x

∗p

0

(x

∗

)

p(x

∗

)

= c

0

(x

∗

) ⇐⇒

p(x

∗

) = c

0

(x

∗

)

1 +

−1

p

.

Since p < 0, we have that p(x

∗

) > c0

(x

∗

), in other words, the monopolist charges

more than the marginal cost. This also means that if demand is very elastic,

p → ∞, then p(x

∗

) ≈ c

0

(x

∗

). On the other hand, if demand is very inelastic,

p ≈ −1, then p(x

∗

) c

0

(x

∗

).

16 Imperfectly Competitive Market

16.1 Price Discrimination

In the previous section we saw that the monopolist sets an inefficient quantity and

total welfare is decreased. Is there a mechanism, which allows the monopolist to

offer the efficient quantity and reap the entire possible welfare in the market? The

answer is yes if the monopolist can set a two-part tariff, for example. In general,

the monopolist can extract consumer rents by using price discrimination.

First degree price discrimination (perfect price discrimination) means discrimination by the identity of the person or the quantity ordered (non-linear pricing). It

will result in an efficient allocation. Suppose there is a single buyer and a monopoly

seller where the inverse demand is given by p = a − bx. If the monopolist were

to set a single price it would set the monopoly price. As we saw in the previous

section, however, this does not maximize the joint surplus, so the monopolist can

do better. Suppose instead that the monopolist charges a fixed fee F that the

consumer has to pay to be allowed to buy any positive amount at all, and then

sells the good at a price p, and suppose the monopolist sets the price p = c. The

fixed fee will not affect the quantity that a participating consumer will choose, so

if the consumer participates then they will choose quantity equal to x

∗

. The firm

can then set the entry fee to extract all the consumer surplus and the consumer

will still be willing to participate. This maximizes the joint surplus, and gives the

entire surplus to the firm, so the firm is doing as well as it could under any other

49

mechanism. Specifically, using the functional form from Example 7 the firm sets

F =

(a − c)x

∗

2

=

(a − c)

2

2b

In integral notation this is

F =

x

∗

Z

0

(p(x) − c)dx.

This pricing mechanism is called a two-part tariff, and was famously used at Disneyland (entry fee followed by a fee per ride), greatly increasing revenues.

Now, let’s assume that there are two different classes of consumers, type A

with utility function u(x) and type B with βu(x), β > 1, so that the second class

of consumers has a higher valuation of the good. If the monopolist structures a

two-part tariff (F, p = c) to extract all surplus from type B consumers, type A

consumers would not pay the fixed fee F since they could not recover the utility

lost from using the service. On the other hand, if the firm offers two two-part

tariffs (FA, p = c) and (FB, p = c) with FA < FB, all consumers would pick the

cheaper contract (FA, p = c). A solution to this problem would be to offer the

contracts (FA, pA > c) and (FB, p = c). Type A consumers pick the first contract

and consume less of the good and type B consumers pick the second contract,

which allows them to consume the efficient quantity. This is an example for second

degree price discrimination, which means that the firm varies the price by quantity

or quality only. It offers a menu of choices and lets the consumers self-select into

their preferred contract.

In addition, there is third degree price discrimination, in which the firm varies

the price by market or identity of the consumers. For example, Disneyland can

charge different prices in different parks. Let’s assume there are two markets,

i = 1, 2. The firm is a monopolist in both markets and its profit maximization

problem is

max

x1,x2

x1p1(x1) + x2p2(x2) − c(x1 + x2).

The FOC for each market is

pi(xi) + xip

0

i

(xi) = c

0

(x1 + x2),

which leads to optimal solution

pi(x

∗

i

) = c

0

(x

∗

1 + x

∗

2

)

1 + 1

i

p

for i = 1, 2.

Hence, the solution depends on the demand elasticity in market i. The price will

be different as long as the structure of demand differs.

50

16.2 Oligopoly

Oligopoly refers to environments where there are few large firms. These firms are

large enough that their quantity influences the price and so impacts their rivals.

Consequently each firm must condition its behavior on the behavior of the other

firms. This strategic interaction is modeled with game theory. The most important

model of oligopoly is the Cournot model or the model of quantity competition. The

general model is described as follows: Let there be I firms denoted by,

i = 1, 2, . . . , I

each producing one homogenous good, where each firm produces qi amount of that

good. Each firm i has cost function

ci(qi)

The total production is given by,

q =

X

I

i=1

qi

We also denote total production by firms other than i by,

q−i =

X

j6=i

qj

The profit of firm i is given by,

πi(qi

, q−i) = p(qi

, q−i)qi − ci(qi)

So firm i solves,

max

qi

πi(qi

, q−i)

Hence the F.O.C. is given by,

p(qi

, q−i) + ∂p(qi

, q−i)

∂qi

qi − c

0

i

(qi) = 0

It is important to note that the optimal production of firm i is dependent on

the production by other firms i.e. q−i

. This the strategic aspect of this model.

Therefore in order to produce any amount of the good firm i must anticipate

what others might be doing and every firm thinks the same way. So we need an

equilibrium concept here which would tell us that given the production level of

every firm no firm wants to move away from its current production.

51

16.2.1 Example

We here consider the duopoly case, where there are only two firms. Suppose the

inverse demand function is given by p(q) = a − bq, and the cost of producing is

constant and the same for both firms ci(q) = cq. The quantity produced in the

market is the sum of what both firms produce q = q1 + q2. The profits for each

firm is then a function of the market price and their own quantity,

πi(qi

, qj ) = qi (p (qi + qj ) − c) .

The strategic variable that the firm is choosing is the quantity to produce qi

.

Suppose that the firms’ objective was to maximize their joint profit

π1(q1, q2) + π2(q1, q2) = (q1 + q2) (p (q1 + q2) − c)

then we know from before that this is maximized when q1 + q2 = q

M. We could

refer to this as the collusive outcome. One way the two firms could split production

would be q1 = q2 =

qM

2

.

If the firms could write binding contracts then they could agree on this outcome.

However, that is typically not possible (such an agreement would be price fixing),

so we would not expect this outcome to occur unless it is stable/self-enforcing. If

either firm could increase its profits by setting another quantity, then they would

have an incentive to deviate from this outcome. We will see below that both firms

would in fact have an incentive to deviate and increase their output.

Suppose now that firm i is trying to choose qi to maximize its own profits,

taking the other firm’s output as given. Then firm i’s optimization problem is

max

qi

πi(qi

, qj ) = qi (a − b (qi + qj ) − c) ,

which has the associated FOC

∂πi(qi

, qj )

∂qi

= a − b (2qi + qj ) − c = 0.

Then the optimal level q

∗

i

given any level of qj

is

q

∗

i

(qj ) = a − bqj − c

2b

.

This is firm i’s best response to whatever firm j plays. In the special case when

qj = 0 firm i is a monopolist, and the observed quantity qi corresponds to the

monopoly case. In general, when the rival has produced qj we can treat the firm

as a monopolist facing a “residual demand curve” with intercept of a − bqj

. We

can write firm i’s best response function as

q

∗

i

(qj ) = a − c

2b

−

1

2

qj

.

52

q1

q2

1

0.5

0.5 1

q1(q2)

q2(q1)

Nash Equilibrium

Figure 6: Cournot equilibrium

Hence,

dqi

dqj

= −

1

2

.

This has two important implications. First, the quantity player i chooses is

decreasing in its rival’s quantity. This means that quantities are strategic substitutes. Second, if player j increases their quantity player i decreases their quantity

by less than player j increased their quantity (player i decreases his quantity by

exactly 1

2

for every unit player j’s quantity is increased). So we would expect that

the output in a duopoly would be higher than in a monopoly.

We can depict the best response function graphically. Setting a = b = 1 and

0, Figure 6 shows the best response functions. Here, the best response functions

are q

∗

i

(qj ) = 1−qj

2

.

We are at a “stable” outcome if both firms are producing a best response to

their rivals’ production. We refer to such an outcome as an equilibrium. That is,

when

qi =

a − bqj − c

2b

(24)

qj =

a − bqi − c

2b

. (25)

Since the best responses are symmetric we will have qi = qj and so we can calculate

the equilibrium quantities from the equation

qi =

a − bqi − c

2b

53

and so

qi = qj =

a − c

3b

and hence

q = qi + qj =

2(a − c)

3b

>

a − c

2b

= q

M.

There is a higher output (and hence lower price) in a duopoly then a monopoly.

More generally, both firms are playing a best response to their rival’s action

because for all i

πi(q

∗

i

, q∗

j

) ≥ πi(qi

, q∗

j

) for all qi

That is, the profits from the quantity are (weakly) higher then the profits from

any other output. This motivates the following definition for an equilibrium in a

strategic setting.

Definition 25. A Nash Equilibrium in the duopoly game is a pair (q

∗

i

, q∗

j

) such

that for all i

πi(q

∗

i

, q∗

j

) ≥ πi(qi

, q∗

j

) for all qi

.

This definition implicitly assumes that agents hold (correct) expectations or

beliefs about the other agents’ strategies.

A Nash Equilibrium is ultimately a stability property. There is no profitable

deviation for any of the players. In order to be at equilibrium we must have that

qi = q

∗

i

(qj )

qj = q

∗

j

(qi)

and so we must have that

qi = q

∗

i

(q

∗

j

(qi))

so equilibrium corresponds to a fixed-point of the mapping q

∗

1

(q

∗

2

(·)). This idea can

also be illustrated graphically. In Figure 6, firm 1 initially sets q1 =

1

2

, which is not

the equilibrium quantity. Firm 2 then optimally picks q2 = q

∗

2

1

2

=

1

4

according

to its best response function. Firm 1, in turn, chooses a new quantity according to

its best response function: q1 = q

∗

1

1

4

=

3

8

. This process goes on and ultimately

converges to q1 = q2 =

1

3

.

16.3 Oligopoly: General Case

Now, we consider the case with I competitors. The inverse demand function

(setting a = b = 1) is

p(q) = 1 −

X

I

i=1

qi

and firm i’s profit function is

π(qi

, q−i) =

1 −

X

I

i=1

qi − c

!

qi

, (26)

where the vector q−i

is defined as q−i = (q1, . . . , qi−1, qi+1, . . . , qI ), i.e., all quantities

excluding qi

.

Again, we can define an equilibrium in this market as follows:

Definition 26. A Nash Equilibrium in the oligopoly game is a vector q

∗ = (q

∗

1

, . . . , q∗

I

)

such that for all i

πi(q

∗

i

, q∗

−i

) ≥ πi(qi

, q∗

−i

) for all qi

.

We simply replaced the quantity qj by the vector q−i

.

Definition 27. A Nash equilibrium is called symmetric if q

∗

i = q

∗

j

for all i and j.

The FOC for maximizing the profit function (26) is

1 −

X

j6=i

qj − 2qi − c = 0

and the best response function for all i is

qi =

1 −

P

j6=i

qj − c

2

. (27)

Here, only the aggregate supply of firm i’s competitors matters, but not the specific

amount single firms supply. It would be difficult to solve for I separate values of

qi

, but due to symmetry of the profit function we get that q

∗

i = q

∗

j

for all i and j

so that equation (27) simplifies to

q

∗

i =

1 − (I − 1)q

∗

i − c

2

,

which leads to the solution

q

∗

i =

1 − c

I + 1

.

As I increases (more firms), the market becomes more competitive. Market

supply is equal to

X

I

i=1

q

∗

i = Iq∗

i =

I

I + 1

(1 − c).

55

As the number of firms becomes larger, I −→ ∞, q

∗

i −→ 0 and

X

I

i=1

q

∗

i −→ 1 − c,

which is the supply in a competitive market. Consequently,

p

∗ −→ c

As each player plays a less important strategical role in the market, the oligopoly

outcome converges to the competitive market outcome.

Note that we used symmetry in deriving the market outcome from the firms’

best response function. We cannot invoke symmetry when deriving the FOC. One

might think that instead of writing the profit function as (26) one could simplify

it to

π(qi

, q−i) = (1 − Iqi − c) qi

.

This is wrong, however, because it implies that firm i controls the entire market

supply (acts as a monopolist). Instead, in an oligopoly market, firm i takes the

other firms’ output as given.

17 Game Theory

17.1 Basics

Game theory is the study of behavior of individuals in a strategic scenario, where

a strategic scenario is defined as one where the actions of one individual affects

the payoff or utility of other individuals. In the previous section we introduced

game theory in the context of firm competition. In this section, we will generalize

the methods used above and introduce some specific language. The specification

of (static) game consists of three elements:

1. The players, indexed by i = 1, . . . , I. In the duopoly games, for example,

the players were the two firms.

2. The strategies available: each player chooses strategy ai

from the available

strategy set Ai

. We can write a−i = (a1, . . . , ai−1, ai+1, . . . , aI ) to represent

the strategies of the other I −1 players. Then, a strategy profile of all players

is defined by a = (a1, . . . , aI ) = (ai

, a−i). In the Cournot game, the player’s

strategies were the quantities chose, hence Ai = R+.

56

3. The payoffs for each player as a function of the strategies of the players. We

use game theory to analyze situations where there is strategic interaction so

the payoff function will typically depend on the strategies of other players

as well. We write the payoff function for player i as ui(ai

, a−i). The payoff

function is the mapping

ui

: A1 × · · · × AI −→ R.

Therefore we can define a game the following way:

Definition 28. A game (in normal form) is a triple,

Γ = n

{1, 2, . . . . , I}, {Ai}

I

i=1, {ui(·)}

I

i=1o

We now define the concept of best response, i.e. the action for a player i which

is best for him (in the sense of maximizing the payoff function). But since we are

studying a strategic scenario, what is best for player i potentially depends on what

others are playing, or what player i believes others might be playing.

Definition 29. An action ai

is a best response for player i against a profile of

actions of others a−i

if

ui(ai

, a−i) ≥ ui(a

0

i

, a−i) ∀ a

0

i ∈ Ai

We say that,

ai ∈ BRi(a−i)

Now we define the concept of Nash Equilibrium for a general game.

Definition 30. An action profile

a

∗ = (a

∗

1

, a∗

2

, . . . , a∗

I

)

is a Nash equilibrium if,

for all i, ui(a

∗

i

, a∗

−i

) ≥ ui(ai

, a∗

−i

) ∀ ai ∈ Ai

or, stated otherwise,

for all i, a

∗

i ∈ BRi(a

∗

−i

)

We know that

BRi

: ×j6=i Aj → Ai

Now let’s define the following function

BR: ×

I

i=1 Ai → ×I

i=1Ai

as

BR = (BR1, BR2, . . . , BRI )

Then we can redefine Nash equilibrium as follows:

57

Definition 31. An action profile

a

∗ = (a

∗

1

, a∗

2

, . . . , a∗

I

)

is a Nash equilibrium if,

a

∗ ∈ BR(a

∗

)

17.2 Pure Strategies

We can represent games (at least those with a finite choice set) in normal form.

A normal form game consists of the matrix of payoffs for each player from each

possible strategy. If there are two players, 1 and 2, then the normal form game

consists of a matrix where the (i, j)th entry consists of the tuple (player 1’s payoff,

player2’s payoff) when player 1 plays their ith strategy and player 2 plays their

jth strategy. We will now consider the most famous examples of games.

Example 8. (Prisoner’s Dilemma) Suppose two suspects, Bob and Rob are arrested for a crime and questioned separately. The police can prove the committed

a minor crime, and suspect they have committed a more serious crime but can’t

prove it. The police offer each suspect that they will let them off for the minor

crime if they confess and testify against their partner for the more serious crime. Of

course, if the other criminal also confesses the police won’t need his testimony but

will give him a slightly reduced sentence for cooperating. Each player then has two

possible strategies: Stay Quiet (Q) or Confess (C) and they decide simultaneously.

We can represent the game with the following payoff matrix:

Rob

Q C

Bob Q 3, 3 −1, 4

C 4, −1 0, 0

Each entry represents (Bob, Rob)’s payoff from each of the two strategies. For

example, if Rob stays quiet while Bob confesses Bob’s payoff is 4 and Rob’s is

−1. Notice that both players have what is known as a dominant strategy; they

should confess regardless of what the other player has done. If we consider Bob,

if Rob is Quiet then confessing gives payoff 4 > 3, the payoff from staying quiet.

If Rob confesses, then Bob should confess since 0 > −1. The analysis is the same

for Rob. So the only stable outcome is for both players to confess. So the only

Nash Equilibrium is (Confess, Confess). Notice that, from the perspective of the

prisoners this is a bad outcome. In fact it is Pareto dominated by both players

staying quiet, which is not a Nash equilibrium.

58

The above example has a dominant strategy equilibrium, where both players

have a unique dominant strategy.

Definition 32. A strategy is ai

is dominant if

ui(ai

, a−i) > ui(a

0

i

, a−i) for all a

0

i ∈ Ai

, a−i ∈ A−i

.

If each player has a dominant strategy, then the only rational thing for them

to do is to play that strategy no matter what the other players do. Hence, if a

dominant strategy equilibrium exists it is a relatively uncontroversial prediction of

what will happen in the game. However, it is rare that a dominant strategy will

exist in most strategic situations. Consequently, the most commonly used solution

concept is Nash Equilibrium, which does not require dominant strategies.

Note the difference between Definitions ?? and 32: A Nash Equilibrium is only

defined for the best response of the other players, s

∗

−i

, whereas dominant strategies

have to hold for strategies s−i ∈ S−i

. A strategy profile is a Nash Equilibrium if

each player is playing a best response to the other players’ strategies. So a Nash

Equilibrium is a stable outcome where no player could profitably deviate. Clearly

when dominant strategies exist it is a Nash Equilibrium for all players to play

a dominant strategy. However, as we see from the Prisoner’s Dilemma example

the outcome is not necessarily efficient. The next example shows that the Nash

Equilibrium may not be unique.

Example 9. (Coordination Game) We could represent a coordination game where

Bob and Ann are two researcher both of whose input is necessary for a project.

They decide simultaneously whether to do research (R) or not (N).

Bob

R N

Ann R 3, 3 −1, 0

N 0, −1 1, 1

Here (R,R) and (N,N) are both equilibria. Notice that the equilibria in this

game are Pareto ranked with both players preferring to coordinate on doing research. Both players not doing research is also an equilibrium, since if both players

think the other will play N they will play N as well.

A famous example of a coordination game is from traffic control. It doesn’t

really matter if everyone drives on the left or right, as long as everyone drives on

the same side.

Example 10. Another example of a game is a “beauty contest.” Everyone in

the class picks a number on the interval [1, 100]. The goal is to guess as close

59

as possible to 2

3

the class average. An equilibrium of this game is for everyone

to guess 1. This is in fact the only equilibrium. Since no one can guess more

than 100, 2

3

of the mean cannot be higher than 66 2

3

, so all guesses above this are

dominated. But since no one will guess more than 66 2

3

the mean cannot be higher

than 2

3

(66 2

3

) = 44 4

9

, so no one should guess higher than 44 4

9

. Repeating this n

times no one should guess higher than

2

3

n100 and taking n −→ ∞ all players

should guess 1. Of course, this isn’t necessarily what will happen in practice if

people solve the game incorrectly or expect others too. Running this experiment

in class the average guess was approximately 12.

17.3 Mixed Strategy

So far we have considered only pure strategies: strategies where the players do

not randomize over which action they take. In other words, a pure strategy is

a deterministic choice. The following simple example demonstrates that a pure

strategy Nash Equilibrium may not always exist.

Example 11. (Matching Pennies) Consider the following payoff matrix:

Bob

H T

Ann H 1, −1 −1, 1

T −1, 1 1, −1

Here Ann wins if both players play the same strategy, and Bob wins if they

play different ones. Clearly there cannot be pure strategy equilibrium, since Bob

would have an incentive to deviate whenever they play the same strategy and Ann

would have an incentive to deviate if they play the differently. Intuitively, the only

equilibrium is to randomize between H and T with probability 1

2

each.

While the idea of a matching pennies game may seem contrived, it is merely

the simplest example of a general class of zero-sum games, where the total payoff

of the players is constant regardless of the outcome. Consequently gains for one

player can only come from losses of the other. For this reason, zero-sum games

will rarely have a pure strategy Nash equilibrium. Examples would be chess, or

more relevantly, competition between two candidates or political parties. Cold

War power politics between the US and USSR was famously (although probably

not accurately) modelled as a zero-sum game. Most economic situations are not

zero-sum since resources can be used inefficiently.

Example 12. A slight variation is the game of Rock-Paper-Scissors.

6

Bob

R P S

R 0, 0 −1, 1 1, −1

Ann P 1, −1 0, 0 −1, 1

S −1, 1 1, −1 0, 0

Definition 33. A mixed strategy by player i is a probability distribution σi =

σi (S

1

i

), . . . , σi

S

K

i

such that

σi

s

k

i

≥ 0

X

K

k=1

σi

s

k

i

= 1.

Here we refer to si as an action and to σi as a strategy, which in this case is a

probability distribution over actions. The action space is Si =

s

1

i

, . . . , sK

i

.

Expected utility from playing action si when the other player plays strategy

σj

is

ui (si

, σj ) = X

K

k=1

σj

s

k

j

ui

si

, sk

j

.

Example 13. Consider a coordination game (also knows as “battle of the sexes”

similar to the one in Example 9 but with different payoffs

Bob

σB 1 − σB

O C

Ann σA O 1, 2 0, 0

1 − σA C 0, 0 2, 1

Hence Bob prefers to go to the opera (O) and prefers to go to a cricket match

(C), but both players would rather go to an event together than alone. There

are two pure strategy Nash Equilibria: (O, O) and (C, C). We cannot make a

prediction, which equilibrium the players will pick. Moreover, it could be the case

that there is a third Nash Equilibrium, in which the players randomize.

Suppose that Ann plays O with probability σA and C with probability 1 − σA.

Then Bob’s expected payoff from playing O is

2σA + 0(1 − σA) (2

σ_A

σ_B

2/3

1/3Bob’s best response

Ann’s best response

Figure 7: Three Nash Equilibria in battle of the sexes game

and his expected payoff from playing C is

0σA + 1(1 − σA). (29)

Bob is only willing to randomize between his two pure strategies if he gets the

same expected payoff from both. Otherwise he would play the pure strategy that

yields the highest expected payoff for sure. Equating (28) and (29) we get that

σ

∗

A =

1

3

.

In other words, Ann has to play O with probability 1

3

to induce Bob to play a

mixed strategy as well. We can calculate Bob’s mixed strategy similarly to get

σ

∗

B =

2

3

.

Graphically, we can depict Ann’s and Bob’s best response function in Figure 7.

The three Nash Equilibria of this game are the three intersections of the best

response functions.

18 Asymmetric Information: Adverse Selection

and Moral Hazard

Asymmetric information simply refers to situations where some of the players

have relevant information that other players do not. We consider two types of

62

asymmetric information: adverse selection, also known as hidden information,

and moral hazard or hidden action.

A leading example for adverse selection occurs in life or health insurance. If

an insurance company offers actuarially fair insurance it attracts insurees with

above average risk whereas those with below average risk decline the insurance.

(This assumes that individuals have private information about their risk.) In

other words, individuals select themselves into insurance based on their private

information. Since only the higher risks are in the risk pool the insurance company

will make a loss. In consequence of this adverse selection the insurance market

breaks down. Solutions to this problems include denying or mandating insurance

and offering a menu of contracts to let insurees self-select thereby revealing their

risk type.

Moral hazard is also present in insurance markets when insurees’ actions depend

on having insurance. For example, they might exercise less care when being covered

by fire or automobile insurance. This undermines the goal of such insurance,

which is to provide risk sharing in the case of property loss. With moral hazard,

property loss becomes more likely because insurees do not install smoke detectors,

for example. Possible solutions to this problem are copayments and punishment

for negligence.

18.1 Adverse Selection

The following model goes back to George Akerlof’s 1970 paper on “The market

for lemons.” The used car market is a good example for adverse selection because

there is variation in product quality and this variation is observed by sellers, but

not by buyers.

Suppose there is a potential buyer and a potential seller for a car. Suppose that

the quality of the car is denoted by θ ∈ [0, 1]. Buyers and sellers have different

valuations/willingness to pay vb and vs, so that the value of the car is vbθ to the

buyer and vsθ to the seller. Assume that vb > vs so that the buyer always values

the car more highly then the seller. So we know that trade is always efficient.

Suppose that both the buyer and seller know θ, then we have seen in the bilateral

trading section that trade can occur at any price p ∈ [vsθ, vbθ] and at that price

the efficient allocation (buyer gets the car) is realized (the buyer has a net payoff

of vbθ − p and the seller gets p − vsθ, and the total surplus is vbθ − vsθ).

The assumption that the buyer knows the quality of the car may be reasonable

in some situations (new car), but in many situations the seller will be much better

informed about the car’s quality. The buyer of a used car can observe the age,

mileage, etc. of a car and so have a rough idea as to quality, but the seller has

presumably been driving the car and will know more about it. In such a situation

we could consider the quality θ as a random variable, where the buyer knows

63

only the distribution but the seller knows the realization. We could consider a

situation where the buyer knows the car is of a high quality with some probability,

and low quality otherwise, whereas the seller knows whether the car is high quality.

Obviously the car could have a more complicated range of potential qualities. If

the seller values a high quality car more, then their decision to participate in the

market potentially reveals negative information about the quality, hence the term

adverse selection. This is because if the car had higher quality the seller would

be less willing to sell it at any given price. How does this type of asymmetric

information change the outcome?

Suppose instead that the buyer only knows that θ ∼ U[0, 1]. That is that the

quality is uniformly distributed between 0 and 1. The the seller is willing to trade

if

p − vsθ ≥ 0 (30)

and the buyer, who does not know θ, but forms its expected value, is willing to

trade if

E[θ]vb − p ≥ 0. (31)

However, the buyer can infer the car’s quality from the price the seller is asking.

Using condition (30), the buyer knows that

θ ≤

p

vs

so that condition (31) becomes

E

θ

θ ≤

p

vs

vb − p =

p

2vs

vb − p ≥ 0, (32)

where we use the conditional expectation of a uniform distribution:

E[θ|θ ≤ a] = a

2

.

Hence, simplifying condition (32), the buyer is only willing to trade if

vb ≥ 2vs.

In other words, the buyer’s valuation has to exceed twice the seller’s valuation for

a trade to take place. If

2vs > vb > vs

trade is efficient, but does not take place if there is asymmetric information.

In order to reduce the amount of private information the seller can offer a

warranty of have a third party certify the car’s quality.

64

If we instead assumed that neither the buyer or the seller know the realization

of θ then the high quality cars would not be taken out of the market (sellers

cannot condition their actions on information they do not have) and so we could

have trade. This indicates that it is not the incompleteness of information that

causes the problems, but the asymmetry.

18.2 Moral Hazard

Moral hazard is similar to asymmetric information except that instead of considering hidden information, it deals with hidden action. The distinction between the

two concepts can be seen in an insurance example. Those who have pre-existing

conditions that make them more risky (that are unknown to the insurer) are more

likely, all else being equal, to buy insurance. This is adverse selection. An individual who has purchased insurance may become less cautious since the costs of

any damage are covered by insurance company. This is moral hazard. There is a

large literature in economics on how to structure incentives to mitigate moral hazard. In the insurance example these incentives often take the form of deductibles

and partial insurance, or the threat of higher premiums in response to accidents.

Similarly an employer may structure a contract to include a bonus/commission

rather then a fixed wage to induce an employee to work hard. Below we consider

an example of moral hazard, and show that a high price may signal an ability to

commit to providing a high quality product.

Suppose a cook can choose between producing a high quality meal (q = 1) and

a low quality meal (q = 0). Assume that the cost of producing a high quality meal

is strictly higher than a low quality meal (c1 > c0 > 0). For a meal of quality q,

and price p the benefit to the customer is q − p and to the cook is p − ci

. So the

total social welfare is

q − p + p − ci = q − ci

and assume that 1−c1 > 0 > −c0 so that the high quality meal is socially efficient.

We assume that the price is set beforehand, and the cook’s choice variable is the

quality of the meal. Assume that fraction α of the consumers are repeat clients

who are informed about the meal’s quality, whereas 1 − α of the consumers are

uniformed (visitors to the city perhaps) and don’t know the meal’s quality. The

informed customers will only go to the restaurant if the meal is good (assume

p ∈ (0, 1)). These informed customers allow us to consider a notion of reputation

even though the model is static.

Now consider the decision of the cook as to what quality of meal to produce.

If they produce a high quality meal then they sell to the entire market so their

profits (per customer) are

p − c1

65

Conversely, by producing the low quality meal, and selling to only 1 − α of the

market they earn profit

(1 − α)(p − c0)

and so the cook will provide the high quality meal if

p − c1 ≥ (1 − α)(p − c0)

or

αp ≥ c1 − (1 − α)c0

where the LHS is the additional revenue from producing a high quality instead of

a low quality meal and the RHS is the associated cost. This corresponds to the

case

α ≥

c1 − c0

p − c0

.

So the cook will provide the high quality meal if the fraction of the informed

consumers is high enough. So informed consumers provide a positive externality

on the uninformed, since the informed consumers will monitor the quality of the

meal, inducing the chef to make a good meal.

Finally notice that price signals quality here: the higher the price the smaller

the fraction of informed consumers necessary ensure the high quality meal. If the

price is low (p ≈ c1) then the cook knows he will lose p − c1 from each informed

consumer by producing a low quality meal instead, but gains c1 − c0 from each

uninformed consumer (since the cost is lower). So only if almost every consumer is

informed will the cook have an incentive to produce the good meal. As p increases

so does p − c1, so the more is lost for each meal not sold to an informed consumer,

and hence the lower the fraction of informed consumers necessary to ensure that

the good meal will be provided. An uninformed consumer, who also may not know

α, could then consider a high price a signal of high quality since it is more likely

that the fraction of informed consumers is high enough to support the good meal

the higher the price.

18.3 Second Degree Price Discrimination

In Section ?? we considered first and third degree price discrimination where the

seller can identify the type of potential buyers. In contrast, second degree price

discrimination occurs when the firm cannot observe to consumer’s willingness to

pay directly. Consequently they elicit these preferences by offering different quantities or qualities at different prices. The consumer’s type is revealed through

which option they choose. This is known as screening.

Suppose there are two types of consumers. One with high valuation of the good

θh, and one with low valuation θl

. θ is also called the buyers’ marginal willingness

66

to pay. It tells us how much a buyer what be willing to pay for an additional unit

of the good. Each buyer’s type is his private information. That means the seller

does not know ex ante what type a buyer he is facing is. Let α denote the fraction

of consumers who have the high valuation. Suppose that the firm can produce a

product of quality q at cost c(q) and assume that c

0

(q) > 0 and c

00(q) > 0.

First, we consider the efficient or first best solution, i.e., the case where the firm

can observe the buyers’ types. If the firm knew the type of each consumer they

could offer a different quality to each consumer. The condition for a consumer of

type i = h, l buying an object of quality q for price p voluntarily is

θiq − p(q) ≥ 0

and for the firm to participate in the trade we need

p(q) − c(q) ≥ 0.

Hence maximizing joint payoff is equivalent to

max

q

θiq − p(q) + p(q) − c(q)

or

max

q

θiq − c(q).

The FOC for each quality level is

θi − c

0

(q) = 0,

from which we can calculate the optimal level of quality for each type, q

∗

(θi). Since

marginal cost is increasing by assumption we get that

q

∗

(θl) < q∗

(θh),

i.e., the firm offers a higher quality to buyers who have a higher willingness to

pay in the first best case. In the case of complete information we are back to first

degree price discrimination and the firm sets the following prices to extract the

entire gross utility from both types of buyers:

p

∗

h = θhq

∗

(θh) and p

∗

h = θlq

∗

(θl)

so that buyers’ net utility is zero. In Figure 8, the buyers’ gross utility, which is

equal to the price charged, is indicated by the rectangles θiq

∗

i

.

In many situations, the firm will not be able to observe the valuation/willingness

to pay of the consumers. That is, the buyers’ type is their private information. In

67

θ

q

q*_l

q*_h

θ_l θ_h

gross utility high type

gross utility low type information rent

Figure 8: Price discrimination when types are known to the firm

such a situation the firm offers a schedule of price-quality pairs and lets the consumers self-select into contracts. Thereby, the consumers reveal their type. Since

there are two types of consumers the firm will offer two different quality levels, one

for the high valuation consumers and one for the low valuation consumers. Hence

there will be a choice of two contracts (ph, qh) and (pl

, ql) (also called a menu of

choices). The firm wants high valuation consumers to buy the first contract and

low valuation consumers to buy the second contract. Does buyers’ private information matter, i.e., do buyers just buy the first best contract intended for them?

High type buyers get zero net utility from buying the high quality contract, but

positive net utility of θhq

∗

(θl) − pl > 0. Hence, high type consumers have an incentive to pose as low quality consumers and buy the contract intended for the

low type. This is indicated in Figure 8 as “information rent,” i.e., an increase in

high type buyers’ net utility due to asymmetric information.

The firm, not knowing the consumers’ type, however, can make the low quality

bundle less attractive to high type buyers by decreasing ql or make the high quality contract more attractive by increasing qh or decreasing ph. The firm’s profit

maximization problem now becomes

max

ph,pl

,qh,ql

α (ph − c(qh)) + (1 − α) (pl − c(ql)) . (33)

There are two type of constraints. The consumers have the option of walking away,

so the firm cannot demand payment higher than the value of the object. That is,

68

we must have

θhqh − ph ≥ 0 (34)

θlql − pl ≥ 0. (35)

These are known as the individual rational (IR) or participation constraints that

guarantee that the consumers are willing to participate in the trade. The other

type of constraints are the self-selection or incentive compatibility (IC) constraints

θhqh − ph ≥ θhql − pl (36)

θlql − pl ≥ θlqh − ph, (37)

which state that each consumer type prefers the menu choice intended for him to

the other contract. Not all of these four constraints can be binding, because that

would determine the optimal solution of prices and quality levels. The IC for low

type (37) will not binding because low types have no incentive to pretend to be

high types: they would pay a high price for quality they do not value highly. On

the other hand high type consumers’ IR (34) will not be binding either because

we argued above that the firm has to incentivize them to pick the high quality

contract. This leaves constraints (35) and (36) as binding and we can solve for the

optimal prices

pl = θlql

using constraint (35) and

ph = θh(qh − ql) + θlql

using constraints (35) and (36). Substituting the prices into the profit function

(33) yields

max

qh,ql

α [θh(qh − ql) + θlql − c(qh)] + (1 − α) (θlql − c(ql)) .

The FOC for qh is simply

α (θh − c

0

(qh)) = 0,

which is identical to the FOC in the first best case. Hence, the firm offers the high

type buyers their first best quality level q

∗

R(θh) = q

∗

(θh). The FOC for ql

is

α(θl − θh) + (1 − α) (θl − c

0

(ql)) = 0,

which can be rewritten as

θl − c

0

(ql) −

α

1 − α

(θl − θh) = 0.

69

The third term on the LHS, which is positive, is an additional cost that arises

because the firm has to make the low quality contract less attractive for high

type buyers. Because of this additional cost we get that q

∗

R(θl) < q∗

(θl): the

the quality level for low types is lower than in the first best situation. This is

depicted in Figure . The low type consumers’ gross utility and the high type

buyers’ information rent are decreased, but The optimal level of quality offered to

low type buyers is decreasing in the fraction of high type consumer α:

dq∗

R(θl)

dα < 0

since the more high types there are the more the firm has to make the low quality

contract unattractive to them.

This analysis indicates some important results about second degree price discrimination:

1. The low type receives no surplus.

2. The high type receives a positive surplus of ql(θh − θl). This is known as an

information rent, that the consumer can extract because the seller does not

know his type.

3. The firm should set the efficient quality for the high valuation type.

4. The firm will degrade the quality for the low type in order to lower the rents

the high type consumers can extract.

19 Auctions

Auctions are an important application of games of incomplete information. There

are many markets where goods are allocated by auctions. Besides obvious examples such as auctions of antique furniture there are many recent application. A

leading example is Google’s sponsored search auctions. Google matches advertiser

to readers of websites and auctions advertising space according to complicated

rules.

Consider a standard auction with I bidders, and each bidder i from 1 to I has a

valuation vi

for a single object which is sold by the seller or auctioneer. If the bidder

wins the object at price pi then he receives utility vi − pi

. Losing bidders receive

a payoff or zero. The valuation is often the bidder’s private information so that

we have to analyze the uncertainty inherent in such auctions. This uncertainty is

captured by modelling the bidders’ valuations as draws from a random distribution:

vi ∼ F(vi).

70

We assume that bidders are symmetric, i.e., their valuations come from the same

distribution, and we let bi denote the bid of player i.

There are many possible rules for auctions. They can be either sealed bid or

open bid. Examples of sealed bid auctions are the first price auction (where the

winner is the bidder with the highest bid and they pay their bid), and the second

price auction (where the bidder with the highest bid wins the object and pays

the second highest bid as a price). Open bid auctions include English auctions

(the auctioneer sets a low price and keeps increasing the price until all but one

player has dropped out) and the Dutch auction (a high price is set and the price is

gradually lowered until someone accepts the offered price). Another type of auction

is the Japanese button auction, which resembles an open bid ascending auction,

but every time the price is raised all bidders have to signal their willingness to

increase their bid. Sometimes, bidders hold down a button as long as they want

to increase their bid and release when they want to exit the auction.

Let’s think about the optimal bidding strategy in a Japanese button auction,

denotes by bi(vi) = ti

, where ti = pi

is the price the winning bidder pays for the

good. At any time, the distribution of valuations, F, the number of remaining

bidders are known to all players. As long as the price has not reached a bidder’s

valuation it is optimal for him to keep the button pressed because he gets a positive

payoff if all other players exit before the price reaches his valuation. In particular,

the bidder with the highest valuation will wait longest and therefore receive the

good. He will only have to pay the second highest bidder’s valuation, however,

because he should release the button as soon as he is the only one left. At that

time the price will have exactly reached the second highest valuation. Hence, it

is optimal for all bidders to bid their true valuation. If the price exceeds vi they

release the button and get 0 and the highest valuation bidder gets a positive payoff.

In other words, the optimal strategy is

b

∗

i

(vi) = vi

.

What if the button auction is played as a descending auction instead? Then it

is no longer optimal to bid one’s own valuation. Instead, b

∗

i

(vi) < vi because only

waiting until the price reaches one’s own valuation would mean that there might

be a missed chance to get a strictly positive payoff.

In many situations (specifically when the other players’ valuations does not

affect your valuation) the optimal behavior in a second price auction is equivalent

to an English auction, and the optimal behaviour in a first price auction is equivalent to a Dutch auction. This provides a motivation for considering the second

price auction which is strategically very simple, since the English auction is commonly used. It’s the mechanism used in the auction houses, and is a good first

approximation how auctions are run on eBay.

71

How should people bid in a second price auction? Typically a given bidder will

not know the bids/valuations of the other bidders. A nice feature of the second

price auction is that the optimal strategy is very simple and does not depend on

this information: each bidder should bid their true valuation.

Proposition 3. In a second price auction it is a Nash Equilibrium for all players

to bid their valuations. That is b

∗

i = vi

for all i is a Nash Equilibrium.

Proof. Without loss of generality, we can assume that player 1 has the highest

valuation. That is, we can assume v1 = maxi{vi}. Similarly, we can assume

without loss of generality that the second highest valuation is v2 = maxi>1{vi}.

Define

µi(vi

, bi

, b−i) =

vi − pi

, if b1 = maxj{bj}

0, otherwise

to be the surplus generated from the auction for each player i. Then under the

given strategies (b = v)

µi(vi

, vi

, v−i) =

v1 − v2, i = 1

0, otherwise

So we want to show that no bidder has an incentive to deviate.

First we consider player 1. The payoff from bidding b1 is

µ1(v1, b1, v−1) =

v1 − v2, if b1 > v2

0, otherwise

≤ v1 − v2 = µ1(v1, v1, v−1)

so player 1 cannot benefit from deviating.

Now consider any other player i > 1. They win the object only if they bid more

than v1 and would pay v1. So the payoff from bidding b1 is

µi(vi

, bi

, v−i) =

vi − v1, if bi > v1

0, otherwise

≤ 0 = µi(vi

, vi

, v−i)

since vi − v1 ≤ 0. So player i has no incentive to deviate either.

We have thus verified that all players are choosing a best response, and so the

strategies are a Nash Equilibrium.

Note that this allocation is efficient. The bidder with the highest valuation

gets the good.

Finally, we consider a first price sealed bid auction. There, we will see that it

is optimal for bidders to bid below their valuation, b

∗

i

(vi) < vi

, a strategy called

bid shedding. Bidder i’s expected payoff is

max

bi

(vi − bi) Pr(bi > bj

for all j 6= i) + 0 Pr(bi < max{bj} for all j 6= i). (38)

72

Consider the bidding strategy

bi(vi) = cvi

i.e., bidders bid a fraction of their true valuation. Then, if all players play this

strategy,

Pr(bi > bj ) = Pr(bi > cvj ) = Pr

vj <

bi

c

. (39)

With valuations having a uniform distribution on [0, 1], (39) becomes

Pr

vj <

bi

c

=

bi

c

and (38) becomes

max

bi

(vi − bi)

bi

c

+ 0

with FOC

vi − 2bi

c

= 0

or

b

∗

i =

vi

2

.

Hence, we have verified that the optimal strategy is to bid a fraction of one’s

valuation, in particular, c =

1

2

.

73

The price is based on these factors:

Academic level

Number of pages

Urgency

Basic features

- Free title page and bibliography
- Unlimited revisions
- Plagiarism-free guarantee
- Money-back guarantee
- 24/7 support

On-demand options

- Writer’s samples
- Part-by-part delivery
- Overnight delivery
- Copies of used sources
- Expert Proofreading

Paper format

- 275 words per page
- 12 pt Arial/Times New Roman
- Double line spacing
- Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Delivering a high-quality product at a reasonable price is not enough anymore.

That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more