Identification of dynamic models with aggregate shocks ...

Viewer
Transcript

Identification of dynamic models with aggregate shocks with an application to mortgage default in Colombia Juan Esteban Carranza Salvador Navarro∗ May 23, 2011

Abstract We describe identification conditions for dynamic discrete choice models that include unobserved state variables that are correlated across individuals and across time periods. The proposed framework extends the standard literature on the structural estimation of dynamic models by incorporating unobserved serially correlated common shocks. The shocks affect all individuals’ static payoffs and the dynamic continuation payoffs associated with different decisions. We show that even in the simple binary choice optimal-stopping problem we study, the model is not identified when only market level data are used. That is, the unobserved aggregate states and their transition are only identified as long as there is cross-sectional variation in the observed states. We use the framework to estimate a model of mortgage default for a cohort of Colombian debtors between 1997 and 2004. Finally, we use the estimated model to study the effects on default of a class of policies and shocks that affected the evolution of mortgage balances in Colombia during the 1990’s. ∗

ICESI (Colombia) and University of Western Ontario, respectively. We thank Gadi Barlevy, Mariacristina De Nardi, Amit Gandhi, Jean-Fracois Houde, Alvin Murphy, Krishna Pendakur, Chris Taber, Ken Wolpin and seminar participants at the Federal Reserve Bank of Chicago, Penn, Simon Fraser, Wisconsin, WUSTL, Yale, the 2009 North American meetings of the Econometric Society, the 2010 meeting of the SED and the 2010 Econometric Society World Congress. Direct correspondence to Carranza: [email protected] and Navarro: [email protected]

1

1

Introduction

In this paper we develop a framework for identifying dynamic structural models, under the presence of unobservable states that are both correlated across individuals and over time. The correlation is caused by unobserved common states that are serially correlated.1 The early literature on the estimation of dynamic discrete choice models was based on the assumption that all the unobserved heterogeneity is independent across individuals and over time.2 More recent papers incorporate unobserved states that vary systematically across individuals but stay constant (as in Keane and Wolpin, 1994), are independent over time (as in Arcidiacono and Miller, 2008) or are correlated over time but independent across individuals (as in Erdem, Imai, and Keane, 2004 or Norets, 2009).3 The literature on the estimation of dynamic structural models with correlated unobserved common states (or shocks) is scarce. In the approach proposed by Altug and Miller (1998), the structure of the aggregate shocks is estimated separately and used as input in the dynamic model. Such approach is only practical when the aggregate shocks can be estimated from a separate model (e.g. a macroeconomic model or Euler equations). A closer paper to ours is Lee and Wolpin (2006), in which the aggregate shocks and their transition are computed throughout the estimation algorithm using a general equilibrium model, which is computationally demanding. We prove that, even in a simple binary choice optimal-stopping model, it is not possible to identify the unobserved common states separately from their transition from market level data alone. This is important because in the Industrial Organization literature static discrete choice models of demand are identified using only market level data. Our result implies that extending this approach to dynamic models will require either micro level data or additional restrictions. We show how to incorporate, identify and estimate unobserved common states that generate correlation both in the cross section and over time, using a standard micro data set. In contrast to Altug and Miller (1998) and Lee and Wolpin (2006), we show that estimating a dynamic model with aggregate unobserved shocks does not require the solution of an aggregate model. The identification and estimation of the common correlated states exploits the variation of the observed aggregate behavior implied by the microeconomic model, which is a piece of information that is not used directly by the existing literature. Our specification of the dynamic model is based on a Markovian decision problem with 1

For simplicity, we refer to these unobserved correlated common states as aggregate shocks. Early papers include Rust (1987), Wolpin (1984), Pakes (1986) and Hotz and Miller (1993). 3 For a comprehensive review of the literature see, for example, Aguirregabiria and Mira (2002). 2

2

finite horizon. Each period’s payoff depends on observed and unobserved state variables that vary systematically across individuals. We show that in this particular formulation of the dynamic model, the micro data contain enough information to infer the aggregate shocks and their transition separately. The aggregate shocks and their transitions can be separately identified if and only if there is individual-level variation on the observables. We use the framework to estimate a dynamic mortgage default model using micro-level Colombian data spanning the years between 1997 and 2004. During this time mortgage default rates in Colombia were unusually high, due to an unprecedented economic downturn that was accompanied by a dramatic fall in home prices. The extent to which the fall in household incomes, the fall in home prices, and the fall in equity contributed separately to the unprecedented rates of default is a relevant policy question that can be answered with the proposed model. In addition, we use the estimated model to evaluate the impact of counterfactual policies, which cannot be evaluated with a model that does not account for the dynamic concerns of debtors. We show that, in the context of our data, the expectations of individuals regarding the evolution of equity had a substantial impact on default behavior. Moreover, we show that neither the level, nor the expected evolution of income contributed significantly to default. In the next section of the paper we describe our methodological framework. We formulate an optimal stopping problem with unobserved heterogeneity, describe our estimation approach, and discuss the identification of the different components of the model. In Section 3 of the paper we present the application of the model to the Colombian mortgage market. We describe the data, the estimation and the results. We then perform counterfactual simulations to illustrate the importance of income shocks, an evaluate the impact of the policies adopted by the Central Bank and the Colombian government in the mid-1990s. The paper concludes with a discussion of the limitations of the proposed framework.

2 2.1

The methodological framework A generic optimal stopping problem

Consider the standard optimal stopping problem of an individual i ∈ {1, ..., N } at period t who has to choose between two actions di,t = j : “stopping” (j = 0), which is an absorbing state, or “continuing” (j = 1) to face the same decision next period. For simplicity, we assume a finite horizon Ti which may be different across individuals. For example, in our aplication Ti is different across individuals, because mortgages have different term lengths 3

across debtors. Each choice generates a static payoff ui,j,t = uj (Xi,t ) + εi,j,t . The payoff consists of a component uj (Xi,t ) that depends on a vector of observable (to the econometrician) states Xi,t , and of an additive unobserved (to the econometrician) state variable εi,j,t that may be correlated across individuals and time periods. We assume that the vector of observed states Xi,t follows a first order Markov process. This process is assumed to be independent of the contemporaneous unobserved states, a common assumption in the literature. That is, if we let Λ(Xi,t |.) be the conditional cdf of X, then Λ(Xi,t |Xi,t−1 , εi,0,t , εi,1,t ) = Λ(Xi,t |Xi,t−1 ). The unobserved states {εi,0,t , εi,1,t } are also assumed to be Markovian as described below. Hence, S˜i,t = {Xi,t , εi,0,t , εi,1,t } is the set of decision-relevant state variables for individual i at time t. Consider the Bellman representation of the problem of individual i who, as of time t − 1 < Ti − 1, has not chosen j = 0 in the past: h i V˜t (S˜i,t ) = max{u1 (Xi,t ) + εi,1,t + βE V˜t+1 (S˜i,j,t+1 )|S˜i,j,t , u0 (Xi,t ) + εi,0,t },

(1)

where β is an assumed known exogenous discount rate. To simplify notation we assume that once an individual stops, he gets the static payoff of stopping and nothing else.4 At t = Ti the continuation payoff of the problem is zero, so that: E[V˜Ti +1 (S˜i,j,Ti +1 )|S˜i,j,Ti ] = 0.

(2)

It has been shown before that this model is generically not identified non-parametrically even with uncorrelated unobserved states.5 Therefore, the mapping of the model into data has to be based partly on assumptions on the net observable utility, u (Xi,t ) = u1 (Xi,t ) − u0 (Xi,t ) , and/or on the stochastic properties of the unobserved states ε. In order to allow for a rich pattern of unobserved correlation, we decompose the unobserved states as follows: εi,1,t − εi,0,t = ξt + µi + i,t ,

(3)

where i,t is an idiosyncratic random variable distributed iid across individuals, choices and 4

Since j = 0 is an absorbing state, this is equivalent to simply redefining the reward function to be the expected present value of continuing in state j = 0 until Ti . 5 Rust (1994); see also Taber (2000) and Heckman and Navarro (2007) for conditions under which these models are semiparametrically identified.

4

time periods. The term µi , which we call individual heterogeneity, is an individual-specific unobserved state that stays constant over time. It is assumed to be distributed among the population of individuals according to a distribution Υ(µ). The term ξt , which we will refer to as the aggregate shock, is a common unobserved state assumed to follow a first order Markov process. Both the distribution of the individual heterogeneity and the transition of the aggregate shocks have to be estimated simultaneously with the whole model. Under this specification individual choices are correlated over time and across individuals, even after conditioning on the observed states. In addition, the unobserved heterogeneity can be allowed to depend on Xi,t , which would be equivalent to a parametric model with heterogeneous coefficients. The model is similar to standard dynamic discrete choice models, except for the presence of the shock ξt which is allowed to be serially correlated. The importance of including this form of heterogeneity is that it permits individual choices to be correlated (in unobservable ways) in a given cross-section (since all individuals face the same shock), allowing this correlation to persist over time. We will refer to these shocks as aggregate shocks, but they can be understood more generally as the common component of the unobserved heterogeneity. The model we specify nests the standard models in the literature. Specifically, if we set µi = ξt = 0, all the unobserved heterogeneity in the model is iid and the model is similar to the models in Rust (1987), Wolpin (1987) and Hotz and Miller (1993). If we assume away the aggregate shocks so that ξt = 0, but account for a correlated individual shock µi 6= 0 the model is similar to Keane and Wolpin (1994). In addition, if we allow µi to depend on X, the model is equivalent to a model with random coefficients. In contrast to the models by Altug and Miller (1998) and Lee and Wolpin (2006), we do not specify where the aggregate shocks come from. In Section 2.3 we show that it is not necessary to do so; micro data alone is enough to identify the aggregate shocks and their transition separately. In a general equilibrium setup, the specification of a model for the determination of the aggregate shocks ξ and their transition would be necessary for the computation of counterfactual equilibria, but not for the estimation of the model. Let Si,t = {Xi,t , µi , ξt } be the the set of state variables excluding the idiosyncratic iid error. Define the expected value function as the expectation of the value function in (1) with respect to the idiosyncratic iid shock: Vt (Si,t ) = E

5

˜ Vt (Si,t , i,t ) .

(4)

Conditional on survival, the predicted probability that individual i chooses j = 1 at time t is given by: P ri,t (Si,t ) = P rob [i,t < u(Xi,t ) + µi + ξt + βE [Vt+1 (Si,t+1 )|Si,t ]] ,

(5)

where the continuation payoffs correspond to the conditonal expectation of (4). Notice that this probability is not observed by the econometrician, since it depends on the realization of both the individual heterogeneity µi and the aggregate shock ξt . Given (5), let P˜ri denote the probability of an individual history, which can be computed as the product of probabilities over the given sequence of choices, conditional on the realization of the individual heterogeneity and the aggregate shocks: P˜ri (Si ) =

T¯i Y

P ri,t (Si,t )di,t [1 − P ri,t (Si,t )] (1−di,t ) ,

(6)

t=1

where Si = {Si,t=1,...,T¯i } is the matrix of states and T¯i is the last time period at which an individual is observed making choices. In other words, T¯i is either the time when individual i first chooses di,t = 0, or the final period Ti if he always chooses di,t = 1.

2.2

Recovering the aggregate shocks

Before discussing identification, we first consider the problem of recovering the aggregate shocks, taking as given all remaining aspects of the model. That is, we provide conditions under which the aggregate shocks can be inferred from data on individual choices without the specification of an aggregate model. In Section 2.3 we discuss which of the remaining aspects of the model can potentially be recovered non- or semi-parametrically. Assume that we have a population of i = 1, ..., N individuals. For notational convenience, we assume that all individuals start to solve the problem simultaneously but then have potentially different non-random problem horizons Ti . The individuals are observed solving the described optimal stopping problem during a sequence of T¯ = max{T¯1 , ..., T¯N } time periods, where T¯i ≤ Ti . In addition to the time horizons, the econometrician observes states Xi = {Xi,1 , ..., Xi,T¯i } and decisions di = {di,1 , ..., di,T¯i }. Given the assumed exogeneity of the observed states, their transition can be recovered directly from the data.

6

The likelihood, as a function of the aggregate shocks, is thus given by:

`(ξ) =

=

N ˆ Y i=1 N ˆ Y

"

T¯i Y

# P ri,τ (Si,τ )di,τ (1 − P ri,τ (Si,τ ))1−di,τ dΥ (µ)

τ =1

P˜ri (µ, Xi , ξ)dΥ(µ),

(7)

i=1

where the choice probabilities are integrated with respect to the initial distribution of the individual heterogeneity Υ(µ). Consider recovering the vector of aggregate shocks, ξ, using (7) and taking all the other elements of the model as given. Depending on the case maximizing the likelihood can be difficult, especially if the number of periods T¯ is large. Moreover, because the aggregate shocks enter both the static payoffs and the continuation payoffs, it is not clear whether the maximization problem has a unique solution. The key insight for the identification results we derive is that the properties of the aggregate shocks can be inferred from the observed aggregate behavior. Let the survival probQ −1 P ri,t (Si,t ), where ξ<τ = ability of an individual i up until time τ be Gτ (., ξ<τ ) = τt=1 {ξ1 , ..., ξτ −1 }. Let also F (x) be the distribution of observed states. In the following lemma we show that, at the population level, the predicted share of individuals who choose each alternative is identical to the observed shares. Lemma 1. Consider the model described by the likelihood function (7). Given the remaining elements of the model, the following condition holds: ˆ st =

P ri,t (x, µ, ξt ) ´

Gt (x, µ, ξ
(8)

where st is the observed “share”, i.e. the proportion of individuals choosing di,t = 1 at each time t among active agents. Proof. See Appendix 1. The proof of the lemma follows directly from the first order conditions of the likelihood maximization problem with respect to the aggregate shocks. Since it is an identification argument, it relies on equating observed densites to their population densities. Hence, using sample probabilities when evaluating the right hand side of (8) leads to consistent but potentially inefficient estimates of ξt . At the same time, the use of sample shares as opposed to population ones on the left hand side of (8) may lead to biased estimates. 7

Roughly speaking, this lemma implies that the aggregate shocks ξ can be inferred from the aggregate choice behavior, provided that the transition of the unobserved aggregate shocks ξ is known. An implication of the lemma is that, when the population shares st are known exactly so that the data set is a combination of micro-level and market-level information, Lemma 1 can be used to “concentrate out” the estimation of the aggregate shocks ξ from the estimation algorithm using the aggregate choice probabilities. Specifically, at each time t the model generates a vector of aggregate predicted shares s˜t (ξ). If the model is correctly specified, (8) must hold: st = s˜t (ξ) ∀t.

(9)

The expression in (9) generates a system of T¯ non-linear equations, so that ξ can be solved for directly under the conditions that we discuss below. Equation (9) is similar to the market shares equations used to estimate demand models for differentiated goods, as in Berry, Levinsohn, and Pakes (1995). In these models the aggregate shocks ξ correspond to the unobserved product attributes common to consumers who purchase the same good. Lemma 1 suggests that if demand is dynamic (e.g. if goods are durable or storable), the common shocks could still be inferred from market-level shares. The only difference is that the market shares in the dynamic model include the continuation payoffs associated with each choice. Notice, however, that this result is obtained under the assumption that the transition of the aggregate shocks is known. As we show below, when only market level data are observed (as in BLP), one cannot separate ξ from their transition. To the best of our knowledge, the literature has not yet established general conditions under which this class of models are identified. In the context of our model, identification requires that the vector ξ 0 that solves (9) be always defined. Moreover, the vector ξ 0 should be unique, at least locally, around the true vector ξ ∗ . The following lemma establishes sufficient conditions under which the solution to (9) exists and is unique. The proof of this lemma, shown in Appendix 1, relies on the monotonicity of the average predicted default rates (8) with respect to the aggregate shock. Lemma 2. Assume that (i) Et [ξt+1 |ξt ] = h(ξt ) and that (ii) h0 (ξt ) > −(1/β) for all ξt . Then, the system of T¯ equations implied by st = s˜t (ξ) for t = 1, ..., T¯ has a unique solution ξ. Proof. See Appendix 1. The conditions for the lemma to be true are far from necessary. They are sufficient to guarantee that (9) is bounded and monotone for every point on the support of X and µ. 8

They imply restrictions that are usually natural in empirical environments. For example, if the aggregate shocks follow a linear autoregressive process, a sufficient condition for the lemma to hold is that the process be stationary. Lemmas 1 and 2 will be used to show our main identification result in section (2.3). In practice they imply that the model can be estimated using standard techniques. Moreover, if there is aggregate information on the market-level shares of each decision, then the estimation of the aggregate shocks ξ can be concentrated out from the likelihood maximization algorithm. In this case, (9) would be used as a separate restriction, thereby reducing the computational dimension of the estimation algorithm. Specifically, under a parametric specification of the model, it can be estimated maximizing the likelihood (7) over the parameters θ, solving for ξ from (9) along the estimation algorithm: maxθ `(θ, ξ(θ)),

(10)

where ξ(θ) solves (9).

2.3

Identification of the model

We now discuss the conditions under which the identification of the model is possible. The main problem lies in the separate identification of the aggregate shocks and their transition. Our main contribution is to show that identification from choice data is not possible when only market-level data are available. When micro-level information is available so that there is variation in the observed states across individuals, identification follows. Except for the presence of the aggregate shocks ξ and their transition probabilities, the choice probabilities in (5) are similar to the choice probabilities in standard empirical dynamic models with unobserved heterogeneity. Therefore, the identification of the utility function and the distribution of µ are based on similar arguments as in the standard literature. We provide a brief discussion of their identification and then discuss in detail the identification of the aggregate shocks ξ and their transition probabilities. As pointed out by Taber (2000) and Heckman and Navarro (2007), the finite horizon of the problem facilitates the nonparametric identification of dynamic discrete choice models. We briefly describe next how their argument works. Notice that at Ti the continuation payoffs of the problem are zero. Therefore, the probability that individual i chooses di,t = 1,

9

obtained from (5), P ri,Ti (Si,t ) = P rob (u(Xi,Ti ) + ξTi + (µi + i,Ti ) > 0) ,

(11)

does not contain a continuation value. In this terminal period, ξTi is simply the constant in the model for individuals sharing the same terminal period Ti . In limit sets, where one can control for the dynamic selection (survival up to Ti ), one can use standard arguments to identify the utility function u (.), the constant ξTi and the nonparametric distribution of (µi + i,Ti ). For example, if utility is linear the arguments in Manski (1988) can be used for identification. Alternatively if the utility function belongs to the "Matzkin" class of functions, then the analysis in Matzkin (1992) would allow us to recover the utility function nonparametrically.6 Since Ti represents different periods for different individuals, one can identify the distribution of (µi + i,Ti ) looking separately at groups of individuals with different terminal periods Ti . From the repeated observations of the marginal distribution of (µi + i,Ti ) over time, one can use deconvolution arguments (Kotlarski, 1967) to recover the distribution of µi . The novel part of this paper is the separate identification of the aggregate shocks from their transition. Intuitively, the identification of the aggregate shocks comes from the variation in the data of the aggregate behavior, a feature which is not fully exploited in the standard literature. In practice, our estimation approach is equivalent to a standard estimation of a Markovian decision model, with the “addition” of the “aggregate” restriction (9), which directly identifies the aggregate shocks. The separate identification of the levels ξ of the aggregate shocks and their transition probabilities needs to be explained in detail. From inspecting (5) it can be seen that both the aggregate shocks ξ and their transition probabilities enter the continuation payoffs. Moreover, since ξ enters additively the flow utility, it can potentially happen that changes in ξ are offset by changes in their expected continuation values and hence generate identical predictions, so that they would not be separately identified. We have two sources for the separate identification of these two sets of unobservables. On one hand, notice from (11) that as we go over groups of individuals with different terminal periods {T1 , ..., TN }, the transition probabilities for the aggregate shocks do not enter the choice probabilities. Therefore, the aggregate shocks are identified from the constant of the binary choice problem. If we observe individuals who face their terminal period at each time 6

We describe these conditions in Appendix 3.

10

period of our sample, ξ will be identified. Since we can identify ξ for different periods we can, in principle, recover their transition probabilities, f (ξt |ξt−1 ), nonparametrically in the domain of the recovered ξ. The second, and more general, source of identification comes from the choice probabilities themselves. Assume that the Markov process of the aggregate shocks can be characterized by a finite parameter vector ρξ . We show that ξ and ρξ will be separately identified, even in a sample of individuals who all face the same terminal period and/or a very short panel. To see this notice that for all t, at any value ρξ of the transition parameters, our estimation algorithm looks for the unique vector ξ 0 that satisfies (9) which we can rewrite as follows: ˆ st =

P ri,t (x, µ, ξt0 ; ρξ ) ´

0 ; ρξ ) Gt (x, µ, ξ
f or

t = 1, ...T¯. (12)

Consider trying to estimate the model from market-level data alone by matching the predictions of the model to the observed data. For any value of ρξ Lemma 2 shows that we can find a unique vector of ξ that solves (12) exactly. But since no additional data are available, as we change ρξ , the algorithm will simply find other vectors of ξ that match the data perfectly and the model would not be identified. This is the main difficulty for the separate identification of the shocks and its transitions. According to Lemmas 1 and 2, for any value of ρξ that satisfies the conditions in Lemma 2, a corresponding vector of aggregate shocks ξ that uniquely solve the aggregate condition can be found. Hence, the aggregate condition (12) –the data commonly employed for demand estimation in industrial organization– does not contain enough information to separately identify them without further restrictions. In Proposition 1 below we formally show that access to individual level data allows us to separately identify ξ from its transition ρξ . Intuitively, as we change ρξ , ξ is uniquely determined by (12). Hence the “average choice probability” in (12) does not change. The sample likelihood, however, depends on the product of the individual choice probabilities. Hence, provided that there is individual variation in the observed states, the value of the likelihood will necessarily change as ρξ and ξ change and the model is (at least locally) identified. Proposition 1. Take the utility function, the transition of the observed states and the distribution of µ as given from our arguments before. Consider the model with likelihood given by (7) as a function of the aggregate shocks and their transition. Assume that the conditions 11

in Lemmas 1 and 2 hold. Let ρξ∗ and ξ ∗ be the true value of the aggregate shocks and their transition parameters. Then the parameter vectors ρξ and ξ are generally identified around ρξ∗ and ξ ∗ if and only if there exist individuals i and i0 such that Xi 6= Xi0 . Proof. See Appendix 1. The proposition establishes the identification of ρξ , conditional on the utility function and the distribution of individual unobserved heterogeneity, whose identification was explained before. Let ρˆξ and ξˆ be the estimates of ρξ and ξ obtained from the model. The proposition implies that the identification of ρξ is formally independent from the identification of ξ. Therefore, if one were to use ξˆ to estimate ρξ , the result might be substantially different from ρˆξ , especially in short panels. Additional restrictions can be added to (10) to guarantee the consistency of the transition implied by ξˆ and the one estimated from the choice probabilities, ρˆξ which might be desirable in long panels. More importantly, however, the result also implies that the choice probabilities contain enough information to distinguish the perception of individuals about how the aggregate shocks evolve from the actual transition implied by the realized ξ. The identification of the model from individual level data, while not trivial, is not completely surprising. The more important result is the non-identification of the model without micro level data (i.e. with no individual variation in the choice probabilities). There is a growing literature on the estimation of structural dynamic models of demand using marketlevel data. Our result highlights the identification limits for this general class of models and the need for imposing additional restrictions on the unobservables.7 Given that our model is a partial equilibrium model, the aggregate shocks and their transition are taken as given. They are identified from the micro-data, regardless of the aggregate model that generated them. Depending on the case, in order to compute counterfactual equilibria, the specification of a macroeconomic model tying together the determination of the aggregate shocks and the observed states might be necessary (as in Lee and Wolpin, 2006). Our result implies that such computation is not necessary for the estimation of the model, but only for the simulation of counterfactual equilibria after the model is estimated. 7

For example, the papers by Carranza (2007) and Gowrisankaran and Rysman (2006) estimate demand for durable goods using market-level data imposing additional restrictions on the distribution of the common shocks. In the paper by Hendel and Nevo (2006) the demand for a storable good is estimated with micro level data and additional structural assumptions.

12

3

An application to the Colombian mortgage market

We now present an empirical dynamic model of mortgage default that we estimate with Colombian data. The data cover the years 1997 to 2004, during which the Colombian economy experienced an economic and financial turmoil. Moreover, during this time mortgage default rates were very high and the mortgage financing system was put under considerable stress. We follow our framework and model mortgage default as an optimal stopping problem, in which debtors choose each period whether to default or to keep on making the mortgage payments. We presume that default behavior was affected by unobserved common shocks related to the macroeconomic environment, which are difficult to capture entirely with a set of observable variables. Therefore, including them in the model is crucial. To understand the roots of the high default rates observed in Colombia in these years, in the next subsection we describe the history and some institutional details of the Colombian mortgage financing system. Next, we describe the data. In the following sections, we describe our model and its estimation. Finally, we present some counterfactual simulations that evaluate the impact of policy decisions and income shocks on observed default.

3.1

The Colombian mortgage financing system during the 1990’s

The Colombian mortgage financing system was established in the 1970’s. Its centerpiece was the set of mortgage banks whose only purpose, by law, was to fund housing construction projects. In order to guarantee enough funding, these banks were the only institutions allowed by the government to issue interest-bearing savings accounts.8 In addition, mortgage loans were denominated in a constant value unit called “UPAC”,whose value changed over time according to a regulated rate.9 This rate, called the “monetary correction”, was set each month by the Central Bank and was supposed to reflect the inflation rate. The goal of the UPAC was to protect banks and debtors against inflation risk and to facilitate the long-run financing of housing projects. Each month, debtors had to pay a portion of the outstanding balance of their debt. The remaining balance was updated according to the “monetary correction”. In addition, each month debtors made an interest payment on the balance. This additional interest rate was fixed for the lifetime of the loan and was not set on a debtor-by-debtor basis. Instead, before individual homes were sold, it was negotiated between the mortgage bank and the developer 8 9

Regular commercial banks had exclusive rights to issue checking accounts bearing no interest. UPAC stands for Unidad de Poder Adquisitivo Constante: Constant Purchasing Power Unit.

13

of the construction of the housing project. Until the early 1990’s the monetary correction tracked the inflation rate closely. This changed when the government decided to liberalize the financial sector and allowed commercial banks to offer savings accounts, which until then could only be offered by the mortgage banks. The government decided to tie the “monetary correction” to a market interest rate, which in practice meant that interest was added over time to the balance of the debts.10 During these years, the Colombian exchange rate was fixed and the interest rate was low. Then, in the mid 1990’s the country experienced substantial capital outflows, as did many other emerging economies. Interest rates increased to unprecedented levels. As home prices and household incomes fell, mortgage balances, that were now tied to the interest rate, ballooned. By the end of the decade, and due to the default rates observed in the data, mortgage financing in Colombia came to a halt and was only reestablished several years later under a different regulatory framework. One of the key policy questions raised by the 1990’s housing crisis is the extent to which the observed default rates were caused by the change in mortgage financing policies, and the extent to which they were a consequence of the fall in income. Our model allows us to measure the effect of changes on each variable on the default probabilities. Moreover, it permits the simulation of the effect of counterfactual policies.

3.2

Description of the data

Our analysis is based mainly on a data set containing information on a number of random mortgages that were outstanding between 1997 and 2004. We observe the term length, monthly payment history of each mortgage, and the original and current value of the mortgaged home. The total number of loans contained in the main data set is 16,000. Nevertheless, this set of mortgages includes loans that started at many different points in time, most of them before 1997. To avoid the problem of length-biased sampling, we work only with those loans started in or after the year 1997.11 After eliminating from our sample those loans with incomplete or inconsistent payment histories, there is a total of 2,486 loans, which are observed from the time they start in 1997/1998 until 2004.12 10

The rationale behind this measure was to compensate the mortgage banks for the increased competition for deposits among financial institutions. It was also meant to help the mortgage banks fund their outstanding long-run mortgage liabilities. 11 See Heckman (1987) for an explanation of the problems associated with a length-biased sample, which is also known as “sampling from a stock”. 12 For a total of 14,250 observations. For a study of the default behavior observed in the whole sample see Carranza and Estrada (2007).

14

The data set contains the price of each home as reported by the bank at the time the loan started. We use regional housing price indices constructed by the Colombian Central Bank to update the expected prices of individual homes at any point in time. In addition, all data is aggregated into quarters, so that default observations are not confounded with missed payments or coding errors. All variables are expressed in constant 1997 real Colombian pesos. In the data it is observed that some debtors stop making their payments, sometimes only temporarily and sometimes definitively. We assume that loans that accumulate past due payments of more than 3 months are defaulted. Default is thus defined as the event in which the number of past due payments in a loan history changes from 3 or less to more than 3. After a loan is defined to be defaulted, it is dropped from the sample.13 Table 1 contains summary statistics of the main data set.14 The number of loans in the data set increases during the first four quarters of 1997 as new loans are initiated, until reaching 2,486 which is the total number of loans in the sample. As a reflection of the high number of defaults observed in the sample, the total number of non-defaulted loans (shown in column 3) decreases gradually over time after 1998. The default rate is shown in column 4. It is defined as the number of defaults per quarter over the total number of outstanding loans. It reaches its peak at a rate higher than 6% during the fourth quarter of 1999. By the end of the sample, more than half of the loans were defaulted. To give a sense of the characteristics of the defaulted loans, we show the average price of homes with outstanding loans (column 5) and the average price of all homes in the sample (column 6). Notice that up until the middle of 1999, the average price of homes with outstanding mortgages was similar to the average price of all homes in the sample. This suggests that until then defaults were spread across homes of all prices. After 1999, the price of homes with outstanding loans was lower than the average price of all homes in the sample, implying that later defaults were concentrated among mortgages of relatively more expensive homes. Since debtors are not required to report their income to the banks over time, the data set contains no information on the income of debtors. This is a common shortcoming among similar datasets since, in general, banks do not collect information on income of debtors over 13

The default rate based on this definition is highly correlated with default rates based on longer default periods. The 3-month threshold was chosen because it coincides with the regulatory threshold that forces banks to set aside funds to cover the likely loss generated by the default. 14 Since default is inferred from the change in the number of past due mortgage payments, no default is reported during the first period of the sample.

15

time. In order to control for the unobserved variation in income, we obtained survey data from the Colombian national statistics agency (DANE) on income and mortgage payments of a random sample of Colombian households.15 Specifically, we selected households in the sample that reported having a home loan. For each household, we simulate several income draws from the data to integrate out this part of the unobserved heterogeneity (i.e. the unobserved income). The draws are taken from the corresponding quintile of the distribution of income, ordered according to the monthly mortgage payments. We assume that the distribution of income, conditional on monthly payments, matches the distribution of income, conditional on the ratio of balance to remaining term. By comparing estimates of our model with and without income data, we are also able to assess the effectiveness of our model in capturing the effect of unobserved variables on default.

3.3

An empirical model of default

Consider the discrete choice problem of a mortgage debtor who is deciding whether to default or continue making the mortgage payments on his home. The choice of defaulting generates a payoff associated with the increased probability of foreclosure, a more restrictive access to the credit markets in the future, etc. Continuing to make the mortgage payments generates a static payoff associated with the continued enjoyment of the home, plus the option of making the same decision the next period (i.e. the continuation value). We study the behavior of mortgage holders (“debtors”), who live in a mortgaged piece of real estate (“home”). Let t index calendar time, and let Ti be the date at which debtor i’s mortgage ends. The utility that debtor i gets from the home each date t is given by the following function: u˜(qi,t , yi,t , mi,t , εi,t ) = λ0 + λq qi,t + λy yi,t + λm mi,t + εui,t ,

(13)

where qi,t is the subjective home quality, yi,t is the household income and mi,t is the mortgage payments. The state variable εui,t is an unobserved (to the econometrician) state variable, that incorporates unobserved time- and consumer-specific variables that affect default, e.g. the opportunity cost of each debtor’s resources. 15 The data correspond to the quarterly household survey collected by the DANE. The survey collects demographic and economic information of a random sample of households. All households are asked their household income. In addition, once a year they are asked whether they have a mortgage or not and the corresponding monthly payments.

16

Since no home attributes are observed in our dataset, we further assume that the unobserved “quality” of homes qi,t is: qi,t = κ + εqi,t , (14) where εqi,t is a random variable that captures unobserved home attributes and that is potentially correlated over time and across debtors. Any systematic differences in the subjective home quality across debtors will be captured by the correlation structure of the error, which we describe in detail in Section 3.4 below. In our data set we have no information on the required payments mi,t of each debtor. However, it is known that the required payments are a function of mortgage balances bi,t and the remaining term Li,t = Ti − t: mi,t = γ0 + γb bi,t + γL Li,t + εm i,t ,

(15)

where εm i,t is an unobserved random term that captures the differences in the fixed interest rates across mortgages. We assume that “default” leads to an absorbing state. Let Wi,t denote the value for individual i of defaulting on her mortgage at time t. Specifically, the individual may be waiting to see whether the following period she can pay back her dues; she may try to sell the home and cash the difference between price and loan balance; she may let the bank take over the property to cover her obligation; finally, she could also just stop making payments indefinitely and face forfeiture or a renegotiation with the bank. The resulting value of default Wi,t is the weighted sum of payoffs across the random scenarios just described. We assume that Wi,t has the following linear reduced form: Wi,t = ω0 + ωy yi,t + ωπ π ¯i,t + ωb bi,t + εw i,t .

(16)

where π ¯i,t is the expected price of the home at time t, bi,t is the balance of the debt, yi,t is the debtor’s income and εw i,t are other unobserved (to the econometrician) attributes. These variables enter directly the payoffs of the individual scenarios arising after a default decision, as discussed above. u w Group the unobserved components into one error term ε¯i,t = λq εqi,t − λm εm i,t + εi,t − εi,t . Assume that the vector of states variables, S˜i,t = {¯ πi,t , yi,t , bi,t , Li,t , ε¯i,t }, follows a first order Markov process. We can obtain the value of the debtor’s problem at each point in time as a

17

function of variables that can be mapped to the data and to unobserved random variables: V˜i,t (S˜i,t ) =

max default}

{continue,

n o ˜ ˜ ˜ ˜ ˜ u˜(Si,t ) + βE[Vi,t+1 (Si,t+1 )|Si,t ], W (Si,t ) .

(17)

We normalize the continuation payoff of the problem at time Ti to zero: h i E V˜i,Ti +1 (S˜i,Ti +1 )|S˜i,Ti = 0.

(18)

If all the state variables and their transitions are known, this function can be computed recursively starting from the last period. The probability that debtor i does not default at time t is: P rob[˜ u(S˜i,t ) + βE[V˜i,t (S˜i,t+1 )|S˜i,t ] − W (S˜i,t ) > 0]

(19)

h h i i = P rob ζ0 + ζ1 π ¯i,t + ζ2 bi,t + ζ3 Li,t + ζ4 yi,t + ε¯i,t + βE V˜ (S˜i,t+1 )|S˜i,t > 0 where the parameters to be estimated ζ = {ζ0 , ζ1 , ζ2 , ζ3 , ζ4 } are linear combinations of the underlying preference parameters. Since both price π ¯i,t and balance bi,t enter the specification of the model, equity is already implicitly accounted for. In order to analyze in more detail the role played by equity, we also estimate models with an additional effect when equity (¯ πi,t − bi,t ) becomes negative. We do this by estimating specifications of the model that include an additional term ζ5 11(¯ πi,t −bi,t < 0) in (19), where 11(A) takes value one when A is true, and zero otherwise. This specification of the optimal default problem highlights the importance of expectations in determining default decisions. By making mortgage payments a debtor is also purchasing an option to default in the future. The value of the option depends on the expected evolution of the relevant state variables, hence expectations play a key role in default decisions. If the value of the option is high enough, debtors may choose not to default, even if they have negative equity. In other words, even if the value of the underlying asset is below its price, the value of the option associated with being able to choose to default in the future may be enough to compensate for the current equity loss.16 The structural model allows for the evaluation of policies and shocks that cannot be 16

In Appendix 2 we use a very simple 2 period income maximizing model to illustrate the point we are making: that the dynamic nature of the problem introduces an option value that makes negative equity only a necessary but not a sufficient condition for default. Furthermore, as we show, it is easy to design a policy that keeps the value of equity constant but changes the default incentives by changing the timing of payments.

18

evaluated with “reduced form” methods. In particular, we can evaluate policies that affect the expected evolution of the states, but that do not affect their current values.17 It is also important that the model accounts for the potential correlation of the unobserved aggregate shocks affecting everyone’s decisions. As documented in earlier work by Carranza and Estrada (2007), most of the variation of default over time in Colombia cannot be explained by cross-sectional variation in micro-level factors.18

3.4

Estimation

In order to estimate the model, we decompose the unobserved state ε¯i,t as follows: ε¯i,t = ξt + µi + i,t ,

(20)

where i,t is an iid idiosyncratic disturbance, which we assume follows a logit distribution. The term ξt is a common aggregate unobserved state variable with a transition indexed by the vector ρξ = {ρξ0 , ρξ1 , ρξ2 } as follows: ξt+1 = ρξ0 + ρξ1 ξt + ηtξ ,

(21)

where ηtξ is an iid error with a distribution characterized by the parameter ρξη . The states ξ capture common unobserved states, such as the average quality of the housing stock, the average opportunity costs of debtors’ resources and the common perceptions of debtors about the ability of banks to foreclose on their homes.19 The individual-specific state µ is distributed according to a mixture of three normal ¯, σµ2 and wµ are 3 × 1 vectors distributions with parameters Σµ = {¯ µ, σµ2 , wµ }, such that µ containing the means, the variances and the probabilities of each distribution, respectively. We normalize the mean of the mixture to zero and denote this distribution as Υ(µ; Σµ ). The vector Σµ of parameters of the mixture distribution is estimated jointly with the other 17

For example, the introduction of adjustable rate mortgages and other types of “innovative” mortgage contracts in the U.S. introduced a dynamic feature into mortgage contracts. Given that these policies had not been observed in the past, their effect could not be anticipated by reduced form models. 18 Even if one accounts for the presence of aggregate shocks by using time dummies, ignoring the potential serial correlation of the aggregate shocks leads to estimation bias. For example, if individuals expect the unobserved benefits of defaulting to increase over time, they might choose to delay default even if current payoffs are negative. A researcher that ignores such unobserved correlation would then overestimate the current payoffs relative to the continuation payoffs to rationalize this behavior. 19 As we explain below, we estimate models without income. In these cases, ξ would also capture the common component of income across individuals.

19

parameters of the model. We assume that µ is correlated with the initial loan-to-value ratio (LTV) of each loan. LTV is regarded as a good predictor of the risk attitude of debtors in the literature (see Deng, Quigley, and Van Order, 2000 for the description of the standard econometric model of mortgage default). We assume that this underlying correlation is determined by the following loading equation: LT Vi = α0 + αµ µi + νi , (22) where νi ∼ N (0, αν2 ). Let Xi,t = (¯ πi,t , bi,t , Li,t ) contain the observed (to the econometrician) states and let Xt = {X1,t , ..., XNt ,t }. We estimate the transition of Xt directly from the data according to: ln(bi,t+1 ) = ρb0 + ρb1 ln(bit ) + ρb2 Lit + ρb3 L2i,t + ρb4 L3i,t + ρb5 L4it + ηitb ,

(23)

π ln(¯ πi,t+1 ) = ρπ0 + ρπ1 ln(¯ πi,t ) + ηi,t , y ln(yi,t+1 ) = ρy0 + ρy1 ln(yi,t ) + ηi,t , y π b , ηi,t } are iid errors that can be correlated across equations, and ρX = {ρy , ρb , ρπ } , ηi,t where {ηi,t are parameters to be estimated. The transition of bi,t is autoregressive and also depends on the remaining term of the mortgage, reflecting the idiosyncracies of the Colombian mortgage financing system, described in section 3.1. It is estimated using only active mortgages so that it reflects the expected evolution of the balance for debtors who have not yet defaulted. We assume that the innovations of the transitions (23) are independent of the error ε¯i,t , so that they can be estimated separately. The Colombian mortgage dataset we use does not contain income data tracking the evolution of income for individual debtors. Therefore, we treat income as an unobserved state variable. In our simplest models we do not control for income so that it is absorbed by the unobserved states (20). We also estimate models controlling explicitly for the changing variation in income. To do so, we first recover the distribution of income conditional on mortgage payments for every period, Hty (y| Lb ). This distribution is obtained from the additional household survey data containing information on debtors’ income and mortgage payments as described in the data section 3.2. For each individual i, we take 10 income draws from this observed distribution of income Hty (y| Lb ) and take the average likelihood contribution to integrate out the unobserved

20

income by Montecarlo. Let Si,t = {Xi,t , µi , ξt } be the the set of state variables, excluding the idiosyncratic iid error. Define the expected value function as the expectation of the value function in (17) with respect to the idiosyncratic iid shock, conditional on the current states: Vi,t (Si,t ) = E V˜i,t (Si,t , i,t )|Si,t = ln 1 + eu(Xi,t )+ξt +µi +βE[Vi,t+1 (Si,t+1 )|Si,t ] ,

(24)

where the second equality is the standard “social surplus” equation which follows from the logit assumption. For notational convenience, we write the expectation of (24) as a function of the conditioning states as follows: Ψ(Si,t ) = E[V (Si,t+1 )|Si,t ], (25) where the expectation is taken with respect to the dynamic states, given their realization and their transition probabilities. For given state variables and transition probabilities, this value can be computed using standard numerical techniques, starting at the terminal period. Let di,t = 0 and di,t = 1 be the observed choice of individual i at time t when she defaults or not, respectively. Let Ti∗ be either the the time when i defaults, the last period of the mortgage, or the last period at which she is observed. Let θ = {ζ, ρξ , Σµ , α, ξ} be the vector of parameters to be estimated, including the aggregate shocks ξ. Under the given assumptions, the model above generates the following non-default probability for debtor i at time t, conditional on not having defaulted on the mortgage up to t − 1, and conditional on the realization of the random states: P ri,t (θ) := P ri,t (di,t = 1|¯ πi,t , bi,t , Li,t , yi,t , µi ; θ) =

(26)

eζ0 +ζ1 π¯i,t +ζ2 bi,t +ζ3 Li,t +ζ4 yi,t +ζ5 11(¯πi,t −bi,t <0)+ξt +µi +βΨ(Si,t ) , 1 + eζ0 +ζ1 π¯i,t +ζ2 bi,t +ζ3 Li,t +ζ4 yi,t +ζ5 11(¯πi,t −bi,t <0)+ξt +µi +βΨ(Si,t )

where Ψ(Si,t ) is computed using the specified transition probabilities. The likelihood is the product across debtors of individual default/non-default histories, integrated over the distribution of the unobservables:

`(θ) =

Yˆ i∈N

 ∗  Ti Y  P ri,t (θ)di,t 1 − P ri,t (θ)(1−di,t )  dHtY (Y | b )φν (LT Vi − α0 − α1 µ) dΥ(Σµ ) L t (27) 21

where the likelihood accounts also for the LT V loading equation. Estimates of θ are obtained by finding the vector that maximizes (27). Following our identification arguments we could have estimated the model concentrating out the estimation of ξ. However, since maxi {T¯i∗ } = 30 is low enough, we chose instead to maximize over the whole set of parameters, including ξ. Therefore, our estimator is a standard maximum likelihood estimator.

3.5

Computation and results.

We estimate models with explicit income effects and without explicit income effects (ζ4 = 0), and models with and without additional equity effects (ζ5 = 0). The estimation of each model requires the repeated evaluation of the likelihood function (27). For any value of θ along the estimation algorithm, the computation of (27) requires the use of numerical techniques to integrate out the unobserved heterogeneity and to compute the expected value functions. Given any θ and an assumed discount rate β = 0.97, each evaluation of (27) requires the computation of several integrals. When computing the value function, we use a quadrature method to compute the expected evolution of ξ. To compute the expected evolution of the remaining dynamic states π ¯i,t , bi,t and yi,t , we use Montecarlo integration by sampling from the residuals obtained when estimating 23). When computing the likelihood, we integrate out the individual heterogeneity µ using a quadrature method and, when appropriate, Montecarlo integration for income as described above. The computation of the expected value functions (4) is done recursively, starting from the last period. Given the transition of the observed states and the assumed transition of the aggregate shocks, the expected value functions are computed by backwards induction. To ease the computational burden, we calculate the exact expected value functions (25) on a grid for Si,t and interpolate elsewhere. In order to preserve monotonicity of the value function with respect to ξt , we use multilinear interpolation. In the models with income, we simulate income draws from the observed sample of households in the auxiliary survey data. We proceed as follows: for each mortgage i at time t, a set of Ri income draws {Yr,t }r=1,...Ri is obtained from the corresponding quintile of HtY , contained in the survey data. The computation of the likelihood function requires that we compute the whole model for each of the simulated households. We have already pointed out that the estimation of the aggregate shocks can be concentrated out from the estimation of the remaining model parameters. In our case, we have a relatively short sample, and we do not have reliable market-level default rate information. Therefore, we instead directly maximize the likelihood function (27) with respect to 22

all parameters, including the aggregate shocks ξ. On table 2, we show the estimated parameters for the four models. Models I and II are restricted models with no explicit income effects (ζ4 = 0). The difference between Model I and Model II is that Model I does not allow for a direct effect of negative equity on default (ζ5 = 0). Since neither of these two models incorporates explicit income effects, all effects of income are being captured by the unobserved heterogeneity. In particular, the common component of all income shocks and their correlation structure are being captured by the aggregate shocks and their transition. Models III and IV incorporate explicit income effects. Again, the difference between Model III and Model IV is that Model III does not allow for a direct effect of negative equity on default (ζ5 = 0). In these models, the aggregate shocks capture the common components of the unobserved heterogeneity that are orthogonal to income and the other observed states. The comparison of the estimates of the two pairs of models (models I and II vs models III and IV) illustrates the ability of the model to control for the presence of common serially correlated shocks. For each model, we show the estimated coefficients and the estimated marginal effects with their corresponding standard errors. The marginal effects are computed as the average marginal effect across all debtors with a 15 year mortgage, one year after the mortgage started, with the aggregate shock evaluated at its mean. We compute the marginal effect of 10% increases in each of price, balance and income and a one quarter change in term length. In all cases, the marginal effects include the effects of changes in the state variables on the continuation payoffs. We do not show marginal effects for the equity variable, because these are implicitly included in the marginal effects of prices and balances. That is, we cannot change the equity variable without changing the balance or the price. In addition, we show the estimates of the loading parameters α and the parameters of the transition of the aggregate shocks ρξ . As indicated above, the distribution of the individual heterogeneity is assumed to be a mixture of three normal random variables. As a summary measure, on Table 2 we show the variance of the individual heterogeneity var(µ), computed from the underlying distribution estimates. The most salient feature of the results is the fact that the main factors affecting default behavior are home prices and loan balances. The equity indicator has a negative effect on non-default, but its estimate is non significant in the models with income. Moreover, the addition of the negative equity effect has no impact on the estimated marginal effects of either price or balance. A 10% increase in home prices leads to an average increase of around

23

0.26 percentage points in the non-default probability for models without income, and 0.42 percentage points for models with income. Similarly, a 10% increase in loan balances leads to an average decrease in the non-default behavior of between 0.33 and 0.35 percentage points depending on whether income is included or not. Both effects are statistically significant. The asymmetry between the effects of both factors is not surprising, because price can go up and down but balances never decrease, except when paid down. The negative coefficient of Li,t means that each quarter as the loan ages (i.e. Li,t decreases) the non-default probability increases by around 0.1 percentage points in models without income and 0.2 percentage points in models with income. In other words, as the remaining term decreases, the probability of default decreases. The estimates of the α parameters capture correlation of the individual heterogeneity with the initial LTV ratio. In contrast with the standard literature on mortgage default behavior, we find that unobserved heterogeneity is not correlated with initial LTV. The estimate of α1 is insignificant, which means that the model cannot capture any statistically significant correlation between the individual heterogeneity and the leverage of the loans, once we control for aggregate shocks. The addition of the simulated income in models III and IV does not have any significant effect on the estimates of the underlying structural parameters, beyond a small reduction in the price coefficient. The inclusion of income, roughly doubles the marginal effect of all observed states. The estimated effect of income on default is statistically significant but is economically insignificant. A 10% increase in income is estimated to increase the probability of non-default by 0.06 percentage points. This negligible effect of income on default, after controlling for prices and balances, is not surprising from an economic point of view. It is still interesting because income is often invoked as an important determinant of default in policy discussions. The only estimates that change substantially between the models without income and the models with income are the estimates of the ρξ parameters, which capture the serial correlation of the aggregate shocks. In models I and II, which are the models without income, the autocorrelation parameter ρξ1 is significant and negative, whereas in the models with income III and IV, the estimate of ρξ1 is significant and positive. Moreover, in the models without income the variance of the aggregate shocks estimated by ρξ2 is much bigger than in the models with income. In models I and II, income is treated as an unobserved state variable contained in the unobserved states ξ, µ and ε. Therefore, in the models without income (I and II), the aggregate variation in income is being captured by the aggregate shocks and their perceived

24

transition, whereas in the models with income (III and IV) we are explicitly controlling for this unobserved variation and the estimated transition of the aggregate shocks is less noisy. The relative stability of all remaining parameters across specifications illustrates the ability of the model to control for the unobserved aggregate heterogeneity.

3.6

Counterfactual experiments

We use the estimated structural model to simulate the behavior of debtors under counterfactual assumptions. We are interested in evaluating the policy change undertaken by the Colombian government. As we indicated when describing the data, in the early 1990’s the the formula used to adjust the value of mortgage balances was changed to reflect the market interest rate instead of the inflation rate. The observed default rates were driven both by an economic slowdown and this exogenous policy decision that drove up the mortgage balances when interest rate increased during the financial crisis. To evaluate the effect of the policy, we compute the counterfactual default behavior of debtors under a natural policy alternative. Specifically, we assume that the “monetary correction” rate which was set by the Central Bank was tied to the inflation rate (as it used to be originally) instead of it being tied to the market interest rate. Under the counterfactual policy assumption, each debtor pays a proportion of its real balance each period depending on the number of periods left in the mortgage. Therefore the evolution of real balances can be perfectly anticipated by debtors. Under the counterfactual assumption, the transition of real balances is given by: bi,t+1 = bi,t − bi,t /Li,t = bi,t (1 − 1/Li,t )

(28)

This transition approximates the initial spirit of the UPAC system as an institutional arrangement to protect banks and debtors against inflationary risks. Notice that this new transition does not contain an error term, so that we are doing more than just changing the policy: we are also eliminating all uncertainty regarding the evolution of real balances. We also compute the counterfactual default behavior of debtors assuming that the distribution of income did not change over the time span of the sample. This way we can give proper context to the simulated effect of the policy. We perform two counterfactual income simulations. First, we keep the distribution of income constant over time, but keep the original transition of income estimated from the observed data. Second, we keep the distribution of income constant and update the evolution of income, so that expectations are consistent 25

with income. The two income simulations allows us to discern whether whatever effect we find is driven by the realizations of income or the debtors’ expectations. We perform our counterfactual analysis using the estimates of model IV. Given that our sample size falls rapidly over time as debtors default on their loans, we first compute a baseline simulation using the transitions we estimate from the data. We take all debtors in our sample and have them start their mortgages simultaneously on the first quarter of 1997. For each debtor we draw ten simulated histories of observed states and unobserved heterogeneity using the estimated distribution of states. The analog of the default rate in the simulation is the hazard rate, which we can average across simulated debtors as we follow their survival and default probabilities over time. We then calculate counterfactual default rates performing the same computation on the simulated sample except that we now use the counterfactual transition of balances (28) instead of the one estimated from the data. In the income simulations we use the counterfactual distribution of income and its transition, as explained above. In Table 3 we show the results of the baseline computation and the three simulations over the 30 periods of the sample. In Figure 1 we show the baseline default rates and the simulated default rates under the transition (28) of balances implied by the counterfactual policy. Notice first in Table 3 that the counterfactual income simulations are virtually identical to the baseline scenario. This is not surprising, given the low estimated effect of income in the determination of default. The simulated default is identical, whether we use the initial transition of income or the updated one. On the other hand, the default rates under the counterfactual transition of balances (28) are consistently lower than the baseline rates. Moreover, because default is an absorbing state, these differences accumulate over time. At the end of the sample around 80% of debtors have defaulted under the baseline simulation but only around 40% of debtors default under the counterfactual balance transition. In other words, the policy of tying the balances to a market interest rate was the cause of at least 1/2 of the observed defaults among the loans in this cohort. This difference is substantial and it is only a lower bound estimate of the impact of the counterfactual policy, because we have kept all other variables at their observed levels. Specifically, we would expect home prices to be affected negatively by the observed default rates. If we allowed for general equilibrium effects, the home prices would be higher in the counterfactual simulation and the equilibrium counterfactual default rates would be even lower. In fairness, we should mention again that there is no uncertainty in the counterfactual

26

transition of real balances. This assumption may be close to reality in stable environments with fixed interest rate mortgages, but it might not be a realistic assumption if debtors know (or expect) that the policy can be changed at any time in the future. Notice also that the change of the policy has an effect on the default behavior of debtors through its effect on both the realization of the mortgage balances and its expected evolution over time. In fact the announcement of the policy has an immediate effect on default, even before the states change, due to its effect on the continuation payoffs. To illustrate the importance of expectations and evaluate its significance, we compute the effect of announcing the policy change at any point in time. Specifically, at each time t ≤ T¯ we assume that the expected evolution of the balances changes to (28). This change has no effect on the current states, but has an an immediate effect on the continuation payoffs. Figure 2 shows the results of this simulation. To obtain the results we show in the Figure, we assume that debtors make their decisions under the observed transitions until time t, when the government suddenly announces the change in the policy. We show the baseline default and the counterfactual default at each t. Therefore, the difference in the two rates is the immediate effect of the policy announcement on default. The displayed counterfactual rates are significantly lower than the baseline rates, even though the current states have not changed at all. The average difference between the two rates is more than four percentage points. This highlights the fact that policies that affect the expectations of debtors can have a substantial effect on current default rates, even if they don’t have any effect on the observed relevant state variables (e.g. equity). We should finally point out that a “reduced form” estimation, by definition, would predict that such a policy has no effect on current default.

4

Concluding remarks

In this paper we show the limits of identification for dynamic discrete choice models with unobserved common shocks. The proposed framework identifies the aggregate heterogeneity by exploiting the aggregate variation of choices over time. We show that, as long as there is micro-level variation in the observed states, the aggregate shocks are separately identified from their transition. This result is important because it highlights the limitations of identification of dynamic models when only market-level information is available. Our model belongs to a class of models that have wide applicability, from Macroeconomics to Industrial Organization. In many setups, the aggregate shocks that we identify

27

correspond to macroeconomic shocks. We show that such shocks are identified in micro-level data, regardless of the equilibrium mechanism that generates them. Furthermore, the use of the aggregate implications of the micro-level model is enough to identify the aggregate unobservables, without the need to specify an aggregate equilibrium model that generates them. In other setups, the aggregate shocks that we identify correspond to market-level unobservables, such as unobserved product characteristics in dynamic demand models. Microlevel data are not generally available for the estimation of these models. Our results justify the use of additional moment restrictions to overcome the nonidentification results we prove. The type of restrictions that guarantee the identification of such models with only marketlevel data, however, is not clear. We use our framework to estimate a model of optimal mortgage default with Colombian data. We estimated the full dynamic model of mortgage default and identified separately unobserved aggregate shocks and their transition. In other words, we were able to estimate the aggregate shocks and the beliefs that debtors had about their evolution. Our empirical results highlight the usefulness of fully specified dynamic models. For example, our simulations show that policies that affect the expectations of mortgage debtors have an immediate effect on default, before they have any effect on the current variables. The framework we use is a restrictive one, because it only allows the choice set to include two alternatives, one of which leads to an absorbing state. While popular, it is not general enough to accommodate many interesting problems. How our results can be generalized is the subject of ongoing research.

28

Appendix 1: Proofs Proof of Lemma 1 For the purposes of this proof, let D1,t = {i : di,t = 1} and D0,t = {i : di,t = 0} denote the sets of individuals who choose di,t = 1 and di,t = 0, respectively. The first order condition of the log of the likelihood function (7) maximization problem with respect to each ξτ is: X ˆ i∈D1,τ

−

X ˆ i∈D0,τ

´

´

P˜ri (., ξ) ∂P ri,τ (., ξτ )/∂ξτ dΥ(µ) P˜ri (., ξ)dΥ(µ) P ri,τ (., ξτ )

P˜ri (., ξ) ∂P ri,τ (., ξτ )/∂ξτ dΥ(µ) = 0. P˜ri (., ξ)dΥ(µ) (1 − P ri,τ (., ξτ ))

(A1)

P Let Ht = {{di,1 , .., di,T¯i } : tτ =1 di,τ = t} be the set of possible decision histories, such that an agent is still active at time t and chooses di,t = 1. For simplicity, we assume that Xi = {Xi,1 , ..., Xi,T¯ } is discrete with support X . A similar proof for continuous X can 1 be written at considerable notational cost. Let Nx,h,t be the number of individuals with characteristics x and history h ∈ Ht who choose d = 1 at time t. From this, we can define P 1 1 Nx,t = h Nx,h,t as the total number of individuals with characteristics x who are active at P 1 time t and choose d = 1. Nt1 = x Nx,t is the total number of individuals who reached time t and choose d = 1. Let Nx be the total number of individuals who have characteristics x at the beginning of the sample. 0 be the number of individuals with characteristics x who choose d = 0 Similarly, let Nx,t P 0 at time t. Nt0 = x Nx,t is then the total number of individuals who choose d = 0 at time 0 t. Finally, let Nt = Nt + Nt1 be the total number of individuals who are active at time t. 0 Let P˜rt (x, µ, ξ) be the probability of the history that ends in the individual choosing h d = 0 at t. Let also P˜rt (x, µ, ξ) be the probability of a particular history h ∈ Ht . We can rewrite the sum over individuals in (A1) as a sum of probabilities over “types” weighted by their relative shares, as follows: ˆ h 1 1 X Nx,h,t Nt1 X Nx,t P˜rt (x, µ, ξ) ∂P rt (x, µ, ξt )/∂ξt dΥ(µ) ´ 1 1 h Nt x Nt h∈H Nx,t ˜r (x, µ, ξ)dΥ(µ) P rt (x, µ, ξt ) P t t 0 0 X N0 ˆ N P˜rt (x, µ, ξ) ∂P rt (x, µ, ξt )/∂ξt x,t − t dΥ(µ) = 0. ´ 0 0 Nt x Nt P˜r (x, µ, ξ)dΥ(µ) (1 − P rt (x, µ, ξt )) t

29

(A2)

If the sample includes the whole population the observed shares are identical to the probabilities predicted by the model and the following conditions hold: 1 Nx,h 0 ,t =P 1 Nx,t

´

0

h P˜rt (x, µ, ξ)dΥ(µ) , ´ ˜rh (x, µ, ξ)dΥ(µ) P t h∈Ht

´ 0 Nx00 ,t Nx0 P˜rt (x, µ, ξ)dΥ(µ) =P , ´ 0 Nt0 ˜ (x, µ, ξ)dΥ(µ) N P r t x x ´ P h Nx10 ,t P˜rt (x, µ, ξ)dΥ(µ)Nx0 h∈Ht . =P P ´ h Nt1 P˜r (x, µ, ξ)dΥ(µ)Nx x

t

h∈Ht

Replacing these expressions in (A2) yields:  P ´ ∂P rt (x,µ,ξt )/∂ξt P ˜ h (x, µ, ξ)dΥ(µ)N P r x t h (x,µ,ξt )   x PP rtP ´ h Nt ˜ P rt (x, µ, ξ)dΥ(µ)Nx x h P ´  ∂P rt (x,µ,ξt )/∂ξt ˜ 0 0 P rt (x, µ, ξ)dΥ(µ)Nx N x 1−P rt (x,µ,ξt )  = 0. − t  ´ P 0 Nt ˜rt (x, µ, ξ)dΥ(µ)Nx P x Nt1

(A3)

P h N0 N1 Note that Ntt = 1− Ntt and that h P˜rt (x, µ, ξ) = Gt (x, µ, ξ
P ´

Gt (x, µ, ξ
(A4)

which completes the proof.

Proof of Lemma 2 The lemma assumes: (i) the aggregate shocks follows a Markov process such that ξt+1 = t) h(ξt ) + υt+1 , such that Et [ξt+1 |ξt ] = h(ξt ), and (ii) ∂h(ξ > − β1 , where 1 > β > 0 is the ∂ξt

30

discount rate. We need to show that the system of equations implied by (9) has a unique solution ξ ∗ . We prove existence and uniqueness by showing that under the given conditions, the mapping s˜t (ξt ) is bounded by zero and one and is strictly monotone with respect to ξt . Recall equation (9): ˆ s˜t (ξt ) =

P ri,t (x, µ, ξt ) ´

Gt (x, µ, ξ
(A5)

Notice first that s˜t (ξt ) is an average of well defined probabilities, therefore it is clear that it is bounded above by one and below by zero. Let ξt , ξt0 ∈ supp(ξ) such that (ξt − ξt0 ) > 0. To show monotonicity, we need to show that (˜ st (ξt ) − s˜t (ξt0 )) > 0. A sufficient condition for (A5) to be monotone is that for all µ, x and t, the following condition holds: (P ri,t (., ξt ) − P ri,t (., ξt0 )) > 0, which from (5) is equivalent to: ξt − ξt0 > β [E[Vt+1 (., ξt+1 )|., ξt0 ] − E[Vt+1 (., ξt+1 )|., ξt ]] .

(A6)

Let ∆max(ξt , ξt0 ) = βmax{u(Xi,t+1 ) + µi + h(ξt0 ) + υt+1 + i,t+1 + E[Vt+2 (., ξt+2 )|., h(ξt0 ) + υt+1 ], 0} − βmax{u(Xi,t+1 ) + µi + h(ξt ) + υt+1 + i,t+1 + E[Vt+2 (., ξt+2 )|., h(ξt ) + υt+1 ], 0}. A sufficient condition for (A6) to be true is that for every realization of µi , xi,t , υi,t and i,t , the following condition holds: ξt − ξt0 > ∆max(ξt , ξt0 ),

(A7)

∆max(ξt , ξt0 ) ≥ β [E[Vt+1 (., ξt+1 )|., ξt0 ] − E[Vt+1 (., ξt+1 )|., ξt ]] .

(A8)

since:

We show now that given (i) and (ii) (A7) holds for every period t and, hence the mapping is monotone. To prove it, notice first that assumption (ii) implies that: h(ξt ) − h(ξt0 ) 1 >− , 0 ξt − ξt β 31

which is equivalent to: (ξt − ξt0 ) > β[h(ξt0 ) − h(ξt )].

(A9)

At the terminal period t = Ti ∆max(ξt , ξt0 ) = 0, whereas ξt − ξt0 is positive by assumption. Therefore (A7) holds trivially. At t = Ti − 1, E[V (., ξt+2 )|., ξt+1 ] = 0. Therefore, the following four cases arise: • ∆max(ξt , ξt0 ) = 0: In this case (A7) holds trivially, because ξt − ξt0 > 0 by assumption. • ∆max(ξt , ξt0 ) = −β(u(Xi,t+1 ) + µi + h(ξt ) + υt+1 + i,t+1 ): In this case ∆max(ξt , ξt0 ) < 0 < ξt − ξt0 , so that (A7) holds. • ∆max(ξt , ξt0 ) = β(u(Xi,t+1 ) + µi + h(ξt0 ) + υt+1 + i,t+1 ) − β(u(Xi,t+1 ) + µi + h(ξt ) + υt+1 + i,t+1 ) = β(h(ξt0 ) − h(ξt )): In this case ∆max(ξt , ξt0 ) < ξt − ξt0 from (A9), therefore (A7) holds. • ∆max(ξt , ξt0 ) = β(u(Xi,t+1 ) + µi + h(ξt0 ) + υt+1 + i,t+1 ): In this case, β(u(Xi,t+1 ) + µi + h(ξt ) + υt+1 + i,t+1 ) < 0. Therefore, β(u(Xi,t+1 ) + µi + h(ξt0 ) + υt+1 + i,t+1 ) − β(u(Xi,t+1 ) + µi + h(ξt ) + υt+1 + i,t+1 ) > β(u(Xi,t+1 ) + µi + h(ξt0 ) + υt+1 + i,t+1 ). From the case above we know that ξt −ξt0 > β(h(ξt0 )−h(ξt )) = β(u(Xi,t+1 )+µi +h(ξt0 )+υt+1 + i,t+1 ) − β(u(Xi,t+1 ) + µi + h(ξt ) + υt+1 + i,t+1 ) > (u(Xi,t+1 ) + µi + h(ξt0 ) + υt+1 + i,t+1 ). Therefore, (A7) holds. We will show now that if ξt − ξt0 > 0, (A7) holds for t < Ti − 1. Abusing notation, let Ψt (ξt ) = E[V (., ξt+1 )|ξt ]. At t = Ti − 2: ∆max(ξt , ξt0 ) =βmax{u(Xi,t+1 ) + µi + h(ξt0 ) + υt+1 + i,t+1 + βΨt+1 (h(ξt0 ) + υt+1 ), 0} − βmax{u(Xi,t+1 ) + µi + h(ξt ) + υt+1 + i,t+1 + βΨt+1 (h(ξt ) + υt+1 ), 0}. Again, four cases arise: • ∆max(ξt , ξt0 ) = 0: In this case (A7) holds again trivially, because ξt − ξt0 > 0 by assumption. • ∆max(ξt , ξt0 ) = −β(u(Xi,t+1 ) + µi + h(ξt ) + υt+1 + i,t+1 + Ψ(h(ξt ) + υt+1 )): In this case ∆max(ξt , ξt0 ) < 0 < ξt − ξt0 , so that (A7) holds.

32

• ∆max(ξt , ξt0 ) = β(h(ξt0 )−h(ξt )+β(Ψt+1 (h(ξt0 )+υt+1 )−Ψt+1 (h(ξt )+υt+1 ))) ≥ −β(h(ξt )+ υt+1 − h(ξt0 ) − υt+1 + ∆max(h(ξt ) + υt+1 , h(ξt0 ) + υt+1 ))). From (A9) ξt − ξt0 > β(h(ξt ) − h(ξt0 )). Since we proved that (A6) is true for t = T −1, then each element of the integral β(Ψt+1 (h(ξt0 ) + υt+1 ) − Ψt+1 (h(ξt ) + υt+1 )) is negative. Therefore, ξt − ξt0 > ∆max(ξt , ξt0 ) and (A7) is true. • ∆max(ξt , ξt0 ) = β(u(Xi,t+1 ) + µi + h(ξt0 ) + υt+1 + i,t+1 + βΨt+1 (h(ξt0 ) + υt+1 )): In this case, β(u(Xi,t+1 ) + µi + h(ξt ) + βυt+1 + i,t+1 + βΨt+1 (h(ξt ) + υt+1 )) < 0. Using again (A9): ξt − ξt0 > β(h(ξt0 ) − h(ξt )) = β(u(Xi,t+1 ) + µi + h(ξt0 ) + υt+1 + i,t+1 + βΨt+1 (h(ξt0 ) + υt+1 )) − β(u(Xi,t+1 ) + µi + h(ξt0 ) + υt+1 + i,t+1 + βΨt+1 (h(ξt0 ) + υt+1 )) > ∆max(ξt , ξt0 ), which is what we wanted to show. Proceeding recursively, we can show that (A7) holds for any t. Therefore s˜(ξt ) is monotone and the proof is complete.

Proof of Proposition 1 To show necessity, we show first that whenever Xi,t = Xj,t ∀i, j, t the model is not identified by showing that the likelihood function of the sample is flat. For simplicity we assume that ρ is a scalar. The proof can easily be generalized to the vector case by changing derivatives for jacobians and determinants of these jacobians (to equate to zero) where needed. We can rewrite the log-likelihood function (7) of the sample as follows:

ln(`) =

T¯i N X X

ˆ ln

P ri,t (Si,t ; ρ)di,t (1 − P ri,t (Si,t ; ρ))1−di,t dΥt (µ)

(A10)

i=1 t=1

where the integrals are taken with respect to the distribution of µ conditional on survival until time t, Υt (µ). The derivative of the log-likelihood w.r.t. ρ is: ´ ´ T¯i N d P ri,T¯i (.)dΥT¯i (µ) d P ri,t (.)dΥt (µ) d ln(`) X X di,t 1 − di,t ´ = −´ dρ dρ dρ P ri,t (.)dΥt (µ) P ri,T¯i (.)dΥT¯i (µ) i=1 t=1 ´ T¯i N X X (2di,t − 1) d P ri,t (.)dΥt (µ) ´ = (A11) dρ P r (.)dΥ (µ) i,t t i=1 t=1

33

Let Xi,t = Xj,t ∀i, j, t. Therefore: P ri,t (µ, ξt ; ρ) = P rj,t (µ, ξt ; ρ) ∀µ ˆ ˆ ⇒ P ri,t (.)dΥt (µ) = P rj,t (.)dΥt (µ) = st (ξ, ρ) ´ ´ ´ ∂ P ri,t (.)dΥt (µ) ∂ P ri,t (.)dΥt (µ) dξt d P ri,t (.)dΥt (µ) = + ∀i ⇒ dρ ∂ρ ∂ξt dρ

(A12)

´ We know from lemmas 1 and 2 that ξt is uniquely defined by st (ρ, ξ) ≡ P ri,t (.)dΥt (µ) = s0t where s0t is the observed time t share. We can obtain its derivative using the implicit function theorem: ´ ∂ P ri,t (.)dΥt (µ)/∂ξt dξt ∂st /∂ξt =− =− ´ dρ ∂st /∂ρ ∂ P ri,t (.)dΥt (µ)/∂ρ ´ ´ ´ ´ d P ri,t (.)dΥt (µ) ∂ P ri,t (.)dΥt (µ) ∂ P ri,t (.)dΥt (µ) ∂ P ri,t (.)dΥt (µ)/∂ρ ´ ⇒ = − dρ ∂ρ ∂ξt ∂ P ri,t (.)dΥt (µ)/∂ξt ´ d P ri,t (.)dΥt (µ) ⇒ = 0 ∀ρ, i dρ which implies that (A11) is zero for all ρ and therefore the model is not identified. To show sufficiency, we show that whenever Xi,t 6= Xj,t the model is generically identified. t will not vary with X (since we are integrating against First, notice that for any given t dξ dρ the distribution of X). Conditional on a particular value X = x we have that ´ ´ ´ d P ri,t (x, .)dΥt (µ) ∂ P ri,t (x, .)dΥt (µ) ∂ P ri,t (x, .)dΥt (µ) dξt = + . dρ ∂ρ ∂ξt dρ

(29)

t does not vary with X this derivative cannot, generically, be zero for all values of Since dξ dρ ρ. This in turns implies that (A11), which consists of a weighted sum of terms like the ones above, will generically not be zero. Therefore, the model can only be identified when there is cross sectional variation in X. Now consider the “true” value of ρ∗ such that d ln(`(ρ∗)) = 0. dρ From (A11) we can show that the second derivative of the likelihood function evaluated at ρ∗ is: ´ T¯i N d2 P ri,t (Xi,t , .)dΥt (µ) d2 ln(`(ρ∗)) X X (2di,t − 1) ´ = (A13) 2 dρ2 dρ P r (X , .)dΥ (µ) i,t i,t t i=1 t=1

From (A12) we can show that

d2

´

P ri,t (Xi,t ,.)dΥt (µ) dρ2

t 2 ( dξ ) ) is the same for all individuals, Xi,t 6= dρ

∂2

´

P ri,t (Xi,t ,.)dΥt (µ) t 2 (1 − ( dξ ) ). Since (1 − ∂ρ2 dρ ´ ´ 2 d P ri,t dΥt (µ) d2 P rj,t dΥt (µ) Xj,t implies that = 6 . dρ2 dρ2

34

=

2

Therefore, if Xi,t 6= Xj,t , d ln(`(ρ∗)) can only be zero around ρ∗ if the terms in (A13) add up dρ2 to zero which is generically not true. Therefore, the model is (locally) identified.

35

Appendix 2 In this appendix we lay down a very simple 2-period model with income maximizing agents to illustrate the point that negative equity is, at best (i.e., when default is costless and agents are income maximizers) a necessary condition for default. Consider an income maximizing agent who lives for two periods and who (for simplicity) does not discount the future. The agent owns an asset (i.e., a house) that has a price of p today. The next period the house can be worth either pL or pH > pL with probability 0.5. Each period t = 1, 2 the individual has to pay the amount ct if he lives in the house (i.e., the mortgage payment). We further assume that the price of the asset is below the cost in the bad state pL ≤ c2 but above cost in the good state pH ≥ c2 . In equilibrium, the price of the house (i.e. the value of the asset) today will be given by p = d + 0.5pL + 0.5pH

(A15)

where d is the “dividend” payed by the asset today (which can be understood as the flow “utility” the agent gets from living in the house today). We further assume that the house has negative equity today so p < c1 + c2 where c1 + c2 is the mortgage balance. We assume that default is costless, so if an individual defaults, he simply walks out of the house and gets nothing (i.e. he does not sell the house) and pays nothing.20 In this case, an individual will choose not to default today if the expected value of holding the house is larger than the value of defaulting. That is, if d − c1 + 0.5 (pH − c2 ) > 0.

(A16)

Notice that, because the individual can (and will) default tomorrow in the bad state, he simply gets zero in that case and so the bad state does not directly affect the decision of whether to default today. A simple way of illustrating the importance of accounting for the dynamic incentives 20

Notice that default is costless in the sense that there are no direct expenses associated with default. Default is costly because we are ruling out the possibility that the debtor walks out and purchases again a similar home at the lower market price. If this was the case, the equilibrium price of homes would not be (A15) and the analysis would be more complicated. The general point remains valid in the sense that negative equity is not sufficient for default. We thank an anonymous referee for pointing this out.

36

facing the individual is to consider the following direct implication of this model. Suppose that a policy is implemented that changes both c1 and c2 while keeping c1 + c2 constant. In this case, the individual’s incentive to default will change even though the equity the individual holds in the house is not affected (see condition (A16)). We now show that, even if there is negative equity (p < c1 + c2 ), an individual may still decide not to default. To see why notice that negative equity implies that d + 0.5pL + 0.5pH − c1 − c2 < 0 which can be rewritten as d − c1 + 0.5 (pH − c2 ) < 0.5 (c2 − pL ) . where c2 − pL ≥ 0 by assumption. If the left hand side of this condition is positive (i.e. if (A16) holds) the individual will not default today even though the house has negative equity. So, a necessary condition for the individual not to default is that 0.5 (c2 − pL ) is strictly positive. But 0.5 (c2 − pL ) is exactly the option value for the individual. To see why this is the case, consider what an individual gets today (in expected terms) if he is not allowed to default tomorrow max (d − c1 + 0.5pL + 0.5pH − c2 , 0) and compare it to what he gets when he has the option of defaulting tomorrow max (d − c1 + 0.5 (pH − c2 ) , 0) . How much will an individual be willing to pay to have the right to default tomorrow? He will be willing to pay up to the difference which is max (0, 0.5 (c2 − pL )) = 0.5 (c2 − pL ) . So, provided the value of the default option is positive, an individual may choose not to default even when the equity in the house is negative.

37

Appendix 3 In this appendix we describe the Matzkin class of functions that allow for non-parametric identification of the binary choice model.21 We begin with her main result. Consider a binary choice model, D = 11 (U (X) > υ), where X is observed and υ is unobserved. Let U ∗ denote the “true” function U and let Fυ∗ denote the the true cdf of υ. Let Ω denote the set of monotone increasing functions from R into [0, 1]. Assume (i) X ∈ X ⊂ RK , U ∗ ∈ U where U is a set of functions mapping X into R that are continuous and strictly increasing in their K th coordinate. (ii) X ⊥⊥ υ (iii) The conditional distribution of the K th coordinate of X has a Lebesgue density that is everywhere positive conditional on the other coordinates of X. (iv) Fυ∗ is strictly increasing. (v) The support of the marginal distribution of X is included in X . Then (U ∗ , Fυ∗ ) is identified within U × Ω if and only if U is a set of functions such that no two functions in U are strictly increasing transformations of each other. The following functional forms are examples of functions satisfying her conditions for exact identification of U (X). 1. U (X) = Xγ, ||γ|| = 1 or γ1 = c for a known constant c. This is the same class considered by Manski (1988). 2. U (X) homogeneous of degree one such that U (˜ x) = α for a known α and some x˜ ∈ X . 3. Least concave functions that attain common values at two points in their domain. 4. Functions additively separable into a continuous and monotone increasing function and a continuous monotone increasing, concave and homogeneous of degree one function, e.g. U (X) = X1 + τ (X2 , ..., XK ).

21

See Matzkin (1992) and the appendix in Heckman and Navarro (2007) from where we borrow heavilly.

38

References Aguirregabiria, V., and P. Mira (2002): “Swapping the Nested Fixed Point Algorithm: A Class of Estimators for Discrete Markov Decision Models,” Econometrica, 70(4), 1519– 1543. Altug, S., and R. A. Miller (1998): “The Effect of Work Experience on Female Wages and Labour Supply,” Review of Economic Studies, 65(1), 45–85. Arcidiacono, P., and R. A. Miller (2008): “CCP Estimation of Dynamic Discrete Choice Models with Unobserved Heterogeneity,” Unpublished manuscript. Berry, S., J. Levinsohn, and A. Pakes (1995): “Automobile Prices in Market Equilibrium,” Econometrica, 60(4), 889–917. Carranza, J. E. (2007): “Product innovation and adoption in market equilibrium: The case of digital cameras,” Unpublished manuscript, University of Wisconsin-Madison, Department of Economics. Carranza, J. E., and D. Estrada (2007): “An empirical characterization of mortgage default in Colombia between 1997 and 2004,” Unpublished manuscript, University of Wisconsin-Madison, Department of Economics. Deng, Y., J. M. Quigley, and R. Van Order (2000): “Mortgage Terminations, Heterogeneity and the Exercise of Mortgage Options,” Econometrica, 68(2), 275–307. Erdem, T., S. Imai, and M. Keane (2004): “Brand and Quantity Choice Dynamics Under Price Uncertainty,” Quantitative Marketing and Economics. Gowrisankaran, G., and M. Rysman (2006): “Dynamics of Consumer Demand for New Durable Goods,” wp, John M. Olin School of Business. Heckman, J. J. (1987): “Selection Bias and Self-Selection,” in The New Palgrave: A Dictionary of Economics, ed. by J. Eatwell, M. Milgate, and P. Newman, pp. 287–297. Palgrave Macmillan Press, London. Heckman, J. J., and S. Navarro (2007): “Dynamic Discrete Choice and Dynamic Treatment Effects,” Journal of Econometrics, 136(2), 341–396.

39

Hendel, I., and A. Nevo (2006): “Measuring the Implications of Sales and Consumer Inventory Behavior,” Econometrica, 74(6), 1637–1673. Hotz, V. J., and R. A. Miller (1993): “Conditional Choice Probabilities and the Estimation of Dynamic Models,” Review of Economic Studies, 60(3), 497–529. Keane, M. P., and K. I. Wolpin (1994): “The Solution and Estimation of Discrete Choice Dynamic Programming Models by Simulation and Interpolation: Monte Carlo Evidence,” The Review of Economics and Statistics, 76(4), 648–672. Kotlarski, I. I. (1967): “On Characterizing the Gamma and Normal Distribution,” Pacific Journal of Mathematics, 20, 69–76. Lee, D., and K. I. Wolpin (2006): “Intersectoral Labor Mobility and the Growth of the Service Sector,” Econometrica, 74(1), 1–40. Manski, C. F. (1988): “Identification of Binary Response Models,” Journal of the American Statistical Association, 83(403), 729–738. Matzkin, R. L. (1992): “Nonparametric and Distribution-Free Estimation of the Binary Threshold Crossing and the Binary Choice Models,” Econometrica, 60(2), 239–270. Norets, A. (2009): “Inference in Dynamic Discrete Choice Models with Serially Correlated Unobserved State Variables,” Econometrica, 77(5), 1665–1682. Pakes, A. (1986): “Patents as Options: Some Estimates of the Value of Holding European Patent Stocks,” Econometrica, 54(4), 755–784. Rust, J. (1987): “Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher,” Econometrica, 55(5), 999–1033. (1994): “Structural Estimation of Markov Decision Processes,” in Handbook of Econometrics, Volume, ed. by R. Engle, and D. McFadden, pp. 3081–3143. North-Holland, New York. Taber, C. R. (2000): “Semiparametric Identification and Heterogeneity in Discrete Choice Dynamic Programming Models,” Journal of Econometrics, 96(2), 201–229. Wolpin, K. I. (1984): “An Estimable Dynamic Stochastic Model of Fertility and Child Mortality,” Journal of Political Economy, 92(5), 852–874. 40

(1987): “Estimating a Structural Search Model: The Transition from School to Work,” Econometrica, 55(4), 801–817.

41

Table 1 Summary Statistics (Main Dataset) (1)

Quarter 1997:1 1997:2 1997:3 1997:4 1998:1 1998:2 1998:3 1998:4 1999:1 1999:2 1999:3 1999:4 2000:1 2000:2 2000:3 2000:4 2001:1 2001:2 2001:3 2001:4 2002:1 2002:2 2002:3 2002:4 2003:1 2003:2 2003:3 2003:4 2004:1 2004:2

(2)

(3)

Number of Loans 93 355 591 925 1,435 1,878 2,224 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486 2,486

Outstanding Loans 93 351 575 892 1,366 1,775 2,078 2,267 2,153 2,022 1,946 1,837 1,738 1,699 1,616 1,565 1,532 1,496 1,458 1,425 1,404 1,389 1,369 1,338 1,321 1,307 1,296 1,289 1,279 1,270

*1997 Millions of Colombian Pesos

(4)

Default Rate 0.00% 1.14% 2.09% 1.91% 2.64% 1.92% 2.07% 3.22% 5.29% 6.48% 3.91% 5.93% 5.70% 2.30% 5.14% 3.26% 2.15% 2.41% 2.61% 2.32% 1.50% 1.08% 1.46% 2.32% 1.29% 1.07% 0.85% 0.54% 0.78% 0.71%

(5)

(6)

Mean House Price* Outstanding Loans

All Loans

167.98 85.69 87.28 85.12 73.97 71.17 65.57 62.81 65.05 63.09 55.94 52.37 50.72 55.54 52.90 53.63 60.29 54.85 59.13 62.29 56.83 64.03 60.11 68.76 63.46 66.76 65.94 67.34 65.94 73.50

167.98 85.35 86.32 84.02 73.04 70.41 64.83 61.68 63.07 62.87 59.31 63.56 60.75 66.33 65.53 65.94 73.96 67.14 71.87 75.04 68.17 76.52 71.51 81.35 74.80 78.52 77.41 79.43 78.06 86.83

Table 2 Estimation Results: Probability of Non-Default Model I Coefficient

Estimate

Model II

Marginal Effect % Points

Estimate

Model III

Marginal Effect % Points

Estimate

Marginal Effect % Points

Model IV Estimate

Marginal Effect % Points

Utility:

ζ1 (Price)

(Std. Error)

ζ2 (Balance)

(Std. Error)

ζ3 (Term)

0.061

0.266

0.061

0.269

0.049

0.418

0.049

0.428

(0.005)

(0.034)

(0.005)

(0.029)

(0.004)

(0.080)

(0.004)

(0.077)

-0.328

-0.673

-0.329

-0.714

-0.355

-1.090

-0.355

-1.181

(0.027)

(0.079)

(0.025)

(0.080)

(0.025)

(0.174)

(0.025)

(0.178)

-0.015

-0.102

-0.015

-0.102

-0.015

-0.228

-0.015

-0.230

(Std. Error)

(0.002)

(0.018)

(0.002)

(0.017)

(0.002)

(0.045)

(0.002)

(0.043)

ζ4 (Income)

-

-

-

-

0.001

0.061

0.001

0.062

(0.000)

(0.029)

(0.000)

(0.029)

-

-

-

-

(Std. Error)

ζ5 (Equity) (Std. Error)

-0.101 (0.059)

-

-0.168 (0.124)

-

Loan to Value:

α0 (Constant)

(Std. Error)

α1 (Heterogeneity)

(Std. Error)

α2 (Variance)

(0.005)

0.003 (0.002)

0.036

(Std. Error)

Transition of

0.539

(0.001)

-1.321

(Std. Error)

(0.231)

ρ1 (Lagged ξ)

-0.896

ρ2 (Variance)

-

0.539 (0.005)

0.004 (0.004)

0.036 (0.001)

-

0.539 (0.005)

0.004 (0.005)

0.036 (0.001)

-

0.539 (0.005)

0.004 (0.002)

0.036 (0.001)

-

ξ:

ρ0 (Constant)

(Std. Error)

-

(0.156)

0.249

(Std. Error)

(0.065)

Variance(µ )

1.964

-

-1.309 (0.223)

-0.895 (0.154)

0.281 (0.067)

1.977

-

-0.035 (0.023)

0.123 (0.068)

0.001 (0.001)

1.862

-

-0.041 (0.024)

0.160 (0.075)

0.001 (0.001)

1.871

-

*The marginal Effects are computed as the average marginal effect across all debtors with a 15 year mortgage one year after the mortgage started. We evaluate the aggregate shock at its mean. We compute the marginal effect of a 10% increase in each variable and a 1 quarter increase in term left.

Table 3 Counterfactual Policy Simulations: Default Rate Quarter 1997:2 1997:3 1997:4 1998:1 1998:2 1998:3 1998:4 1999:1 1999:2 1999:3 1999:4 2000:1 2000:2 2000:3 2000:4 2001:1 2001:2 2001:3 2001:4 2002:1 2002:2 2002:3 2002:4 2003:1 2003:2 2003:3 2003:4 2004:1 2004:2

1

Baseline

3.64% 3.07% 2.29% 3.47% 2.16% 2.02% 2.96% 4.45% 5.97% 3.83% 6.50% 7.03% 3.72% 8.37% 6.42% 5.28% 5.69% 7.13% 9.22% 4.84% 4.65% 5.55% 10.77% 6.23% 5.85% 5.14% 4.14% 5.56% 6.25%

New Transition for Balances2 3.15% 2.50% 1.75% 2.52% 1.44% 1.28% 1.78% 2.58% 3.29% 1.89% 3.05% 2.98% 1.36% 2.93% 1.92% 1.41% 1.37% 1.54% 1.93% 0.65% 0.54% 0.57% 1.01% 0.43% 0.33% 0.24% 0.15% 0.18% 0.16%

Income Distribution Fixed for all t. Set Equal to Initial Distribution Original Transition3 Updated Transition4 3.64% 3.65% 3.07% 3.08% 2.30% 2.34% 3.45% 3.50% 2.15% 2.16% 2.02% 2.04% 2.95% 2.99% 4.42% 4.50% 5.96% 6.06% 3.82% 3.89% 6.51% 6.55% 7.04% 7.14% 3.72% 3.80% 8.38% 8.52% 6.44% 6.54% 5.28% 5.40% 5.68% 5.80% 7.13% 7.20% 9.20% 9.36% 4.84% 4.85% 4.65% 4.66% 5.54% 5.60% 10.74% 10.80% 6.20% 6.26% 5.85% 5.85% 5.15% 5.08% 4.15% 4.09% 5.55% 5.51% 6.26% 6.30%

1

Average default rate across debtors using the estimated transitions.

2

Let bt be the balance at time t and let Lt be the remaining periods until the mortgage ends. We show the average default rates across debtors when the transition for balances is replaced with the "standard" transition: bt=bt-1(1-1/Lt).

3

Average default rate across debtors when the distribution of income is fixed at the initial one for all periods but consumers use the original estimated transition to form expectations.

4

Average default rate across debtors when the distribution of income is fixed at the initial one for all periods, consumers know this and adjust their transition accordingly.

Figure 1 Default Rates: Baseline1 and Counterfactual Balances Transition2 12.00% Baseline Counterfactual

10.00%

Default Rate

8.00%

6.00%

4.00%

2.00%

0.00% 1

2

1Average

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

Quarter

default rate across debtors using the estimated transitions. bt be the balance at time t and let Lt be the remaining periods until the mortgage ends. We show the average default rates across debtors when the tran 2Let

Figure 2 Default Rates: Baseline1 and Counterfactual Balances Transition Announced in Current Period2 12.00% Baseline Counterfactual

10.00%

Default Rate

8.00%

6.00%

4.00%

2.00%

0.00% 1

2

1Average

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

Quarter

default rate across debtors using the estimated transitions. bt be the balance at time t and let Lt be the remaining periods until the mortgage ends. We show the average default rate across debtors when the estim 2Let

Identification of Insurance Models with ...

Monetary Shocks in Models with Inattentive Producers

Aggregate Consequences of Dynamic Credit ...

Identification in Nonparametric Models for Dynamic ...

Aggregate Consequences of Dynamic Credit ...

Monetary Shocks in Models with Inattentive Producers - EIEF

Efficient estimation of general dynamic models with a ...

Dynamic Drop Models

Identification in models with discrete variables

set identification in models with multiple equilibria - CiteSeerX

set identification in models with multiple equilibria

Identification of Models of the Labor Market

Identification and Semiparametric Estimation of Equilibrium Models of ...

Identification of Piecewise Linear Models of Complex ...

Hysteresis in Dynamic General Equilibrium Models with ...

Solving Dynamic Models with Heterogeneous Agents ...

A note on the identification of dynamic economic ...

Solving Dynamic Models with Heterogeneous Agents ...