Online Appendix A to the paper ”Evidence for Relational Contracts in Sovereign Bank Lending” Peter Benczur∗

Cosmin L. Ilut†

June 2014

1

Some further issues related to the construction of our default indicators

First we discuss cross-default clauses that bank loan contracts included. These clauses stated that once a country enters into default or any repayment problem that constitutes a break on the contract with one lender, this will be treated as default also by the other creditors. According to Das et al. (2012),1 during the 1980s, cross-default clauses in sovereign loan contracts protected banks from selective defaults on parts of a syndicated loan. For example, if a sovereign defaulted on a loan tranche towards a small bank, this triggered remedies on the loans towards other, possibly larger banks. According to Hurlock (1984),2 one form of cross-default clauses that was particularly troublesome to borrowers was a provision that gives the lenders the right to accelerate their debt if other lenders are capable of declaring a default, even if they have not done so. The inclusion of such a “capable of” clause makes the renegotiation agreement subject to the most restrictive provisions of any of the sovereign’s many loan agreements. Because of its restrictiveness, the “capable of” clause has been nominated as the worst clause in the Euromarkets. At the same time, as described by Buchheit and Reisner (1988),3 there were almost always limits to cross-default clauses in sovereign loan agreements. Next we address the issue that some instances of our future default indicator are negative (see Table 1 of the main text). If we had the proper allocation of the arrear fragments (equation 9 of the main text), this would never happen. As long as sometimes we are attributing too little of an ∗ Joint Research Center of the European Commission, Central European University and IE-CERS, HAS. email: [email protected] † Duke University and NBER. Corresponding author, e-mail: [email protected] 1 Das, Udaibir, Papaioannou, Michael and Trebesch, Christoph, Sovereign Debt Restructurings 1950-2010: Literature Survey, Data, and Stylized Facts, IMF Working Paper, 2012/203. 2 Hurlock, James B., Advising Sovereign Clients on the Renegotiation of Their External Indebtedness, Columbia Journal of Transnational Law, 1984. 3 Buchheit, Lee C. and Reisner, Ralph, The Effect of the Sovereign Debt Restructuring Process on Inter-Creditor Relationships, University of Illinois Law Review, 1988.

1

increase or too much of a decrease in arrears to a particular loan contract, this may happen. In general, we view our measure as an approximation of creditor losses from non-repayment – and the fact that it can capture the impact of almost all fundamentals in the structural form shows that it is a successful measure. We nevertheless explored the frequency and role of such negative values in our exercise. Out of the 154 observation, 19 have a negative value. The mean is -0.0186, with a standard deviation of 0.02. To check the robustness of our results, we rerun our main structural form regression where we replace the negative values with zero. The findings were very similar.

2

Estimation issues

The reduced form in a panel framework is: sit = α + βRt + ΓZit + ci + εit , where εit is the idiosyncratic error term, Zit are the economic fundamentals for country i known at time t and ci is the unobservable individual effect. Pooled OLS estimates are incompatible with individual country effects. The usual procedure to correct for these effects is a fixed or random effects estimator. A key assumption for both methods is strict exogeneity, which requires that the idiosyncratic error terms, conditional on the individual effect, are uncorrelated with past, present and future values of the regressors. If this fails, then all the classic panel data methods and specification tests are inconsistent. Formally, the strict exogeneity assumption means: E(εit |Zis , ci ) = 0, for all t and s. There are reasons to suspect that the assumption might fail, as any pricing error (εit ) could affect the future values of certain indicators, like reserves, debt to GDP, etc. The strict exogeneity assumption is even more problematic and crucial in the structural form (equation (1) of the main text with the country effects included) than in the reduced form. As Keane and Runkle (1992) strongly point out,4 in this type of models, there are never any strict exogenous variables or instruments. This formal result comes from the link between the prediction error and the future values of the variables. When there are concerns about strict exogeneity, the general approach is to use a transformation to remove the country effects ci , and then search for instrumental variables, assuming only sequential exogeneity (Wooldridge, 2002).5 According to this assumption, the idiosyncratic errors, conditional on ci , should be uncorrelated with the contemporaneous and past values of the regressors (instruments), but not with future values. In this respect, a first-difference (FD) estimator is attractive: sit − si(t−1) = β(Rt − Rt−1 ) + Γ(Zit − Zi(t−1) ) + εit − εi(t−1) . 4

Keane, Michael P. and Runkle, David E., On the Estimation of Panel-Data Models with Serial Correlation When Instruments Are Not Strictly Exogenous, Journal of Business & Economic Statistics, 1992. 5 Wooldridge, Jeffrey, Econometric Analysis of Cross Section and Panel Data, 2002.

2

One can notice that if strict exogeneity fails, then there is a problem here in the reduced form as well, since E(εi(t−1) |Zit ) ̸= 0. The sequential exogeneity assumption, however, implies that all the lags of Z (or their linear combination) can be used as potential instruments for Zit − Zi(t−1) and then the estimation is consistent. In practice, however, this bias turned out to be negligible, but it decreased the precision of the estimates. For this reason, we report estimates without this extra instrumenting step. For the structural form, FD would mean estimating: ¯ it − di(t−1) ) + Θ(Xit − Xi(t−1) ) − λ(ε ¯ 2it − ε2i(t−1) ) + ε1it − ε1i(t−1) . sit − si(t−1) = β(Rt − Rt−1 ) + λ(d However, the first difference specification causes further complications. The rational expectation assumption guarantees that the prediction error ε2it is orthogonal to time t information, but this is not true about ε2i(t−1) . The remedy is to use Zi(t−1) or Zi(t−2) as instruments, as those variables are not correlated with any error at time t or t − 1.

3

Estimation in levels

Unfortunately, estimation in levels is plagued with several problems which render it unreliable. The presence of fixed effects calls for a FE or RE specification, where the strict exogeneity assumption fails, and indeed both suffer from a very significant rejection of the overidentification test. The instrumented future default risk appears very insignificant and the point estimate is much lower than the one resulting from the correctly specified first differenced estimation. In the RE case, the point estimate on the future default indicator is closer, even though much less significant, to the first-difference estimation. We attribute these to the failure of the strict exogeneity assumption. If instead we ignore fixed effects and run the pooled level IV, the results are broadly close to our benchmark first-difference specification: the instruments have a similar though somewhat even lower first-stage explanatory power, and there is a significant extra effect of recent default. One significant difference is that now the overidentification test rejects the null with a p-value of 0.05. When the past default dummy is included as an extra right hand side variable, its coefficient is negative and not significant. The p-value of the overidentification test becomes even lower. Thus, this specification would suggest that the reduced-form effect of past distant default on the spread can be attributed entirely to future default risk, and not to a punishment channel. However, given the problems outlined earlier, we would not trust enough this level specification to draw a strong conclusion.

4

Supplementary tables

In the structural form, when instead of the spread, we regress the quantity-weighted average maturity (Column 2), we find that a shorter maturity is predicted by: (i) a higher future default probability; (ii) a lower benchmark rate; and (iii) a larger US corporate spread. The effect of 3

our recent default variable on maturity is negative, but not statistically significant. These results suggest that while maturity is another margin that is affected by the risk profile of the borrower, it is not significantly affected by the country’s recent default experience.

4

Table 1: Supplementary structural-form resultsa Future default Benchmark yield US BAA/BBB spread Recent default First stage relevance:g Partial R2 for future default Kleibergen-Paap rk Wald F statistich Kleibergen-Paap rk Wald stat p-vali Structural form: Overidentification test p-valj Anderson-Rubin Wald test p-valk Number of observations

Spreadb 0.38 (1.92)∗ -0.11 (-12.38)∗∗∗ 0.223 (7.48)∗∗∗ 7.27 (1.74)∗

Maturity c Spreadd −2.492 0.448 (−2.19)∗∗ (1.97)∗ 0.44 -0.11 (8.38)∗∗∗ (-12.23)∗∗∗ -1.714 0.228 ∗∗∗ (−8.31) (7.70)∗∗∗ -16.37 7.27 (−1.42) (1.80)∗

Spreade 0.45 (1.23) -0.114 (-6.72)∗∗∗ 0.216 (7.17)∗∗∗ 10.35 (1.80)∗

Spreadf 0.325 (1.80)∗ -0.12 (-12.32)∗∗∗ 0.35 (7.29)∗∗∗ 6.91 (1.96)∗

0.258 3.991 0.000

0.258 3.991 0.000

0.255 3.333 0.000

0.249 3.923 0.000

0.252 3.67 0.000

0.739 0.07 154

0.342 0.000 154

0.74 0.061 154

0.22 0.002 120

0.688 0.048 154

a The dependent and explanatory variables are first differenced. All columns refer to Fuller(1). The future default variable is instrumented by the first lag of the following variables: debt/GDP, reserves to imports, GDP growth, investment growth, GDP per capita, proportion of countries with arrears in the region, experience, new sovereign dummy and a distant default dummy. The t statistics are in parentheses; the standard errors are corrected for clustering at the country level. *, **,*** denote 0.1, 0.05 and 0.01 significance levels, respectively. b Column 1 uses only quantities as weights for the spreads. c The dependent variable in Column 2 is maturity. d Column 3 includes a 3 year grace period in computing the future default. e Column 4 uses as instruments the second lag of the variables used in the benchmark specification. f Column 5 uses the US BBB-rated corporate spread computed by Altman (1989). g The reduced-form regression of the instrumented indicator on the full set of instruments. h The F statistic of the joint significance of excluded instruments. i The Kleibergen-Paap rk Wald test of the null hypothesis that the equation is underidentified. j The Hansen J-statistic test of the null hypothesis that the instruments are uncorrelated with the error term and that the excluded instruments are correctly excluded from the structural equation. k The weak-instrument robust Anderson-Rubin test of the null hypothesis that the coefficient on the endogenous regressor in the structural equation is zero.

5

Table 2: Detailed country-default indicators Country Algeria Argentina Bolivia Brazil Cameroon Chile Colombia Costa Rica Ecuador Gabon India Indonesia Ivory Coast Jamaica Jordan Liberia Malawi Malaysia

Na Future Default 4 6 4 8 2 2 6 5 6 7 4 6 8 5 3 2 2 7

Meanc 0.002 0.322 0.573 0.037 0.531 0 0.03 0.685 0.228 0.024 0 0.001 0.193 0.053 0.012 0.958 0.051 0

Stdd 0.001 0.198 0.313 0.038 0.44 0 0.044 0.603 0.29 0.054 0 0.001 0.293 0.107 0.015 0.287 0.1 0

Recent Defaultb Country

Na Future Default

Mean 0 0 0.27 4.09 0 0 0.01 0.1 0.01 0.02 0 0 0 0.53 0 0 0 0

1 6 6 2 1 3 3 8 2 5 8 2 2 6 2 3 4 3

Std 0 0 0.16 2.58 0 0 0.01 0.2 0.04 0.03 0 0 0 0.5 0 0 0 0

Mauritius Mexico Morocco Nicaragua Nigeria Oman Pakistan Panama Papua NG Peru Philippines Senegal Sri Lanka Thailand Tunisia Turkey Uruguay Venezuela

Mean 0 0.001 0.049 0.112 0.69 -0.02 0.001 0.291 0 0.814 0.054 -0.001 0.011 0 0.002 0.001 0 -0.024

Std . 0.001 0.13 0.145 . 0.038 0.001 0.474 0 0.971 0.073 0.006 0.004 0 0.001 0.001 0 0.001

Recent Defaultb Mean 0 0.01 0.4 0 0 0 0 0.04 0 0.01 0.04 0 0 0 0.22 0.01 0 2.4

Std . 0.01 0.08 0 . 0 0 0.1 0 0.01 0.03 0 0 0 0.07 0.02 0 2.1

a N refers to the numbers of observations for each country. Some countries have only one observation because this sample is obtained after first-differencing. b The numbers for recent default are reported as 100 times the underlying values. c The mean for each variable is calculated across the N observations for each particular country. d The standard deviation for each variable is calculated across the N observations for each particular country.

6

A Simulation Exercise Online Appendix B to the paper ”Evidence for Relational Contracts in Sovereign Bank Lending” Peter Benczur∗

Cosmin L. Ilut†

June 2014

1

Strategy

We generate artificial fundamental data, default behavior and sovereign spreads, as closely matching our empirical setup as possible: • The spread is priced by potentially risk-averse investors, who form conditional expectations about future repayments. • Future ’repayment difficulties’ are driven by some random fundamentals, potentially recent default (representing the signaling hypothesis), and the current pricing error (’endogeneity’). • The proportion of debt that might be in default is uncertain, and it increases with the repayment difficulty – this allows for a continuous realized repayment indicator. • There is also a potential ad hoc extra (’punishment’) component, simply added to the spread (priced in line with the previous point). The true spread is in general a highly nonlinear function of the distribution of future repayment difficulties, which would be hard to estimate. Instead, we utilize approximate pricing equations. We work with a linear case, which can be derived as an ’arch-approximation’ of the nonlinear pricing equation (see details later); the full nonlinear pricing equation and (just briefly) with an interim case of a nonlinear approximation (2-piece arch approximation, see details later). Our strategy is to generate spreads using one of these three cases, which we then estimate by a linear regression. In the third case, we can also estimate the true data generating process by adding various (conditional) expectations as right hand side regressors. We do not derive a full blown theory which would lead to such a punishment component (as we argue in many places, that is still up for the literature) or a strategic default decision of the borrower. Our ’statistical model’ is also purely illustrative, we do not seriously calibrate it to any data moments or properties, though in some cases, certain parameters will be chosen in accordance with data (repayment ∗

Joint Research Center of the European Commission, Central European University and IE-CERS, HAS. email: [email protected] † Duke University and NBER. Corresponding author, e-mail: [email protected]

7

difficulty frequency, realized haircut, importance of the recent default effect, slope coefficient of realized default etc.). In our specification, we strive at the minimum necessary complexity (a single fundamental factor, for example), still allowing us to demonstrate the main methodological properties of our error-invariables approach: • Our method can estimate a meaningful slope coefficient of (expected) future default. • Our method can tell apart punishment from signaling, even when we use only a subset of available instruments. • Strict/sequential exogeneity is indeed an issue: in first differences, one needs to apply precisely those instruments that we do (second lags), and their level is indeed more useful than their difference.1 • Our method can handle the endogeneity of the pricing error term. • Our method produces good estimates of the impact of future default when risk-aversion fluctuates, even in an endogenous fashion (driven by recent default). It might be difficult, however, to separate an increase in risk aversion in response to a recent default from a direct punishment effect. • Our linear approximation performs well, in the following two senses: (1) when the true pricing equation is the 2-piece arch version, its linearized version (1-piece arch) leads to an estimated expected future repayment difficulty indicator very close to its true value, and (2) when we run a linear regression specification on spreads generated by the nonlinear pricing equation, the implied risk-aversion coefficient coming from our estimates is close to the true value. • Finally, the finding of an economically meaningful extra effect of recent default is not the consequence of the approximation errors of our linear specification.

2

Setup

2.1

Realized default

There is a single country, and its fundamental evolves as xt = ρxt−1 + σx ϵt .

(1)

The proportion (z) of debt that might be in default is uncertain, and it decreases with the fundamental.2 The haircut (one minus the recovery rate) is λ. Realized default at t + 1 is then dt+1 = λzt+1 , 1

(2)

Notice that we do not demonstrate the fixed effect argument, as that would require the generation of multiple country Monte Carlo samples in a parallel fashion, which would become too complicated. This means that there is no true reason for using first differences – we only do it to illustrate the issue of sequential endogeneity. 2 The negative of the fundamental can be viewed as repayment difficulty. With more fundamentals, repayment difficulty would be some (potentially linear) function of fundamentals.

8

where zt+1 has a distribution    zt+1 =

 

1 if xt ≤ xt − −α2 (xt − xt ) if −

1 α2

1 α2

+ xt < xt < xt

0 if xt ≥ xt

    

,

(3)

and α2 is a parameter. Here we take x as a fundamental that influences positively the no-default probability (like GDP, or reserves). Whenever it is very high, there is no repayment difficulty. Whenever it is very low, there is “full non-repayment”, which still involves a potentially nonzero recovery rate. For intermediate values of x, there is some interior value of repayment difficulties. Then 1 |xt−1 ) Et [zt+1 |xt−1 ] = Pr(xt ≤ xt − α2 [ ] 1 1 −α2 Et (xt − xt )| − + xt < xt < xt , xt−1 Pr(− + xt < xt < xt |xt−1 ) α2 α2 [ ]( ) 1 1 1 = F (xt − ) − α2 Et (xt − xt )| − + xt < xt < xt , xt−1 F (xt ) − F (xt − ) , α2 α2 α2 where F is the cumulative distribution function of xt (conditional on xt−1 ).

2.2

Pricing at issue (time t)

The investor has a total wealth of W + 1, from which he has to hold one unit in the form of a risky loan. Let us assume first that there is either partial default (z = 1 with a haircut of λ) on the principal or full repayment. The risky loan thus either pays the riskless rate and a premium 1 + r + s = 1 + R, with probability 1 − p, or(1 − λ) + R, with probability p. Expected utility is then pEU (W + 1 − λ + r + s) + (1 − p) U (W + 1 + r + s) . The spread s has to be such that the investor is indifferent between this portfolio and the riskless one: pEU (W + 1 − λ + r + s) + (1 − p) U (W + 1 + r + s) = U (W + 1 + r) . Let us work with a CARA utility function of the form −e−aW . Then the indifference equation becomes e−a(W +1+r) = pEe−a(W +r+s+1−λ) + (1 − p) e−a(W +1+r+s)

eas = pEeaλ + (1 − p) ( ) ln 1 − p + pEeaλ 1 s= = f (p) . a a This gives us back s = pλ in the limiting case of a = 0 (risk-neutrality). Now let us assume that repayment difficulty z is a continuous variable, there is a haircut (or loss given

9

Figure 1: The 1- and 2-piece arch approximations

default) λ and risk-aversion can fluctuate over time. The same calculation then yields st =

1 ( ( at zt+1 λ )) ln Et e . at

(4)

Summing up: • Realized losses can be computed from (2) and (3) . • The lender observes the history of fundamentals xt−1 , default realizations z t and the realization of risk aversion at . The information set is thus It = {xt−1 , z t , at }. • Using (1) and (3), the lender forms the conditional probability distribution m(zt+1 ) for zt+1 , denoted for future reference by mt . • The loan is priced based on the distribution mt and the realized value of risk-aversion at , with a pricing error νt ∼ N (0, σp ) added.

2.3

A linear pricing formula

Now let us modify the pricing equation to allow for its empirical estimation. A first order approximation of the nonlinear formula (4) is: st =

( )) ) ea t λ − 1 1 ( 1 ( ln 1 + E eat zt+1 λ − 1 ≈ E eat zt+1 λ − 1 ≈ Et [zt+1 ]. at at at

(5)

Here the first approximation step is ln (1 + x) ≈ x. The second step is obtained by replacing the nonlinear function eazλ − 1 (as a function of z) by the arch connecting its starting point (zero) and ending point ( ) (eaλ − 1) – see figure 1, and calculate its expected value using the linear function z eaλ − 1 . The virtue of this formula is that it describes the spread as a linear function of the expected repayment difficulty zt+1 , the slope coefficient depending on risk-aversion and the haircut. Then our EIV method can replace the expected repayment difficulty with its realization, using instrumental variables: we can estimate st =

(eat λ − 1) · Et [zt+1 ] + υt at

10

by running st =

(eat λ − 1) · zt+1 + υ˜t at

(6)

and using any function of xt−1 as instrument.

2.4

An interim case: a nonlinear formula where we can still estimate the true DGP

Due to its high degree of nonlinearity, it is not obvious how to estimate the general nonlinear pricing formula directly. Consequently, when estimating the linear approximation, we cannot assess the size of the error of the approximation. As a remedy, we can replace the full nonlinear formula with a “second order approximation” (as we will see: a two-piece arch approximation), postulate that as the true pricing equation and compare it to its linear approximation (being the same as before). Let us eliminate the nonlinear function outside the expectation in (4): st ≈

( ) 1 Et eat zt+1 λ − 1 + υt , at

and then replace it with the two-piece arch-approximation H2 (z) (see figure 1). In formulas:  a + b z 1 1 H2 (z) = a + b z 2

2

¯ in [0, z]

.

in [¯ z , 1]

The four parameters are such that H2 (0) = eat λ·0 − 1 = 0, H2 (¯ z ) = eat λ¯z − 1 (using both segments of H2 ), H2 (1) = eat λ − 1. We have four equations for four unknowns, yielding a1 = 0, b1 = a2 =

z λ −¯ ea¯ z eaλ a(1−¯ z)

− a1 , b2 =

¯

eaλ −eazλ a(1−¯ z) .

z λ −1 ee¯ a¯ z ,

For the cutoff value z¯, we choose the median of the internal values for

z (in (0, 1)). If the true pricing formula is H2 (z) , then st = b1 Et [zt+1 ] + Et [(a2 + (b2 − b1 ) zt+1 ) χ (zt+1 > z¯)] + υt .

(7)

Here χ (yt < y¯) is the indicator function that the random variable yt is less than y¯. Equation (7) can be estimated by running st = b1 zt+1 + a2 χ (zt+1 > z¯) + (b2 − b1 ) zt+1 χ (zt+1 > z¯) + υ˜t

(8)

and using (various) nonlinear transformations of xt−1 as instruments. It would be a separate paper to explore the good (or even optimal) choice of the instruments and/or the cutoff value z¯. The terms in (8) suggest variables like χ (xt−1 < x ˜) and xt−1 χ (xt−1 < x ˜) for some cutoff x ˜, but there is no a priori guarantee that those would be sufficiently correlated with the terms in (8).

2.5

Implementing various scenarios

In the benchmark case, there is no signaling. It means that the default cutoff point in formula (3) is constant: xt = x. 11

The pricing equation is thus (5), with at = a: st =

eaλ − 1 Et [zt+1 ] + νt , a

and so st = g(xt−1 ) + νt , since xt−1 is a sufficient statistic for Et [zt+1 ]. Under risk neutrality (a = 0), the pricing equation becomes st = λEt [zt+1 ] + νt . In case of signaling, we postulate that the default cutoff point is a function of past default: xt = x + γzt−1 . or 2 xt = x + γ2 zt−1 .

for strong signaling. This latter scenario features a strong nonlinearity in default probabilities: large arrears (recent default) lead to much larger increases in (re-)default probabilities. In both cases, pricing becomes st = g(xt−1 , zt−1 ) + νt . Under constant risk-aversion, we again have that st =

eaλ − 1 Et [zt+1 ] + νt . a

In case of risk-neutrality, the coefficient becomes λ. To implement punishment, we simply write st = g(xt−1 , zt−1 ) + νt + βzt−1 , and so st =

eaλ − 1 Et [zt+1 ] + βzt−1 + νt . a

The endogeneity of future default can be implemented similarly to signaling. In this case the default threshold becomes a function of the current pricing error: xt = x + γp νt . The underlying assumption here is that the investor knows the value of the pricing error and so he takes it into account when calculating expected repayment difficulties. This means that there is a correlation between the current pricing error and Et [zt+1 ] as well, since Et [zt+1 ] depends on υt . Finally, we consider two versions of time varying risk aversion. In the first one, changes in risk aversion

12

are purely exogenous: at = a + ξt { } ξ wp 0.5 ξt = . −ξ wp 0.5

(9)

In the other, risk aversion also responds to past default: at = a + ξt + κzt−1 , where ξt still follows (9). Note that our structural-form regression would always replace the expected value of future default with its realization and instrument it with time t information. The obvious candidate is the fundamental xt−1 itself, but in fact any of its (nonlinear) functions are also good candidates: they are all orthogonal to the prediction error, though they might have too low extra predicting power for future default. In the reduced form, the linear and the quadratic terms were always significant, and also their first stage fit was sufficiently strong. Having at least two instruments means that our equation is in general overidentified, allowing a non-trivial role for the overidentification test.3 Based on these, we will be using xt−1 and x2t−1 as instruments.

2.6

Parameter values

Table 1 reports the numerical values for all the parameters of our simulation exercise. The parameters were chosen such that (1) we get a reasonable RF fit, (2) we get sufficiently significant and precise point estimates in all SF regressions, and (3) future default is an important determinant of the spread, relative to the pricing error. Some of the variables, however, also reflect certain data patterns. We chose riskaversion to be a = 1, which is in line with standard values. The haircut is then λ = 0.3, implying that the linear approximation yields a similar future default coefficient as our data (appr. 0.379 under 2SLS, 0.41 under Fuller(1) and LIML; while in our simulation, it is 0.3498588). The value of λ = 0.3 is also similar to the average haircut of 0.37 found by Cruces and Trebesch (2013). The default threshold determinants x ¯ and α2 (under no signaling or endogeneity) are such that the frequency of repayment problems (measured by recent default being non-zero) is like in our data.4

3

Results for different scenarios

First we cover the cases where the true pricing equation is linear. We first consider constant risk-aversion, and look at all four combinations with or without punishment and signaling (both normal and strong). We then repeat these exercises in first differences (FD), and also under the assumption that the pricing error influences future default (endogeneity). Finally, we look at cases when risk aversion is time varying. After that, we switch to cases when the true pricing equation is nonlinear (either completely, or just the 3

In our actual exercise, we had enough fundamentals for overidentification without having to consider nonlinear transforms. Recent default has around 0.23 nonzeros, while future default has around 0.8, averaging to 0.5. This is the sample frequency of recent default being non-zero in our simulation. Under signaling, this increases to 0.52 for zt−1 = 0.5 and to 0.54 for zt−1 = 1; while the correspoding values for strong signaling are 0.54 and 0.65. 4

13

Table 1: Parameter values Variable name Haircut Risk-aversion Default threshold parameters (baseline) The impact of past default on the threshold (signaling) The impact of past default on the threshold (strong signaling) The impact of the pricing error on the threshold (endogeneity) The impact of past default on risk-aversion Exogenous shock to risk-aversion AR coefficient of the fundamental process Standard deviation of the fundamental error term Standard deviation of the pricing error Impact of past default on the spread (punishment) Sample size

Value λ = 0.3 a=1 x ¯ = 0, α2 = 1 γ = 0.1 γ2 = 0.4 γp = 1 κ=1 ξ¯ = 0.1 ρ = 0.4 σx = 1 σp = 0.05 β = 0.1 N = 20, 000

two-piece arch approximation), but we are still estimating its linear approximation.5 We usually report reduced-form (RF) results, where the spread is regressed on the set of fundamentals corresponding to the case at hand; followed by structural form (SF) results, where the right hand side contains realized future default (zt+1 ) and potential additional terms. Since the F-test of the joint significance of the variables would always yield a p-value of 0.00, we do not report this statistics. All tables report the T-statistics in parentheses. Due to our controlled experiment, we do not need to adjust for any clustering or heteroscedasticity. In general, the RF equations take the form of st = θx xt−1 + θdrf zt−1 + vet , where xt−1 refers to any information available at the time of pricing (in general, it will contain xt−1 and x2t−1 ), while the SF is st = θzt+1 + θd zt−1 + vet , where realized future default zt+1 is instrumented by xt−1 . As explained earlier, the true values should be θ = 0.3498588 whenever the pricing equation is linear; and θd = 0.1, whenever there is punishment. In all tables, we use both xt−1 and x2t−1 as instruments (fundamentals) – results would be practically identical when using xt−1 only. In some cases, we also report structural-form results using the analytically computed value of expected repayment difficulties; such specifications can be estimated by OLS in most cases. Notice that in many cases this means estimating the true data generating process itself. To simplify the terminology, we refer to zt−1 as past default, while zt+1 as future default. When referring to our empirical exercise in the main paper, we label it as ”actual”, in order not to confuse it with the true data generating process of the simulation. 5

Scenarios with strong signaling were added to the exercise at a later stage. It implies that the realizations of the underlying random variables is different under strong signaling than all the other scenarios. To prevent the multiplication of scenarios, we only considered strong signaling for constant risk aversion, linear pricing, levels and first differences, and the general nonlinear DGP.

14

Table 2: Linear pricing, constant risk-aversion, levels; LHS variable: st RF RF SF (1) (2) (3) −0.001 zt−1 (−0.63) −0.045 −0.044 xt−1 (−130.29) (−138.65) 0.004 0.004 x2t−1 (18.78) (18.77) 0.349 Et zt+1 (or zt+1 ) (140.85) 0.110 0.110 0.000 const (211.76) (255.42) (0.16) R2 / partial R2 0.50 0.50 0.50 overid

no signaling or punishment SF SF SF (4) (5) (6) −0.003 (−0.96)

0.346 (49.16) −0.000 (−0.08) 0.12 0.95

0.346 (49.16) −0.000 (−0.07) 0.12 0.39

0.349 (45.83) −0.000 (−0.08) 0.11 0.33

Notes. Column 3 uses the true expected value Et zt+1 as a RHS variable, and the regression is OLS. In all other columns, row 4 refers to realized default zt+1 . In Columns 4 and 6, xt−1 and x2t−1 are used as instruments. In Column 5, xt−1 , x2t−1 and zt−1 are used as instruments.

3.1

Linear pricing, constant risk-aversion, levels

We are interested in the following aspects of the results: • RF: Is zt−1 significant? If it needs to be included in the RF, would its omission lead to a substantial bias in other parameters? • SF: – With E [zt+1 ], how close is the estimated value to the true value? – In general, do we get good and precise estimates of

eaλ −1 a ,

and a good first stage fit?

– Is zt−1 significant when it should be? Is its value close to the truth? – If zt−1 should be included, can we detect its omission using the overidentification test? – Does the channel decomposition work? 3.1.1

No signaling, no punishment

The results are reported in Table 2. Notice that the reduced form offers a good fit, with the irrelevant past default indeed being negligible and highly insignificant. In the structural form, the point estimate of the slope coefficient is in general quite precise and close to the truth (0.3498588). Given our sample size and parametrization, even the true specification (Column 3) is not yet 100% equal to the true parameter, so this is as close as we can expect to get. Since the correctly specified SF estimates with realized default are always quite close to the estimates using the true expected value, we no longer report the results from that specification. The partial R2 is reasonably good. The irrelevant extra term zt−1 is insignificant.6 6

It is also negligible, which can be formally demonstrated the following way. The sample median of the spread is 0.113, its 90% value is 0.207, and its 99% value is 0.289. The median and 90 percentile for the true expected default variable is

15

Table 3: Linear pricing, constant risk-aversion, levels; signaling, no punishment LHS variable: st RF RF SF SF SF (1) (2) (3) (4) (5) 0.01 −0.003 zt−1 (12.34) (−0.94) −0.045 −0.046 xt−1 (−131.21) (−143.92) 0.004 0.004 x2t−1 (18.39) (18.89) 0.346 0.346 0.349 zt+1 (50.98) (51.24) (45.96) 0.110 0.114 −0.000 −0.000 −0.000 const (210.02) (263.78) (−0.10) (−0.01) (−0.09) R2 / partial R2 0.52 0.51 0.13 0.13 0.11 overid 0.34 0.40 0.34 Notes. In Columns 3 and 5, xt−1 and x2t−1 are used as instruments. In Column 4, xt−1 , x2t−1 and zt−1 are used as instruments.

Whenever applicable, the overidentification test does not reject its null. Note that we can also check the “channel decomposition” result: the RF coefficient of xt−1 is −0.045, its first stage coefficient in predicting future default (not reported) is −0.128, the coefficient of future default is 0.349, and indeed −0.128 · 0.349 = −0.045. 3.1.2

Signaling, no punishment

The results shown in Tables 3 and 4 are quite similar to the previous case (Table 2). The reduced form offers a good fit, with past default now indeed being significant though not very large for normal signaling (the sample median and 90 percentiles of past default and the spread are almost the same as in the previous exercise, so the argument of footnote 6 can be adopted). In case of strong signaling, the effect is larger: increasing past default from its median to its 90th percentile would already take the spread around 36% from its median to the 90th percentile. In the structural form, the point estimate of the slope coefficient is quite precise and close to the truth. The partial R2 is reasonably good. Whenever applicable, the overidentification test does not reject its null. The channel decomposition works in the structural-form regression for all the significant variables from the RF. The irrelevant extra term zt−1 is insignificant and economically negligible (again adopting the argument of footnote 6). There are two further points which the results illustrate. The comparison of Columns 1 and 2 reveal that the RF can be subject to omitted variable bias: when omitting the relevant variable zt−1 , the point estimate of the constant increases by 4% (signaling) and 13% (strong signaling). The coefficient of xt−1 also changes by 13% in case of strong signaling (the change under normal signaling is less sizable and of similar order of magnitude to that in Table 2, which is just a random change). On the other hand, omitting zt−1 from the instruments does not lead to any visible change in the SF estimate (see Columns 0.312 and 0.527. So moving expected default from its median to its 90% changes the spread by 0.349 · 0.215 = 0.075, moving the spread around 80% from its median towards the 90 percentile. The median and 90% of past default are also 0 and 1, so multiplying its difference by the point estimate is negligible relative to the sample variation of the spread.

16

Table 4: Linear pricing, constant risk-aversion, levels; strong signaling, no punishment LHS variable: st−1 RF RF SF SF SF (1) (2) (3) (4) (5) 0.043 0.002 zt−1 (48.11) (0.59) −0.045 −0.051 xt−1 (−131.49) (−149.32) 0.004 0.004 x2t−1 (17.34) (18.50) 0.349 0.350 0.347 zt+1 (55.32) (57.65) (46.27) 0.109 0.124 0.000 0.000 0.000 const (202.04) (269.97) (0.30) (0.15) (0.28) R2 / partial R2 0.58 0.53 0.15 0.16 0.11 overid 0.96 0.84 0.96 Notes. In Columns 3 and 5, xt−1 and x2t−1 are used as instruments. In Column 4, xt−1 , x2t−1 and zt−1 are used as instruments.

3 and 4). This latter is the “Wickens finding”, showing that we do not need to worry about not using all potential instruments, as long as we have enough good ones. 3.1.3

No signaling, punishment

When we add punishment but no signaling (Table 5), the reduced form offers a similarly good fit, with past default now being highly significant. Its value is close to its true (structural-form) coefficient. Omitting it from the regression (Column 2) changes the point estimates substantially. It is nevertheless not an illustration of the Wickens result, as the omission of past default from the structural form also biases the structural-form coefficients (Column 3 and 4). The incorrect specification is also revealed by the overidentification test, provided that the “problematic variable” is included in the instrument set: overidentification is clearly rejected when past default is among the instruments (Column 4), but it is not rejected when we use only the two fundamentals (Column 3). Once we add past default as an extra right hand side variable (Column 5), we again get all point estimates close to the true values and overidentification is not rejected. The partial R2 is reasonably good. 3.1.4

Signaling, punishment

Tables 6 and 7 contain results with both signaling (normal and strong) and punishment. The reduced form shows both the significance and the relevance of past default. Columns 3 and 4 show again that omitting the relevant extra right hand side variable zt−1 from the structural form leads to biased estimates (for future default). The overidentification test is indicative when the missing variable is used as an instrument (Column 4). When we estimate the proper specification (Column 5), the point estimates are quite precise and close to the truth. In this case, we can also look at the channel decomposition in details: in the first

17

Table 5: Linear pricing, constant risk-aversion, levels; no signaling, punishment LHS variable: st RF RF SF SF SF (1) (2) (3) (4) (5) 0.10 0.10 zt−1 (108.35) (36.75) −0.045 −0.057 xt−1 (−130.29) (−142.00) 0.004 0.005 x2t−1 (18.78) (19.09) 0.447 0.451 0.349 zt+1 (49.36) (49.41) (45.83) 0.110 0.142 −0.000 −0.002 −0.000 const (211.76) (261.40) (−0.12) (−0.52) (−0.08) R2 / partial R2 0.69 0.51 0.12 0.12 0.11 overid 0.35 0.00 0.33 Notes. In Columns 3 and 5, xt−1 and x2t−1 are used as instruments. In Column 4, xt−1 , x2t−1 and zt−1 are used as instruments.

Table 6: Linear pricing, constant risk-aversion, levels; signaling, punishment LHS variable: st RF RF SF SF SF (1) (2) (3) (4) 0.11 0.10 zt−1 (122.32) (35.53) −0.045 −0.059 xt−1 (−131.21) (−140.03) 0.004 0.005 x2t−1 (18.39) (18.08) 0.444 0.469 0.349 zt+1 (51.19) (51.78) (45.96) 0.111 0.147 0.001 −0.008 −0.000 const (210.02) (258.34) (0.18) (−2.36) (−0.09) R2 / partial R2 0.71 0.50 0.13 0.13 0.11 overid 0.40 0.00 0.35 Notes. In Columns 3 and 5, xt−1 and x2t−1 are used as instruments. In Column 4, xt−1 , x2t−1 and zt−1 are used as instruments.

18

Table 7: Linear pricing, constant risk-aversion, levels; strong LHS variable: st RF RF SF (1) (2) (3) 0.143 zt−1 (159.82) −0.045 −0.065 xt−1 (−131.49) (−131.67) 0.004 0.005 x2t−1 (17.34) (16.31) 0.440 zt+1 (55.45) 0.109 0.159 0.004 const (202.04) (242.96) (1.31) R2 / partial R2 0.77 0.47 0.15 overid 0.96

signaling, punishment SF SF (4) (5) 0.102 (34.07)

0.500 (58.59) −0.018 (−5.18) 0.16 0.00

0.347 (46.27) 0.000 (0.28) 0.11 0.96

Notes. In Columns 3 and 5, xt−1 and x2t−1 are used as instruments. In Column 4, xt−1 , x2t−1 and zt−1 are used as instruments.

stage and prediction equations (both reduced form) st = θx xt−1 + θdrf zt−1 + vet zt+1 = θxz xt−1 + θdz zt−1 + vet , b Now θbrf = 0.11, β = 0.1, θz = .0395235 and θˆ = 0.349. So from the and we should have θbdrf = β + θdz θ. d d RF coefficient of 0.11, 0.1 is coming from punishment and 0.0395 · 0.349 = 0.0137 ≈ 0.01 from signaling. Summing up, our approach performs well in estimating the structural-form coefficients. When using the correct set of instruments, we get precise and accurate point estimates. Past default is found to be significant in the structural form and overidentification is rejected exactly when they should be. The channel decomposition is also perfectly in line with the true data generating process.

3.2

Linear pricing, constant risk-aversion, first differences

Here we are running the SF equation in the form of st − st−1 = θ (zt+1 − zt ) + θd (zt−1 − zt−2 ) + (e vt − vet−1 ) , where the change in realized future default zt+1 − zt is instrumented. The objective of the exercise is to show that all of the results we had so far remain valid even if we run everything in FD (even though in our simulation there is only one country, so there is no true need to do FD). Moreover, we can demonstrate that (1) the FD of the valid instruments (xt−1 − xt−2 ) is no longer valid but rather, (2) we should go to FD of their lags (xt−2 − xt−3 ), or anything that contains information up to time t − 2, and (3) using the second lag in levels (xt−2 ) yields more precise estimates than the FD of the first lag. First we look at the case when there is no signaling or punishment (Table 8). The reduced form

19

LHS variable: st zt−1 xt−1 x2t−1 zt+1 const R2 / partial R2 overid

Table 8: First differences; no signaling, no punishment RF, FD RF, FD SF, FD SF, FD SF, FD (1) (2) (3) (4) (5) −0.002 (−1.81) −0.045 −0.045 (−104.15) (−107.71) 0.004 0.004 (16.07) (16.42) −0.40 0.40 0.353 (−36.69) (10.09) (20.46) −0.000 −0.000 −0.000 0.000 0.000 (−0.00) (−0.00) (−0.00) (0.01) (0.00) 0.37 0.37 0.07 0.01 0.03 0.00 0.46 0.81

SF, FD (6)

SF, FD (7) −0.005 (−1.75)

0.351 (20.46) 0.000 (0.00) 0.03 0.21

0.338 (18.36) 0.000 (0.00) 0.03 0.67

Notes. Instruments: xt−1 − xt−2 and x2t−1 − x2t−2 in Column 3, xt−2 − xt−3 and x2t−2 − x2t−3 in Column 4, xt−2 and x2t−2 in Column 5 and 7, and xt−2 , x2t−2 and zt−1 − zt−2 in Column 6. The p value of the first stage Partial F statistics (weak identification test) is 0.00 in Columns 3-7.

(Column 1 and 2) are quite similar to Table 2.7 Column 3 shows that one cannot use the first difference of xt−1 directly as instruments: the point estimate of future default becomes significantly negative! Using the first difference of its lagged value (xt−2 − xt−3 ) as instrument is already admissible, it leads to a point estimate much closer to the truth, though it is imprecise and the instruments are very weak (Column 4). The level of the second lag, with or without the lag of past default (Columns 5 and 6) is much better: the estimate is closer to the truth, more precise, and the instruments are less weak (the first stage partial F-statistics are 256.53 in Column 7 versus 69.62 in Column 4, p values of 0.00 in both cases). When adding past default as an extra right hand side variable, it is not entirely insignificant, but it is economically small. Overidentification is not rejected in Columns 4-7. In the case of the first difference (Column 3), it clearly shows that the instruments are not appropriate. Next we look at the case where there is signaling but no punishment (Table 9). Strong signaling would yield the same conclusions, so we decided not to report its results. To save space, we no longer report the case where only xt−2 and x2t−2 are used as instruments. The results show exactly the same pattern as without signaling (Table 8) and in levels (Table 3). Omitted variable bias in the reduced form is even a bit more visible, since the point estimates in Column 2 are in relative terms more different from Column 1 than they were in levels (Table 3). Notice that the presence of signaling impacts the structural-form estimate of past default only slightly (Column 6). Next we look at the case where there is punishment but no signaling (Table 10). To save space, we no longer report the columns which were demonstrating the proper and best set of instruments (levels of second lags). The results are again similar or at times starker than those in levels: the overidentification test strongly rejects even when using a proper set of instruments (xt−1 and xt−2 ) but omitting a necessary right hand side variable zt−1 (Column 3), and even the point estimate of future default is very different from 7

The only exception is the marginal significance of past default in Column 1, which seems to be a sample anomaly: the variable would completely lose its significance in any random subsample. Moreover, its size is economically insignificant, based on the argument in footnote 6.

20

LHS variable: st zt−1 xt−1 x2t−1 zt+1 const R2 / partial R2 overid

Table 9: First differences; signaling, no punishment RF, FD RF, FD SF, FD SF, FD SF, FD (1) (2) (3) (4) (5) 0.01 (9.02) −0.045 −0.044 (−104.77) (−105.52) 0.0035 0.003 (15.54) (14.56) −0.39 0.434 0.346 (−37.56) (6.79) (19.06) −0.000 −0.000 −0.000 0.000 0.000 (−0.00) (−0.01) (−0.00) (0.01) (0.00) 0.37 0.36 0.07 0.003 0.03 0.00 0.55 0.18

SF, FD (6) −0.006 (−1.85)

0.337 (18.36) 0.000 (0.00) 0.03 0.69

Notes. Instruments: xt−1 − xt−2 and x2t−1 − x2t−2 in Column 3, xt−2 − xt−3 and x2t−2 − x2t−3 in Column 4, xt−2 , x2t−2 and zt−1 − zt−2 in Column 5, and xt−2 and x2t−2 in Column 6. The p value of the first stage Partial F statistics (weak identification test) is 0.00 in Columns 3-6.

the truth. Once estimating the correct specification (Column 5), we get quite accurate point estimates. This is even more surprising, given that the partial R2 is not particularly strong; though again, the first stage partial F would be highly significant. Finally, we look at the case where there is both signaling8 and punishment (Table 11). The results show exactly the same pattern as without signaling (and in levels): The overidentification test strongly rejects even when omitting a necessary right hand side variable zt−1 (Columns 3 and 4). In these columns, even the point estimate of future default is very different from the truth. Once estimating the correct specification (Column 5), we get quite accurate point estimates. The channel decomposition works in this case as well. When comparing estimates using first differences to levels, we get in general similar results, though first differencing leads to a huge loss in efficiency and accuracy, as expected. First differences of the instruments which we could use in levels no longer work. We need to go to second lags, preferably in levels (to get stronger instruments). The overidentification test clearly rejects improper specifications (even when we use proper instruments but omit a relevant extra right hand side variable), and it does not reject proper specifications and instruments. Given that there is no true need for doing FD in our simulation (no country effects), we switch back to levels, since that can better demonstrate the capabilities of our error-in-variables method to estimate the pricing equation properly. In our exercise with actual data, we nevertheless need to work with first differences.

3.3

Linear pricing, constant risk-aversion, levels, endogeneity

Here we are repeating the regressions from Tables 2-6, but under the assumption that the default threshold depends on the current pricing error (and maybe also on past default) – labelled as endogeneity. Our objective is to demonstrate that our results remain unchanged even in the presence of endogeneity. 8

Again, strong signaling would yield the same qualitative conclusions, so we decided not to report its results.

21

Table 10: First differences; no signaling, punishment LHS variable: st RF, FD RF, FD SF, FD SF, FD (1) (2) (3) (4) 0.10 zt−1 (91.43) −0.045 −0.035 xt−1 (−104.15) (−71.16) 0.004 0.001 x2t−1 (16.07) (4.06) 0.09 0.109 zt+1 (9.62) (11.14) −0.000 −0.000 0.000 0.000 const (−0.00) (−0.00) (0.00) (0.00) R2 / partial R2 0.44 0.20 0.03 0.03 overid 0.00 0.00

SF, FD (5) 0.10 (29.47)

0.338 (18.36) 0.000 (0.00) 0.03 0.67

Notes. Instruments: xt−2 and x2t−2 in Columns 3 and 5, xt−2 , x2t−2 and zt−1 − zt−2 in Column 4. The p value of the first stage Partial F statistics (weak identification test) is 0.00 in Columns 3-5.

Table 11: First differences; signaling, punishment LHS variable: st RF, FD RF, FD SF, FD SF, FD (1) (2) (3) (4) 0.110 zt−1 (102.42) −0.045 −0.034 xt−1 (−104.77) (−66.74) 0.003 0.001 x2t−1 (15.54) (2.98) 0.07 0.19 zt+1 (6.68) (14.84) −0.000 −0.000 0.000 0.000 const (−0.00) (−0.00) (0.00) (0.00) R2 / partial R2 0.44 0.18 0.03 0.03 overid 0.00 0.00

SF, FD (5) 0.09 (31.50)

0.337 (18.36) 0.000 (0.00) 0.03 0.69

Notes. Instruments: xt−2 and x2t−2 in Columns 3 and 5, xt−2 , x2t−2 and zt−1 − zt−2 in Column 4. The p value of the first stage Partial F statistics (weak identification test) is 0.00 in Columns 3-5.

22

LHS variable: st zt−1 xt−1 x2t−1 Et zt+1 (or zt+1 ) const R2 / partial R2 overid

Table 12: Endogeneity; no signaling, no punishment RF RF SF SF (1) (2) (3) (4) −0.000 (−0.66) −0.045 −0.044 (−117.13) (−124.58) 0.004 0.004 (16.99) (16.98) 0.39 0.349 (158.31) (140.40) 0.110 0.110 −0.01 0.000 (190.51) (229.73) (−14.39) (0.17) 0.44 0.44 0.56 0.98 0.12

SF (5)

SF (6) −0.0027 (−1.04)

0.346 (49.12) −0.000 (−0.06) 0.12 0.34

0.349 (45.80) −0.000 (−0.07) 0.11 0.31

Notes. Column 3 is estimated by OLS, using expected default as the RHS variable. In Column 4, expected default is instrumented by xt−1 , x2t−1 and zt−1 . Finally, in Columns 5 and 6, the corresponding RHS variable is realized default, instrumented by xt−1 , x2t−1 and zt−1 in Column 5 and xt−1 and x2t−1 in Column 6.

Under no signaling or punishment (Table 12), the reduced form (Columns 1 and 2) remain almost identical to the specification without endogeneity (Table 2). Column 3 shows that endogeneity creates a problem even when we use the true expected value of future default (and not its realization), since it is correlated with the pricing error. The point estimate is significantly different from the true value. One can nevertheless instrument expected default with fundamentals and recover the true parameter (Column 4). Using realized default and instrumenting it with fundamentals also identifies the specification and yields estimates very close to the truth (Column 5 and 6). Past default is not relevant. The overidentification test never rejects (though it has a p-value of 0.12 in Column 4, when we are instrumenting the true expected value of default). When adding signaling but still no punishment (Table 13), the results are very similar. Past default is now significant in the RF and is a better instrument, but it is not significant in the SF and does not lead to any major reduction in the p value of the overidentification test. Results under punishment but no signaling (Table 14) are similar to those before: past default is much more significant and important now in the RF (Columns 1-2). Expected default also needs to be instrumented (Columns 3-4). When using realized default, the inclusion of past default in the instrument set leads to inconsistent point estimates and a clear rejection of overidentification (instrumenting only with xt−1 and x2t−1 would lead to inconsistent estimates but the overidentification test would not reject). Adding zt−1 as an extra RHS variable leads to the true parameters and restores the non-rejection of overidentification. Finally, results under both signaling and punishment (Table 15) are again similar as before (and also as in levels, in Table 6): Past default is significant and important in the RF (Columns 1-2). Expected default also needs to be instrumented (Columns 3-4). When using realized default, the inclusion of past default in the instrument set leads to inconsistent point estimates and a clear rejection of overidentification

23

LHS variable: st zt−1 xt−1 x2t−1 Et zt+1 (or zt+1 ) const R2 / partial R2 overid

Table 13: Endogeneity; signaling, no punishment RF RF SF SF (1) (2) (3) (4) 0.01 (10.94) −0.045 −0.046 (−117.89) (−129.31) 0.004 0.004 (16.63) (17.08) 0.384 0.349 (164.09) (146.76) 0.110 0.114 −0.01 0.000 (188.83) (237.26) (−13.77) (0.25) 0.46 0.46 0.57 0.98 0.11

SF (5)

SF (6) −0.003 (−0.98)

0.346 (51.21) 0.000 (0.02) 0.13 0.38

0.350 (45.96) −0.000 (−0.07) 0.11 0.32

Notes. Column 3 is estimated by OLS, using expected default as the RHS variable. In Column 4, expected default is instrumented by xt−1 , x2t−1 and zt−1 . Finally, in Columns 5 and 6, the corresponding RHS variable is realized default, instrumented by xt−1 , x2t−1 and zt−1 in Column 5 and xt−1 and x2t−1 in Column 6.

LHS variable: st zt−1 xt−1 x2t−1 Et zt+1 (or zt+1 ) const R2 / partial R2 overid

Table 14: Endogeneity; no signaling, punishment RF RF SF SF (1) (2) (3) (4) 0.10 0.09 0.10 (97.47) (103.54) (108.51) −0.045 −0.057 (−117.13) (−132.33) 0.004 0.005 (16.99) (17.80) 0.393 0.349 (150.97) (131.87) 0.110 0.142 −0.013 0.000 (190.51) (243.94) (−14.37) (0.17) 0.64 0.47 0.72 0.98 0.06

SF (5)

SF (6) 0.10 (36.69)

0.451 (49.50) −0.002 (−0.52) 0.12 0.00

0.349 (45.80) −0.000 (−0.07) 0.11 0.31

Notes. Column 3 is estimated by OLS, using expected default and past default as the RHS variables. In Column 4, expected default is instrumented by xt−1 and x2t−1 . In Columns 5 and 6, the corresponding RHS variable is realized default, instrumented by xt−1 , x2t−1 and zt−1 in Column 5, and xt−1 and x2t−1 in Column 6.

24

LHS variable: st zt−1 xt−1 x2t−1 Et zt+1 (or zt+1 ) const R2 / partial R2 overid

Table 15: Endogeneity; signaling, punishment RF RF SF SF (1) (2) (3) (4) 0.11 0.09 0.10 (109.89) (99.03) (105.35) −0.045 −0.059 (−117.89) (−131.25) 0.004 0.005 (16.63) (16.95) 0.393 0.349 (152.04) (132.83) 0.111 0.148 −0.01 0.000 (188.83) (242.47) (−14.29) (0.18) 0.67 0.47 0.74 0.98 0.06

SF (5)

SF (6) 0.10 (35.55)

0.469 (51.89) −0.008 (−2.33) 0.13 0.00

0.349 (45.96) −0.000 (−0.07) 0.11 0.32

Notes. Column 3 is estimated by OLS, using expected default and past default as the RHS variables. In Column 4, expected default is instrumented by xt−1 , and x2t−1 . In Columns 5 and 6, the corresponding RHS variable is realized default, instrumented by xt−1 , x2t−1 and zt−1 in Column 5, and xt−1 and x2t−1 in Column 6.

(instrumenting only with xt−1 and x2t−1 would lead to inconsistent estimates but the overidentification test would not reject). Adding zt−1 as an extra RHS variable leads to the true parameters and restores the non-rejection of overidentification. The channel decomposition remains valid in this scenario. Overall, this subsection demonstrates that our method is able to handle the endogeneity of the pricing error quite well. In future simulations, we thus no longer consider endogeneity.

3.4

Linear pricing, time-varying risk-aversion, levels (and no endogeneity)

In this scenario, we are mostly interested in checking whether the finding of a punishment effect can also be the consequence of (exogenous or endogenous) variations in risk-aversion. Exogenous variation simply means a mean zero random shock to the risk-aversion parameter a; while endogenous variation adds a feedback from past default to risk-aversion as well. We first explore in depth the case when there is signaling but no punishment; and then also check briefly the performance of our model in the other three main scenarios. In some cases, we also allow the econometrician to observe the mean zero shock ξt to risk-aversion. We thus estimate st = θzt+1 + θa at + θd zt−1 + vet . 3.4.1

Signaling, no punishment

First we consider exogenous changes in risk-aversion (Table 16). Here past default and the risk aversion shock are both significant in the RF (Columns 1-2) but irrelevant for the SF (Column 4). Since the estimated coefficient of future default remains close to the true value under constant risk aversion, it means that we can nicely estimate the mean value of the risk-aversion parameter. Next we consider an endogenous shift in the risk-aversion parameter (Table 17). Past default and the 25

Table 16: Exogenous changes in risk-aversion; signaling, no punishment LHS variable: st RF RF SF SF (1) (2) (3) (4) 0.01 0.01 −0.003 zt−1 (12.35) (12.34) (−0.94) −0.045 −0.045 xt−1 (−131.21) (−131.21) 0.004 0.004 x2t−1 (18.39) (18.40) 0.02 −0.01 ξt (6.34) (−0.77) 0.346 0.350 zt+1 (51.34) (45.91) 0.111 0.088 −0.000 0.007 const (209.88) (24.81) (−0.00) (0.72) R2 / partial R2 0.52 0.52 0.13 0.11 overid 0.50 0.35 Notes. The instruments are xt−1 , x2t−1 , ξt and zt−1 in Column 3; and xt−1 , x2t−1 in Column 4.

risk-aversion shock are both significant in the RF, as they should be. Since they are highly correlated, their significance is relatively low in the reduced form. In the structural form, using only future default as a right hand side variable leads to slightly inconsistent estimates and a clear rejection of overidentification (Columns 3 and 4). When adding either past default or the risk-aversion shock as a right hand side variable, the point estimate of future default moves very close to the truth and the extra right hand side variable is significant though overidentification is still rejected. Adding both terms to the right hand side is inconclusive: past default becomes marginally significant and the risk-aversion shock insignificant, but overidentification is still rejected. It means that one might not be able to properly estimate a specification with an endogenous shift in risk-aversion by adding the risk-aversion shock as an extra variable. The more complicated nature of the true data generating process, however, is indicated by the rejection of overidentification. Notice, however, that a shift in risk-aversion means that the slope coefficient of expected future default value should change. This calls for adding the interaction of past default and/or the risk-aversion shift with future default into the RHS: st = θzt+1 + θa at + θ2 zt−1 zz+1 + θd zt−1 + vet st = θzt+1 + θa at + θ2′ at zz+1 + θd zt−1 + vet . Note that we need at least one additional instrument here, but xt at and xt zt−1 are available. The results are reported in Table 18. In the RF, the interaction terms are significant when we only use interactions with past default (Column 1); while using interactions with both past default and the riskaversion shock makes many estimates lose their significance (Column 2). When adding the interaction of future default with one of the extra terms to the structural form (Columns 3 and 4), both terms are indeed significant, and the overidentification test does not reject. Adding both terms as extra right hand side 26

Table 17: Endogenous changes in risk-aversion; signaling, no punishment LHS variable: st RF RF SF SF SF SF (1) (2) (3) (4) (5) (6) 0.03 0.01 0.02 zt−1 (35.92) (2.84) (6.83) −0.045 −0.045 xt−1 (−131.68) (−131.68) 0.005 0.005 x2t−1 (22.70) (22.72) 0.02 0.02 ξt + κzt−1 (6.31) (6.37) 0.371 0.376 0.353 0.354 zt+1 (51.32) (51.75) (46.10) (45.99) 0.104 0.090 −0.007 −0.008 −0.006 −0.02 const (196.71) (37.63) (−2.54) (−3.09) (−2.68) (−6.02) R2 / partial R2 0.56 0.56 0.13 0.13 0.11 0.11 overid 0.02 0.00 0.04 0.00

SF (7) 0.03 (2.52)

−0.008 (−0.76) 0.353 (45.95) −0.002 (−0.28) 0.11 0.02

Notes. The instruments are the following. Column 3: xt−1 , x2t−1 . Columns 4 and 7: xt−1 , x2t−1 , zt−1 , ξt + κzt−1 . Column 5: xt−1 , x2t−1 , ξt + κzt−1 . Column 6: xt−1 , x2t−1 , zt−1 .

variables (Columns 5-6), they are significant in one case (interaction with risk-aversion, Column 6) but not the other (interaction with past default, Column 5). When adding both interaction terms (Column 7), only future default remains significant. Overall, it seems that the approximate regression specifications we were considering have a hard time in uncovering the true underlying data generating process. The fact that a risk-aversion shock term is significant in the estimates signals that changing risk-aversion might be present. If past default is significant, one cannot necessarily rule out that it reflects the impact of a change in risk-aversion after a default episode. But if the inclusion of a good proxy for the risk-aversion shock is available, and its inclusion makes the results less stable, it points to the presence of an endogenous shift in risk-aversion. The rejection of overidentification is also indicative of a more complex interaction between past default and the spread. Notice that if a default episode leads to higher risk-aversion (and thus higher spreads), that is already a punishment itself for the borrower. As for repayment incentives, it is thus not central to separate these two cases. It is more important for modeling punishment. We also argue in the main paper, however, that even such a default-driven increase in risk-aversion requires some form of borrower-lender lock-in. Since in our simulated data the interaction regressions perform poorly, we no longer consider those specifications for the next cases. We do not report results for the exogenous shock either, since they do not seem to influence the results (and the performance of our methodology). 3.4.2

No signaling, no punishment

When there is neither signaling nor punishment (Table 19), the results look mostly similar to the previous case (Table 17). Past default is significant in the RF (Column 1), but its significance is eliminated by the inclusion of the risk-aversion shock (Column 2). Omitting interaction terms from the SF (Columns 3 and

27

Table 18: Endogenous changes in risk-aversion; signaling, no punishment; interaction terms LHS variable: st RF RF SF SF SF SF SF (1) (2) (3) (4) (5) (6) (7) 0.03 0.01 0.001 0.02 0.002 zt−1 (31.56) (3.12) (0.11) (2.27) (0.08) −0.042 −0.04 xt−1 (−90.02) (−17.77) −0.01 −0.005 xt−1 · zt−1 (−11.41) (−1.40) −0.006 xt−1 · (ξt + κzt−1 ) (−1.98) 0.004 0.003 x2t−1 (14.98) (2.17) −0.003 −0.005 x2t−1 · zt−1 (−5.34) (−2.39) 0.002 x2t−1 · (ξt + κzt−1 ) (1.08) 0.02 −0.007 −0.02 −0.008 ξt + κzt−1 (4.46) (−0.74) (−2.44) (−0.32) 0.331 0.30 0.33 0.29 0.326 zt+1 (37.14) (24.97) (31.95) (15.14) (7.00) 0.05 0.07 0.06 zt+1 · zt−1 (7.75) (3.78) (0.88) 0.05 0.06 0.002 zt+1 · (ξt + κzt−1 ) (7.33) (3.68) (0.03) 0.104 0.091 −0.000 −0.001 0.006 0.02 0.006 const (184.28) (31.77) (−0.29) (−0.42) (0.79) (2.10) (0.36) R2 / partial R2 0.57 0.57 0.13/0.46 0.13/0.29 0.11/0.12 0.11/0.11 0.11/0.11/0.11 overid 0.67 0.59 0.94 0.81 0.85 Notes. The instruments are the following. Column 3: zt−1 , xt−1 , x2t−1 and their interactions with zt−1 . Column 4: xt−1 , x2t−1 and interactions with ξt + κzt−1 . Columns 5-7: zt−1 , ξt + κzt−1 , xt−1 , x2t−1 and their interactions with ξt + κzt−1 and zt−1 .

28

Table 19: Endogenous changes in risk-aversion; LHS variable: st RF RF SF (1) (2) (3) 0.02 −0.001 zt−1 (21.58) (−0.45) −0.045 −0.045 xt−1 (−130.54) (−130.54) 0.005 0.005 x2t−1 (23.02) (23.04) 0.02 ξt + κzt−1 (6.07) 0.370 zt+1 (49.48) 0.104 0.089 −0.01 const (198.61) (37.89) (−2.49) R2 / partial R2 0.53 0.53 0.12 overid 0.02

no signaling, no punishment SF SF SF (4) (5) (6) 0.02 (6.72)

0.370 (49.58) −0.007 (−2.53) 0.12 0.00

0.351 (45.96) −0.01 (−2.55) 0.11 0.04

0.02 (6.28) 0.352 (45.82) −0.02 (−5.56) 0.11 0.00

SF (7) 0.03 (2.49)

−0.008 (−0.82) 0.352 (45.82) −0.001 (−0.19) 0.11 0.02

Notes. The instruments are the following. Column 3: xt−1 , x2t−1 . Columns 4 and 7: xt−1 , x2t−1 , zt−1 , ξt + κzt−1 . Column 5: xt−1 , x2t−1 , ξt + κzt−1 . Column 6: xt−1 , x2t−1 , zt−1 .

4) leads to slightly inconsistent point estimates but also a clear rejection of overidentification. Adding past default and/or the risk-aversion shock as extra terms to the right hand side (Columns 5-7) leads to better estimates for the future default coefficient, a (marginally) significant past default coefficient and at most a slight improvement in the rejection level of the overidentification test. 3.4.3

No signaling, punishment

When allowing for punishment but no signaling (Table 20), the estimates look almost the same as adding a highly significant 0.1 value to past default to the results without punishment. Past default is significant in the RF (Column 1), and adding the risk-aversion shift only makes it a bit smaller and much less significant. Omitting past default from the SF leads to highly inconsistent estimates for future default (Columns 3-4). When adding extra terms, this inconsistency is reduced. From the two extra terms, past default seems stronger now than before (its significance is higher in Column 7 of Table 20 than in Column 7 of Table 19). Overidentification is still rejected in all cases. 3.4.4

Signaling and punishment

When we consider both signaling and punishment (Table 21), the findings show exactly the same patterns as without signaling. Summing up, we find that our methodology is not necessarily able to separate the impact of an endogenous shift in risk-aversion (in response to a default) from a direct punishment effect. There are some econometric signs which may point towards one or the other. If there is a proxy for risk-aversion and its inclusion has a strong impact of the significance of past default, this points to the possibility of an endogenous risk-aversion shift. A stubborn rejection of overidentification can be a sign of a more complex link between past default and the spread, which is more characteristic of an endogenous risk-aversion shift 29

Table 20: Endogenous changes in risk-aversion; no signaling, punishment LHS variable: st RF RF SF SF SF SF (1) (2) (3) (4) (5) (6) 0.12 0.10 0.12 zt−1 (130.27) (6.07) (44.19) −0.045 −0.045 xt−1 (−130.54) (−130.54) 0.005 0.005 x2t−1 (23.02) (23.04) 0.02 0.11 ξt + κzt−1 (6.07) (42.20) 0.471 0.475 0.351 0.353 zt+1 (49.21) (49.35) (45.96) (45.68) 0.104 0.09 −0.007 −0.008 −0.01 −0.08 const (198.61) (37.89) (−2.00) (−2.38) (−2.55) (−24.99) R2 / partial R2 0.73 0.73 0.12 0.12 0.11 0.11 overid 0.05 0.00 0.04 0.00

SF (7) 0.12 (11.97)

−0.008 (−0.42) 0.352 (45.82) −0.001 (−0.19) 0.11 0.02

Notes. The instruments are the following. Column 3: xt−1 , x2t−1 . Columns 4 and 7: xt−1 , x2t−1 , zt−1 , ξt + κzt−1 . Column 5: xt−1 , x2t−1 , ξt + κzt−1 . Column 6: xt−1 , x2t−1 , zt−1 .

Table 21: Endogenous changes in risk-aversion; signaling, punishment LHS variable: st RF RF SF SF SF SF (1) (2) (3) (4) (5) (6) 0.13 0.11 0.12 zt−1 (145.51) (30.18) (42.99) −0.045 −0.045 xt−1 (−131.68) (−131.68) 0.005 0.005 x2t−1 (22.70) (22.72) 0.02 0.11 ξt + κzt−1 (6.31) (40.76) 0.47 0.499 0.352 0.357 zt+1 (51.05) (51.77) (46.10) (45.88) 0.104 0.089 −0.006 −0.02 −0.007 −0.008 const (196.71) (37.63) (−1.76) (−4.50) (−2.68) (−25.83) R2 / partial R2 0.75 0.75 0.13 0.13 0.11 0.11 overid 0.06 0.00 0.04 0.00 Notes. The instruments are the following. Column 3: xt−1 , x2t−1 . Columns 4 and 7: xt−1 , x2t−1 , zt−1 , ξt + κzt−1 . Column 5: xt−1 , x2t−1 , ξt + κzt−1 . Column 6: xt−1 , x2t−1 , zt−1 .

30

SF (7) 0.13 (11.94)

−0.008 (−0.76) 0.353 (45.95) −0.002 (−0.28) 0.11 0.02

than that of punishment. This rejection however may not be necessarily conclusive for realistic sample sizes. Such tests tend to overreject in small samples (see Hayashi, 2000, for example). Given that we do not get a rejection in our actual empirical exercise, it is rather indicative of a simple additive punishment specification. Finally and most importantly, if a default episode leads to higher risk-aversion (and thus higher spreads), that is already a punishment itself for the borrower. As for repayment incentives, it is thus not central to separate these two cases. For modeling punishment, however, it is more important. In short, we can safely say that our method can separate signaling from either a direct punishment or an endogenous response of risk-aversion; but it might have difficulties in separating the latter two. We argue in the main paper, however, that even such a default-driven increase in risk-aversion requires some form of borrower-lender lock-in, implying still a ’relational contract’ mechanism.

3.5

Nonlinear pricing

Here we are running regressions with constant risk-aversion, no endogeneity, in levels and with all combinations of signaling and punishment. We are interested in the following: (1) When calculating the estimated risk-aversion parameter from the linear regression, how close is that to the true parameter? (2) Can it happen that past default (or some other extra term) is picking up an approximation error term, thus it is significant although it does not enter the true (nonlinear) pricing equation? (3) How good is the linear approximation, both under general and “controlled” nonlinearity? 3.5.1

General nonlinear DGP

First we look at the case with no punishment or signaling (Table 22). The reduced form (Columns 1 and 2) still offers a good fit (an R2 of 0.43, as compared to 0.50 in Table 2) and quite similar point estimates than under the linear pricing assumption. Past default is never significant (Columns 1 and 4) nor pivotal for overidentification (Column 3 and 4). It also means that zt−1 does not pick up any approximation error in this case. Similarly, when including xt−1 or x2t−1 as extra terms on the right hand side (and using the other as a single instrument), none of them would be significant or lead to any change in the other point estimates, and standard errors would increase (results are not reported). Future default is significant. The implied estimate of the risk-aversion parameter is close to the true value, since the point estimate is close to

eaλ −1 a

= 0.35 (under λ = 0.3 and a = 1).

Next we add signaling but still no punishment (Table 23). Results are again very similar to those under the linearized data generating process (Table 3). Past default is now significant in the reduced form (Column 2) but not in the structural form (Column 4). It is not pivotal for overidentification either. The approximation error is not picked up by xt−1 or x2t−1 either (not reported). The coefficient of future default is the same as without signaling. Strong signaling would yield the same conclusions. Under punishment but no signaling (Table 24), past default is again highly significant, important and sizable in the RF (Columns 1-2). Omitting past default from the SF (Column 3) leads to much less accurate implied estimates of the risk-aversion parameter (the coefficient of future default is further away from 0.35), and a rejection of overidentification. Adding past default as an extra term (Column 4) restores the non-rejection of overidentification, and all parameter estimates are in line with expectations:

31

Table 22: Nonlinear pricing equation; no signaling, no punishment LHS variable: st RF RF SF SF (1) (2) (3) (4) −0.001 0.001 zt−1 (−1.35) (0.32) −0.039 −0.039 xt−1 (−115.95) (−123.04) 0.003 0.003 x2t−1 (14.39) (14.34) 0.31 0.31 zt+1 (47.69) (45.33) 0.102 0.102 0.005 0.005 const (197.35) (236.14) (2.19) (2.05) R2 / partial R2 0.43 0.43 0.12 0.11 overid 0.74 0.31 Notes. Instruments: xt−1 , x2t−1 and zt−1 in Column 3, xt and x2t−1 in Column 4.

Table 23: Nonlinear pricing equation; signaling, no punishment LHS variable: st RF RF SF SF (1) (2) (3) (4) 0.009 0.001 zt−1 (10.19) (0.38) −0.040 −0.041 xt−1 (−116.45) (−127.53) 0.003 0.003 x2t−1 (13.86) (14.37) 0.31 0.31 zt+1 (49.91) (45.40) 0.102 0.105 0.005 0.005 const (195.86) (243.67) (2.12) (2.15) R2 / partial R2 0.45 0.45 0.13 0.11 overid 0.53 0.28 Notes. Instruments: xt−1 , x2t−1 and zt−1 in Column 3, xt and x2t−1 in Column 4.

32

Table 24: Nonlinear pricing equation; punishment, no signaling LHS variable: st RF RF SF SF SF (1) (2) (3) (4) (5) 0.10 0.10 zt−1 (107.72) (43.05) −0.040 −0.052 −0.04 xt−1 (−115.95) (−129.77) (−10.47) 0.003 0.004 x2t−1 (14.39) (16.17) 0.40 0.31 0.11 zt+1 (48.30) (45.33) (4.13) 0.102 0.133 0.006 0.005 0.100 const (197.35) (245.57) (2.04) (2.05) (11.02) R2 / partial R2 0.66 0.46 0.12 0.11 0.003 overid 0.00 0.31 0.00

SF (6)

−0.000 (−0.57) 0.40 (47.91) 0.006 (2.11) 0.12 0.00

Notes. Instruments: xt−1 , x2t−1 and zt−1 in Column 3, xt and x2t−1 in Column 4, x2t−1 and zt−1 in Column 5, and xt−1 and zt−1 in Column 6.

the coefficient of zt−1 is close to its true value, while the coefficient of zt+1 is close to its value without punishment or signaling (Table 22). Here we also report SF specifications in which we (erroneously) add the fundamental xt−1 or x2t−1 as an extra right hand side variable, in order to check whether they can pick up the impact of past default. As Columns 5-6 show, this is not the case: though xt−1 is significant, it still leads to a clear rejection of overidentification (Column 5). Under both signaling and punishment (Table 25), we have similar findings as before. Again, neither xt−1 nor x2t−1 can entirely pick up the extra effect coming from past default; and neither leads to a non-rejection of overidentification. Now the inclusion of the linear term xt−1 leads to an estimated future default coefficient that is even further away from the truth, and the remaining two instruments are very weak. Adding past default, on the other hand, leads to meaningful estimates, a non-rejection of overidentification and a correct channel decomposition. Strong signaling would yield the same conclusions. Summing up, our method performs remarkably well under the fully nonlinear data generating process. There is no sign that past default or any of the fundamentals would be picking up approximation error terms. None of the fundamentals are able to replace past default as a necessary extra term in the structural form. The coefficient of future default is highly significant, and its value implies a risk-aversion parameter that is very close to the truth. Though one cannot rule out that the linear approximation would perform less nicely in some other parametrizations, we view these results as quite reassuring. 3.5.2

Controlled nonlinearity: using the two-piece arch approximation

In these scenarios we use a nonlinear scenario, but in such a way that we can in principle estimate the true data generating process directly. As explained in (7) and (8) and recapped here, the spread is defined by st = b1 Et [zt+1 ] + Et [(a2 + (b2 − b1 ) zt+1 ) χ (zt+1 > z¯)] + υt ,

33

LHS variable: st zt−1 xt−1 x2t−1 zt+1 const R2 / partial R2 overid

Table 25: Nonlinear pricing equation; signaling, punishment RF RF SF SF SF (1) (2) (3) (4) (5) 0.10 0.10 (120.32) (41.91) −0.040 −0.054 0.10 (−116.45) (−128.30) (5.25) 0.003 0.004 (13.86) (15.35) 0.43 0.31 1.18 (50.55) (45.40) (8.10) 0.102 0.138 −0.001 0.005 −0.256 (195.86) (243.47) (−0.18) (2.15) (−5.19) 0.68 0.45 0.13 0.11 0.003 0.00 0.28 0.00

SF (6)

−0.000 (−0.97) 0.43 (50.18) −0.00 (−0.02) 0.13 0.00

Notes. Instruments: xt−1 , x2t−1 and zt−1 in Column 3, xt and x2t−1 in Column 4, x2t−1 and zt−1 in Column 5, and xt−1 and zt−1 in Column 6.

which can be estimated by running st = b1 zt+1 + a2 χ (zt+1 > z¯) + (b2 − b1 ) zt+1 χ (zt+1 > z¯) + υ˜t , using (various) nonlinear transformations of xt−1 as instruments. Here χ (yt < y¯) is the indicator function that the random variable yt is less than y¯. The terms in (8) suggest instruments like χ (xt−1 < x ˜) and xt−1 χ (xt−1 < x ˜) for some cutoff x ˜, but there is no a priori guarantee that those would be sufficiently correlated with the terms in (8). Our particular choice will be χ (xt−1 < x ˜) and xt−1 χ (xt−1 < x ˜), with the cutoff x ˜ being the value of x that leads to a value of z = z¯. The parameter values of the true data generating process are as follows (under a = 1, λ = 0.3 and z¯ = 0.3213): eaλ¯z − 1 = 0.3213 a¯ z eaλ − eaλ¯z = = 0.3735 a (1 − z¯) = 0.0521 1 eaλ¯z − z¯eaλ = − + = −0.0236. a a (1 − z¯)

b1 = b2 b2 − b1 a2

We will be checking first the true specification (with our a priori selected set of instruments). Then we add past default as an extra right hand side variable. Then we remove the two nonlinear terms from the right hand side but keep the full set of instruments (both with and without past default on the right hand side). Then we reestimate the previous two specifications but only with the original instruments xt−1 and x2t−1 . To save space, we no longer report RF results. Results under no punishment or signaling are presented in Table 26. Unfortunately, even though we do know the form of the true specification, we do not manage to estimate it. As noted before, it might be

34

Table 26: 2-piece arch approximation; no signaling, no punishment SF SF SF SF SF (1) (2) (3) (4) (5) −0.003 −0.003 zt−1 (−0.80) (−0.97) −0.29 −0.03 χ (zt+1 > z¯) (−0.72) (−0.06) 0.14 0.20 χ (zt+1 > z¯) zt+1 (0.57) (0.79) 0.51 0.18 0.34 0.35 0.34 zt+1 (0.99) (0.27) (49.17) (45.84) (49.12) 0.001 0.003 −0.002 −0.002 −0.002 const (0.10) (0.44) (−0.60) (−0.61) (−0.59) R2 / partial R2 0.10/0.11/0.11 0.09/0.11/0.10 0.12 0.11 0.12 overid 0.44 0.36 0.57 0.58 0.30 LHS variable: st

SF (6) −0.003 (−0.96)

0.35 (45.79) −0.002 (−0.60) 0.11 0.23

Notes. The instruments contain the full set in Columns 1-4, and only xt−1 , x2t−1 and zt−1 in Columns 5-6.

the consequence of our instrument selection: though not reported, the Shea partial F statistics would be very low in both Column 1 and 2. We nevertheless restrain ourselves from a full-blown search for better instruments. Another potential reason for the dismal results is a high degree of multicollinearity between the right hand side variables, as indicated by their very low T-statistics and still an F statistics of joint significance reaching 780 (Column 1) and 570 (Column 2), with p values of 0.00 in both cases. Once we drop the extra terms relative to our benchmark linear regression specification (6) and keep only future default (Columns 3-6), we get the same findings as in the case of the general nonlinear data generating process (Table 22). More importantly, now we can compare the estimated future default coefficient to its true value b1 = 0.3213, and conclude that the linear approximation estimates the true parameter quite well. Under signaling but no punishment (Table 27), results remain identical as in the previous case (Table 26). Again, past default is neither significant, nor pivotal for overidentification not to be rejected. Next we consider punishment but no signaling (Table 28). Though this case also features problems in identifying the three expected value (or probability) terms (Columns 1-2), the identification of the extra effect of past default works just as usual: it is highly significant in the SF (Columns 2, 4 and 6), and not adding it as an extra right hand side variable leads to a (near-) rejection of overidentification (Columns 1, 3 and 5). The linear approximation again gets quite close to the true value of b1 (in Columns 4 and 6), and the omitted probability and expected value terms do not lead to a clear rejection of overidentification. Finally, we add both signaling and punishment (Table 29). Results are almost identical as in the previous table. Overidentification is even more rejected in Column 1, where we again do not manage to estimate the true data generating process well. Summing up, the linear approximation performs remarkably well under the constrained nonlinear data generating process as well. There is no sign that past default or any of the fundamentals would be picking up approximation error terms. None of the fundamentals are able to replace past default as a necessary extra term in the structural form. The coefficient of future default is highly significant, and its value is close to the true parameter value. Unfortunately, we do not manage to identify the true data generating 35

LHS variable: st zt−1 χ (zt+1 > z¯) χ (zt+1 > z¯) zt+1 zt+1 const R2 / partial R2 overid

Table 27: 2-piece arch approximation; signaling, no punishment SF SF SF SF SF (1) (2) (3) (4) (5) −0.003 −0.003 (−0.91) (−0.94) 0.09 −0.002 (0.15) (−0.00) 0.16 0.16 (0.75) (0.78) 0.10 0.19 0.34 0.35 0.34 (0.16) (0.32) (51.26) (45.99) (51.20) 0.004 0.003 −0.001 −0.002 −0.001 (0.50) (0.42) (−0.57) (−0.64) (−0.56) 0.11/0.13/0.12 0.09/0.11/0.10 0.13 0.11 0.13 0.33 0.26 0.50 0.49 0.31

SF (6) −0.003 (−0.94)

0.35 (45.92) −0.002 (−0.64) 0.11 0.92

Notes. The instruments contain the full set in Columns 1-4, and only xt−1 , x2t−1 and zt−1 in Columns 5-6.

LHS variable: st zt−1 χ (zt+1 > z¯) χ (zt+1 > z¯) zt+1 zt+1 const R2 / partial R2 overid

Table 28: 2-piece arch approximation; no signaling, punishment SF SF SF SF SF (1) (2) (3) (4) (5) 0.10 0.10 (27.98) (36.97) 9.08 −0.03 (1.90) (−0.06) 2.49 0.20 (2.82) (0.79) −11.45 0.18 0.45 0.35 0.45 (−1.88) (0.27) (49.43) (45.84) (49.38) 0.10 0.003 −0.003 −0.002 −0.003 (1.20) (0.44) (−0.91) (−0.61) (−0.93) 0.10/0.12/0.11 0.09/0.11/0.09 0.12 0.11 0.12 0.12 0.36 0.00 0.58 0.00

Notes. The instruments contain the full set in Columns 1-4, and only xt−1 , x2t−1 and zt−1 in Columns 5-6.

36

SF (6) 0.10 (36.97)

0.35 (45.79) −0.002 (−0.60) 0.11 0.23

LHS variable: st zt−1 χ (zt+1 > z¯) χ (zt+1 > z¯) zt+1 zt+1 const R2 / partial R2 overid

Table 29: 2-piece arch approximation; signaling SF SF SF (1) (2) (3) 0.10 (34.52) −3.33 −0.002 (−1.64) (−0.00) 0.31 0.16 (0.40) (0.78) 3.66 0.19 0.47 (1.68) (0.32) (51.81) −0.02 0.003 −0.009 (−0.81) (0.42) (−2.76) 0.11/0.13/0.12 0.09/0.11/0.10 0.13 0.00 0.26 0.00

and punishment SF SF (4) (5) 0.10 (35.73)

SF (6) 0.10 (35.73)

0.35 (45.99) −0.002 (−0.64) 0.11 0.49

0.35 (45.92) −0.002 (−0.64) 0.11 0.23

0.47 (51.76) −0.009 (−2.78) 0.13 0.00

Notes. The instruments contain the full set in Columns 1-4, and only xt−1 , x2t−1 and zt−1 in Columns 5-6.

process, due to a combination of potential multicollinearity and the weakness of our a priori selected instruments. In this last respect, we believe that it would be a worthwhile but separate project to explore the estimation of such a nonlinear structural-form pricing equation.

4

Summary of the results

Our objective was to explore the performance of our error-in-variables method using a simulation exercise. We generated artificial data for fundamentals, default behavior and loan spreads, under various assumptions. Our benchmark regression was always a linear regression in expected default, replacing it with its realization and then instrumenting with fundamentals. We considered the following cases: • Linear pricing equation, estimation in levels, a full combination of signaling and/or punishment. • Linear pricing equation, estimation in first differences, a full combination of signaling and/or punishment. • Linear pricing equation, estimation in levels, endogeneity of the pricing error term, a full combination of signaling and/or punishment. • Exogenous and endogenous fluctuations in risk-aversion. • Nonlinear pricing equation but still estimated by our linear specification. Our results were in general quite reassuring. In all cases, we managed to get a meaningful and (near) correct point estimate of future default. Our method was able to tell apart punishment from signaling, both by looking at the significance of past default as an extra right hand side variable, and its role in the overidentification test’s rejection. Its performance was not influenced by changing the degree of nonlinearity of the (default) prediction equation. Even when the true pricing equation was nonlinear, 37

our linear structural-form regression was meaningful, it led to the correct identification of a punishment effect, and the point estimate of future default was either close to its true value, or the implied underlying parameter of risk-aversion was close to the truth. The approximation error never led to the significance of any additional right hand side variable in the SF. The channel decomposition also worked in all cases. We could also illustrate three methodological points: (1) the importance of strict/sequential exogeneity and the proper choice of instruments under first differencing, (2) the immunity of our approach to the endogeneity of the pricing error terms, and (3) the property that one does not have to use all available instruments to identify the SF correctly, while missing fundamentals might cause some bias in the RF. One important caveat was the limited separability of the impact of past default from an endogenous shift in risk-aversion. The two effects differ in that a shift in risk-aversion leads to a more complicated, nonlinear true data generating process, which, as we find, can be indicated by a rejection of overidentification. This rejection however may not be necessarily conclusive for realistic sample sizes. Such tests tend to overreject in small samples (see Hayashi, 2000, for example). Given that we do not get a rejection in our actual empirical exercise, it is rather indicative of a simple additive punishment specification. Though this lack of clear separability might be an issue from a sovereign risk theory/modeling perspective, the implication for a borrowing country is the same. No matter whether a default leads to a higher future spread because of increased risk-aversion or a punishment effect, there is a cost of default for the country. Moreover, even such a default-driven increase in risk-aversion requires some form of borrower-lender lock-in, pointing to relational contracts.

Appendix Here we collect and present the calculation details, which describe the analytical derivations of the pricing equations. Let us start with the necessary formulas for the linear (1-piece arch) and the controlled nonlinear (2-piece arch) approximations. Remember that  a + b z 1 1 H2 (z) = a + b z 2

2

¯ in [0, z]

.

in [¯ z , 1]

The four parameters are such that H2 (0) = eat λ0 − 1 = 0, H2 (¯ z ) = eat λ¯z − 1, H2 (1) = eat λ − 1. Similarly, H1 (z) was a fully linear function, spanning the arch between 0 and 1 of the function eazλ − 1. We have four equations for four unknowns, yielding a1 = 0, b1 =

z λ −1 ee¯ a¯ z ,

a2 =

z λ −¯ ea¯ z eaλ a(1−¯ z)

If the distribution of zt+1 is given by    zt+1 =

 

1 if xt ≤ xt − −α2 (xt − x ¯t ) if −

1 α2

1 α2

+ xt < xt < xt

0 if xt ≥ xt ,

38

    

,

− a1 , b2 =

¯

eaλ −eazλ a(1−¯ z) .

then Et (H1 (z)) = (eat λ − 1) ∗ E[z] ( [ ]( )) 1 1 1 = (eat λ − 1) ∗ F (xt − ) + Et −α2 (xt − x ¯t ) | − +x ¯t < xt < x ¯t F (xt ) − F (xt − ) α2 α2 α2 [ ]( ) z¯ 1 z¯ ¯t ) | − Et [H2 (z)] = (a2 + b2 ) F (xt − ) + b1 Et −α2 (xt − x +x ¯ t < xt < x ¯t F (xt ) − F (xt − ) α2 α2 α2 [ ]( ) 1 z¯ z¯ 1 +Et a2 − b2 α2 (xt − x ¯t ) | − +x ¯ t < xt < x ¯t − F (xt − ) − F (xt − ) α2 α2 α2 α2 =b1 E[z] + E[(a2 + (b2 − b1 ) z) χ (z > z¯)], where x ¯t = Et [xt ] = Axt−1 . Now let us move on to the general nonlinear case. A nonlinear pricing formula that takes into account the distribution for z : st =

1 ( ( at zt+1 λ )) ln Et e . at

Using again the link between zt+1 and xt , we can write

eat zt+1 λ =

    

eat λ if xt ≤ xt − e−at λα2 (xt −xt ) if −

1 α2

1 α2

+ xt < xt < xt

1 if xt ≥ xt

    

.

Let us denote the set of default by Dt+1 . Then ) ( ) [ ]( 1 1 at zt+1 λ at λ −at λα2 (xt −xt ) Et e = e F (xt − ) + Et e |Dt+1 F (xt ) − F (xt − ) + (1 − F (xt )) α2 α2 [ ] a2 λ2 α22 Et e−at λα2 (xt −xt ) |Dt+1 = exp{−at λα2 (Et [(xt − xt ) |Dt+1 ] + t V art [(xt − xt ) |Dt+1 ]}, 2 so: st = + = ·

{ 1 ( ( at zt+1 λ )) 1 1 ln Et e = ln eat λ F (xt+1 − ) + (1 − F (xt+1 )) at at α2 ) { }} ( 1 a2t λ2 α22 ) exp −at λα2 (Et [(xt+1 − xt+1 ) |Dt+1 ] + V art [(xt+1 − xt+1 ) |Dt+1 ] F (xt+1 ) − F (xt+1 − α2 2 { [ ] 1 1 1 at λ ln 1 + e F (xt+1 − ) − F (xt+1 ) + F (xt+1 ) − F (xt+1 − ) · at α2 α2 ]} [ a2 λ2 α22 V art [(xt+1 − xt+1 ) |Dt+1 ]} (10) exp{−at λα2 (Et [(xt+1 − xt+1 ) |Dt+1 ] + t 2

Here the expected value and the variance corresponds to a variable with normal distribution, so (10) can be computed analytically.

39

Online Appendix A to the paper ”Evidence for ...

Online Appendix B to the paper ”Evidence for Relational Contracts in .... Due to its high degree of nonlinearity, it is not obvious how to estimate the general ...

258KB Sizes 0 Downloads 40 Views

Recommend Documents

Online Appendix to the Paper Tax Revenues ...
(p er capita). 95.2. 67.4. 140.1. 71.4. 24.9. 105.3. 117.9. 157.5. 50.3. (108.3). (74.8). (142.0). (69.6). (12.7). (117.0). (152.8). (128.9). (39.2). Length of the episo de.

Online Appendix for: Who Migrates and Why? Evidence ...
better idea for the intuition behind this restriction note that vit. = uitn − uits − ... it = yits), i.e. if the individual was observed working in the source country last period. Then the outcome equations conditional on actually being observed

Online Appendix to
Online Appendix to. Zipf's Law for Chinese Cities: Rolling Sample ... Handbook of Regional and Urban Economics, eds. V. Henderson, J.F. Thisse, 4:2341-78.

Online Appendix to
The model that controls the evolution of state z can be written as zt. = µz .... Members of survey A think of the signal θA as their own, but can observe both.

Online Appendix to
Sep 27, 2016 - data by applying the pruning procedure by Kim et al. .... “Risk Matters: The Real Effects of Volatility Shocks,” American ... accurate solutions of discrete time dynamic equilibrium models,” Journal of Economic Dynamics &.

ONLINE APPENDIX for
Dec 6, 2017 - that acquired other stores, whereas. “Other” denotes chains in mark ets with mergers that did not participate in the merger. All sp ecifications include a con stan t and store and time. (quarter) fixed effects. Columns. 5 to. 8 also

Online Appendix for - Harvard University
Notice first that solving program (5), we obtain the following optimal choice of investment by the ... 1−α . Plugging this express into the marginal contribution function r (m) = ρ α ...... Year started .... (2002), we obtained a master-list of

Online Appendix for - Harvard University
As pointed out in the main text, we can express program (7) as a standard calculus of variation problem where the firm chooses the real-value function v that ...

Online Appendix to
Nov 3, 2016 - 0.03. 0.03. 0.03. 0.02. 0.04. 0.04. 0.04. 0.04. Note . Robust standard errors b et w een paren theses, r ob us t-standard-error-based. p-v alues b et w een brac k ets. ∆. Cr e d is the gro w th rate of real lending b y domestic banks

Online Appendix for
where µi ≡ ¯αi − ¯ci. Solving the FOCs gives ei = qi,. (A.1) qi = µi − ρ n. ∑ j=1 bijqj + φ n. ∑ j=1 aijqj,. (A.2) or, in vector-matrix form, e = q, q = µ − ρBq + φAq. Therefore, there exists a unique Nash equilibrium with the e

Online Appendix for the paper “New Parties and Policy ...
In this appendix, I present a simple career concerns model in which I apply the insights ..... Reputational dynamics and political careers. ... Tech. rep., Direccion.

Online Appendix
Aug 13, 2013 - Online Appendix Figures 3a-4e present further evidence from the survey .... Control variables include age, gender, occupation, education, and ...

APPENDIX for LABORATORY 3 SHEET APPENDIX A
An Interrupt Service Routine (ISR) or Interrupt Handler is a piece of code that should be executed when an interrupt is triggered. Usually each enabled interrupt has its own ISR. In. AVR assembly language each ISR MUST end with the RETI instruction w

Online Appendix for the paper 'Tax Me, but Spend Wisely? Sources of ...
Sources of Public Finance and Government Accountability' .... Political party affiliation of the mayor at the time of ... Data coded by Fernanda Brollo, Tommaso ... Table 2: FPM coefficients, predicted and real transfer revenues by population ...

Online Appendix
Power Capital Variables adds up all ranking positions by terms (excluding the above top 4 positions). 2 ever held by native officials connected to a commune (in.

Online Appendix
Aug 13, 2013 - Online Appendix Figures 3a-4e present further evidence from the survey responses on the .... Notes: Data from a survey of 70 individuals in 9 villages. ...... You will stay in the assigned room for 20 minutes listening to a.

Online Appendix
Length of business registration in days. 2. Land access sub-score ..... Trends. Province trends. Cluster. Commune. Commune. Commune. Commune. Province.

Online Appendix
When γ = 1, Equation 3 becomes α(p – c) = 1 + exp(δ – αp). The left-hand ... We averaged daily five-year maturity credit default swap (CDS) spreads on GM debt.

NOT FOR PUBLICATION Online Appendix to “Can ...
Nov 3, 2009 - Online Appendix to “Can Higher Prices Stimulate Product Use? ..... partially determines the degree of price sensitivity in the purchase decision.

Online Appendix to International Portfolios: A ...
Aug 31, 2015 - International Portfolios: A Comparison of Solution Methods. Katrin Rabitsch∗. Serhiy Stepanchuk† ... vanish: the ergodic distribution of 'global η = 10−5' is close to the one it converges to when η = 0, while the ergodic .....

Online Appendix for “Coordination on Networks” A ...
Dec 1, 2017 - v, φ, σ(·), and G, in particular, the cutoffs are independent of the noise distribution F. In this appendix, we provide an alternative proof of the noise-independent selection result from a potential game approach. In the simple case

Online Appendix Supplemental Material for “A Moment ...
Aug 3, 2013 - as T → ∞. In Proposition 1 below, we show that calculating the transition probabilities using the continuous distribution functions does not always deliver meaningful approximations. In particular, Tauchen's (1986) method fails to a

Internet Appendix for “A Supply Approach to Valuation”
‡Department of Finance, Fisher College of Business, The Ohio State .... premium is the yield spread between Baa-rated and Aaa-rated corporate bonds from Federal Re- .... spread of 0.91 but a small investment-to-capital spread of 0.06, albeit ...

Online Appendix to Exogenous Information ...
“Exogenous Information, Endogenous Information and Optimal Monetary Policy.” 1 ..... are standard values in the business cycle literature. Similar values ... In this online Appendix C there are no technology shocks because in Section 5 of the pap