CREATING CONSISTENT PRICE SERIES

Prepared for the Office of National Drug Control Policy by

BOTEC Analysis Corporation June 8, 1993

BOTEC Analysis co

R

po

RAT

10

N

CREATING CONSISTENT PRICE SERIES

By: Jonathon Caulkins Andrew H . Chalsma

With assistance from: Mark A.R. Kleiman David A. Boyum

TABLE OF CONTENTS

Executive Summary ....... ........... ........ ... ... ... ..... .. ...... ..... ............ ........ ... ... ... ... ...... ........ ....... ... ... .... iii Introduction ........................................................................................... .... ................................. 1 I. Improving the Reporting of Domestic Monitor Program Data .... .............................................. 2 2. Estimating Price in One Place at One Time .... ........ .... .... ..... .... .. ...... .... ... ....... ..... ... .... ... 3 3. Finding the National Average Price ......... ... ............. ....... ... .... ....... ..... ... ........ ... .. ....... ... . 8 4. Finding National Price by Region of Origin ................................................................ 10 5. Estimating Trends in National Price ... ... ...... ..... ... ..... ....... .... .... .. .... ..... .. .... ... ............... 11 6. Estimating Average Purity and Trends in Average Purity ... ...... ... ...... ...... ............ .... ... 12 7 . Ad'Justmg . . S·lze ...... ..... ..... .. .. .. ... ......... .. .... ... .. .. .. ... .............. .. ..... .... ...... 13 or T ransactlon 8. Using Pure Quantity Purchased As an Alternative Measure of Price .... ... ... .... ........ .. ... 14 9. SuggestlOns . Co ImproVIng . Samp rmg .... .. ... .... .... .... .. .... .. .... ...... ... ... .. ..... ..... ...... .. .. ... .... 15 lor 10. Expanding the Number of Observations ................................................................... 17 11. Computing Retail Heroin Price and Purity Series ...................................................... 20 12. Summary ..... ..... ...... .............. ........ .... .. .... ....... ..... ......... ...... .. ... .. ... ... ..... ..... .. .... ....... .. 21

u

II. Creating Price Series from STRIDE ...................................................................................... 23 1. The STRIDE Database .. ..... .... ... .. .. .... ..... .... ... ... ...... ....... .. ..... ...... ....... ..... .... ... .... ...... .. 23 2. Outliers and Other Data Issues ................................................................................... 24 3. Adjusting for Quantity Discounts ...... ... :..................................................................... 26 4. Adjusting for Differences in Purity ...... ..... ............... ... ........ .. ... ..... ...... .... .... .. .. .. ... ... .... 27 5. Alternative Ways of Constructing Price Series .. ................... .. ..................... .. .......... ... 30 6. Application to STRIDE's Cocaine Data ...................................................................... 35 7. Summary ................................................................................................................... 42 III. Existence and Implications of Variability Between Purchases ............................................... 43 1. Variability in Purchases of Similar Quantity .. ....... ........... ...... ........................... ...... ... .. 44 2. Variability in Purchases of Similar Price ..................................................................... 53 3. Variability in Standardized Prices .. .. ... ... ... .... .... .......... .. .. ..................... .............. .... ..... 53 4. Implications for Detecting Variation in Prices ............................................................ 55 5. Summary ...... ..... .... .. .. .. .............. .... ... ... ......... .. .. .... ..... ... .... .... ... ..... ...... ........... ..... ....... 59 References ..... .... .... .... ... ....... ..... .. ......... .. ..... ........... ....... ................. .. .. .... .... ..... ............... ... .. ..... . 61 Appendix ........ ..... ...... .... .. ..... ...... .... ...... ...... .... ... .... .. .... .. .... .. ...... .... ..... .: .... .. ... ... .. ...................... 63

Executive Summary

Scope of This Report This report describes methods of creating price series for illicit drugs from available data. This topic is important both because price series have the potential to play an important role in policy planning, analysis, and evaluation and because currently available data are not reported in the most informative manner. For two decades price series have been used to address policy questions ranging from the link: between prices and property crime (Brown and Silverman, 1974); the elasticity of demand (Silverman and Spruill, 1977; DiNardo, 1993); and the effect of changing prices on emergency room mentions of illicit drugs, medical examiner mentions, and the fraction of arrestees testing positive for illicit drugs (ONDCP, 1992). These studies, however, only scratch the surface of what could be done. For example, price series could be examined in relation to large seizures and/or important arrests to determine whether it is possible to create "spot shortages" for drugs, or measure the effectiveness of various enforcement techniques in reducing supply. Examining retail and wholesale price movements might also shed light on how price increases at one level are passed on to lower levels, a key question in assessing the efficacy of source country control, interdiction, and high-level enforcement (Caulkins 1990, Chapter 3). Monitoring prices could also give earlier warning of changes in market conditions than is provided by other available indicators (BOTEC 1991; Kleiman and Caulkins 1992). There are a number of reasons why price series have not been better used. One of the more significant is simply that challenging technical issues arise when working with price data for illicit drugs. All data associated with illicit activities have "warts"; drug price data are no exception. Seemingly sensible, straightforward approaches to analyzing these data can lead one astray. This report seeks to make price series more available for policy purposes by directly addressing some of these difficult technical issues, explicitly describing procedures for creating price series from available data, and illustrating those procedures by applying them to data from the Drug Enforcement Administration. Broadly speaking, there are two ways to produce better price series. One is to collect better data; the other is to use the available data more effectively. The second is the focus of this report. That does not mean that the way data are collected is not important; nor that current data collection procedures are perfect. Designing better data collection procedures is an important and challenging task worthy of its own study; here we take data collection as given and consider how best to sift it for insight. The discussion in this document is broken down into three sections. The first directly addresses the Domestic Monitor Program, but most of the techniques and insights developed are relevant to any data set containing transaction-level data drawn from one particular market level. Often, however, price data span several market levels. Such is the case, for example, with the Drug Enforcement Administration's STRIDE database. The second section parallels the first in CREATING CONSISTENT PRICE SERIES

111

that it directly addresses the STRIDE database, but the discussion readily generalizes to any transaction-level data base with data drawn from several market levels. Typically one is interested not just in creating price series, but also in interpreting them. For example, one might wish to know whether a recent price decline was truly significant or whether it might not simply be attributable to random "noise." Such interpretations must be grounded in knowledge of the nature of the variability in price data. The third section investigates this variability and uses the results to comment on the power and significance of statistical tests as a function of the amount of data collected. The focus of this report is on methodology, but it includes actual price series for cocaine and heroin. It is worth commenting on one striking feature of those price series, before summarizing the methodological findings by chapter. The national price series estimated for cocaine looks familiar. It shows steep declines through 1989, an increase in 1990, followed by another downturn in 1991. What may be more surprising, at least to people who do not regularly study price data, is that the decline in heroin prices is of comparable dimensions. Cocaine prices fell by about 70% between 1983 and 1989. Heroin prices fell by 50% over the same period and have fallen by another third since then (for a total decline of about 66%). These facts can be interpreted in several ways. On the one hand, the decline in cocaine prices is often blamed for substantially contributing to the cocaine-related problems of the 1980s. Inasmuch as that is true, the decline in heroin prices should serve as a warning that heroin should not be ignored in the 1990s. On the other hand, since heroin use does not appear to have grown by as much as cocaine use, this may suggest that the causal links are more complicated. For example, the explosive growth in the size of drug markets in general may have created economies of scale for both heroin and cocaine that led to similar declines in price for both drugs. Conducting the analyses necessary to distinguish between these and other interpretations is not the subject of this report. Rather, this report strives to provide the fodder-accurate, consistent, reliable price series-to support those analyses.

Chapter I: Improving the Reporting of Domestic Monitor Program Data The Drug Enforcement Administration's Domestic Monitor Program (DMP) reports price, purity, and source area information for retail heroin markets on a quarterly basis. DEA has invested great effort has gone in reporting the results in a clear and effective manner, and the reports produced since the end of 1990 stand head and shoulders above those produced earlier. The first chapter seeks to continue this trend by suggesting what may be better ways of analyzing and reporting the DMP data. The chapter begins by discussing the problem of estimating the price in a particular city at a particular time. At first glance this might seem a trivial exercise, but there are subtleties which undermine straight-forward approaches. In particular, there is so much variability across purchases that simply averaging the price paid per pure milligram, as the DMP now does, yields a biased and noisy measure of the average market price.

CREATING CONSISTENT PRICE SERIES

lV

This report suggests instead using an average which weights observations by transaction quantity. There are two reasons to favor the quantity-weighted average. First, for most purposes it is actually the more relevant average. Suppose, for example, one had an estimate of the quantity consumed and wanted to estimate total spending, either to estimate dealers' revenues or to relate drug consumption to consumption-induced property crime. Then one would want to multiply the consumption estimate by the quantity-weighted average, not the transaction-weighted average. One can similarly show that it is changes in the quantity-weighted average which reveal how an addict's spending will change, on average, if the addict wishes to maintain a particular habit size. The transaction-weighted average is the correct one to use if one wants to get a sense of the price most likely to characterize any individual transaction, but most policy decisions rest on information about aggregates, not individual transactions. So, for most purposes the quantityweighted average is more useful. Furthermore, the quantity-weighted average is more stable statistically. For example, if one or more of the purchase observations has zero purity, a common occurrence, then the transactionweighted average "blows up" because it entails dividing by zero. The problem is almost as severe (and is harder to detect) with very low purity observations. Currently the DMP avoids these problems by excluding zero-purity-observations and "outliers." This "fix" is problematic, however. In the first place it discards potentially valuable data. More subtly, the choice of the cutoff price per pure unit, above which observations are discarded, can substantially affect the price per pure unit measured by the transaction-weighted average. This truncation can mask movements in the underlying price. Hence, one of the primary contentions of the first chapter is that the quantity-weighted average should be used instead of the transaction-weighted average. One is, of course, interested in estimating quantities other than the average price in one city at one time. Statistical issues of comparable importance and subtlety arise in these estimations as well. So, the chapter goes on to discuss finding a nationally representative price, finding an nationally representative price by region of origin, estimating trends in prices, estimating levels and trends in purity, and so on. The lessons learned in this discussion are then applied to a subset of STRIDE data that are comparable to typical DMP data to generate time series for the retail price and purity of heroin between 1981 and 1992. The results are displayed on page 20.

Chapter IT: Creating Price Series from STRIDE Section II describes how one might create price series for cocaine from the Drug Enforcement Administration's STRIDE (System to Retrieve Information from Drug Evidence) database. STRIDE differs from the Domestic Monitor Program (DMP) in a number of ways, but from a methodological perspective the main difference is that STRIDE data are collected over a wide range of market levels. STRIDE's cocaine pu~chas.es, for ex~ple, r~g~ over more th~ seven orders of magnitude in size. Substantial quantIty dIscounts eXIst for IllICIt drugs: the pnce per unit decreases as the transaction size increases. Hence, one cannot directly compare the price per unit of observations taken at different market levels. Fortunately, a simple log-linear model is a surprisingly effective way of adjusting for these quantity discounts and, thereby, allowing one to compare observations from different market levels.

v

CREATING CONSISTENT PRICE SERIES

One would similarly like to adjust for differences in quality across observations, where for drugs the principal measure of quality is their purity. Adjusting prices for quality is tricky, but Subsection 4 seeks to resolve some long-standing confusion on this issue. The earliest studies used a log-linear adjustment for differences in quality as well as quantity. Other studies have assumed a priori that quality premia were complete and, hence, collapsed the quality and quantity adjustments into a single factor. The traditional log-linear adjustment is problematic because it produces price series that conflict with conventional wisdom and suggests that drug market participants are almost indifferent to quality. Making the a priori assumption that pure quantity drives transaction level prices, however, is also problematic, because it degrades the fit of the model and produces implausible estimates for the coefficient describing the quantity discount. Fortunately, there is a resolution of this conundrum: the realization that it is the expected not the actual purity which governs transaction-level prices. Obviously it is impossible to reconstruct buyers' expectations on a transaction-by-transaction basis, but that is not necessary. Simple proxies, such as the median purity observed in the same location and time for similar size transactions, work well enough. Price series can be constructed in a variety of ways. The most straight forward method is simply to report average prices for a narrow range of quantities, but that approach does not make efficient use of the data. A better approach, is to use some form of regression. Price is regressed on a variety of regressors including time variables, and the coefficients estimated for the time variables are used to construct a price series. Section 5 of Chapter II reviews these methods and suggests an alternative, standardizing individual purchase observations. The reason quantity discounts and variations in purity make it difficult to monitor prices is that they prevent one from directly comparing different observations. Comparing a one gram purchase to a seven gram purchase is like comparing an apple to an orange. Hence, what one would truly like is a way to convert apples into oranges, and vice versa. Fortunately, in this context such alchemy is possible. Formulas are given for converting individual observations into standard units so that they can be compared. The significant advantage of this approach is that once all of the data have been converted to common units, then the price data can be analyzed just like any other cross-sectional time series. Price series are then created for cocaine from STRIDE data by standardizing individual observations for their varying expected pure quantities. Price series are created at the gram, ounce, and kilogram level for a variety of cities and then synthesized into national price series. The results are displayed on pages 39 to 42.

Chapter ill: Existence and Implications of Variability Between Purchases By their very nature, price series rise and fall over time. The price series created in the first two chapters are no exception. Sometimes these variations reflect true chariges in underlying market conditions; sometimes they reflect nothing more than random variation, i.e. "the luck of the draw." Distinguishing between these cases is vitally important. Making such distinctions is the subject of the third and final chapter. In order to make such distinctions, one must first characterize the "natural" variability which occurs in the absence of any underlying, systematic changes. Ideally, to do this one would collect C REATING CONSISTENT PRICE SERIES

Vl

a large number of observations under identical conditions. Practically speaking, there are no such data. One can, however, for a particular city and time period, examine purchases which are (1) of the same or similar weight, (2) of the same or similar dollar value, and (3) of differing weight, but which have been standardized for those differences. In each case, we found a parametric probability distribution which describes the data reasonably well. For observations of the same or similar weight, to a first-order approximation, prices are roughly normally distributed, at least after outliers have been removed. Such normality, or near-normality, is fortuitous because it is relatively easy to design statistical tests of significance for differences in normally distributed random variables. The design of these tests is discussed, as are procedures for describing these tests' power, or ability to detect true changes. The power of a statistical test depends on (1) the number of observations in each sample, (2) how variable repeated observations are, and (3) the size of the true difference one is trying to detect. Graphs relating these factors, known as power curves, are derived and presented. The results are displayed on pages 58 and 59. The power curves can be used for a variety of purposes, including deciding how many observations should be collected by intelligence-oriented purchase programs. The distribution and variability of the purity of purchases of a similar size in a particular city and year is also examined. It is found that they can reasonably be described by a statistical distribution known as the beta distribution. Doing so allows one to obtain better intuition into the nature of changes in purity and suggests better ways of reporting purity ranges than did previous methods. Parallel results are obtained for repeated observations of the same price and for observations that have been standardized for transaction size.

CREATING CONSISTENT PRICE SERIES

Vl1

Introduction

Price series for illicit drugs are of great potential value for policy purposes, but that potential has never been fully exploited. This document seeks to remedy that situation by addressing some of the challenging technical issues that arise when working with drug price data, explicitly setting forth the procedures for creating price series, and applying those procedures to data from the Drug Enforcement Administration. The document is divided into three chapters. The first directly addresses the Drug Enforcement Administration's Domestic Monitor Program, but most of the techniques and insights are relevant to any data set containing transaction-level data drawn from one particular market level. Often, however, the price data available span several market levels. Since quantity discounts for illicit drugs are substantial, one must adjust, in some way, for the fact that data drawn from different market levels are not directly comparable. The second chapter of this report examines such issues in the context of the Drug Enforcement Administration's STRIDE (System to Retrieve Information from Drug Evidence) data base. Typically one is interested in more than point estimates. For example, one might wish to know whether a recent price decline was statistically significant or whether it was merely random "noise" around a stable, but imperfectly measured quantity. Moving beyond point estimates requires that one know something about the underlying random variation in the data and address questions about the power and significance of statistical tests as a function of the amount of data collected. Some of these issues are explored in Chapter III.

CREATING CONSISTENT PRICE SERIES

1

I. Improving the Reporting of Domestic Monitor Program Data

The Drug Enforcement Administration's Domestic Monitor Program (DMP) reports price, purity, and source area information for retail heroin markets on a quarterly basis (DEA, various years). The program has been in operation, off and on, for more than a decade, but it has been substantially expanded in recent years. Likewise, greater effort has gone into reporting the results in a clear and effective manner; the reports produced starting at the end of 1990 stand head and shoulders above those produced previously. This chapter seeks to continue this trend by suggesting what may be better ways of analyzing and reporting the data. Past and future efforts to improve DMP are well justified. Early warning of trends in drug markets is vital to effective policy response, and heroin is the most difficult mass-market drug to monitor because its users form a relatively small, highly deviant, and often inaccessible population. Traditional indicators such as DAWN and DUFI are of significant value, but alone are not always sufficient, because they tend to lag changes in market conditions. If falling heroin prices induce an individual to initiate heroin consumption, it may be some months, or perhaps even years, before that person is arrested, hospitalized, or dies with the drug in his or her system (or he may never show up in those statistics at all). Broadly speaking there are two ways of improving DMP. One is by collecting better data; the other is by using the collected data more efficiently. Collecting data on retail heroin markets is both difficult and dangerous. Recommendations concerning these activities should only be made by people who intimately understand the risks involved. Quite naturally most of this expertise lies within the enforcement agencies. Given the limited scope of this project, we could not begin to achieve a comparable understanding of the relevant issues. Hence, we confine our comments on data collection to a few thoughts in Sections 9 and 10. The bulk of this chapter is devoted instead to suggesting ways in which those data which are collected could be used more effectively. We begin by discussing the problem of estimating the price in a particular city at a particular time. At first glance this might seem a trivial exercise, but there are subtleties which undermine straightforward approaches. In particular, there is so much variability across purchases that simply averaging the price paid per pure milligram yields a biased and noisy measure of the average market price. The discussion is then generalized in succeeding sections to finding a nationally representative price, estimating trends in the national price, and so ~n. Most of the, insights obtained carry ,over to the analysis of purity data, and a few to the analysIs of source regton as well. An alternative way of monitoring price, studying trends in the pure amount obtained for a fixed number of dollars is also discussed. In Section 11 the various suggestions are applied to produce retail heroin 'price and purity series for 10 cities from 1981 through the third quarter of 1992.

1DAWN

measures mentions of illicit substances in emergency rooms and medical examiner offices. DUF monitors the fraction of arrestees whose urine tests positive for an illicit substance. 2

CREATING CONSISTENT PRICE SERIES

2. Estimating Price in One Place at One Time Estimating the price at a single place and time is harder than it appears. For one thing, there is not a single price. Heroin is not sold in standard units, so one needs to think about the price per unit, not the total price. 2 But even the price per unit, typically reported as the price per pure milligram, can vary across locations within the city; among regions of origin (Southwest Asian, Southeast Asian, and Mexican); and even among essentially identical, repeated observations of the same substance, at the same location, on the same day. Hence, one is not trying to find a single number, but rather trying to represent a distribution of numbers. In such situations it is common to report the mean (average) of the distribution, also known as the distribution's expected value. This number represents the central tendency of the distribution. The mean of an unknown distribution is, of course, not known, but it can be estimated--typically by taking an average of repeated observations. One must, however, always be careful that one takes the right average. With respect to heroin prices, two averages are relevant. The first is the average price of a retail transaction; it is, in some sense, the price one would expect to observe if one selected a retail transaction at random. The second is the average price at which a unit of heroin is sold; it is, in a parallel sense, the price one would expect to observe if one selected a unit of heroin at random from among all the heroin consumed, and asked at what price it was sold. The first method gives each transaction equal "weight" in determining the average, no matter how much heroin is involved; the second treats every milligram of pure heroin as equally important, no matter how many milligrams are in the transaction. Below these two averages are referred to as the transaction-weighted average and the quantity-weighted average, respectively. At present the DMP reports the first or transaction-weighted average. It lists the price per pure milligram of all the purchases in a city and reports as the overall price the average of those individual observations. One of the primary contentions of this report, however, is that the second or quantity-weighted average is of greater interest for policy and can be estimated more reliably.3 The difference between these two averages can be stated more formally. Readers uninterested in details are invited to skip to the next paragraph. Let Pi = the price paid in the ith transaction, Qi = the pure quantity purchased in the ith transaction, and

2As is discussed in the second half of this report, even the price per unit varies with transaction size, but that issue is better addressed separately. 3The general issues involved, discussed in many probability texts, are random incidence, see, e.g., Drake (1967), and ratio estimators, see, e.g., Cochran (1977).

CREATING CONSISTENT PRICE SERIES

3

N = the number of transactions, then Transaction-weighted Average

=

(!.) =~ ±QPi Q

N

i=1

(1)

.

1

and

Quantity-weighted Average

=~ N

±./ftQ l! = ~P; 1= 1

L...J

Q1

(2)

L...J Q1

where the overbar denotes average. The quantity-weighted average weights each observations by its size and, as a result, reduces to the total money paid divided by the total quantity purchased. Broadly speaking there are two reasons to favor the quantity-weighted average. First, for most purposes it is actually the quantity one wants to know. Suppose, for example, one had an estimate of the quantity consumed and wanted to estimate total spending, either to estimate dealers' revenues or to relate drug consumption to consumption-induced property crime. Then one would want to multiply the consumption estimate by the quantity-weighted average, not the transaction-weighted average. The easiest way to understand this is to consider a highly stylized example. Suppose there were a city in which all retail heroin sales were $50 for a 100 milligram bag of heroin which was equally likely to be 5% or 25% pure. Then in half the transactions the price per pure milligram would be $50

- - - - = $10 1 pure mg.,

5% *100mg

and in the other half of the transactions the price per pure milligram would be

- -$50 - - - =$21 pure mg. 25% *100mg

Hence, the average price according to the transaction-weighted average is just ($10 + $2)/2 = $6/pure milligram. But suppose that annual retail sales of heroin in the city were known to be about $600 million. Does that mean that annual consumption is: .

~$6_0_0_tn1_·I_li_on_/..:....ye_a_r = 100 kg/year? $61 pure mg

CREATING CONSISTENT PRICE SERIES

4

No; it would be much higher. $600 million would buy twelve million, retail purchases of$50 each. Roughly six million would be 5% pure and hence contain 5 milligrams, and the other six million would be 25% pure and contain 25 milligrams of pure heroin. Thus, the total quantity consumed would be 6 million purchases/year * 5 mg + 6 million purchases/year * 25 mg = 180 kg/year. The average price according to the quantity-weighted formula is $50+$50 - - - - - - - - - = $ 3.33/pure mg. 5%* 100mg + 25%* 100mg and $600 million/year divided by $3.33 per milligram is 180 kilograms/year. So, the quantityweighted formula gives the correct answer. One can similarly show that it is changes in the quantity-weighted average which reveal how an addict's spending will change, on average, if the addict wishes to maintain a particular habit size. The transaction-weighted average is the correct one to use if you want a sense of the price likely to be paid for any individual purchase, but most policy decisions rest on information about aggregates, not individual transactions. So, for most purposes the quantity-weighted average is more useful. Furthermore, the quantity-weighted average is a more stable quantity statistically. The simplest way to illustrate this is to consider what happens if one or more of the purchase observations has zero purity. (This is not an unusual occurrence; 56 of the 616 DMP exhibits purchased in 1991 did not contain any heroin.) The transaction-weighted average "blows up" because it entails dividing by zero. The quantity-weighted does not-unless every one of the exhibits has zero purity. The problem is almost as severe (and is harder to detect) with very low purity observations. Suppose there were 10 exhibits of 100 milligrams, each purchased for $50, and that 9 of the 10 were 25% pure ($2/pure milligram) and one was 0.1 % pure ($500/pure milligram). Then the transaction-weighted average would report an average price of$51 .80 per pure milligram, even though in almost all cases the price per pure milligram was only $2/pure milligram. (The quantityweighted formula yields a price of $2.22 per pure milligram.) Currently DMP avoids these problems by excluding zero-purity-observations and "outliers." Outliers are not formally defined, but appear to be all observations with unreasonably high prices per pure milligram (e.g., above $20/pure milligram); such high prices per pure unit are observed only for very low purity observations. This "fix" is problematic. In the first place it discards potentially valuable data. It would, for example, produce the same price per pure milligram in two cities that each had thirty, 100 milligram purchases for $50, in one of which every purchase was 25% pure and in the other city one purchase was 25% pure and the rest were all 0.1% pure. CREATING CONSISTENT PRICE SERIES

5

More subtly, the choice of the cutoff price per pure unit, above which observations are discarded, can substantially affect the price per pure unit measured by the transaction-weighted average. This argument can be made analytically for simple distributions of price and purity and more generally through simulation. Consider first a simple example. Suppose the purity of observations in a city were uniformly distributed between 0 and 50% and prices were independently uniformly distributed between $0.25 and $0.75 per milligram (per total number of milligrams, not per pure milligram). Then the total price paid divided by the total pure quantity sold would be $2/pure milligram. If the transaction-weighted average is applied, discarding observations whose individual price per pure milligram is greater than $M/pure milligram, then the estimated price will be an increasing function of the cutoff value M, as is shown in Figure 1.1 .4 Figure 1.1: Expected Average Price per Pure Milligram Using Transaction-weighted Averaging as a Function of the Cutoff Value for Outliers

4

3

-

--f(M)

o +--~--1--~--1--~~-~--r----r----~--~ 40 30 50 20 10 o Price/Pure Mg for IndMdual Observations Above Which They Are Exluded

Thus the expected price estimated by the transaction-weighted average is affected quite dramatically by the choice of cutoff value M. Furthermore, this truncation can mask movements in the underlying price. As the true price rises, more observations are discarded, so the observed increase is smaller than the true increase. Likewise, observed decreases are smaller than true decreases because fewer high price

4The relation is f(M) = [M/(M-l)] • [0.5 - 0.125 • In(3) -In(3/2) + In(M)].

CREATING CONSISTENT PRICE SERIES

6

observations are discarded. This is undesirable because of the principal objectives of the DMP is detecting such price changes. The actual value estimated by the transaction-weighted average is quite variable. The value estimated by the quantity-weighted average also varies with the particular sample drawn, but less dramatically. There is not a simple expression showing the relative variability of the two formulas, but we performed a simulation study to explore the variability of the two methods as a function of the sample size. The results are discussed further in Section 10, but for present purposes it suffices to say that for any number of data points, the quantity-weighted average gives a less variable or less noisy estimate than does the transaction-weighted average. Unnecessary variability can exaggerate and even mask trends in the data. Figure 1.2 below compares plots of the average retail heroin price (from exhibits marked as DMP purchases in STRIDES) in New York since the fourth quarter of 1987 using the transaction-weighted and quantity-weighted averages. Figure 1.2: Comparison of Average Price per Pure Milligram in New York City Using the Transaction-weighted and Quantity-weighted Averages.

$3.50

$3.00 - - - Transaction-weighted Average

$2.50

.. _._._._............ _- Quantity-weighted Average

$2.00

$1 .50

$1.00

......

' . ...... R._.... _._· .._..

$0.50

- ~-

. .,- _.-

- -...

~-

..-- ....- -- . . .... .........

""

$0.00 Q1

Q1

Q1

Q1

88

89

90

91

Q1 92

While both plots show a fall, a rise, then a gradual downward trend in prices, as they must since they are based on exactly the same data, the plot corresponding to the transaction-weighted

5While STRIIDE has a variable that is supposed to indicate whether an exhibit was purchased for the DMP, in reality some exhibits are marked as purchases for DMP that never appear in DMP reports, and some exhibits that are in the reports are not in STRIDE. In addition numerical inconsistencies exist between the two datasets. As a result, examples drawn from STRIDE will not agree precisely with published DMP figures.

CREATING CONSIS1ENT PRICE SERIES

7

,"

average jumps around quite wildly. Sometimes the transaction-weighted average exaggerates a trend, as it does between the third quarter of 1989 and the first quarter of 1990. More problematic are the times the transaction-weighted average shows an incorrect trend. For example, in the first quarter of 1989, the transaction-weighted average gives a slight price drop, while the quantity-weighted average shows a rise that is repeated (and picked up on by the transaction-weighted average) in the second quarter. Hence, if transaction-weighted averages are computed, quarter-to-quarter variations are less well-measured, both in magnitude and direction, than they would be if quantity-weighting were employed.

3. Finding the National Average Price In every quarterly and annual report the DMP shows the average price in dollars/pure milligram (discarding outliers) for each participating site by source area and for all source areas combined. Also given is the DMP average price for all sites by source area and for all source areas. This average price is simply the average of all exhibits' prices/per pure milligram computed by the transaction-weighted average method. While the DMP average is not specifically meant to be an estimate of national retail price, it should at least attempt to represent an estimate of the price across the cities sampled. That is, even though the DMP data should not be construed as giving a nationally representative price, it would be desirable if the reports were based on a consistent set of locations so that inferences about price trends could be drawn. Currently the DMP average fails to do this. In addition to the problems with using the transaction-weighted average, since the DMP average is just the average of all prices/per pure milligram, it gives equal weight to each exhibit. One problem with this approach is that it fails to take into account variations over time in the number of non-outlier exhibits that are purchased. For example, if agents in New York (a lower-than-average price city) made half as many purchases in quarter two as they did in quarter one, but prices were unchanged, the DMP average price would rise, in spite of the fact that each city's average price (including New York's) stayed the same. To further illustrate this is point, consider Table 1.1 . The price data and number of purchases in the left column are from the first quarter of 1990. Now suppose that agents in Atlanta and Baltimore made three more purchases in each city and agents in Los Angeles and Phoenix made three fewer purchases. Even if the average price/pure mg. in every city remained same, the DMP average price would increase from $4.09/pure mg. to $S .S8/pure mg. Thus, variations in sampling can dramatically affect the estimated average price.

CREATING CONSISTENT PRICE SERIES

8

Table 1.1: Illustration of How Weighting by Observation Can Cause Problems

CITY Atlanta Baltimore Chicago Los Angeles Miami New York Phoenix Washington

ACTUAL Price/Pure Mg.

# of Purchases 3 8 4 10 2 19 10 5

Average Weighted by Observation

$19.64 $12.09 $5.92 $0.70 $4.76 $1.94 $0.84 $1.69 $4.09

WITH SAMPLING CHANGES Price/Pure Mg.

# of Purchases 6 11 4 7 2 19 7 5

$19.64 $12.09 $5.92 $0.70 $4.76 $1.94 $0.84 $1 .69 $5.58

A second, distinct problem with weighting all exhibits equally is that it fails to account for differences in the size of the heroin market in different cities. For the DMP average to estimate the average pure price for the cities covered (and act as a proxy for a national estimate) it should take into account the relative size of the heroin markets in the different cities. For example, if City A and City B both had the same number of exhibits for a period, but the heroin market in City A was known to be twice as large as the heroin market in City B, then City A's exhibits should count twice as much when calculating the average of A and B. Unfortunately there is no obviously best method of weighting the data. Ideally we would weight by the number of transactions similar to DMP purchases made by heroin users, or the amount consumed in each city, for the period the DMP purchases were made-impossible things to know for an illicit market. Some alternatives are to weight by: • • •

City (Le., by the reciprocal of the number of observations in each city) City population (i.e., by the city's population divided by the observations in that city) Indicators of consumption (e.g., by the number ofDAWN emergency room mentions divided by the number of observations in the city)

Weighting by city is unsuitable since it fails to take into account the widely differing sizes of heroin markets in the different cites. Weighting by population is somewhat better because city populations are known and market size is at least loosely correlated with city size. Weighting by a indicator of consumption such as heroin emergency room mentions in DAWN data may be the best alternative, although using DAWN data to draw inter-city inferences gives a rough first-order approximation at best. Table 1.2 below is based on the same price as was used in Table 11, but the average price is weighted by DAWN emergency room mentions. It should also be noted that the average price is lower in Table 12 than in Table 1.1, because weighting by DAWN mentions (properly) gives more weight to the low price in larger cities such as New York and Los Angeles.

CREATING CONSISTENT PRICE SERIES

9

Table 1.2: Data From Table 1.1 with Transaction Average Weighted by DAWN Emergency Room Mentions

# DAWN ER CITY Atlanta Baltimore Chicago Los Angeles Miami New York Phoenix Washington

MENTIONS

14 298 469 773 17 1,005 96 287

Average Weighted by DAWN Mentions

PRICE/PURE MG. (TRANSACTION AVERAGE)

$19.64 $12.09 $5.92 $0.70 $4.76 $1.94 $0.84 $1.69 $3.31

In addition to weighting the average by some estimate of heroin consumption for each city, a city's average should be included in the national average only if the number of exhibits from it meet a certain threshold, and the quantity-weighted average should be used when computing city averages.

4. Finding National Price by Region of Origin A still more challenging problem is finding the national average price by source of origin.6 Currently DMP calculates a national average pure price for each region of origin just as it calculates it for all regions of origin. That is, it simply averages all cases ( except outliers) for a particular region of origin in a particular period. This is problematic because it may appear that heroin from a particular source is expensive simply because it is found most often in markets where all heroin is expensive and rarely in markets where all heroin is inexpensive. This problem is most dramatically exhibited by the data for the First Quarter 1992. In all three of the cities in which there were both Southwest Asian and Mexican heroin purchases, the Mexican heroin was less expensive, but the national average price of Mexican heroin ($1.58) is higher than that for Southwest Asian heroin ($1.47). Such extreme examples are rare, but less visible distortions always exist.

In estimating national price by region of origin, we must not only weight observations by indicators, but also by a measure of the market share that each source area has in each city. Since information about market share is just as difficult to obtain as data on quantities consumed, we

6The advent of Colombian heroin has made this problem still more difficult. but distinguishing between Colombian and Southwest Asian heroin is a problem of chemical analysis, not statistical reporting, so is not addressed here.

CREATING CONSIS1ENT PRICE SERIES

10

might choose to estimate a market share weight based on DMP source area data itself (perhaps augmented with data from the heroin signature program). Another strategy would be to estimate relative prices of various sources by estimating a regression model with the price of an observation as the dependent variable and dummy variables for city and source region on the right hand side. If there are real and consistent differences in price between source regions and there are not significant interaction effects between source region and point of sale, then in principle the coefficients of the source region variables should give an indication of the price by source region. The limited number of data in anyone quarter will put a strain on ordinary regression methods; a variety of approaches can be used to reduce variability due to "noise" while still measuring quarter to quarter changes. Until more data are collected on a regular basis it is best to recognize the limitations inherent in estimating prices by source region and to be cautious in drawing any inferences based on those prices.

5. Estimating Trends in National Price At present trends in national price are estimated simply by stringing together a series of estimated national prices for individual years. This is vulnerable to a serious sampling bias, however. To illustrate what is meant by sampling bias, consider the hypothetical example in Table I.3 . In each of the three cities the price is falling by $1/pure milligram each year. Nevertheless, the national average remains constant because the distribution of observations across cities changes. Note, this result has nothing to do with randomness; the result is the same whether the prices in the individual cities in each year represent the average price or the actual price of each observation. Table 1.3: Hypothetical Example to Illustrate Sampling Bias

CITY #1

1990 Price n $7.00 5

1991 Price n $6.00 10

1992 Price n $5.00 20

#2

$5.00

5

$4.00

10

$3.00

5

#3

$3.00 20

$2.00

10

$1.00

5

National Average

$4.00 30

$4.00 30

$4.00

30

There are two ways to deal with this issue. The first is simply to report· price series based on a consistent set of cities and apply weighting as mentioned above. In addition, computing the quantity-weighted average rather than the transaction-weighted average will make recognizing national trends easier (for the same reasons given above on page 7 for trends in price in one city). Table 1.4 uses actual data to show effects similar in spirit to those in Table I.3 and to show the benefits of having a consistent set of cities. In 1990, DMP expanded to a number of cities that are CREATING CONSISTENT PRICE SERIES

11

dominated by relatively high-priced Southwest Asian heroin. Hence the DMP average price more than doubled, whereas the average based on a consistent set of cities increased by a more modest and more plausible 25%. Table 1.4: Southwest Asian Average Pure Price, Consistent Cities and DMP $/PURE MG. Avg of Consistent Cities DMPAvg

FY 1988 2.46 2.46

FY 1989 1.55 1.53

FY 1990 1.93 3.18

CY 1991 1.31 1.60

199201-02 1.05 1.12

From 1988 on, Atlanta, Chicago, Detroit, Los Angeles, New York, Phoenix have all made DMP purchases in most quarters of the year. In examining national trends in DMP data, only these cities should be combined (when going back to 1988 and 19891).8 From now on this set of cities is referred to as the "consistent set." The second, more complicated way to get a better national price trend is by using regression to estimate a model with the price of an observation as the dependent variable and dummy variables for city and time on the right hand side. Such models may be somewhat sensitive to outliers, but there are three ways to deal with this potential problem: (1) conventional outlier diagnostics; (2) using the quantity-weighted average price in a city as the dependent variable; and (3) Making the dependent variable the log of the price rather than the price. Finally, it should be noted that when intertemporal price comparisons are made across more than a few years, it is important to adjust for inflation by reporting results in constant dollars.

6. Estimating Average Purity and Trends in Average Purity Efforts to estimate the national average purity and trends in that purity are affected by the same issues described above for pure price. For example, expanding DMP to include low-purity cities such as Dallas, Miami, and St. Louis would have damped the estimated increase in national purity during 1991, if it had not been for the concurrent inclusion of such high purity cities (at least for 1991) as Boston, Newark, and Philadelphia. Hence error was avoided as much by happenstance as by design. The hypothetical example in Table 1.5 shows how adding low purity cities can reverse an overall rise in purity. In the original three cities and the two added cities purity is rising each year. However, measured overall purity falls when the new cities are added in.

7Miami, Puerto Rico, San Francisco, Seattle, and Washington, DC. were DMP sites in 1990 and 1991.

80ne can also synthesize a price series by computing each year's price change from a consistent set of cities, but allowing the set used for different pairs of years to vary. For example, the change from 1989 to 1990 might be based on the change in prices in Atlanta, Chicago, Detroit, Los Angeles, New York, Phoenix. while the change between 1990 and 1991 could be based on the change in prices in those cities plus Miami, Puerto Rico, San Francisco, Seattle, and Washington, DC.

CREATING CONSISTENT PRICE SERIES

12

Table 1.5: Hypothetical Example Illustrating How Adding New Cities To Overall Average Can Adversely Affect the Meaning of Trends in the data.

CITY #1 #2 #3

1990 % Purity n 15 35 15 27 20 24

1991 % Purity 42 35 29

n 15 15 20

3 3

15 15

5 5

30 30

New City New City

1992 % Purity n 15 44 15 40 32 20

Average Based on Consistent Set of Cities

28

50

35

50

38

50

Overall Average

28

50

23

80

20

110

Methods similar to those used with national price trends can be applied to purity trends. Simply using a consistent set of cities is part of the solution. Weighting the data as described in the price section above can also be performed.

7. Adjusting for Transaction Size It is well known that the price per pure unit decreases and the purity increases, on average, as one moves to larger transaction sizes. Since not all DMP samples are of the same size (no matter whether measured by weight, pure weight, or purchase price) there is a possibility of distortion.

For example, most of the New York City DMP purchases are for $100. In Phoenix the average price is less than $100. In cities such as Houston, Miami, and New Orleans, on the other hand, average purchase prices are closer to $200. If price per pure unit, all other things equal, decreases with increasing purchase price, then these sampling patterns would tend to overestimate the retail price in Phoenix and New York relative to Houston, Miami, and New Orleans. Generally speaking the cities with the larger purchase prices are the ones with the higher prices per pure unit, so the differences in market level may be less extreme than is suggested by looking only at prices. Nevertheless, the possibility of bias remains. A simple correction for this is to compute not the price per pure milligram but the standardized price per pure milligram using the log-linear adjustment for transaction size discussed in the second half of this report. As is discussed there, this raises the issue of how exactly to deal with variations in purity: whether to identify transaction size with the raw quantity or the pure quantity involved in the transaction. For simplicity, and in keeping with the tradition of the DMP which has always emphasized pure quantity, we will consider the pure quantity model. Then the formula relating price and transaction size is simply

(3) CREATING CONSIS1ENT PRICE SERIES

13

'

..

where P is the price, Q is the pure quantity, P is a parameter whose numeric value is roughly 0.8, and ex. is the standardized price for one unit. Now instead of averaging prices (P'S) one averages standardized prices ( pp'S ). In particular, the

Q

quantity-weighted average becomes:

N

Discounted Quantity-weighted Average When there is no quantity discount (i. e., when on page 4.

""

P Q i I-II

Q . p. L,. i =-1 "" L,. ~-T = ""

P=

N

1= 1

L,.Q i Q i

L,.Qi

PQI-P

=- -

(4)

Q

1) this equation reduces to the one given above

Adjusting for quantity is only essential when the observations span a wide range of transaction sizes. Since most of the DMP observations are for retail purchases of roughly the same size, making this adjustment is of secondary importance.

8. Using Pure Quantity Purchased As an Alternative Measure of Price For most goods in most markets prices are commonly discussed in terms of dollars or dollars per unit obtained. Occasionally, however, it makes sense to think of the quantity obtained for a fixed dollar amount. The classic example, in this regard, is gum balls. For many years gum balls were sold for a penny and an infrastructure (namely, gum ball machines) was built up around penny gum balls. Eventually, inflationary pressures forced a price increase of some kind. Because the gum ball machines were designed to accept a single penny, and it was believed that customers would balk at paying more than a penny, the nominal purchase price was not increased. Instead, the gum balls were made hollow. Effectively the price increased because children got less candy for their money, but the nominal price did not change. In this circumstance, it might make sense to monitor the price increase in terms of the quantity of gum obtained for a penny, rather than in terms of pennies per gum ball. Drugs are, in this respect, sold like gum balls. In many drug markets, when street dealers need to adjust prices, they adjust the quantity and/or purity of the contents of a bag rather than the dollar price. New York City heroin markets are particularly well known in this regard, with "dime bags" always selling for ten dollars, but the phenomenon is by no means confined to New York or heroin. Many cities have a standard price for a rock of cocaine. For example, the Western States Information Network reports the existence of$20 rocks and $40 rocks for most of the west coast.

CREATING CONSISTENT PRICE SERIES

14

This suggests that, as an alternative to the methods described above, sometimes it might make sense to monitor drug prices by tracking the average pure quantity obtained for a fixed dollar amount. The strongest case can be made when most of the data available are for a single purchase price. All of the 1991 New York City DMP data, for example, are for purchases of $100. An advantage of reporting prices this way is that the denominator is stable; for $100 purchases, it is always $100. Hence, one never encounters the problem of average prices being artificially inflated by one or a few observations that have a very small denominator. One potential drawback to this approach is that, with conspicuous exceptions such as the New York City DMP data, observations are not usually clustered on a single purchase price. Round number prices are certainly more common than odd prices, but it is often the case that there are several popular dollar amounts even at the retail level, and the dollar amounts become more spread out as one moves to larger quantities. There is a remedy for this, however. One can standardize quantities obtained for different dollar amounts in a manner entirely analogous to the way one standardizes prices paid for different quantities. The appropriate formula for converting an observation of Pi dollars paid for the pure amount Qi into a standardized pure amount Ki for a Po dollar reference purchase is:

(5) where b is the quantity discount parameter referred to in the previous section. One can then average and track trends in these Kits just as one can average and track trends in the price paid per unit obtained. When there is no quantity discount (i.e. J3 = 1) and the prices are all the same or the average of the Kits is weighted by the price Pi, then the average obtained is exactly the reciprocal of that obtained by taking the quantity-weighted average discussed above. In other circumstances, however, the relationship between these price measures is not so simple. At any rate, it is equally valid to think of prices in terms of dollars paid per unit obtained or units obtained per dollar spent. The first is, in general, more natural, but when one is tracking prices in markets with very standard purchase dollar amounts, the second may be more sensible to the people in the field and relate more directly to how dealers actually adjust their effective price.

9. Suggestions for Improving Sampling As was discussed in the introduction, this report primarily addresses ways of reporting those data which are collected, but this section offers a few comments on the data collection itself

One issue concerns how much should be spent on an individual purchase. There are several arguments favoring smaller purchases. First, DMP is supposed to be monitoring retail heroin market conditions; heroin users typically spend much less than $100 on an individual purchase, so many DMP observations are from purchase sizes that are somewhat above retail. A second argument for smaller purchases is simply that they would save money (and, hence, give dealers CREATThlG CONSISTENT PRICE SERIES

15

:

, --' ,

t-"')'

,!

, .!~

,

less money). The principal reason for not making smaller purchases is that smaller quantities are more difficult to analyze in the laboratory. However, since the price per pure gram has fallen in recent years, somewhat smaller purchase sizes (in dollar value) may now be adequate. According to the DMP reports, the minimum required pure sample to do signature analysis is 45 milligrams. Since prices per pure milligram are known in the various cities, it would be possible to roughly detennine what the best purchase size should be in a given city. For example, since the average price per pure milligram in San Diego in 1991 was $1.08, one would expect that about half of all $50 purchases could be analyzed for source area. In 1991, purchases in San Diego were between $70 and $200, so it may be that the purchase size could be reduced slightly without significantly reducing the chance that the source area can be identified. A great many factors affect the detennination of source, however, so this brief analysis is intended to be suggestive not definitive. What can be stated with greater confidence is that there are advantages to maintaining a stable purchase size within a given city. This facilitates comparison across samples within the city and helps prevent distortion of trends over time that might occur if the purchase size were to vary significantly. Another issue is how many observations should be taken in each city. Of course, as is discussed in the next section, more observations are always better, but suppose a fixed pool of money is available to make purchases. How should these fixed resources be allocated across cities? Again, various arguments can be given. Spreading the samples over many cities would enhance the ability to track the spread of new market phenomena as they move from city to city, but if the samples are spread too thin they will not give reli~ble information for any city. The optimal resolution of this and related trade-offs depends greatly on how the data will ultimately be used, but one recommendation does flow directly from considerations of how the data should be reported. That is, there are considerable advantages to being consistent over time both in terms of the number of observations taken in a city and the set of cities which are included in the program. Trends often contain more information than absolute values, but trend data can only be interpreted if they are formed from consistent samples. Similar questions arise with respect to when samples should be taken within a city. It would clearly be a mistake, for example, to make ten purchases on the same comer on the same day; much of the information obtained would be redundant. On the other hand, it is less obvious what is the best way to distribute the purchases. In principal one would like to scatter the purchases made in a given quarter around the city to obtain broad coverage, and to return to those same locations in successive periods to make the successive observations comparable. Street markets are constantly evolving, however, so this may be an imperfect control for local variations. Hence, the purchase locations may have to be revised periodically. A final comment is that it is important to make sure that the agents collecting the data know how valuable their efforts are. If collecting data is viewed as bureaucratic "make-work," even the most dedicated professionals can make mistakes. If the data collection is recognized as providing valuable information which is used to make policy and good performance is rewarded, there is a much better chance that collection protocols will be both carefully designed and faithfully and consistently executed. CREATING CONSISTENT PRICE SERIES

16

10. Expanding the Number of Observations Monitoring programs, particularly those, such as DMP, whose samples are subject to considerable random variation, can always be made more reliable by increasing the number of data points collected (unless collection quality declines with quantity to negate this effect). When collecting a data point entails risking the safety of a drug enforcement agent or confidential informant, however, increasing sample sizes is quite expensive. The principal benefit of increasing sample size is that the variability of the resulting estimate of average price is reduced. It is difficult to obtain closed-form expressions for the variance of averages produced by the two methods described in Section 2, so we ran a simple simulation comparing the two . The results support what intuition suggests must be the case. Namely, for any given number of data points, averages obtained by transaction-weighted averaging are "noisier," or more variable, than those obtained by computing the quantity-weighted average. Specifically, we generated 7,560 observations with price uniformly distributed between $25 and $75 and purity independently, uniformly distributed between 0 and 50%. These were intended to represent 100 milligram heroin purchases. Hence the "true" expected average price per pure milligram is $50/(25% * 100 milligrams) = $2/pure milligram. We then successively divided these observations into groups of two, three, ... , ten observations. The average price for each group was computed in several different ways, and then the variance of these averages was measured. For example, there were 2,520 groups of three observations. An average price was computed for each triplet, and the mean and standard deviation of these 2,520 averages was found. Table I.6 shows the results of computing the transaction- and quantity-weighted averages based on groups of between 2 and 10 observations. In both cases individual observations with prices exceeding $20/pure milligram were excluded. Figures I.3 and 1.4 plot the means and means plus and minus one standard deviation for both methods of averaging, respectively. The figures show clearly that, for any size grouping, the quantity-weighted averages are less variable. Note also that the transaction-weighted averages are consistently biased upward by roughly 50%. In contrast, although the quantity-weighted averages also overestimate the average price, the extent of the overestimate is always less than that for the transaction-weighted averages and decreases steadily as the size of the grouping increases. The quantity-weighted average dips slightly below two dollars for large group sizes because the unusually expensive observations are discarded, but the extent of this underestimate is never more than a few percent.

CREATING CONSISTENT PRICE SERIES

17

Table 1.6: Stability of Averages Produced By Transaction-weighted and Quantity-weighted Averaging

GROUP SIZE 2 3 4 5 6 7 8 9 10

TRANSACTION-WEIGHTED AVERAGE Std. Dev. Mean 3.05 2.32 1.86 3.04 1.59 3.04 3.04 1.41 1.31 3.04 3.04 1.19 3.04 1.11 3.04 1.05 1.01 3.04

QUANTITY-WEIGHTED AVERAGE Mean Std. Dev. 2.38 1.74 2. 15 1.08 2.06 0.76 2.02 0.64 2.00 0.57 1.98 0.51 1.96 0.45 1.96 0.44 1.95 0.40

Figure 1.3: Estimated Price per Pure Milligram, Using the Transaction-weighted Average

6.00

---Mean ""

5.00

. .....

.. _._._._ ..........._- +1 Std Dev

-.

-......

E l!

_._._._._._._.. -1 Std Dev

..

--_.._-_. -..

4.00

_..._-..._- --.._---_._--_...__ ..

~u 3.00

..j,-------------------------

. -_._-- ......... _....

01

5

:IE

!! :s

'0:

Q.

'i

.. , ... ---- _...

~ 200

:;

w

_.-.----- _._.- ...._. _.-._'

. _ 0" ..

.... -... .-'1.00

0.00

+-----4_----~----_+----~------~----+-----4_----~

2

3

4

5

6

7

8

9

10

Size of Grouping

CREATING CONSIS1ENT PRICE SERIES

18

Figure 1.4: Estimated Price per Pure Milligram, Using the Quantity-weighted Average

6.00

---Mean .. _.... _._ ..........._.. +1 Std Dev

5.00

._._ .. . _._ .. .. -1 Std Dev

E :.! CI § 4.00

......

~

!!

:::I Q.

Qj

u

3.00

... ~-

'':

. -....... ..

Q.

"

....

]

'"

...

" "'-

.. _.....

_-... .. ............-......__......

E 2.00

iw

- ' "

.... . - .- '"

... __ .... . _0- _.-.--

1.00

0.00

+------r-----+------r-----~-----r-----1------+-----~

2

3

4

5

6

7

8

9

10

Size of Grouping

These figures also give a sense of how many observations one must have to obtain a reasonably reliable average. Of course they apply directly only to the simulated data, but the simulated data are not entirely unlike actual heroin data obtained from a high-purity city (such as New York), and it is a simple exercise to rerun the simulation with price and purity distributions that represent any particular city's markets to get a sense of how many data points should be collected in a given quarter in that city. One relatively inexpensive way to increase sample sizes would be to augment DMP with data from investigation-driven retail heroin purchases. Some such data are already available in STRIDE. While DMP purchases accounted for over 70% of the retail level heroin purchases from DMP cities in the 1991 STRIDE date set, STRIDE contains data on major heroin use cities that have not been part ofDMP until recently (e.g. Philadelphia) and data for past years when DMP was less active. Even more data exists in the records of state and local agencies. Investigation-driven purchases are less useful than monitoring purchases for many reasons. Their location, timing and circumstances are driven by the exigencies of the investigation, not the desire to maintain consistent sampling procedures. The lab results are often not available in time to make quarterly reports. Region-of-origin information is not generally available. Nevertheless, existing DMP reports might usefully be augmented by sections reporting price and purity information based on these additional observations.

CREATING CONSISTENT PRICE SERIES

19

11

11. Computing Retail Heroin Price and Purity Series What follows is a rough retail heroin price and purity series, created by drawing on the methods described above. The price series is based on STRIDE data that are comparable to typical DMP data point s9 because DMP data are not available for the entire period. STRIDE, in addition to being readily available, also has the advantage, mentioned above, of including more cases and more cities. To give the best comparison to DMP data we chose to discard as outliers samples with a pure weight of 10 milligrams or lower and a price of $20/pure milligram or higher. We also discarded exhibits with no price (as these exhibits usually represent seizures or free samples). In order to obtain a consistent set of cities, we used data only from those cities that had exhibits in each year from 1981 to 1992. As shown in Figure I.5 and Table I.7 below, the price and purity series show a clear fall in pure price levels as well as a substantial rise in purity.

Better methods of adjusting for transaction size and variations in purity will be discussed, using cocaine data, in the second half of this report. Figure 1.5: Rough National Retail Heroin Price (1992 Dollars) and Purity Series

3.00

'F

45%

2.50 35%

.!! '0

::!en

2.00

en

::. E I! 1.50

~ ~

!::::I 1.00

15%

~u

·c a.

0.50

0.00 .1.-_ 1981

_

_

_

1982

_

- - 0 - - Price (left scale)

10%

- --1.___-

5%

Purity (right scale)

_ _ _ _ _ _ _ _ _ _ _ _ __ __ 1983

1984

1985

1986

1987

1988

1989

_ 1990

_

_ 1991

_

_

--1.

0%

1992

9In order to isolate cases in STRIDE most likely to be at the retail level we examined the pure weights that occurred in DMP purchases and determined that 94% of all valid cases in DMP in 1991 cost less than $250 and fell between pure weights O.Olg and 0.50g. We also discarded all purchases with a pure price of$20/milligram or higher.

CREATING CONSISTENT PRICE SERIES

20

'-

Table 1.7: Rough Retail Heroin Price and Purity Series for Selected Cities

Chicago

% Purity $Ipure mg. DAWN ER Mentions

n Detroit

2.6 4.99 292 3

5.1 2.26 280 3

5.3 1.23 399 3

3.2 2.82 513 8

1.7 3.90 605 17

1.1 6.6 7.1 18.5 8.4 10.9 16.5 3.23 2.31 1.56 1.12 1.34 1.41 1.14 832 1,272 2,016 2,124 2,039 1,101 1,101 2 4 27 35 31 30 15

13.5 14.3 23.0 3.1 9.9 17.0 14.1 22.5 16.2 25.4 9.4 17.0 2.13 1.n 1.512.84 1.051 .050.901 .241 .011.211.271.25 1,141 1,795 3,274 2,568 3,061 2,793 2,537 2,812 1,963 1,552 893 893 n 2 17 1 16 10 5 13 33 13 25 20 16

% Purity $Ipuremg . DAWN ER Mentions

10.4 16.616.328.517.136.4 36.640.539.940.953.161 .9 1.38 0.96 1.19 1.04 1.30 0.71 0.69 0.64 O.n 0.79 0.55 0.44 1,669 2,103 4,670 3,553 3,300 3,456 4,590 5,397 5,438 3,810 2,733 2,733 n 80 42 66 24 22 11 46 56 60 93 73 39

New York

%Purity

$Ipure mg. DAWN ER Mentions

3.3 2.86 274 42

4.4 2.73 412 50

2.9 3.56

14.3 1.47 90 32

15.9 1.62 93 47

15.6 1.30 129 76

n

12.0 1.64 683 15

7.1 1.56 710 27

11 .7 6.6 17.8 7.7 17.3 30.5 1.59 0.59 2.34 1.47 1.48 2.81 824 1,511 1,447 1,196 1,650 2,557 11 3 13 15 2 32

%Purity $Ipure mg. $/pure mg. 1992 Dollars

10.1 1.90 2.84

12.6 1.47 2.07

16.6 1.46 2.00

Philadelphia

% Purity $Ipure mg. DAWN ER Mentions

n San Diego

% Purity $Ipure mg. DAWN ER Mentions

n Washington

% Purity $Ipure mg. DAWN ER Mentions

National Avg ""

344

18

8.4 1.96 464 22

3.9 2.54 475 20

11.0 1.96 466 13

42.1 1.39 815 20

17.5 1.73

31 .7 0.99

161

1n

41

57

33.0 0.89 182 43

24.1 0.92 148 144

16.2 1.78 2.33

12.3 1.46 1.85

18.9 1.36 1.69

25.5 1.10 1.31

29.4 51 .9 43.6 1.07 0.62 0.97 1,748 3,351 2,653 15 16 69 28.4 0.62 407 45

28.0 0.91 1.04

46.2 0.47 600 13

51.2 54.3 0.85 0.70 1,630 1,630 52 14

44.5 0.51 756 9

28.5 0.75 352 51

46.0 0.56 352 35

39.2 14.5 1.80 1.50 1,761 1,334 5 27

19.4 1.03 678 34

20.0 0.94 678 12

37.0 0.87 0.89

44.5 0.75 0.75

35.8 0.91 1.00

30.3 1.02 1.06

"1992 is derived from STRIDE exhibits from 01 to 03 of 1992. "National average is created by weighting city averages based on DAWN emergency room mentions of heroin for that year. 1991 and 1992 are weighted based on mentions from the 1st half of 1991 .

12. Summary While the Domestic Monitor Program has made substantial progress in terms of increasing the scope of the program and in the quality oftheir reports, many problems, some subtle and some not, remain in the reporting of the collected data. The two main areas of concern are: • •

The way in which city pure-price averages are calculated and The way in which nation- or DMP-wide pure price and purity averages are calculated.

C REATING CONSISTENT PRICE SERIES

21

The first of these problems can be ameliorated by applying the alternative averaging procedure described in Section 2. The only drawback to making such a change would be incompatibility with previous reports, but they would be retrospectively recomputed to fit the new methods. The second problem is more complex, but it can be addressed in no small measure by presenting a DMP-wide average based on a consistent set of cities and a weighting procedure that corrects for sampling bias. DMP should not drop any cities that have been part of the program since 1988 and should report an average based on only those cities. A second average could be computed on a larger consistent set starting in 1990. The DMP should also create a DMP wide average that is weighted by a city-by-city indicator of heroin use (in our examples we used DAWN emergency room mentions). Until these changes can be made, policy makers and analysts should take care in interpreting the price data, and especially trends in these data.

CREATING CONSISTENT PRICE SERIES

22

II. Creating Price Series from STRIDE This chapter describes how one might create price series from the Drug Enforcement Administration's STRIDE (System to Retrieve Inibrmation from Drug Evidence) database. As Section 1 details, STRIDE differs from the Domestic Monitor Program (DMP) in a number of ways, including the fact that the data in STRIDE are not collected for intelligence purposes; they are the byproduct of investigations. One consequence of this difference is that STRIDE contains outliers which one must remove before estimating price series. Section 2 describes how to do this. From a methodological perspective the main difference between STRIDE and DMP is that STRIDE data are collected over a wide range of market levels. STRIDE's cocaine purchases, for example, range from 0.001 grams to over 10 kilograms, more than seven orders of magnitude. There are substantial quantity discounts for illicit drugs; that is, one generally pays more per gram when buying a gram at a time than one would buying kilograms. Section 3 discusses how one can adjust for such quantity discounts and, thereby, compare observations from different market levels. One should also adjust for differences in quality across observations. For most illicit drugs the most important characteristic of quality is the purity of the psychoactive substance. For example, a white powder which is 80% cocaine would be perceived on the street as being of higher quality than one which is only 20% cocaine and 80% diluents. Adjusting prices for quality is tricky, whether the product in question is amphetamines or automobiles, but Section 4 seeks to resolve some long-standing confusion on this issue. Price series can be constructed in a variety of ways. The most common method is simply to report average prices for a narrow range of quantities, but that approach does not make efficient use of the data. A better approach, and one that is becoming increasingly common, is to use some form of regression. Price is regressed on a variety of regressors including time variables, and the coefficients estimated for the time variables are used to construct a price series. Section 5 reviews these methods and suggests an alternative, standardizing individual purchase observations, which offers advantages in some situations. Finally Section 6 summarizes a procedure for producing price series and illustrates that procedure by applying it to cocaine data from STRIDE.

1. The STRIDE Database Frank (1987) describes STRIDE in detail, so it will be reviewed only briefly here. STRIDE records information on purchases and seizures of illicit drugs that are examined by DEA laboratories. About 75% of the records are the product of DEA and FBI investigations. Most of the remainder are from District of Columbia Metropolitan Police investigations; hence coverage of state and local agencies is not at all comprehensive. Each observation includes information on the date, city, quantity, identity, and purity of the drug. Table ILl gives a sense of the data that are available. CREATING CONSISTENT PRICE SERIES

23

. -.- <,...

Table 11.1: Number of Observations in STRIDE

YEAR 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991

COCAINE Purchases Seizures 1,092 3,202 4,467 1,541 2,108 5,238 2,675 6,285 7,287 3,865 9,584 4,193 10,035 4,534 12,484 5,374 5,530 11,026 4,691 9,056 6,061 9,789

HEROIN Purchases Seizures 1,783 2,439 1,871 2,726 1,616 2,573 1,418 2,622 1,702 2,281 1,451 2,390 1,311 1,793 1,338 1,985 1,166 2,024 1,423 1,955 1,474 2,033

The database has a number of outstanding strengths and some weaknesses. Its greatest advantages are that it contains transaction-level data from across the country, and records these data in a consistent manner from 1973 to the present. Its principal disadvantage is that it contains so little data from state and local sources. Given how variable drug prices are, adding state and local agencies' observations could make STRIDE substantially more useful for market analysis.

2. Outliers and Other Data Issues About 1-2% of STRIDE's observations need to be removed before price series are created either because they are outliers or because the purity data reported are suspect. The need to cull so many outliers may be explained, at least in part, by the fact that STRIDE's principal application is as an information system for tracking drug evidence, not for collecting intelligence on market characteristics. Since the price and purity of a drug sample rarely have any substantial bearing on an exhibit's role in a trial, there is relatively little demand for precision with respect to those variables. Whatever the reason, however, the problem is relatively simple to remedy, as is discussed next.

Price Outliers STRIDE contains what appear to be typographical errors in data entry leading to price outliers of as many as four orders of magnitude. For example, retail prices for cocaine are typically between $100 and $300 per gram. Nevertheless there are STRIDE observations suggesting that agents paid as little as $1,045 for almost 12 kilograms of cocaine (Louisville, 11114/87) and one-tenth of one cent for 23 grams of 84% pure cocaine (San Diego, 12112/87). Conversely there are records indicating payments of $6,800 for one milligram of cocaine (Albany, 2/2/87) and $500,000 for 2 grams of cocaine that was only 10% pure (Boston, 10/24/84). These outliers can be expunged by using a simple filter. If one adjusts the observed price for transaction size in the following, rudimentary way:

CREATING CONSISTENT PRICE SERIES

24

L,

· d . ad~uste pnce per gram =

price of observation . . exp(O.77*log(quantity m grams»

(1)

then the coefficient of variation of these adjusted prices is on the order of 0.3. Hence, it is easy to distinguish between typos which distort standardized prices by an order of magnitude and simple random variation. One can, for example, eliminate all observations whose adjusted price differs from the average adjusted price, for that time period, by a factor of five or more. Methods based on medians are somewhat more robust. Hence, one could also eliminate as outliers all observations for which:

Ixo - med(x;)1 --'-;:--------'--- > 5 med[lx; - med(x;)I]

(2)

where Xo is the adjusted price of the observation in question and the median operation is taken over all observations to which the observation is being compare, e.g., all observations in the same city and year.

Zero-Purity-Observations l A second issue affects purity calculations. STRIDE contains a significant number (about 15%) of observations which are recorded as having zero purity, and, as Caulkins (1993b) illustrates, the decision to include or exclude these observations can have a substantial impact on analyses. A very large fraction of observations of less than or equal to 0.1 grams are recorded as having zero purity, suggesting that these represent trace findings or residues which are too small to be properly assayed in the lab. Hence, in the sequel, observations ofless than or equal to 0.1 grams are omitted. The question of whether to include or exclude zero-purity-observations for larger quantities is more difficult. One might likewise dismiss them on the assumption that a quantitative assay was not performed (i.e. missing observations were recorded as 0), but there are a handful of non-zero purities that are less than one percent, suggesting that very low purity samples do, in fact, exist. On the other hand, the pattern of zero-purity-observations suggests that at least some of them are not accurate reflections of the true purity. Specifically, zero-purity-observations for larger transactions appear to be most common when average purities are high. For example, for observations ofa gram or more in New York City in 1987, fully 109 of the 597 seizure observations had a zero purity, but only three others had purities ofless than 30% and almost 90% of the non-zero-purity-observations had purities above 75%. One' plausible explanation for this is that as the number of cocaine exhibits purchased and seized rose, the laboratories began to assay only those observations that were strictly necessary.

lMaterial in this subsection is drawn from Caulkins, 1993b.

CREATING CONSISTENT PRlCE SERlES

25

.'"

!

"

Hence, it seems quite likely that many of the observations for which the purity was recorded as zero actually had non-zero purities. On the other hand, very low but non-zero purities are recorded and "rip-offs" do occur, so some of the zero-purity-observations may in fact have contained no cocaine. On balance, though, it seems like there is less harm in excluding a few of the latter than in including the sometimes large numbers of the former.

3. Adjusting for Quantity Discounts STRIDE data come from transactions of many different sizes, so they must be standardized before they can be compared. This point can be illustrated with a simple example. STRIDE contains cocaine purchase observations of $27,000 for one kilogram in Boston and $100 for one gram in Dallas, both in 1991. Since a kilogram is 1,000 grams, one might be tempted to infer that the price in Boston is $27/gram and, hence, is much less than the price in Dallas. That inference would be incorrect, however, because there are substantial quantity discounts in the markets for cocaine (and for other illicit drugs). A simple and surprisingly effective way to represent these quantity discounts is with a log-linear model:

P=aQP,

(3)

where P = the price paid for a transaction of size Q, a = proportionality constant = standardized price of one gram, Q = quantity or size of the transaction in grams, and ~ = parameter describing the extent of the quantity discount.

If ~ = 1 then there is no quantity discount because the price increases linearly in the size of the transaction. Empirically it has been observed that ~ for cocaine is typically between 0.7 and 0.8. 2 For these values of~, if one Q = 1,000 gram purchase costs P = $27,000, then a is between $107 and $152, which is more, not less, than the a = $100 (standardized) price of one gram in Dallas. This log-linear adjustment was pioneered by Brown and Silverman (1974) and was applied in studies such as Silverman and Spruill (1977). Unfortunately it was neglected for a considerable time. More recently, Caulkins and Padman (1991, forthcoming) applied a similar model to price data from the Western States Intelligence Network (WSIN) for a variety of drugs for transactions ranging from retail to low-level wholesale sizes. Independently and simultaneously, DiNardo (1991, 1993) applied a similar approach to the STRIDE data in his studies of the responsiveness of consumption to enforcement and changes in price. Researchers at Abt have also used this sort of price adjustment in their work (e.g., Rhodes and Hyatt, 1992).

2The exact value depends somewhat on the market level and how one adjusts for purity, but these details are not important for making this particular point.

CREATING CONSISTENT PRICE SERIES

26

4. Adjusting for Differences in Purity The previous section described why and how one should adjust purchase observations for differences in transaction size. One would similarly like to adjust observations for differences in purity. In Brown and Silverman's (1974) original work on retail heroin prices, they adjusted for differences in purity as well as transaction size with a log-linear model. Specifically, their model was: 3

(4) where Pi is the price of the ith buy (in dollars), Qi is the quantity of the ith buy (in grams), Si is the purity of the ith buy (in percent), Ti is the month of the ith purchase, and vi is the error term.

A similar model was used by Silverman and Spruill (1977), Caulkins and Padman (1991, forthcoming), and DiNardo (1991, 1993). One of the most striking things about all of these studies is that the exponent estimated for purity was much less than that estimated for quantity. This would imply that quality premia are not "complete" in the sense that price is not determined by pure quantity alone. Rather, they suggest that it costs significantly more to buy two grams of 30% pure cocaine than it is to buy one gram of 60% pure cocaine. It is easy to generate explanations for why the purity adjustment might be slightly less than "complete," but the magnitude of the gap is hard to believe. In fact, the estimated coefficients suggest that two grams of30% cocaine sells for more than one and a halftimes as much as one gram of 60% pure cocame. Since cocaine is vastly more valuable than the common adulterants or diluents, 4 it is not obvious why any dealers would be willing to sell the 60% pure cocaine. They could make more money by mixing in some (inexpensive) diluents. Or, to put it another way, it would appear that an opportunity for arbitrage exists and has existed for many years. It is hard to reconcile such an opportunity with the notion that, especially at the retail level, drug markets are highly competitive.

3The ~o notation has been changed slightly for expository purposes. 4Diluents are pharmocologically inactive substances such as mannitol, sucrose, and starch; adulterants, such as quinine, procaine, and caffeine, while diluting the principal drug, are themselves pharmocologically active.

CREATING CONSISTENT PRICE SERIES

27

...... "I -~

,,,.'

The results are equally perverse when viewed from the consumer's perspective. The exponent on purity estimated with STRIDE data is typically in the neighborhood of 0.05-0. 10. 5 The coefficient of 0.10 implies that a 95% pure sample would cost only 35% more than a 5% purity sample of similar size, even though the first sample contained nineteen times as much cocaine. A coefficient of 0.05 makes even less sense; it implies one would pay only 15% more for nineteen times as much cocaine. A third way one can see that the simple, log-linear adjustment for purity is unsatisfactory is to note that adding this tenn as an explanatory variable does not appreciably improve the fit of regressions on price. 6 To put it another way, once one has considered transaction size, knowing the purity of an observation gives very little additional infonnation about the price paid. 7

Finally, and perhaps most importantly, price series estimated in this manner with STRIDE do not reflect the collapse in prices in the mid-1980s which is widely acknowledged to have occurred. The series show that prices declined, but not dramatically so. Hence, one might seek other models. Some studies, such as Rhodes and Hyatt (1992) and ONDCP (1992), have simply ignored the possibility that quality premia might not be complete. They effectively force the quality premia to be complete by regressing on pure quantity rather than quantity and purity separately. This approach does give plausible price series, but unfortunately there are other noteworthy problems. One of the more significant is that the model simply does not fit the data as well. The fit is noticeably worse than even that obtained by ignoring purity altogether. More seriously, however, the estimated coefficient for log pure quantity is implausibly small. Caulkins (1993b) evaluated four ways of adjusting for purity when creating price series for cocaine from STRIDE: (1) ignoring purity altogether, (2) the log-linear adjustment discussed above, (3) regressing on log pure quantity, and (4) regressing on log expected pure quantity-which is discussed further below. The coefficient estimated at the retail level when regressing on log pure quantity was 0.65; at the wholesale level it was 0.64. Suppose the coefficient's average value between the one gram and one kilogram level was 0.645. That would imply that if a pure gram cost $100 (which is a typical value for 1991), then a pure kilogram would cost 100 * (1000)0.645 ::::l $8,600-far below any pure kilogram prices observed within the U.S .

SThe exponents estimated with WSIN data (Caulkins and Padman) are larger and, hence, less perverse, probably . because WSIN data are not transaction level. 60 ne obtains very similar results if purity itself instead of log purity is used as an explanatory variable. 7Since purity generally increases with transaction size this absence of an effect might be partially due to colinearity, but variation in log quantity explains less than 3% ofthe variation in log purity, so this is unlikely to be the entire explanation.

CREATING CONSISTENT PRICE SERIES

28

'.~.,--'

All of the other methods estimated coefficients of about 0.80 at the retail level and 0.77 at the wholesale level. Suppose that 0.79 is a fair representation of the coefficient's average value between 1 gram and 1 kilogram. Then, if one pure gram cost $100, one pure kilogram would cost about 100 * (1000)0.79 R: $23,400. This is consistent with the data in STRIDE, the DEA's Office of Intelligence price reports, and conventional wisdom about the price of a kilogram. Another way to see that the coefficients obtained by regressing on log pure quantity are too low is to consider the interpretation of this coefficient P provided by Caulkins and Padman (1991):

P= 1_ 1n (1+(;:) In (
(5)

where

(;: = the percentage mark up in price as one moves one level down the domestic

distribution network and = the branching factor as one moves one level down the domestic distribution network; that is, is the number of people to whom one supplier sells.

Assuming that the markup at each level, (;:, is between 50% and 100%, coefficients of 0.64 and 0.65 imply branching factors of only 3 to 7. In contrast, values of the coefficient estimated by the other methods imply much more realistic and less restrictive ranges for the branching factor. Specifically, the value of 0.80 for the retail level implies a branching factor in the range of8 to 32, and a value of 0.77 at the wholesale level implies a branching factor in the range of6 to 20. To summarize, the traditional log-linear adjustment for purity is problematic because it produces price series which conflict with conventional wisdom and coefficient estimates for the purity adjustment which suggest that drug market participants are almost indifferent to the quality of the drug. Making the a priori assumption that pure quantity drives transaction level prices, however, is also problematic, because it degrades the fit of the model and produces implausible estimates for the coefficient describing the quantity discount. Caulkins (1993b) suggests a simple resolution to this conundrum; namely, that it is the expected purity, not the actual purity, that, along with transaction size, governs price. A moment's reflection about the nature of illicit drug transactions suggests why the expected and actual (or observed) purity can differ and why the former is more relevant to price. The purity observed in drug transactions is highly variable, even within a single city at one time. Furthermore, customers in illicit drug transactions rarely have the means to quantitatively assay the quality of the product they are considering purchasing. In addition, drug agents rarely have long-standing relations with their suppliers; once one or a few transactions have been completed, an arrest is usually made. Because of these three factors, the business relationship upon which STRIDE data are based rarely lasts long enough for statements about the quality of the product to be trustworthy. That is, even if the seller knows the purity of the product, he or she cannot

CREATING CONSISTENT PRICE SERIES

29

·

~

.. I;

communicate that information to the agent in a credible manner. Thus, the true purity of the sample in question is not known at the time of purchase. The buyer does, however, have some expectation of what the purity might be, based on observations of other transactions of roughly the same size occurring in the same area in the recent past. It is this expectation which drives price. The evidence discussed above is consistent with the hypothesis that it is the expected not the actual purity which governs price. If the hypothesis were true, one would not expect the actual purity of the transaction to give much information about its price. Likewise, one would expect that regressing on log pure quantity would yield a weaker fit and estimates of the quantity discount coefficient which are biased downward because it introduces extraneous noise into the independent variable. That is, the independent variable is effectively measured with error, and this fact has not been taken into account in running the regressions. On the other hand, one would expect regressing on log pure quantity to produce roughly the right price trend because the actual purity is as likely to be greater than the expected purity as it is to be lower. Obviously it is impossible to reconstruct buyers' expectations on a transaction-by-transaction basis, but Caulkins' results suggest that this is not necessary. He simply used the median purity observed in that year for that class of transaction (retail or wholesale) as a proxy for the expected purity. The resulting model fit the data as well as any of the other models,8 and the price series obtained capture the collapse in prices experienced in the 1980s just as well as do the price series obtained by regressing on log pure quantity. 9 Hence, the realization that it is expected pure quantity, not observed pure quantity, which drives the price in transaction level data both explains otherwise contradictory evidence and provides a practical basis for considering purity when producing price series.

5. Alternative Ways of Constructing Price Series There are a variety of ways of constructing price series. This section discusses three: aggregating observations of similar size, regression, and standardizing individual observations. All are responses to the fact that quantity discounts preclude directly comparing observations taken from different market levels.

8With expected purity estimated as describe, the fit is exactly the same as that obtained by regressing on log quantity alone; the only effect is on the dummy coefficients' values. Hence, the fit is marginally inferior to that obtained by regressing on log quantity and log purity separately, but that differe~ce is easily made up by using even modestly more sophisticated methods of proxying for expected purity. 9The aggregate price series produced by the two methods appear to be similar because price estimates produced by regressing on pure quantity are in some sense unbiased. On average, observed purity equals expected purity, by definition. The estimates are, however, inefficient because extraneous noise is introduced, making them less useful when there is a limited amount of data, as is the case, for example, when producing estimates for individual cities.

CREATING CONSISTENf PRICE SERIES

30

The first method, aggregating observations of similar size, is the traditional approach. Its principal advantage is its simplicity, but it also has a number of serious drawbacks. The second, regression, makes better use of the data but requires some advanced knowledge about the price series one is trying to find. The third, standardizing individual observations, also uses the data efficiently and is more flexible. These three approaches are discussed in tum.

Aggregating Observations oj Similar Size The most common response to the fact that quantity discounts preclude directly comparing observations of different sizes has been to restrict attention to observations of the same or similar size. The basic procedure is to divide observations into categories based on transaction size and report the average price per unit and the average purity (or some range of purity) for each category. The National Narcotics Intelligence Consumers Committee (NNICC, various years) and the DEA's Office of Intelligence (DEA, various years), for example, often report prices at the gram level, the kilogram level, and sometimes the ounce level. This approach, while not unreasonable, has several drawbacks. These drawbacks include the inability to take advantage of data from intermediate market levels, the inevitable aggregation of observations of quantities that are not exactly the same, and the suggestions that transactions occur at discrete market levels when that is generally not the case. The first two drawbacks work together to limit the power of this approach. The less data one omits, the greater the problems associated with comparing unlike observations, and there is no happy middle ground. For example, suppose one used only data which are within 25% of one gram, one ounce, and one kilogram to estimate gram, ounce, and kilogram prices, respectively. These ranges might seem narrow, and, indeed, they exclude almost 70% of the cocaine data in STRIDE. Nevertheless, it still involves aggregating observations whose prices per unit will differ by almost 50% .10 Hence, if at all possible, one should construct price series in a manner which incorporates the loglinear adjustment for transaction size. Both of the methods described next do just that.

Regression

An alternative way of constructing price series is regressing price (or more commonly log price) on a variety of regressors (such as log expected pure quantity, dummy variables for location, etc.) including variables related to time. The coefficients estimated for the time variables can then be used to construct a price series.

10The ration of sizes 25% above to 25% below is (1 +0.25)/(1-0.25)= 1.666, and 1.66613 ranges from 1.47 to 1.50 as ~ varies from 0.75 to 0.80.

C REATING C ONSISTENT PRICE SERIES

31

,","

. ':"

t •

The simplest approach is to include dummy variables for each time period of interest. This has the advantage of imposing very little structure on the data.II That is, it avoids the problem of finding in the data only what one expected to see. A disadvantage is that, in estimating the price in one time period, it ignores information potentially available in other time periods. This problem is most severe when the time periods under consideration are brief, both because that is precisely when data in adjacent periods are most likely to be relevant and because the number of data points contained in the period for which the estimate is desired is negatively related to the length of that time period. One might seek to avoid these problems by defining very broad time periods, but another disadvantage with the dummy variable approach is that it gives no insight into price changes over intervals shorter than the length of the predefined period. For example, one might determine that prices in 1990 were 30% higher than they were in 1989, but one would not know whether the price increase was gradual or concentrated in a particular quarter. For that matter, one would not even be able to tell if the price simply increased by 30% or if they increased by 60% for the first six months of 1990 and fell back to their original levels in the second half of the year. Hence, there is an unappealing trade-off. Short time intervals make inefficient use of the data; long time intervals mask potentially interesting shorter term trends. If one has prior knowledge of the nature of the price changes, one can make better use of the data. For example, if one knew for certain that prices had increased linearly over the interval in question, then one could incorporate that into the regression model directly and simply estimate the intercept and slope of that linear increase. Furthermore, if one had such prior knowledge, one could also make statements about prices in time intervals for which there is little or no data based on the data in other periods. This general approach is by no means limited to linear trends. As long as one knew the general shape of the time trend, one could fit it with a parametric curve, estimate the parameter values by regression, and use the functional form and parameter estimates to produce the estimated price series. The catch is that one does not always have perfect prior knowledge of the nature of the price changes. One could, of course, use exploratory data analysis to discover the general trends, posit functional forms motivated by these trends, and apply those functional forms to the data, but such "mining" of the data is treacherous. Furthermore, if one's prior beliefs--either true priors or "quasi-priors" which are informed by the data-are wrong, and one imposes them on the data, one can obtain perverse results. To illustrate this, consider the following hypothetical and stylized example. Suppose one wanted to investigate the extent to which the 20-ton cocaine seizure in September of 1989 affected prices. Suppose further that the actual impact were to immediately double the retail price per gram from $100 to $200, but that this effect decayed exponentially over time. (See Figure II.1.) Finally, suppose that one incorrectly assumed that the price increase were linear over time and fit a line to those data points. The resulting best-fit regression line (also shown on

lllt does, however, assume that there is no interaction between time and the other regressors, such as log quantity, which mayor may not be valid.

CREATING CONSISTENT PRICE SERIES

32

Figure II. I) is $114.83 + 0.56x, where x is the number of months since January, 1989. It does not represent the data well at all . It suggests that the seizure had very little impact on price, and it suggests that the trend during the course of 1990 was of increasing rather than decreasing prices. Figure 11.1: Hypothetical Example of a Misspecified Model

E ra '-

(!) '-

a

200

a

180

a

Actual Price

-------- Regression Line

Q)

Co Q) (,,)

a

160

';:

a..

a

c 140 'n; Q)

a

a y=114.83+0.S6099x _______ ______________ '0-------_______________________ a h -R 2=0.019 a a

(,,)

0

u 120

-

'n; Q)

0::

a

100

a a a

a a a a

a

a

80 0

3

6

9

12

15

18

21

24

Time (Month since January 1989)

Of course no one would actually make such a blunder with these data. The analyst would simply plot the data and spot the mistake. One drawback with estimating time series from STRIDE based on prior assumptions about the price trend, however, is that one cannot simply plot the data. One should still catch such blunders by analyzing residuals, but in general it is more difficult to develop intuition for the behavior of the prices. Since developing such intuition is desirable, the discussion will tum next to a method of standardizing individual price observations.

Standardizing Individual Price Observations The reason quantity discounts and variations in purity make it difficult to monitor prices is that they prevent one from directly comparing different price observations. Comparing a one gram purchase to a seven gram purchase is like comparing an apple and an orange. Hence, what one would really like is a way to convert apples into oranges and vice versa. Fortunately, in this context such alchemy is possible. That is, one can standardize the prices of individual transactions. Having done this, one can then apply any of the thousands of statistical methods that have been developed to analyze time series and panel data. The concept behind standardizing individual observations is simple. Consider again the fundamental log-linear relationship described in Equation 3, but modified to reflect the discussion of expected purity in Section 4 and written in a different form: CREATING CONSISTENT PRICE SERIES

33

P

a =--=p Q

(6)

where P = the price paid, ~ = parameter describing the extent of the quantity discount, Q= expected pure quantity of transaction, and a = standardized price per expected pure gram. Now suppose the expected purity for one-half ounce transactions was 70%. And suppose an agent purchased one-half ounce for $600. Then the expected pure quantity, Q, is 70% * 0.5 * 28 .35 grams/ounce = 9.9225 grams. If ~ for this market level were 0.80, then this implies that the standardized price per gram is a = $600/(9.9225 0 .80) = $95.69. What Equation 6 suggests, therefore, is that if this cocaine had been purchased by a dealer, instead of an agent, and resold at a level such that the expected pure quantity were one gram, then the price would have been $95 .69 * 1.00 .80 = $95 .69. Hence, given Equation 6 and ~ =0.80, the fact that an agent made a purchase whose expected pure quantity was 9.9225 grams for $600 gives essentially the same information about market prices as if the agent had paid $95.69 on a purchase whose expected pure quantity was one gram. If one defines one expected pure gram as the standard reference retail purchase size, then all retail observations can be converted into this standard unit and compared. Mechanically one could convert the price of a 2 kilogram purchase into the same standard unit, but this is inadvisable because, as mentioned above, the coefficient ~ can vary slightly across market levels. More generally, kilogram-level transactions and one gram street purchases are quite different phenomena, and it seems prudent not to rely on a simplistic formula to bridge such a broad gap. What one can do, however, is standardize purchase observations of roughly one kilogram to a standard reference quantity of one expected pure kilogram. This can be done with a modified form of Equation 6:

(7) where

Qo is the standard reference quantity, and the other terms are as ·above. For example, suppose that the expected purity of one-half kilogram were 85%, an agent purchased one-half kilogram for $12,500, and the ~ value for wholesale purchases were 0.77. Then the standardized price per reference kilogram would be:

CREATING CONSISTENT PRICE SERIES

34

a.

=(

1 kilogram ) 0.77 $12, 500 =$24,157. 85%* 1/ 2 kilogram

In a similar manner, purchases that are close to an ounce can be standardized using Qo = 28.35 grams, and the appropriate value of f3. This raises the question of how one finds the appropriate value of f3 . One simple way is to regress log price on log expected pure quantity and whatever other regressors are appropriate (e.g. dummy variables for location) and use the estimated coefficient for log pure quantity. Technically this is somewhat improper if the data which are to be standardized are included in the data used to estimate f3. Pragmatically, however, this is not a concern for two reasons. First, the values of f3 estimated using the observations in question, all observations, all observations excluding the ones in question, etc. are all nearly the same, at least as long as the range of quantities included is not too narrow and the number of data points is sufficiently large. Second, Caulkins (1993b) shows that trends in price series produced in this way are very robust with respect to the exact value of f3 used. Changing the value of f3 will affect the absolute value of the estimated prices, but it affects all by a similar proportion, so the trends are stable. The only exception to this occurs when the distribution of transaction sizes differs over the aggregates being compared. For example, if one were comparing retail prices in 1989 and 1990, most of the retail purchases in 1989 were for less than one gram, and most of the retail purchases in 1990 were for more than one gram, then the value of f3 could influence onels perceptions of the price trend. Pragmatically, the distribution of transaction sizes within a market level does not vary systematically over time or location, so this is not an issue. In summary, it is possible to standardize all observations from a similar market level. The greatest advantage of this approach is its flexibility. Once the individual observations have been standardized, then the price data can be analyzed just like any other cross-sectional time series. They can be plotted, adjusted for seasonal effects, analyzed with ARIMA or other moving average models, and so on. The ability to easily apply methods which are robust with respect to outliers is of particular interest in this regard. As has been discussed above, the price data in STRIDE are fairly noisy and include outliers. This suggests that, when looking for a measure of the central tendency of a set of observations, one would rather use the median than the mean, because the median does not depend directly on the exact value of non-central observations. When observations are standardized individually, the median, as well as other robust measures of the central tendency such as the trimmed mean, can be computed readily. The regression methods described above are less flexible. Likewise, one can use interquartile ranges, in place of the mean plus and minus one standard deviation, as a measure of the location and range of the bulk of the data.

6. Application to STRIDE's Cocaine Data This section summarizes the lessons discussed above into a procedure for creating price series and then applies that procedure to the STRIDE data.

CREATING CONSISTENT PRICE SERIES

35

..

~

..

Procedure Step 1: The first step in creating a price series is to identify the geographic region for which the price series is desired. Since purity and price vary across cities, it is generally not advisable to aggregate data from more than one city, for at least two reasons. Since prices vary over even relatively modest distances (Caulkins, 1993a) and there is no guarantee that those variations remain constant over time, it is somewhat misleading to even speak of a single price series for a larger region. Sometimes price series for larger geographic units are needed. In such cases the best approach is to compute a weighted average oflocal price series. Some care must be exercised, however, in choosing the weights. One would clearly not want, for instance, to weight the local price series by the number of observations in STRIDE (as is implicitly done if one simply uses all of the data). This would put too much emphasis on prices in places where, for various reasons, the number of STRIDE observations is large relative to the size of the market. 12 Furthermore, because of the vagaries of enforcement, the ratio of observations taken in different cities varies over time. If this is not adjusted for, it opens the door to paradoxical results of the sort discussed in Section 5 of the previous chapter. Ideally one would weight the local price series by the corresponding market size, but good measures of market size are not available. Hence, one must rely on proxies such as numbers of DAWN mentions or simply the population of the city. In addition, one might want to adjust the weights for the reliability of the estimate. Cities whose price series were poorly estimated would be weighted less, all other things being equal, than cities whose series were estimated from a large number of reliable data points. There also may be instances in which sparsity of data necessitates combining observations from different cities. For cocaine, prices standardized for transaction size, even after outliers have been removed, have coefficients of variation around 0.4. Hence, if there are only a handful of observations, there simply will not be enough information to estimate modest price trends reliably. In such cases, the problems associated with comparing potentially unlike things (prices in different cities) may be less severe than the problems of estimates based on small sample sizes (high variability). Merging data in this way should only be done, however, ifthere is no prior basis for believing prices in the cities in question differ and the data themselves do not suggest such differences. Step 2: Having identified the geographic region for which the price series are to be created, one next needs to decide on a range of transaction sizes from which data will be used. One possibility is to

12For example, San Diego has by far the most retail level observations, but other cities cocaine markets may be as large or larger.

CREATING CONSIS1ENT PRICE SERIES

36

use all the data since, by their very nature, the methods discussed above can standardize over various transaction sizes. This cavalier approach is imprudent, however, given Caulkins and Padman's (forthcoming) finding that the exponent for transaction size can vary (albeit weakly) with transaction size. Furthermore, not allowing for this possibility forces prices at different market levels to be proportional to each other, and one of the principal reasons for constructing price series is to compare the price movements at different market levels. It seems safe to combine observations taken from a range of one to two orders of magnitude. Wider ranges may also be acceptable, but should generally be avoided unless they are needed to obtain a sufficient number of data points. For example, if one wished to create a retail or "gram level" price series, one might use observations larger than 0.1 grams (because measurements made on smaller quantities may be less reliable) but no larger than 4 grams (larger enough to include one-eighth ounce purchases, or "eight balls"). Likewise one might use observations of between 0.1 and 10 kilograms when estimating wholesale or "kilogram level" price series. Observations below 100 grams border on lower-level wholesale. The 10 kilogram upper limit is less important because it is rare for there to be purchase observations for more than 10 kilograms. For ounce level price series, a range of 20 - 40 grams might be appropriate.

Step 3: The next step is to eliminate outliers. As discussed above in Section 2, there are a number of simple and effective ways of doing this, but it is essential that it be done. Step 4: The next step is to find a reasonable proxy for expected purity. One might use, for example, the median purity observed for non-zero purity purchases (or perhaps purchases and seizures combined) in the quantity range, city, and time period in question. If there are not sufficient data (e.g. fewer than five data points) in one time period, then the median can be taken over the data in that and the two adjacent time periods.

An alternative is to regress log purity (for observations with non-zero purity) on log quantity and dummy variables for time and location. Log of quantity is included as a regressor because purity can increase with transaction size, although often its coefficient is not significant. The regressors can be augmented by interaction terms, but the interaction terms usually are not significant, and when they are significant, their explanatory power is low. Once an expected purity has been obtained, the expected pure quantity is found simply by multiplying the total weight of an observation by its expected purity. Step 5: Finally, the price series can be produced either by regression or by standardizing individual observations, as was described in Section 5. If one chooses to standardize individual observations, then one needs an estimate of~, the measure of the extent of quantity discounts. As mentioned, one can find ~ by regressing log price on log expected pure quantity and appropriate ancillary variables. Typically values obtained are between 0.75 and 0.8, and Caulkins (1993b) shows that trends in standardized prices are very robust with respect to the exact value of ~ used.

CREATING CONSISTENT PRICE SERIES

37

Price Series for Individual Cities Gram (0.1 - 4 gram), ounce (20 - 40 grams), and kilogram (0.1 - 10 kilograms) level price series were created for the cities indicated in Table II.2 by standardizing individual observations for expected pure quantity, where expected purity was proxied by median purity. Details of the results are presented in an appendix to this report. Table 11.2: Cities for Which Price Series Were Computed CITY Atlanta Baltimore Boston Chicago Detroit Minneapolis New Orleans New York Philadelphia Saint Louis San Diego Washington, DC Years

GRAM

OUNCE X X

KILOGRAM

x

X X

X

X

X

X

X X

X X

X

X

X X

X X X

1982-1991

1983-1991

1985-1992

As was discussed above, one of the virtues of standardizing individual observations is that doing so allows great flexibility in presenting and analyzing the results. To illustrate this, Figures II.2, II.3, and II.4 apply three different methods of graphing price series to the gram, ounce, and kilogram level results. Figure II.2 shows an exponentially smoothed plot of San Diego's gram level individual price observations; this gives maximum prominence to short term fluctuations in pnce.

CREATING CONSISTENT PRICE SERIES

38

Figure 11.2: Gram Level Observations in San Diego , 1982-1991 (1991 dollars), Exponentially Smoothed

350

300

~

(;

250

~

GI Co GI

u .;:

200

Il. "0

~ 150

~ III

"0

;

100

en

50

o

+-----+-----+-----+-----~----~----_r----_r----_r----~----~

1/82

1/83

1/84

1/85

1/86

1/87

1188

1/89

1/90

1/91

1192

Figure n.3 plots the median ounce-level standardized price, for each year, for each of the eleven cities for which ounce-level price series were computed. I3 The plot dramatically shows that prices in every city follow similar trends even though there can be consistent differences in price. It does not, of course, clearly depict trends for individual cities, but such plots can readily be generated from the data in the Appendix.

13Nine city-year combinations for which there were ten or fewer data points are omitted.

C REATING C ONSISTENT PRICE SERIES

39

Figure 11.3: Median Ounce level Standardized Prices for Eleven Cities, 1983-1991 (1991 dollars)

6,000

8c o

5,000 x

;

4,000

~

Q. QI

U

~ 3,000 'tI

:!I

---'''"-<:--- x

~ 2,000 CIS

'tI C

iii

1,000

o

O +----r---~~--~---_+---_+---_r---_r---~

1984

1983 -

-

1986

1985

1987

...-

Chicago

--0-

Atlanta



Detroit

---lr--

New Orleans

x-

st. Louis

-:1:-

Washington

-

1989

1988

1990

-+---

Boston

-----¢---

Minneapolis



New York

--0--

San Diego

1991

- + - - Baltimore

Figure IT.4 plots the median, 25th percentile, and 75th percentile of New York City's standardized kilogram level prices by year for 1985-1992. This plot gives a sense not only of how prices have varied over time, but also for how they vary from purchase to purchase at a single time. Figure 11.4: Interquartile Ranges of Standardized Prices at the Kilogram level in New York, 1985-1992 (1991 dollars)

50,000 45,000

...~CI 40,000 g 35,000

~



~

i

30,000

QI

.g

25,000

"'2

20,000

a.

'" :;

'6 'tI

15,000

c

iii

10,000 5,000 0 1985

1986

• 25%

C REATING CONSISTENT PRICE SERIES

1989

1988

1987

0

50%

1990

1991

1992

• 75%

40

Composite Price Series A "national" ounce level price series was created by computing the weighted average of the eleven city specific ounce level price series described above, with weights equal to the city population. l4 Figure II.5 shows the results. There was clearly a steady decline in prices through 1989, an increase in 1990, followed by a further decline in 1991. Figure 11.5: National Ounce Level Price Series, 1983-1991 (1991 dollars)

4000

3500

8 c o

3000

;:,

; 2500

Co

cu

CJ

;t

~ ~

2000

1500

III

"C C

.l! 1000 en

500

o

+_----~r-----_+------_r------+_----~------_+------_r----~

1983

1984

1985

1986

1987

1988

1989

1990

1991

Finally, sometimes one may be interested in comparing trends in prices at different market levels. When doing this, it is imperative that the geographic regions covered by the two series be identical. Suppose, for example, one wanted to compare ounce and kilogram level price series. One could compute an aggregate kilogram level series by combining the Boston, Chicago, and New York City price series in the manner described above.ls This should not, however, be compared to the price series in Figure 11.5 . Instead, it should be compared to a new, composite ounce level price series based on city specific series for those same three cities. This is done in Figure II.6. (To facilitate comparison, the ounce level and kilogram level price series are plotted on different axes.)

14City population for specific years was linearly interpolated and extrapolated from the 1980 and 1990 census figures. ISThis series is referred to as an "aggregate" not a "national" price series since it is based on only three sites.

CREATING CONSISTENT PRICE SERIES

41

Figure 11.6: Comparable Ounce and Kilogram Level Composite Price Series, 1985-1991 (1991 dollars)

4,000

50,000 ---0--

3,500 cu c: 3,000 ~ 0

Ounce Level (left scale)

u

...



Kilogram Level (right scale)

~ 2,500 cu

u

.~

0..

2,000

45,000

E

40,000 I!

01

o

35,000 ~

...

30,000 ~ cu 25,000 .~ 0..

't:J

cu

.~

1,500

"'c: r"'n

1,000

...

't:J

't:J

20,000

N

15,000 ~ "' 't:J

10,000 500

-g C

!'9 rn

5,000

0 ~--------------------------------------------~ 0 1985 1986 1987 1988 1989 1990 1991

7. Summary

This chapter has described and applied methods for creating price series from STRIDE data. It made several key points. •

Outliers must be removed.



The log-linear adjustment for transaction size should be used to avoid comparing unlike objects.



Transaction level prices are driven by expected, not actual, purity.



Creating price series form individually standardized observations is a useful, flexible alternative to regression methods.

These lessons were applied to a number of cities at the gram, ounce, and kilogram levels. The resulting price series show some interesting fluctuations. One might reasonably ask, however, whether the fluctuations are significant. Do they reflect true changes in market conditions or merely random noise? None of the tools described above answer such questions, so they are addressed in the next chapter.

CREATING CONSISTENT PRICE SERIES

42

III. Existence and Implications of Variability Between Purchases

The previous two chapters discussed how one might create price series for heroin from Domestic Monitor Program data and for cocaine from STRIDE data. By their very nature, price series rise and fall over time. Sometimes these variations reflect true changes in underlying market conditions; sometimes they reflect nothing more than random variation, i. e. "the luck of the draw." Distinguishing between these cases is vitally important. Consider, for example, the problem of evaluation. It is widely recognized that, at least in principle, market prices can be used as an indicator of the success or failure of various interventions. For example, one would hope that stepped-up interdiction would reduce the supply of drugs within this country and, hence, lead to an increase in the market-clearing price. Suppose one did see such a price movement after an intervention. If there were indeed a causal relationship between the intervention and the price change, that would imply that the intervention was having an effect and might justify allocating additional resources to that program. If, on the other hand, the program were ineffective and the apparent price response was due to nothing more than sampling variability, then shifting these resources would be wasteful. It is, in general, exceedingly difficult to demonstrate causality. But it is common to demand that the measured response to an intervention at least exceed that which can be readily explained by randomness before considering the intervention a success. One can similarly think of examples of policy analysis and planning, not just evaluation, for which it is important to determine whether fluctuations in the price trends produced by the methods described in the first two chapters of this report are significant. Hence, for a variety of reasons, one would like to distinguish between natural variability present in the absence of any trends and variability which is inconsistent with a null hypothesis that there has been no true change in the market. Making such distinctions is the subject of this chapter. In order to make such distinctions, one must first characterize the "natural" variability which occurs in the absence of any underlying, systematic changes. Ideally, to do this one would take a large number of observations under identical conditions. Practically speaking, there are no such data. One can, however, for a particular city and time period, examine purchases which are (1) of the same or similar weight, (2) of the same or similar dollar value, and (3) of differing weight, but which have been standardized for those differences. These three situations are addressed for cocaine data from STRIDE in the first three sections below. In each case, the procedure begins with a search for a parametric probability distribution which describes the data reasonably well. Having found such a distribution, one can then inquire as to how those parameters have varied over time, location, and transaction size. More generally, one can develop insight into the natural variability in repeated purchases (at least for cocaine). Then the fourth section applies the lessons learned to testing the significance of variations in the price series and designing data collection efforts. This section then closes with a summary of the results obtained.

CREATING CONSISTENT PRICE SERIES

43

1. Variability in Purchases of Similar Quantity Drugs are not sold in exact multiples of regular units, the way many licit commodities, such as sugar, are. For example, what is sold as an "ounce" of cocaine may only weigh 7/8 of an ounce. Hence, it is rare for there to be many purchases of exactly the same weight. So, to examine variability of price and purity for particular sales quantities, we considered all transactions whose weight fell within a narrow band around the nominal weight, to be of that weight. 1 We found 35 instances in STRIDE in which a single city in a single year had 30 or more purchase observations of essentially the same size. Most of the conclusions discussed below are based on these 35 instances and, to a lesser extent, 95 more city-year combinations that had between 18 and 29 purchase observations of a similar size. Price Most of the prices paid for a given quantity in a certain city in a particular year cluster around a particular value. The cumulative effect of variation in bargaining skill, geographic differences in price within the city, variations in price over the year, and variations in price by time of day and day of week, among other factors, would lead some observations to be above or below the usual price, but there is no obvious reason why these deviations should be skewed one way or another. Hence, one might expect that the prices would be roughly normally distributed. To a first-order approximation this appears to be the case, at least after outliers have been removed. Of the 35 instances with 30 or more purchase observations of the same size in one city in one year, 18 passed Lilliefors' test for normality2. After outliers were removed (discussed below) 25 passed the test. Of the remaining 10, about halffail the test of normality simply because prices cluster at convenient dollar amounts, e.g. multiples often dollars (also discussed below). Such normality, or near-normality, is fortuitous because it is relatively easy to design statistical tests of significance for differences in normally distributed random variables. The design of these tests is discussed further in Section 4. F or purchases of observations within a narrow weight range in a particular city and year, price standardized for quantity (in any of the ways mentioned in Section 3) will also be approximately normally distributed. This is simply a consequence of the fact that standardization for observations of similar quantities amounts to little more than scaling the original prices.

1 In

particular, for multiples of an ounce, we included observations that were within 9.7% below and 4.4% above the nominal quantity. Similarly, for multiples of a kilogram we included observations that were within 4% of the nominal quantity.

2Lilliefors' test is a variation on the Kolmogorov test of whether two distributions are dissimilar. See Sprent (1990), for a description and demonstration of these tests.

CREATING CONSISTENT PRICE SERIES

44

Deviations from the Normal Distribution Although prices for a given quantity in a particular location and time appear to be roughly normally distributed, strictly speaking they are not for two reasons. First, certain prices, especially multiples of ten dollars, are favored. Second, there are more outliers, or extreme values, than a normal distribution would predict. These two differences are discussed in turn. The distribution of price data is actually discrete, not continuous. Drugs are usually bought and sold in multiples of$10, $50, $100, etc., and rarely at odd prices such as $23 P. The larger the purchase, the bigger the multiple that is likely to be used. This in and of itself is a minor problem, but it is often compounded because certain round prices are favored over others. For example STRIDE contains 30 purchases of cocaine in New Orleans in 1984. One-third of these purchases were for $2,000. Generally, however, this effect is most pronounced for smaller purchase quantities. If the most popular price coincides with the average price this is not a problem. But if the average price falls between two of these convenient round prices, then the distribution of prices will be less peaked than a normal distribution and could possibly even be bimodal. The second major way in which the distribution of cocaine prices for a given quantity, location, and time differs from the normal distribution is the presence of outliers. An outlier is a data point that does not belong in the distribution of data. For example, an outlier might be a consequence of a typographical error, or it may come from a different distribution than the bulk of the data. Both of these circumstances exist in STRIDE data. (This may be because STRIDE is an evidence tracking system, not a drug market research tool.) The inclusion of outliers broadens, and shortens the distribution, making it more difficult to determine trends and differences between data. In addition, if the outliers are not balanced on either side of the mean, they will shift the mean of the distribution. Deciding whether to include or exclude an observation is tricky, especially in data about illicit drugs. Certainly typographical errors should be excluded, but other data points that are outliers mayor may not be a part of the distribution being studied. Recognizing that there is no foolproof method, we simply identified an observation as an outlier if it stood out from the rest of the observations significantly. Specifically, we identified as outliers those observations for which:

Ixo - med(x;)1

>5

med [Ix; - med (x;)l] where med(xJ is the median of all observations in the sample. If the distribution is approximately normal this procedure will identify as outliers those data points which are over three standard deviations from the mean. 4

3This phenomenon is not unknown in licit commerce. Candy bars, for example, are usually priced in multiples of five cents. 4Sprent (1990), p 277.

CREATING CONSISTENT PRICE SERIES

45

.'

"

Variation in Price Normally distributed random variables are customarily described by their standard deviation (a measure of absolute variability) and mean (a measure of their central tendency). The ratio of these two quantities, known as the coefficient of variation, gives a sense of how variable the quantity in question is as a percentage of its typical values.5 For example, if the coefficient of variation is 0.1, then nine observations in ten will fall within about 16% of the average value, whereas, if the coefficient of variation were 0.2 then, on average, only about six in ten would. Hence it is instructive to examine how the coefficient of variation differs across locations, time, and quantities, both for its own sake and as a step toward evaluating monitoring system's ability to detect changes in prices. Table III. 1 shows how the coefficient of variation in the price of one-ounce cocaine purchases has varied over time for seven cities.6 The table suggests that the coefficient of variation at the oneounce level has not varied much with location or time, typically being between 0.1 and 0.2.7 One might expect, however, that variability in price is inversely related to transaction size. Table III.2 investigates this possibility by showing the coefficient of variation for two cities in various years for several different transaction sizes. There does not appear to be any strong pattern in quantities two ounces and under, suggesting relative variability does not vary with transaction size at this level. However, in all five instances for which a coefficient of variation was estimated at the 118 kilogram level, it is smaller that the coefficient of variation for smaller quantities in the same year and location. This suggests that the prices of low-level wholesale transactions may be less variable than those of smaller quantities. It should be noted though, that the variability shown in New York at the 1/8 kilogram level is no less than many of the values shown in Table III. 1 at the one-ounce level.

5The standard deviation for a price distribution with a mean of SI,OOO is usually larger than one with a mean of S10, just because higher prices are involved. If the SI,OOO distribution had a standard deviation ofSl00, and the S10 distribution had a standard deviation of S5, the latter distribution would typically be perceived of as being more variable. This intuition is reflected in the coefficient of variation. The distribution with the $1000 mean has a coefficient of variation of 0.1, and the distribution with the S10 mean has a coefficient of variation of 0.5. 6The coefficients of variation shown were computed after outliers were removed using the process describe earlier. 7 An

exception is that there appears to be a weak tendency for larger cities to have higher coefficients of variation. Such a tendency would make sense because larger cities simply have more room for local geographic variation in price. Likewise, the coefficients of variation seems to be smaller in earlier years. This might be because, at those times, effective price changes at lower levels were achieved in part by varying purity. More recently purities have been uniformly high, so they may vary less, and price fluctuations may be more likely to manifest themselves as changes in the dollar price. Whether or not either of these tendencies are real or the explanations offered are correct cannot be known without further data gathering.

CREATING CONSISTENT PRICE SERIES

46

Table 111.1: Coefficient of Variation in price of 1 Ounce of Cocaine 5 or more years per city (15 or more similar City-Year-Quantity purchases after outliers removed) CHICAGO YEAR

Coe!'.

n

1981

0.069

21

1982

0.081

28

1983

0.127

21

1984

0.132

26

1985

0.1 30

50

1986

0.105

48

1987

0.192

1988

0.151

1989

0.151

36

1990

0.145

53

1991

0.145

43

1992

0.156

18

DETROIT Coel'.

0.055

n

18

NEW ORLEANS Coe!'.

n

NEW YORK CoeL

SAN ANTONIO

n

CoeL

ST. LOUIS

n

Coel'.

n

0.099

15

0.121

18

0.140

18

0.097

20

WASHINGTON

coer.

n

0.113

16

0.112

30

0.103

19

0.135

21

0.121

46

0.139

36

0.084

35

0.119

16

30

0.183

35

0.151

18

0.083

27

0.130

26

0.1 52

36

0.149

75

0.166

33

0.097

33

0.205

35

0.197

24

0.085

27

40

0.180

16

0.118

41

0.363

17

0.139

20

0.237

22

0.334

17

0.195

18

0.139

33

0.384

20

0.104

18

0 .164

22

0.183

15

0.203

15

0.197

21

0.117 0.106

37

0.178

19

0.201

25

16

Table 111.2: Coefficient of Variation at Various Market Levels and Years for New York and Chicago (15 or more similar City-Year-Quantity purchases after outliers removed) YEAR

CHICAGO Weight

Coef.

n 24

NEW YORK Coef.

n

35

1985 112 ounce

0.125

1/4 ounce

0.073

18

2 ounce

0.104

20

1986 1 ounce

0.105

48

0.183

2 ounce

0.141

38

0.315

17

0.1n

24 35

1/8 kilo

1981 1 ounce

0.1 92

75

0.205

2 ounce

0.178

75

0.185

18

0.151

25

0.363

17

1/8 kilo

1988 1 ounce

0.151

40

2 ounce

0.138

33

1/8 kilo

0.197

18

1 kilo

0.138

16

0.384

20

0.243

23

0 .203

15

0.135

23

1989 1 ounce

0.151

36

2 ounce

0.120

19

118 kilo

1990 1 ounce

0.145

53

2 ounce

0.144

17

1/8 kilo

1991 1 ounce

0.145

43

2 ounce

0.144

25

CREATING CONSISTENf PRICE SERIES

47

Since the coefficient of variation does not appear to differ enormously across time, location, or transaction size, a reasonable rule of thumb to use is that the coefficient of variation for the price of repeated purchase observations of the same size is usually between 0.1 and 0.2. This fact will be used in Section 4. Similar investigation were performed for the coefficient of variation of the pure unit price, i.e., the unit price divided by the purity. Dividing by purity produced marked increases in variation, suggesting that price and purity are not perfectly correlated. Overall the coefficient of variation was about three times higher for pure price than price, with the majority ofthe coefficients falling between 0.2 and 0.6. Purity

One can also examine the distribution and variability of the purity observed in cocaine purchases of a similar size in a particular city and year. Since purity is always bounded between 0 and 1 and was observed to be unimodal, it was natural to try to fit the data with a beta distribution. 8 The mean and standard deviation of the beta distribution are determined by two parameters 0. and ~, just as with the normal distribution. However, unlike the normal distribution, the beta distribution need not be symmetrical. Hence, these parameters also determine the shape of the distribution. This flexibility was important because not only does the mean and variance of purity vary across cities, years, and quantities, but so does the shape of the distribution. We began by attempting to fit beta distributions to the 35 city-year-quantity combinations with 30 or more purchases. For 32 of the 35 combinations, we obtained parameters that fit the distribution well enough to pass Kolmogorov's test for distribution similarity. No effort was made to remove outliers since it was not necessary, and because it is not clear in a drug purity distribution when it would be valid to remove them. F or example, Figure ill. 1 shows the observed cumulative distribution of purities and the cumulative distribution function of the beta distribution fit to those data. The figure shows that the fit is quite good for all purities.

8The beta distribution is described by the equation:

r(o.+~) x I( x I0., ~)= r(o.)r(~) where OO,

CX -

1

(1_x)Il- 1

p>o, and ro denotes the gamma function.

CREATING CONSISTENT PRICE SERIES

48

Figure 111.1: Cumulative Distribution of Purity Data, and Best Fit Beta Distribution at the 1 Ounce Level in Chicago in 1985

1.00

c::

0

.~

0.90 0.80

::::l

..c 0.70

- - - Cumulative Distribution of Data ------- - Cumulative Best Fit Beta Distribution

..... (I) 0.60 C 0.50 ~

Q)

> 0.40

.~

~

::::l

0.30

::::l C,.)

0.20

E

c:x.=4.01, P=1.85

0.10 0.00 0 .00

--0 .10

0.20

0.30

0.40

0 .50

0.60

0.70

0.80

0 .90

1.00

Purity

The advantage of doing this is that it allows one to draw the probability density function, which gives a better intuitive understanding of changes in purity than does just looking at means or even means and standard deviations. What follows are three examples of beta distributions derived from STRIDE data showing differences in purities across cities, quantities, and time. For each example, it is instructive to first examine the mean and standard deviations for each distribution and try to picture what is happening, then look at the plot of the beta distributions. Examining the beta distributions gives a better sense of the changes than analyzing the mean and standard deviation and implicitly assuming that purity is distributed normally. The first of example is Figure III.2, which shows the beta distribution derived for 4 cities in 1987 at the 1 ounce level. The mean and standard deviations derived from the distributions are as follows :

CITY New Orleans Minneapolis Detroit Philadelphia

MEAN 0.71 0.72 0.85 0.86

STD. DEV. 0.14 0.14 0.09 0.08

These data suggest that, purity is somewhat, but not dramatically, higher in Detroit and Philadelphia, than in Minneapolis and New Orleans. In Figure III.2, however, the high-purity

C REATING CONSISlENT PRICE SERIES

49

cities jump out as having different distributions. Not only is mean purity higher than in the lowpurity cities, but also cocaine of very high purity is much more common. Figure 111.2: Estimated Beta Distributions of Purity for 1987, Two High Purity and Two Low Purity Cities, 1 Ounce Level Purchases

0.14

>

.Ci)

0.12



Minneapolis

0.10

--0--

Philadelphia

0.08



I: Q)

C

>

.t:: .c 0.06 co .c 0 0.04

Detroit

--<>-- New Orleans

~

Q.

0.02

o. 00

~~J-O-{}-{lf-O-l:;HJ-~~~HHiHIl-fIjo1!H!t0e3~==-----+-----+----l--:O

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

Purity

One can similarly construct examples for different purchase quantities in the same city. For example, Figure ill.3 shows the beta distribution derived for 3 different quantity levels in New York in 1988. The mean and standard deviations derived from the distributions are as follows: QUANTITY 1 Ounce 1/8 Kilogram 1 Kilogram

MEAN

0.80 0.86 0.88

STD. DEV. 0.14 0.09

0.07

One can see from these data that the average purity gradually increases with increasing transaction size, but what is not so obvious is that the bulk of the distribution shifts towards higher purities as well-until one examines plots of the corresponding beta distributions . .

CREATING CONSISTENT PRICE SERIES

50

Figure 111.3: Estimated Beta Distributions of Purity for 3 different Quantity Levels in New York, 1988

0.16 0.14

...>-

0.12

'u)

c:: Q) 0.10 C

>- 0.08

• --0-



1 kilogram 1/8 kilogram 1 ounce

,~

.c co .c 0.06

...0

c.. 0.04 0.02 0.00 0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

Purity

Finally, one can examine purity in one city, at one market level, but over time. For example, Figure rn.4 shows the distribution of purities derived for Chicago at the 1 ounce level in 1985, 1986, and 1987. The mean and standard deviations of the distributions are as follows : YEAR

MEAN

STD. DEV.

1985 1986 1987

0.70 0.78 0.83

0.14 0.14 0.11

They suggest a steady progression. The plot below shows, however, that as purity increased in Chicago, it did so in a two stage process. In 1986, the distribution essentially shifted rightward, with not much change in shape (similar to what the mean and standard deviation indicate). Then in 1987 the mode of the distribution only shifted slightly, but the shape of the distribution changed.

CREATING CONSISJENT PRICE SERIES

51

Figure 111.4: Estimated Beta Distributions of Purity for Three Years in Chicago, 1 Ounce Level

0.12 0.10 > .t: III

c

0.08



1985

----D--

1 986



1987

II)

C

> 0.06

.t:

:cca

.&l

0

~

Q.

0.04 0.02 0.00 0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

Purity

In addition to making it easier to visualize how purity varies with time, place, and quantity, discovering that purity can be described by a beta distribution suggests a way of reporting the range in purity. Typically purity ranges are reported with no description of how they were obtained (e.g., what has been discarded as an outlier). One could, however, use all available observations to estimate parameters of a beta distribution, and then report the 5th and 95th percentiles (or 10th and 90th) of the distribution specified by those parameters.9 To demonstrate this we randomly divided the purity data from Chicago in 1985 at the 1 ounce level into two groups. We then estimated the best fit beta distribution parameters for each half and found the 5th and 95th percentiles of the distributions. As can be seen from Table III.3, reporting just the ranges of purities observed might lead one to conclude that purity observations were lower in Half # 1 than in Half #2 even though, by construction, they were drawn from the same distribution. The percentiles of the beta distributions, in contrast, are almost identical. The explanation for this is that the 50 original observations had only one case that was below 0.29 purity, and that case happened to end up in Half#l . One could argue that the case with 0.19 purity is an outlier, and report the data ranges

9

The ranges produced by this procedure would not give a 90 percent confidence interval (the intervals are too narrow) because the parameters are not known. Nevertheless, they can be instructive.

CREATING CONSISTENT PRICE SERIES

52

~,.

:.

without it. But, by estimating beta distributions and reporting their percentiles, no exclusion of outliers is necessary, since the best fit beta distribution is much less sensitive to outliers than the data range. Furthermore, when one reports the actual observed range, the width of the interval will tend to increase as more data points are collected. The procedure proposed here, in contrast, does not suffer that bias. Table 111.3: Alternative Procedure for Describing Purity Ranges

Original Data (n=50) Half#1 Half #2

RANGE OF OBSERVATIONS 0.19-0.93 0.19-0.93 0.29 - 0.91

5th, 95th PERCENTILES OF BEST FIT BETA DISTRIBUTION 0.36,0.94 0.37,0.94 0.35,0 .93

2. Variability in Purchases of Similar Price The previous section considered repeated observations from the same market level, in the same city and year. One can also examine repeated observations of the same dollar value in the same city and year. Unfortunately there are simply fewer instances of 30 or more similar observations when grouping data this way, so the findings here are tentative. Our preliminary analyses suggest, though, that when groups of observations are selected this way, purity is still reasonably welldescribed by a beta distribution, and both quantity and pure quantity are roughly normally distributed. It seems, however, that the coefficients of variation of net quantity for observations of similar price were much larger and more varied than were the coefficients of variation of price for those of observations of similar quantity, with the majority ranging from 0.0 up to 0.6. The coefficients of variation for pure price were similar, with the majority ranging from 0.1 to 0.7.

3. Variability in Standardized Prices10 Chapter II argued that there are advantages to creating price series by standardizing the prices of individual transactions. One is interested, therefore, in characterizing the distribution of standardized prices in a given location at a given time for several reasons. For example, one needs to understand this variability in order to decide what, if any, significance to attach to observed changes in price between locations or over time. The favored means of standardizing prices is to adjust for expected purity: .. StandardIzed Pnce #1 =

Observed Price . . (Quantity* Expected Punty)1l

(1)

10Material here is based on Caulkins (1993)

CREATING CONSISTENf PRICE SERIES

53

But one can also consider adjusting separately for quantity and purity: .. StandardlZed Pnce #2 =

Observed Price . . Quantity~l. Observed Punty~l

(2)

Observed Price (Quantity. Observed Purity)~ .

(3)

adjusting for pure quantity: Standardized Price #3

=

and adjusting for only quantity: Standardized Price #4 =

Observed Price (Quantity)~

.

(4)

Standardized prices by all four methods ll were computed for the city with the most retail cocaine observations (San Diego) and the most wholesale cocaine observations (New York)12 for the years for which there was sufficient data to perform meaningful analyses. In every case the distribution of standardized prices was unimodal and roughly centered on its mean. The distributions were not normal, however, because they were skewed left slightly (i.e., they had a large right-hand tail) and had more extreme values than a normal distribution would predict. In fact, once outliers were eliminated (defined in this case as having prices standardized by method #4 outside of the range $25 - $250), the distributions were often approximately lognormal. Hence, strictly speaking, familiar tests designed fo'r the normal distribution are not valid. Furthermore, since the distributions are not always lognormal (even after truncating the tails), other distribution-specific methods are also not truly correct, suggesting that one should rely on non-parametric methods such as the Kruskal-Wallis test. However, the familiar t-tests are relatively robust with respect to departures from normality. Since the observed distributions are basically bell-shaped, as long as the number of data points is not too small, the familiar tests should not lead one too far astray. One would also like to know which formula gives the least variable standardized prices, because the less variable are repeated observations, the better the chance there is of detecting true underlying changes in prices. Table m.4 shows the coefficient of variation of prices standardized with each method by year. (Since the standardized prices produced by methods #1 and #4 for any given year differ by a constant, they have the same coefficient of variation.) The table shows that

llExpected purity was found simply as the average purity of purchases of that type (retail or wholesale) in that city and year. 12Retail was defined as including observations of between 0.1 and 4 grams; wholesale was defined as 100 grams to 10 kilograms. Observations were excluded if the standardized price by method #4 was more than 8 times or less than 1/8 times the typical value. San Diego had at least 75 such retail observations for every year between 1981 and 1990. New York had at least 35 such wholesale observations for every year between 1985 and 1991.

CREATING CONSISTENT PRICE SERIES

54

the third method gives the least concentrated distribution of standardized prices; the variability with each of the other three methods is similar, so anyone one of them might reasonably be chosen. That the third method gives the "noisiest" standardized prices is not surprising because it is a function of the purity observed in the individual transaction, a quantity which is itself quite variable. Had the variability in purity been negatively correlated with variation in total quantity for a given price, the third method might have done better. But this does not appear to be the case. Hence, by this criterion, adjusting prices based on observed pure quantity is the least desirable method of standardizing prices. Among the other methods it appears that the coefficient of variation for retail purchases is between 0.35 and 0.5. For wholesale purchases it is between 0.2 and 0.3. Table 111.4: Coefficient of Variation of Repeated Observations Computed by Different Standardizing Formulas RETAIL (SAN DIEGO) YEAR

1981 1982 1983 1984 1985 1986 1987 1988 1989 1990

Method #1 0.376 0.340 0.466 0.377 0.705 0.388 0.498 0.468 0.493 0.314

Method tf2. 0.375 0.344 0.458 0.385 0.697 0.387 0.475 0.467 0.488 0.309

Method #3 0.497 0.459 0.464 0.506 0.746 0.424 0.596 0.444 0.536 0.353

WHOLESALE (NEW YORK)

Method #4 0.376 0.340 0.466 0.377 0.705 0.388 0.498 0.468 0.493 0.314

Method #1

Method tf2.

Method #3

Method #4

0.209 0.286 0.237 0.307 0.253 0.323 0.236

0.214 0.288 0.234 0.308 0.252 0.354 0.236

0.291 0.295 0.385 0.389 0.346 0.348 0.332

0.209 0.286 0.237 0.307 0.253 0.323 0.236

4. Implications for Detecting Variation in Prices

The most important implication of showing that prices, for a given weight, location, and time, are roughly normally distributed is the ability to use simple and familiar methods to test for statistically significance of differences between sets of price data. For instance, given data on one-ounce purchases of cocaine made in Chicago in 1989 and 1990, is the difference in average price between these two samples "significant"? That is, is it relatively unlikely one would see that large or larger a change in the observed sample average if there had not in fact been a true change in the market price? The first portion of this section describes how one would go about answering such questions. One is interested, however, not just in avoiding the mistake of seeing price changes when they are not there, but also in maximizing the probability of detecting a price change that does indeed exist. This second concept is known as the "power" of statistical test, and it is discussed in the second part of this section.

a

Testing for Significance of Differences "Significance" in statistics describes the amount of confidence one can have that an observed difference between samples does in fact reflect a true difference in the underlying CREATING CONSISTENT PRICE SERIES

55

l'

distributions. In this context, the samples are sets of purchase prices from, for example, different years, and a true difference in the underlying distributions means that the market price has changed over the intervening time. 13 The procedure customarily used for determining the significance of differences between two distributions that are assumed to be normal but for which the standard deviation of the population must be estimated from the sample is called a t-test. This test involves computing the difference between the two sample means and dividing it by the standard error to obtain a score which is compared to the t-distribution parameterized by number of degrees of freedom in the two samples combined. The t-distribution is similar to the normal distribution, but shifted because one does not know what the standard deviations of the populations truly are. As sample sizes get very large, this difference shrinks. To illustrate, below are the mean price, standard deviation and number of cocaine purchase observations in STRIDE at the one-ounce level in Chicago in 1989 and 1990 (with outliers left in). YEAR

MEAN

STD. DEV.

1989 1990

$913.20 $1259.50

138.3 227.2

N 36 57

The difference between the two means is $346.30. The standard error of the difference is the simply the sum of the sample standard deviations of the means divided by the square root of the sum of the number of observations i.e., 138.3+227.2 ~36 + 57

=

37.9.

Dividing the difference between the two means by 37.9 gives at-score of9.14. The sum of the degrees of freedom of the two distributions is (36-1) + (57-1) = 91. Checking a table of values from the t-distribution shows that we can reject the null hypothesis that the distribution of price in 1989 and 1990 had the same mean with a confidence of 0.001. I.e., it is very likely the price increase observed in the data reflect a true change in the market price, not just random noise. One can similarly ask whether prices in Baltimore and Washington in 1990 at the one-ounce level are in fact different, as there sample means suggest (again with outliers left in).

13Paradoxically, small numbers denote a high degree of significance. For example, if the difference between samples is found to be significant at the 0.001 level, one can be quite sure that there truly is a difference between the underlying distributions: only one time in a thousand (0.001) would such a difference arise if the samples had been drawn from the same underlying distribution. One would require better odds to accept a bet about the existence of a difference which was found to be significant only at the 0.05 level, reflecting a difference that would be produced by mere chance one time in twenty.

CREATING CONSISTENT PRICE SERIES

56

CITY Baltimore Washington

MEAN

STD. DEV.

N

$1275.00 $1497.50

130.3 634.6

20 20

The difference between the means is $222.50 and the standard error is 120.9, giving a t-score of 1.84. Consulting a table of t-distribution values with (20-1) + (20-1) = 38 degrees of freedom shows that one cannot reject the null hypothesis that the distribution of one-ounce prices in Baltimore and Washington in 1990 had the same mean at the 0.05 level. This does not, of course, mean that they were the same, but rather the evidence for a difference is weak. Indeed, if one first removes outliers (by the method described above) and then conducts the test, the difference would appear to be significant at the 0.005 level, again illustrating the importance of dealing with outliers carefully. The Power of Tests for Price Differences

As discussed above, the power of a statistical test is defined as the probability it will conclude that the observed difference between two samples is significant given that they did indeed come from different distributions. The power of a test is governed by three factors: (1) the number of observations in each sample, (2) how variable repeated observations from the same distribution are, and (3) the size of the true difference one is trying to detect. Not surprisingly, the more data one collects and the larger the true difference, the easier it is to find that difference. On the other hand, the more variable the data within a sample are, the harder it is to pick out differences between samples. The best way to illustrate the power of tests is with diagrams, known as power curves. In order to draw such curves on a two-dimensional piece of paper, one needs to collapse two of the three factors affecting the power into a single dimension. One can do this by expressing the size of the true difference as a multiple (L\) of the mean, describing the variability in terms of the coefficient of variation (Cg), and using the ratio of the two to capture their combined effect. This concept is best explained by example. Suppose the average price of one-ounce cocaine purchases in a city were $1,000, the coefficient of variation for such purchases was typically 0.15, and 16 one-ounce purchases were made each year. 14 If the true market price of an ounce of cocaine dropped by $50, to $950, what is the probability that this drop could be detected by the t-test procedure described above? Since $50 is 5 % of $1,000, the true price difference measured in multiples of the average price is L\ = $501$1000 = 0.05. Hence, the ratio of the percentage change in mean to the coefficient of variation is 0.05/0.15 = 0.33. Figure ill.5 shows that the probability of detecting such a change when n = 16 purchase observations are made each year is quite small. If, however, the price change had been of $150, then the percentage change would be L\ = $150/$1000 = 0.15, 14For ease of explication all examples given here assume that the number of data points in each sample is the same, but it is a simple matter to derive power curves for any particular set of samples sizes

CREATING CONSISTENT PRICE SERIES

57

so the ratio of the price change to the coefficient of variation would be 1.0, and the probability of detecting such a change would have been over 70%. More generally, one can see that, given the coefficient of variation and number of purchases made in each year, it would be quite difficult to detect price changes smaller than 10% (implying ratios less than 0.67), but very easy to notice changes of 25% or more (implying ratios of greater than 1.67). Consider another example. Suppose one were working with prices that had been standardized for transaction size. It was shown above that a typical coefficient of variation for standardized prices is 0.4. It is not uncommon to have 30 purchase observations made in a city in a year if one considers all transaction sizes. With that many data points, the fact that standardized prices tend to follow a lognormal rather than a normal distribution is relatively unimportant, and the t-test is valid. The figure indicates that in such a circumstance, one would generally not be able to detect changes in prices of less than 20%, but changes of 40 % or more would be easy to detect. Figure 111.5: Ability of a t-test to Detect a Change in Price as a Function of the Size of the Price Change

~

/'

0.9

//

li 0.8

...

/

ell ell

/

.&:

:: 0.7

I

/

I

I

/

f

I

0.4

/ '

!

-----

,-/

/

/ /

/

l

/

,- /

,,

/

i I

....-'

/"

I

/

/

0.5

....' .

1/'/

:' / / / I i

0.6

~

"

/

/

~

~

I

/

I

I

f

0

0..

/'

/

I

I

/

"

i - - n=31

- - --- - n=16 .._.-._ ....-... n=11

-.-.-.-.-.- n=5

,/

2

0

3

4

Ratio of Percentage Change in Mean Over the Coefficient of Variation

Sometimes it is useful to look at power curves in another way. Suppose one were designing a program of drug purchases specifically to monitor changing market 'conditions and were trying to decide how many purchases to make in each period. The more purchases, the greater the cost and the greater the risk to the agents. The fewer the purchases, the less reliable the information. There is obviously a trade-off to be made, and one might wish to know how much more information additional purchases provide. Figure III. 6 attempts to provide such information. CREATING CONSISTENT PRICE SERIES

58

Figure 111.6: Effect of Increasing Sample Size on the Ability of a t-test to Detect a Change in Price

.- .--..... ..-.. -- .. ~-

--_.... . ...

0.8

."---

. ~ .

....

'

.

---------

-u;

~ 0.6 QI

;: ~

...c

~ 0.4 c

Q.

0.2

Values for ~ s

- - 2.00 --- --- 1.50 ··-·-·-·-········· 1.25 -.-.-.-.-.- 1.00 --_.-._ .-.. - 0.75

0.50

o +-------~----~-------+-------+------~------1_------+-----~ o 10 20 30 40 Number of Data Points

Suppose that the coefficient of variation for purchases at the level in question were 0.2, IS and one wanted to have at least an 80 % chance of detecting price changes of 25 % or more. In that case ~ = 0.25, so the ratio of ~/Cs = 1.25. Referring to the figure, this implies that one would need to collect about 11 data points (in each period) to satisfy that criterion. To have an 80 % chance of detecting price changes as small as 15 %, one would have to increase data collection to almost 30 observations (per period).

5. Summary

This chapter examined variability in repeated purchases. The first section examined repeated purchases of the same, or similar, size in one city and one year. Prices for such purchases were found to be roughly normally distributed, at least after outliers were removed, with coefficients of variation typically between 0.1 and 0.2. Purities could be usefully described by beta distributions; doing so gave a sense of the "shape" of the distribution of purities, as well as its central tendency and variance.

ISWe chose a relatively high value on the assumption that most intelligence-driven programs would focus on street level purchases.

CREATING CONSISTENT PRICE SERIES

59

Clusters of data for purchases of the same or similar dollar amount in the same city and year are less common, so only preliminary analyses were performed. Those analyses suggest that the beta distribution continues to apply for purity and that both the weight and pure weight of observations of similar price are roughly normally distributed. However, the weights of similarly-priced observations seem to be much more variable than are the prices of similarly-sized observations. When the prices of individual observations are standardized for differences in weight, the resulting standardized prices are not normally distributed. They appear, instead, to be roughly log normally distributed, although they contain more outliers than a lognormal distribution would. Nevertheless, since the lognormal distribution is not entirely irregular, tests such as the t-test are relatively robust with respect to the nature of the data. Often, when one includes purchases from all transaction sizes there are many observations, it may be reasonable to apply statistical tests to these data which are designed for the normal distribution. The coefficient of variation of these standardized prices is typically between 0.35 and 0.5. Because of the near-normality of many of these price data, standard methods, such as t-tests, can be applied to assess the statistical significance of observed differences between two different samples of prices. Likewise, one can construct power curves which describe the likelihood of detecting a true difference in distributions as a function of the number of data points collected, the variability of the data, and the size of the true difference. Such power curves can also be used to design programs of purchases made for intelligence

CREATING CONSISJENT PRICE SERIES

60

References

BOTEC. "Heroin Situation Assessment." Report submitted to the Office of National Drug Control Policy, 1991. Brown, George F. and Lester P. Silverman. 1974. "The Retail Price of Heroin: Estimation and Applications." Journal of the American Statistical Association. Vol. 69, No. 347, pp.595606. Caulkins, Jonathan P. 1990. "The Distribution and Consumption of Illicit Drugs: Some Mathematical Models and Their Policy Implications." Ph.D. Dissertation in Operations Research, M .LT., Cambridge, MA. ____. 1993a. "Domestic Geographic Variation in Illicit Drug Prices." Carnegie Mellon University Heinz School Working Paper WP 93-13. ____. 1993b. "Developing Price Series for Cocaine." Carnegie Mellon University Heinz School Working Paper WP 93-27. Caulkins, Jonathan P. and Rema Padman. 1991. "A Model of Price vs. Transaction Size for Illicit Drugs." Carnegie Mellon University School of Urban and Public Affairs Working Paper 91-16. ____. "Quantity Discounts and Quality Premia for Illicit Drugs," forthcoming in The Journal of the American Statistical Association. Cochran, William G., Sampling Techniques, John Wiley, New York, 1977. DiNardo, John. 1991. "Using STRIDE Data to Form a Regional Time Series for Cocaine Prices." Working Draft WD-5548-DPRC, RAND Drug Policy Research Center, Santa Monica, CA, July. ____. 1993 . "Law Enforcement, the Price of Cocaine, and Cocaine Use." Mathematical Modelling. Vol. 17, No.2, pp.53-64. Drake, Alvin W., Fundamentals ofApplied Probability Theory, McGraw-Hill, New York, 1967. Drug Enforcement Administration. Various Years. "The Illicit Drug WholesalelRetail Price Report." U.S. Department of Justice Drug Enforcement Administration, Office of Intelligence. ____. Various years. "Domestic Monitor Program Report." U. S. Department of Justice Drug Enforcement Administration, Office of Intelligence. Frank, Richard S. 1987. "Drugs of Abuse: Data Collection Systems of DEA and Recent Trends." Journal ofAnalytical Toxicology Vol. 11, Nov.lDec. pp.237-241. Kleiman, Mark A.R. and Jonathan P. Caulkins. 1992. "Heroin Policy for the Next Decade." Annals of the American Academy of Political and Social Science, No. 521 (May), pp.163174. National Narcotics Intelligence Consumers Committee. Various Years. The NNICC Report.

CREATING CONSISTENT PRICE SERIES

61

Office of National Drug Control Policy. 1992. "Price and Purity of Cocaine: The Relationship to Emergency Room Visits and Deaths, and to Drug Use Among Arrestees." October. Rhodes, William and Raymond Hyatt. 1992. "The Price of Illicit Drugs, 1981-1991." Report submitted to the Office of National Drug Control Policy. Abt Associates, Inc. Cambridge, MA. May 19. Silverman, Lester P. and Nancy L. Spruill. 1977. "Urban Crime and the Price of Heroin." Journal o/Urban Economics. Vol. 4, pp. 80-103 . Sprent, Peter, Applied Nonparametric Statistical Methods. (London: Chapman & Hall, 1990) Western States Information Network. Various years. "Annual Price Report."

CREATING CONSISTENT PRlCE SERlES

62

Appendix

Table A.1: Median Standardized Cocaine Prices at the Gram Level by City and Year (1991 dollars)

Year 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991

Chicago Price n $334 20 326 12 314 47 263 12 274 6 156 4 121 21 80 10 117 6 79 7

Detroit Price n

$263 357 320 196 197 136 167 82 111 113

4 16 22 26 67 7 12 12 12 13

New York Price n $342 10 156 16 148 10 214 26 149 55 122 27 136 37 72 58 172 88 83 20

Philadelphia Price n $560 10 279 9 314 13 156 4 122 30 102 22 70 27 63 61 96 13 77 28

San Diego Price n $247 97 223 80 203 70 185 74 150 104 100 132 94 93 94 81 94 27 68 14

Washington Price n $568 7 282 9 462 27 408 9 262 17 238 16 122 8 133 17 189 52 175 46

National Price n

$363 230 231 222 180 131 124 77 144 86

148 142 189 151 279 208 198 239 198 128

Table A.2: Median Standardized Cocaine Prices at the Ounce Level by City and Year (1991 dollars)

Year 1983 1984 1985 1986 1987 1988 1989 1990 1991

Atlanta Price n $3,221 9 3,957 20 3,353 17 2,542 35 2,419 13 1,385 40 1,290 26 1,593 20 1,420 28

Boston Price n $4,051 10 3,176 17 3,526 13 2,517 20 1,920 26 1,593 8 1,528 18 2,435 25 1,430 12

Chicago Price n $3,587 24 3,114 29 3,313 62 2,482 62 1,901 84 1,323 48 1,149 51 1,727 72 1,387 55

Minneapolis Price n $4,519 6 5,553 11 4,607 34 3,712 34 2,715 41 2,071 17 1,909 6 2,262 10 1,612 12

Detroit Price n $5,193 3 3,742 18 3,192 14 2,611 39 2,066 35 1,210 21 992 21 1,240 16 1,113 30

New Orleans Price n $5,345 13 4,055 32 3,549 60 2,790 34 1,975 38 1,600 50 1,333 42 1,914 18 1,663 34

Year 1983 1984 1985 1986 1987 1988 1989 1990 1991

New York Price n $3,163 19 2,465 19 2,519 47 1,876 48 1,576 41 1,165 26 1,021 35 1,154 19 1,165 11

San Diego Price n $3,144 26 2,748 9 2,435 19 2,044 32 1,195 43 1,098 46 882 30 883 9 812 15

St. Louis Price n $4,871 23 4,643 21 4,770 40 3,981 28 2,879 29 1,943 24 1,527 26 2,056 16 2,092 18

Washington Price n $4,319 20 4,071 11 2,813 20 2,956 33 2,301 17 1,906 22 1,508 26 1,847 37 1,616 60

Baltimore Price n $4,781 9 4,812 17 2,638 42 3,472 43 2,792 26 1,838 31 1,508 25 1,"840 20 1,632 15

National Price n $3,697 142 3,117 193 2,924 348 2,334 375 1,836 376 1,328 311 1,142 280 1,430 225 1,279 230

Table A.3: Median Standardized Cocaine Prices at the Kilogram Level by City and Year (1991 dollars)

Year

1985 1986 1987 1988 1989 1990 1991 1992

I

Boston Price

$95,107 29,701 27,297 28,862 27,885 30,031 23,477 25,648

n 4 5 12 11 33 43 42 40

Chicago Price n

$47,413 47,941 39,538 24,132 21,640 31 ,222 24,673 20,180

6 9 5 31 17 10 22 12

New York Price n

$40,550 29,555 22,649 19,341 17,649 24,330 17,862 18,982

38 54 81 70 65 54 33 13

National Price

$45,322 34,515 27,410 21 ,120 19,247 26,433 19,925 19,648

n 48 68 98 112 115 107 97 65

BOTEC_Creating Consistent Price Series_Jonathan ...

CREATING CONSISTENT PRICE SERIES ... Estimating Price in One Place at One Time . ... Creating Price Series from STRIDE . ... The STRIDE Database . ..... Price Series_Jonathan Caulkins_Andrew Chalsma_for ONDCP_June 1993.pdf.

32MB Sizes 0 Downloads 113 Views

Recommend Documents

Time-Consistent and Market-Consistent Evaluations
principles by a new market-consistent evaluation procedure which we call 'two ... to thank Damir Filipovic, the participants of the AFMath2010 conference, the ... Pricing such payoffs in a way consistent to market prices usually involves combining ..

Consistent Bargaining
Dec 27, 2008 - Consistent Bargaining. ∗. Oz Shy†. University of Haifa, WZB, and University of Michigan. December 27, 2008. Abstract. This short paper ...

Time-Consistent Bailout Plans - DII UChile
Sep 2, 2011 - *Email: [email protected]. Central Bank of .... if the authority does not deviate from it when agents follow a trigger strategy: They believe the plan if ...

Self-consistent quasiparticle random-phase ...
Nov 2, 2007 - Self-consistent quasiparticle random-phase approximation for a multilevel pairing model ... solvable multilevel pairing model, where the energies of the ...... G. QRPA SCQRPA LNQRPA LNSCQRPA. Exact. 0.10. −0.05. −0.06. −0.04. 0.20

SELF-CONSISTENT QUASIPARTICLE RPA FOR ...
Jul 3, 2007 - At large values of G, predictions by all approximations and the exact solution coalesce into one band, whose width vanishes in the limit. G → ∞.

CONSISTENT FRAGMENTS OF GRUNDGESETZE ...
terms of individual type, and if all formulas of this extended language are eligible as ... Just as r was modelled on the Russell class {x : x ∈ x}, the value range of.

Dynamically consistent optical flow estimation - Irisa
icate situations (such as the absence of data) which are not well managed with usual ... variational data assimilation [17] . ..... pean Community through the IST FET Open FLUID Project .... on Art. Int., pages 674–679, Vancouver, Canada, 1981.

Time-Consistent Bailout Plans - DII UChile
Sep 2, 2011 - *Email: [email protected]. Central Bank of Chile and ... scarcity of liquidity during crises, the design of bailouts, and public debt. The baseline .... Layout. Section 2 displays basic results in a one-generation model. Section 3 ...

GENERATIVE MODEL AND CONSISTENT ...
is to propose a coherent statistical framework for dense de- formable templates both in ... Gaussian distribution is induced on β. We denote the covari- ... pair ξi = (βi,τi). The likeli- hood of the observed data can be expressed as an integral

Time-Consistent Consumption Taxation
Jan 1, 2015 - Barcelona GSE Summer Forum 'Macro and Micro Perspectives on .... tax would require an upward trend in the consumption tax to satisfy the ...

Improved Consistent Sampling, Weighted ... - Research at Google
simple data statistics to reduce the running time of hash ... statistics (such as histograms of color or texture). .... To obtain the distribution P(z), let us analyze its.

Read PDF Price Action Breakdown: Exclusive Price ...
Download Price Action Breakdown: Exclusive Price Action Trading Approach to Financial Markets Best Book, free book download Price Action Breakdown: ...