back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [ 24 ] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


24

tensions of this basic technique that need to be made when analyzing economic data.

We start with a basic question: What is regression analysis? Regression analysis is concerned with describing and evaluating the relationship between a given variable (often called the explained or dependent variable) and one or more other variables (often called the explanatory or independent variables) We will denote the explained variable by and the explanatory variables b\ Xl, X2, . . . , x.

The dictionary definition of "regression" is "backward movement, a retreat, a return to an earlier stage of development." Paradoxical as it may sound, regression analysis as it is currently used has nothing to do with regression as dictionaries define the term.

The term regression was coined by Sir Francis Gallon (1822-1911) from England, who was studying the relationship between the height of children and the height of parents. He observed that although tall parents had tall children and short parents had short children, there was a tendency for childrens heights to converge toward the average. There is thus a "regression of childrens height toward the average." Gallon, in his aristocratic way, termed this a "regression toward mediocrity."

Something similar to what Gallon found has been noted in some other studies as well (studies of test scores, etc.). These examples are discussed in Section 3.12 under the heading "regression fallacy." For the present we should note that regression analysis as currently used has nothing to do with regression or backward movement.

Let us return to our notation of the explained variable to be denoted by and explanatory variables denoted by x,, X2, . . . , x,,. If = 1, that is, there is only one of the -variables, we have what is known as simple regression. This is what we discuss in this chapter. If A: > 1, that is, there are more than one x variables, we have what is known as multiple regression. This is discussed in Chapter 4. First we give some examples.

Example 1: Simple Regression = sales

X = advertising expenditures

Here we try to determine the relationship between sales and advertising expenditures.

Example 2: Multiple Regression

= consumption expenditures of a family Xx = family income X2 = financial assets of the family Xj = family size



3.1 .RODUCTION 51

Here we try to determine the relationship between consumption expenditures on the one hand and family income, financial assets of the family, and family size on the other.

There are several objectives in studying these relationships. They can be used to:

1. Analyze the effects of policies that involve changing the individual jcs. In Example 1 this involves analyzing the effect of changing advertising expenditures on sales.

2. Forecast the value of for a given set of xs.

3. Examine whether any of the xs have a significant effect on y.

In Example 2 we would want to know whether family size has a significant effect on consumption expenditures of the family. The exact meaning of the word "significant" is discussed in Section 3.5.

The way we have set up the problem until now, the variable and the xvariables are not on the same footing. Implicitly we have assumed that the jcs are variables that influence or are variables that we can control or change and is the effect variable. There are several alternative terms used in the literature for and Xf, Xj, . . . , x. These are shown in Table 3.1.

Each of these terms is relevant for a particular view of the use of regression analysis. Terminology (a) is used if the purpose is prediction. For instance, sales is the predictand and advertising expenditures is the predictor. The terminology in (b), (c), and (d) is used by different people in their discussion of regression models. They are all equivalent terms. Terminology (e) is used in studies of causation. Terminology (f) is specific to econometrics. We use this terminology in Chapter 9. Finally, terminology (g) is used in control problems. For instance, our objective might be to achieve a certain level of sales (target variable) and we would like to determine the level of advertising expenditures (control variable) to achieve our objective.

In this and subsequent chapters we use the terminology in (c) and (d). Also, we consider here the case of one explained (dependent) variable and one ex-

Table 3.1 Classification of Variables in Regression Analysis

X„ X., . . . ,Xk

(a) Predictand Predictors

(b) Regressand Regressors

(c) Explained variable Explanatory variables

(d) Dependent variable Independent variables

(e) Effect variable Causal variables

(f) Endogenous variable Exogenous variables

(g) Target variable Control variables



3.2 Specification of the Relationships

As mentioned in Section 3.1, we will discuss the case of one explained (dependent) variable, which we denote by y, and one explanatory (independent) variable, which we denote by x. The relationship between and x is denoted by

y=fix) (3.1)

where f{x) is a function of x. At this point we need to distinguish between two types of relationships:

1. A deterministic or mathematical relationship.

2. A statistical relationship which does not give unique values of for given values of X but can be described exactly in probabilistic terms.

What we are going to talk about in regression analysis here is relationships of type 2, not of type 1. As an example, suppose that the relationship between sales and advertising expenditures x is

= 2500 + : - x

This is a deterministic relationship. The sales for different levels of advertising expenditures can be determined exactly. These are as follows:

2500

4100

5000

2500

On the other hand, suppose that the relationship between sales and advertising expenditures x is

= 2500 + IOOjt - x +

where = -b 500 with probability = - 500 with probability i

Then the values of for different values of x cannot be determined exactly but can be described probabilistically. For example, if advertising expenditures are 50, sales will be 5500 with probability and 4500 with probability i The values of for different values of x are now as follows:

planatory (independent) variable. This, as we said eariier, is called simple regression. Further, as we said eariier, the variables and x are not treated on the same footing. A detailed discussion of this issue is postponed to Chapter 11.



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [ 24 ] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]