# Chapter 4 Chapter 4 Probability: Studying Randomness Randomness and Probability Random: Process where the outcome in a particular trial is not known in advance, although a distribution of outcomes may be known for a long series of repetitions Probability: The proportion of time a particular outcome will occur in a long series of repetitions of a random process Independence: When the outcome of one trial does not effect probailities of outcomes of subsequent trials Probability Models Probability Model: Listing of possible outcomes Probability corresponding to each outcome Sample Space (S): Set of all possible outcomes of a random process Event: Outcome or set of outcomes of a random process (subset of S) Venn Diagram: Graphic description of a sample space and events Rules of Probability The probability of an event A, denoted P(A) must lie between 0 and 1 (0 P(A) 1) For the sample space S, P(S)=1 Disjoint events have no common outcomes. For 2 disjoint events A and B, P(A or B) = P(A) + P(B)

The complement of an event A is the event that A does not occur, denoted Ac. P(A)+P(Ac) = 1 The probability of any event A is the sum of the probabilities of the individual outcomes that make up the event when the sample space is finite Assigning Probabilities to Events Assign probabilities to each individual outcome and add up probabilities of all outcomes comprising the event When each outcome is equally likely, count the number of outcomes corresponding to the event and divide by the total number of outcomes Multiplication Rule: A and B are independent events if knowledge that one occurred does not effect the probability the other has occurred. If A and B are independent, then P(A and B) = P(A)P(B) Multiplication rule extends to any finite number of events Example - Casualties at Gettysburg Results from Battle of Gettysburg Counts Killed Wounded Captured/Missing Safe Survival Total North 3155 14525 5365

72324 95369 South 2592 12709 12227 49972 77500 Proportions North 0.0331 0.1523 0.0563 0.7584 1.0000 South 0.0334 0.1640 0.1578 0.6448 1.0000 Killed, Wounded, Captured/Missing are considered casualties, what is the probability a randomly selected Northern soldier was a casualty? A Southern soldier? Obtain the distribution across armies Random Variables Random Variable (RV): Variable that takes on the value

of a numeric outcome of a random process Discrete RV: Can take on a finite (or countably infinite) set of possible outcomes Probability Distribution: List of values a random variable can take on and their corresponding probabilities Individual probabilities must lie between 0 and 1 Probabilities sum to 1 Notation: Random variable: X Values X can take on: x1, x2, , xk Probabilities: P(X=x1) = p1 P(X=xk) = pk Example: Wars Begun by Year (1482-1939) Distribution of Numbers of wars started by year X = # of wars stared in randomly selected year Levels: x1=0, x2=1, x3=2, x4=3, x5=4 Probability Distribution: Histogram #Wars 0 1 2 3 4 Probability 0.5284 0.3231 0.1070 0.0328 0.0087

Yearr 300 200 100 0 0 1 2 3 Wars 4 More Masters Golf Tournament 1st Round Scores Probability 0.000288 0.000576 0.001728 0.004608 0.013249 0.019297 0.043491

0.068548 0.097062 0.123272 0.134505 0.143433 0.114343 0.084389 0.058468 0.036002 0.022465 0.014401 0.008065 0.004896 0.002016 0.002016 0.001152 0.000864 0.000288 0.000576 Histogram Frequency Score Frequency 63 1 64 2 65 6 66 16 67

46 68 67 69 151 70 238 71 337 72 428 73 467 74 498 75 397 76 293 77 203 78 125 79 78 80 50 81 28 82 17 83 7

84 7 85 4 86 3 87 1 88 2 600 500 400 300 200 100 0 Score Continuous Random Variables Variable can take on any value along a continuous range of numbers (interval) Probability distribution is described by a smooth density curve Probabilities of ranges of values for X correspond to areas under the density curve Curve must lie on or above the horizontal axis Total area under the curve is 1 Special case: Normal distributions

Means and Variances of Random Variables Mean: Long-run average a random variable will take on (also the balance point of the probability distribution) Expected Value is another term, however we really do not expect that a realization of X will necessarily be close to its mean. Notation: E(X) Mean of a discrete random variable: E ( X ) X x1 p1 x2 p2 xk pk xi pi Examples - Wars & Masters Golf #Wars 0 1 2 3 4 Sum Probability 0.5284 0.3231 0.1070 0.0328 0.0087 1.0000 x*p 0.0000 0.3231 0.2140 0.0983 0.0349

0.6703 =0.67 Score 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 Sum prob

0.000288 0.000576 0.001728 0.004608 0.013249 0.019297 0.043491 0.068548 0.097062 0.123272 0.134505 0.143433 0.114343 0.084389 0.058468 0.036002 0.022465 0.014401 0.008065 0.004896 0.002016 0.002016 0.001152 0.000864 0.000288 0.000576 1 x*p 0.0181 0.0369 0.1123 0.3041

0.8877 1.3122 3.0009 4.7984 6.8914 8.8756 9.8188 10.6141 8.5757 6.4136 4.5020 2.8082 1.7748 1.1521 0.6532 0.4015 0.1673 0.1694 0.0979 0.0743 0.0251 0.0507 73.54 =73.54 Statistical Estimation/Law of Large Numbers In practice we wont know but will want to estimate it We can select a sample of individuals and observe the sample mean: x By selecting a large enough sample size we can be very confident that our sample mean will be arbitrarily close

to the true parameter value Margin of error measures the upper bound (with a high level of confidence) in our sampling error. It decreases as the sample size increases Rules for Means Linear Transformations: a + bX (where a and b are constants): E(a+bX) = a+bX = a + bX Sums of random variables: X + Y (where X and Y are random variables): E(X+Y) = X+Y = X + Y Linear Functions of Random Variables: E(a1X1++anXn) = a1++ann where E(Xi)=i Example: Masters Golf Tournament Mean by Round (Note ordering): 1=73.54 2=73.07 3=73.76 4=73.91 Mean Score per hole (18) for round 1: E((1/18)X1) = (1/18)1 = (1/18)73.54 = 4.09 Mean Score versus par (72) for round 1: E(X1-72) = X1-72 = 73.54-72= +1.54 (1.54 over par) Mean Difference (Round 1 - Round 4): E(X1-X4) = 1 - 4 = 73.54 - 73.91 = -0.37 Mean Total Score: E(X1+X2+X3+X4) = 1+ 2+ 3+ 4 = = 73.54+73.07+73.76+73.91 = 294.28 (6.28 over par) Variance of a Random Variable Variance: Measure of the spread of the probability distribution. Average squared deviation from the mean Standard Deviation: (Positive) Square Root of Variance V ( X ) X2 ( x1 X ) 2 p1 ( xk X ) 2 pk ( xi X ) 2 pi

xi2 pi X2 E ( X 2 )- X2 (useful when X takes on integer values) Rules for Variances (X, Y RVs a, b constants) V (a bX ) a2bX b 2 X2 V (aX bY ) 2 aX bY 2 2 X 2 2 Y a b 2ab X Y where is the correlation between X and Y Variance of a Random Variable V (a bX ) a2bX b 2 X2 2 2 2 2 2 V (aX bY ) aX

a b Y 2ab X Y bY X where is the correlation between X and Y Special Cases: X and Y are independent (outcome of one does not alter the distribution of the other): = 0, last term drops out a=b=1 and = 0 V(X+Y) = X2 + Y2 a=1 b= -1 and = 0 a=b=1 and 0 V(X-Y) = X2 + Y2 V(X+Y) = X2 + Y2 + 2XY a=1 b= -1 and 0 V(X-Y) = X2 + Y2 -2XY Wars & Masters (Round 1) Golf Scores Wars (x) 0 1 2 3

4 Sum Prob 0.5284 0.3231 0.1070 0.0328 0.0087 1.0000 (x- ) -0.6703 0.3297 1.3297 2.3297 3.3297 2=.7362 = .8580 (x- )^2 0.4493 0.1087 1.7681 5.4275 11.0869 ((x- )^2)*p 0.2374 0.0351 0.1892 0.1780 0.0965

0.7362 Score 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 Sum prob (x-)^2

((x-)^2)p 0.000288 111.0916 0.031996 0.000576 91.0116 0.052426 0.001728 72.9316 0.126034 0.004608 56.8516 0.261989 0.013249 42.7716 0.566674 0.019297 30.6916 0.592263 0.043491 20.6116 0.896415 0.068548 12.5316 0.859021 0.097062 6.4516 0.626207 0.123272 2.3716 0.292352 0.134505 0.2916 0.039222 0.143433 0.2116 0.03035 0.114343 2.1316 0.243734 0.084389 6.0516 0.510691 0.058468 11.9716 0.699952 0.036002 19.8916 0.716143 0.022465

29.8116 0.669731 0.014401 41.7316 0.600974 0.008065 55.6516 0.448803 0.004896 71.5716 0.350437 0.002016 89.4916 0.180427 0.002016 109.4116 0.220588 0.001152 131.3316 0.151304 0.000864 155.2516 0.134146 0.000288 181.1716 0.052181 0.000576 209.0916 0.120444 1 9.474503 2 =9.47 Masters Scores (Rounds 1 & 4) 1 = 73.54 4 = 73.91 12=9.48 42=11.95 =0.24 Variance of Round 1 scores vs Par: V(X1-72)=12=9.48 Variance of Sum and Difference of Round 1 and Round 4 Scores: Sum ( X 1 X 4 ) : V ( X 1 X 4 ) 12 42 2 1 4 9.48 11.95 2(0.24) (9.48)(11.95) 9.48 11.95 5.11 26.54 Difference ( X 1 X 4 ) : V ( X 1 X 4 ) 12 42 2 1 4 9.48 11.95 2(0.24) (9.48)(11.95) 9.48 11.95 5.11 16.32 X 1 X 4 26.54 5.15 X 1 X 4 16.32 4.04

General Rules of Probability Union of set of events: Event that any (at least one) of the events occur Disjoint events: Events that share no common sample points. If A, B, and C are pairwise disjoint, the probability of their union is: P(A)+P(B)+P(C) Intersection of two (or more) events: The event that both (all) events occur. Addition Rule: P(A or B) = P(A)+P(B)-P(A and B) Conditional Probability: The probability B occurs given A has occurred: P(B|A) Multiplication Rule (generalized to conditional prob): P(A and B)=P(A)P(B|A)=P(B)P(A|B) Conditional Probability Generally interested in case that one event precedes another temporally (but not necessary) When P(A) > 0 (otherwise is trivial): P ( A and B ) P ( B | A) P ( A) P( A and B) P( A | B) P( B) Contingency Table: Table that cross-classifies individuals or probabilities across 2 or more event classifications Tree Diagram: Graphical description of cross-classification of 2 or more events John Snow London Cholera Death Study

2 Water Companies (Let D be the event of death): Southwark&Vauxhall (S): 264913 customers, 3702 deaths Lambeth (L): 171363 customers, 407 deaths Overall: 436276 customers, 4109 deaths 4109 .0094 (94 per 10000 people) 436276 3702 P( D | S ) .0140 (140 per 10000 people) 264913 407 P ( D | L) .0024 (24 per 10000 people) 171363 P( D) Note that probability of death is almost 6 times higher for S&V customers than Lambeth customers (was important in showing how cholera spread) John Snow London Cholera Death Study Water Company S&V Lambeth Total Cholera Death

Yes No Total 3702 (.0085) 407 (.0009) 4109 (.0094) 261211 (.5987) 170956 (.3919) 432167 (.9906) 264913 (.6072) 171363 (.3928) 436276 (1.0000) ( Contingency Table with joint probabilities (in body of table) and marginal probabilities (on edge of table)

John Snow London Cholera Death Study Company .0140 D (.0085) S&V .6072 Death .9860 DC (.5987) WaterUser .0024 .3928 L .9976 D (.0009) DC (.3919) Tree Diagram obtaining joint probabilities by multiplication rule Example: Florida lotto You select 6 distinct digits from 1 to 53 (no replacement) State randomly draws 6 digits from 1 to 53

Probability you match all 6 digits: First state draw: P(match 1st) = 6/53 Given you match 1st, you have 5 left and state has 52 left: P(match 2nd given matched 1st) = 5/52 Process continues: P(match 3rd given 1&2) = 4/51 P(match 4th given 1&2&3) = 3/50 P(match 5th given 1&2&3&4) = 2/49 P(match 6th given 1&2&3&4) = 1/48 1 6 5 4 3 2 1 Multiplication rule : P(match all) 53 52 51 50 49 48 22,957,480 Bayess Rule - Updating Probabilities Let A1,,Ak be a set of events that partition a sample space such that (mutually exclusive and exhaustive): each set has known P(Ai) > 0 (each event can occur) for any 2 sets Ai and Aj, P(Ai and Aj) = 0 (events are disjoint) P(A1) + + P(Ak) = 1 (each outcome belongs to one of events) If C is an event such that 0 < P(C) < 1 (C can occur, but will not necessarily occur) We know the probability will occur given each event Ai: P(C|Ai) Then we can compute probability of Ai given C occurred: P (C | Ai ) P ( Ai ) P( Ai and C ) P ( Ai | C ) P(C | A1 ) P( A1 ) P(C | Ak ) P( Ak ) P(C ) Northern Army at Gettysburg

Regiment I Corps II Corps III Corps V Corps VI Corps XI Corps XII Corps Cav Corps Arty Reserve Sum Label A1 A2 A3 A4 A5 A6 A7 A8 A9 Initial # 10022 12884 11924 12509 15555 9839 8589 11501 2546

95369 Casualties 6059 4369 4211 2187 242 3801 1082 852 242 23045 P(Ai) 0.1051 0.1351 0.1250 0.1312 0.1631 0.1032 0.0901 0.1206 0.0267 1 P(C|Ai) 0.6046 0.3391 0.3532 0.1748 0.0156 0.3863

0.1260 0.0741 0.0951 P(C|Ai)*P(Ai) 0.0635 0.0458 0.0442 0.0229 0.0025 0.0399 0.0113 0.0089 0.0025 0.2416 P(C) P(Ai|C) 0.2630 0.1896 0.1828 0.0949 0.0105 0.1650 0.0470 0.0370 0.0105 1.0002 Regiments: partition of soldiers (A1,,A9). Casualty: event C P(Ai) = (size of regiment) / (total soldiers) = (Column 3)/95369 P(C|Ai) = (# casualties) / (regiment size) = (Col 4)/(Col 3) P(C|Ai) P(Ai) = P(Ai and C) = (Col 5)*(Col 6)

P(C)=sum(Col 7) P(A |C) = P(A and C) / P(C) = (Col 7)/.2416 Independent Events Two events A and B are independent if P(B| A)=P(B) and P(A|B)=P(A) , otherwise they are dependent or not independent. Cholera Example: P(D) = .0094 P(D|S) = .0140 P(D|L) =.0024 Not independent (which firm would you prefer)? Union Army Example: P(C) = .2416 P(C|A1)=.6046 P(C|A5)=.0156 Not independent: Almost 40 times higher risk for A1