文章目录
abstract
- 全概率公式和bayes公式及其应用
- 后验概率和信念度量
完备事件组(划分)
- 设有限集 I = { 1 , ⋯ , n } I=\{1,\cdots,n\} I={1,⋯,n};试验 E E E的样本空间为 Ω \Omega Ω
- 若
{
B
i
;
i
∈
I
}
\{B_i;i\in{I}\}
{Bi;i∈I}满足:
- ⋃ i = 1 n B i = Ω \bigcup_{i=1}^{n}B_i=\Omega ⋃i=1nBi=Ω
- B i B j = ∅ ; i ≠ j B_iB_j=\varnothing;i\neq j BiBj=∅;i=j
- 则称 { B i ; i ∈ I } \{B_i;i\in{I}\} {Bi;i∈I}为 Ω \Omega Ω的一个完备事件组,也称为划分
基本性质
- 完备事件组 { B i ; i ∈ I } \{B_i;i\in{I}\} {Bi;i∈I},试验的任意一个样本点(任意一次试验结果)都属于且仅属于某一个 B i B_i Bi
全概率公式
-
设试验E的样本空间为 S S S, A A A为 E E E的事件
-
{ B i ∣ i ∈ I } \set{B_i|i\in I} {Bi∣i∈I}是一个 S S S的划分, P ( B i ) > 0 , i ∈ I P(B_i)>0,i\in{I} P(Bi)>0,i∈I,则 P ( A ) = ∑ i = 1 n P ( A ∣ B i ) P ( B i ) P(A)=\sum_{i=1}^{n}P(A|B_i)P(B_i) P(A)=∑i=1nP(A∣Bi)P(Bi)
- 那么 P ( A ) = ∑ i ∈ I P ( A ∣ B i ) P ( B i ) P(A)=\sum\limits_{i\in I}P(A|B_i)P(B_i) P(A)=i∈I∑P(A∣Bi)P(Bi)
-
证明:
-
显然 A B i ⊂ B i AB_i\sub B_i ABi⊂Bi,又 B i B j = ∅ B_iB_j=\varnothing BiBj=∅,所以 ( A B i ) ( A B j ) = A ( B i B j ) = A ∅ = ∅ (AB_i)(AB_j)=A(B_iB_j)=A\emptyset=\emptyset (ABi)(ABj)=A(BiBj)=A∅=∅, ( i ≠ j ) (i\neq{j}) (i=j)
-
P ( A ∣ B i ) P ( B i ) = P ( A B i ) P(A|B_i)P(B_i)=P(AB_i) P(A∣Bi)P(Bi)=P(ABi)
-
证法1:
-
∑
i
∈
I
P
(
A
∣
B
i
)
P
(
B
i
)
=
∑
i
∈
I
P
(
A
B
i
)
\sum\limits_{i\in I}P(A|B_i)P(B_i)=\sum\limits_{i\in I}P(AB_i)
i∈I∑P(A∣Bi)P(Bi)=i∈I∑P(ABi)
- = P ( A B 1 ∪ A B 2 ∪ ⋯ ∪ A B n ) =P(AB_1\cup{AB_2}\cup\cdots\cup{A{B_n}}) =P(AB1∪AB2∪⋯∪ABn)
- = P ( A ∩ ( ⋃ i ∈ I B i ) ) =P\left(A\cap\left(\bigcup\limits_{i\in I}B_i\right)\right) =P(A∩(i∈I⋃Bi))
- = P ( A Ω ) =P(A\Omega) =P(AΩ)
- = P ( A ) =P(A) =P(A)
-
∑
i
∈
I
P
(
A
∣
B
i
)
P
(
B
i
)
=
∑
i
∈
I
P
(
A
B
i
)
\sum\limits_{i\in I}P(A|B_i)P(B_i)=\sum\limits_{i\in I}P(AB_i)
i∈I∑P(A∣Bi)P(Bi)=i∈I∑P(ABi)
-
证法2:
- A = A S = A ( B 1 ∪ ⋯ ∪ B n ) A=AS=A(B_1\cup\cdots\cup{B_n}) A=AS=A(B1∪⋯∪Bn)= A B 1 ∪ ⋯ ∪ A B n AB_1\cup\cdots\cup{A{B_n}} AB1∪⋯∪ABn
- P ( A ) P(A) P(A)= P ( A B 1 ∪ ⋯ ∪ A B n ) P(AB_1\cup\cdots\cup{A{B_n}}) P(AB1∪⋯∪ABn)= ∑ i = 1 n P ( A B i ) \sum_{i=1}^{n}P(AB_i) ∑i=1nP(ABi)= ∑ i ∈ I P ( A ∣ B i ) P ( B i ) \sum\limits_{i\in I}P(A|B_i)P(B_i) i∈I∑P(A∣Bi)P(Bi)
-
例
-
某个含有20个球的箱子
-
含有0,1,2只次品的概率分别为0.8,0.1,0.1
- 记:
A
i
A_i
Ai={箱子包含的残次品数量为
i
i
i个}
- P ( A 0 ) = 0.8 P(A_0)=0.8 P(A0)=0.8
- P ( A 1 ) = P ( A 2 ) = 0.1 P(A_1)=P(A_2)=0.1 P(A1)=P(A2)=0.1
- 记:
A
i
A_i
Ai={箱子包含的残次品数量为
i
i
i个}
-
记 B B B={抽中的4件产品都是正品}
-
-
那么发生事件 B B B的概率?
-
显然 A 0 , A 1 , A 2 A_0,A_1,A_2 A0,A1,A2构成试验E{观察箱子中的全部球的正品数,样本空间为{0,1,2}}的样本空间的一个划分
-
而事件 B B B进行的试验 F F F{观察取出的4个抽样品的正品数,样本空间为{0,1,2}}
-
容易根据古典概型公式计算(因为此时的样本空间已知)以下条件概率(这里不是用条件概率公式展开计算,这会绕回来)
-
P ( B ∣ A 0 ) = ( 20 4 ) ( 20 4 ) = 1 P(B|A_0)=\frac{\binom{20}{4}}{\binom{20}{4}}=1 P(B∣A0)=(420)(420)=1
-
P ( B ∣ A 1 ) = ( 19 4 ) ( 20 4 ) = ( 19 ∗ 18 ∗ 17 ∗ 16 ) ∗ ( 4 ∗ 3 ∗ 2 ∗ 1 ) ( 20 ∗ 19 ∗ 18 ∗ 17 ) ∗ ( 4 ∗ 3 ∗ 2 ∗ 1 ) = 4 5 P(B|A_1)=\frac{\binom{19}{4}}{\binom{20}{4}}=\frac{(19*18*17*16)*(4*3*2*1)}{(20*19*18*17)*(4*3*2*1)}=\frac{4}{5} P(B∣A1)=(420)(419)=(20∗19∗18∗17)∗(4∗3∗2∗1)(19∗18∗17∗16)∗(4∗3∗2∗1)=54
- 其中 Ω A 1 = Ω \Omega_{A_1}=\Omega ΩA1=Ω
-
P ( B ∣ A 2 ) = ( 18 4 ) ( 20 4 ) = 12 19 P(B|A_2)=\frac{\binom{18}{4}}{\binom{20}{4}}=\frac{12}{19} P(B∣A2)=(420)(418)=1912
-
-
根据全概率公式 P ( B ) = ∑ i = 1 3 P ( B ∣ A i ) P ( A i ) = 0.943 P(B)=\sum\limits_{i=1}^{3}P(B|A_i)P(A_i)=0.943 P(B)=i=1∑3P(B∣Ai)P(Ai)=0.943
-
Note:这里试验 F F F的样本空间恰好和E的样本空间重复,但如果箱子中的次品高达4个以上,那么 F F F的样本空间 0 , 1 , 2 , 3 , 4 {0,1,2,3,4} 0,1,2,3,4是包含于 E E E的样本空间的
-
贝叶斯公式
- bayes公式基于上述的全概率公式,是一个更加综合的公式,但是原理是简单的
- 设试验E的样本空间为 S S S, A A A为 E E E的事件
-
{
B
i
∣
i
∈
I
}
\set{B_i|i\in I}
{Bi∣i∈I}是一个
S
S
S的划分,
P
(
A
)
>
0
,
P
(
B
i
)
>
0
,
i
∈
I
P(A)>0,P(B_i)>0,i\in{I}
P(A)>0,P(Bi)>0,i∈I,则
- P ( B i ∣ A ) P(B_i|A) P(Bi∣A)= P ( A B i ) P ( A ) \frac{P(AB_i)}{P(A)} P(A)P(ABi)= P ( A ∣ B i ) P ( B i ) ∑ j = 1 n P ( A ∣ B j ) P ( B j ) \frac{P(A|B_i)P(B_i)}{\sum_{j=1}^{n}P(A|B_j)P(B_j)} ∑j=1nP(A∣Bj)P(Bj)P(A∣Bi)P(Bi); i = 1 , ⋯ , n i=1,\cdots,n i=1,⋯,n;此公式为bayes公式
- 证明:由条件概率,乘法定理,全概率公式,贝叶斯公式显然成立
- P ( B i ∣ A ) P(B_i|A) P(Bi∣A)= P ( A B i ) P ( A ) \frac{P(AB_i)}{P(A)} P(A)P(ABi)
-
P
(
A
B
i
)
P(AB_i)
P(ABi)=
P
(
A
∣
B
i
)
P
(
B
i
)
P(A|B_i)P(B_i)
P(A∣Bi)P(Bi)
- 注意在bayes公式中,我们要求的是 P ( B i ∣ A ) P(B_i|A) P(Bi∣A),因而 P ( A B i ) P(AB_i) P(ABi)= P ( B i ∣ A ) P ( A ) P(B_i|A)P(A) P(Bi∣A)P(A)就不合适,否则公式无法有效计算
- 而因该用 P ( A B i ) P(AB_i) P(ABi)= P ( A ∣ B i ) P ( B i ) P(A|B_i)P(B_i) P(A∣Bi)P(Bi)来算
- P ( A ) = ∑ i = 1 n P ( A ∣ B i ) P ( B i ) P(A)=\sum_{i=1}^{n}P(A|B_i)P(B_i) P(A)=∑i=1nP(A∣Bi)P(Bi)
例
-
次品来源问题:设一批零件来自三个供应商
-
供应商 次品率 进货份额 1 0.02 0.15 2 0.01 0.80 3 0.03 0.05
-
-
试验内容:从零件中抽取一件
- A={取到的产品是次品}
-
B
i
B_i
Bi={次品零件来自第
i
i
i个厂商}
- P ( B 1 ) P(B_1) P(B1)= 0.15 0.15 0.15; P ( A ∣ B 1 ) P(A|B_1) P(A∣B1)=0.02
- P ( B 2 ) P(B_2) P(B2)=0.80; P ( A ∣ B 2 ) P(A|B_2) P(A∣B2)=0.01
- P ( B 3 ) P(B_3) P(B3)=0.05; P ( A ∣ B 3 ) P(A|B_3) P(A∣B3)=0.03
-
求该样品是次品的概率:
- 由全概率公式: P ( A ) = ∑ i = 1 3 P ( A ∣ B i ) P ( B i ) P(A)=\sum\limits_{i=1}^{3}P(A|B_i)P(B_i) P(A)=i=1∑3P(A∣Bi)P(Bi)= 0.02 ∗ 0.15 + 0.01 ∗ 0.80 + 0.03 ∗ 0.05 = 0.0125 0.02*0.15+0.01*0.80+0.03*0.05=0.0125 0.02∗0.15+0.01∗0.80+0.03∗0.05=0.0125
-
从中取出一件,发现是次品,那么来自产商 i i i的概率是多少 ( i = 1 , 2 , 3 ) (i=1,2,3) (i=1,2,3)
- 由贝叶斯公式:
- P ( B i ∣ A ) = P ( B i A ) P ( A ) = P ( A ∣ B i ) P ( B i ) P ( A ) P(B_i|A)=\frac{P(B_iA)}{P(A)}=\frac{P(A|B_i)P(B_i)}{P(A)} P(Bi∣A)=P(A)P(BiA)=P(A)P(A∣Bi)P(Bi)
- 分别可以计算出:
- P ( B 1 ∣ A ) = 0.02 ∗ 0.15 0.0125 = 0.24 P(B_1|A)=\frac{0.02*0.15}{0.0125}=0.24 P(B1∣A)=0.01250.02∗0.15=0.24
- P ( B 2 ∣ A ) P(B_2|A) P(B2∣A)= 0.64 0.64 0.64
- P ( b 3 ∣ A ) P(b_3|A) P(b3∣A)= 0.12 0.12 0.12
- 由贝叶斯公式:
对立事件下的常用形式
- 全概率公式和beyes公式在 n = 2 n=2 n=2的时候(事件 B , B ‾ B,\overline{B} B,B构成样本空间的一个划分)最常用
- 此时分别有:
- P ( A ) = P ( A B ) + P ( A B ‾ ) P(A)=P(AB)+P(A\overline{B}) P(A)=P(AB)+P(AB)= P ( A ∣ B ) P ( B ) + P ( A ∣ B ‾ ) P ( B ‾ ) P(A|B)P(B)+P(A|\overline{B})P(\overline{B}) P(A∣B)P(B)+P(A∣B)P(B)
- P ( B ∣ A ) = P ( A B ) P ( A ) P(B|A)=\frac{P(AB)}{P(A)} P(B∣A)=P(A)P(AB)= P ( A ∣ B ) P ( B ) P ( A ∣ B ) P ( B ) + P ( A ∣ B ‾ ) P ( B ‾ ) \frac{P(A|B)P(B)}{P(A|B)P(B)+P(A|\overline{B})P(\overline{B})} P(A∣B)P(B)+P(A∣B)P(B)P(A∣B)P(B)
先验概率和后验概率
例
-
机器与产品合格率问题
-
设机器正常时,生产的产品合格率为0.9,否则合格率为0.3
-
如果机器开机后,正常的概率是0.75(先验概率)
-
某天该机器第一件产品是合格的,机器正常的概率是多少?
-
分析:
- A={第一件产品合格}
- B={机器正常}
- 所求概率表达式为: P ( B ∣ A ) = ? P(B|A)=? P(B∣A)=?
-
根据假设可知:
- P ( A ∣ B ) = 0.9 ; P ( A ∣ B ‾ ) = 0.3 P(A|B)=0.9;P(A|\overline{B})=0.3 P(A∣B)=0.9;P(A∣B)=0.3
-
P
(
B
)
=
0.75
,
P
(
B
‾
)
=
0.25
P(B)=0.75,P(\overline{B})=0.25
P(B)=0.75,P(B)=0.25
- B , B ‾ B,\overline{B} B,B构成了样本空间的一个划分(即机器要么正常,要么不正常)
- 由全概率公式 P ( A ) = P ( A ∣ B ) P ( B ) + P ( A ∣ B ‾ ) P ( B ‾ ) P(A)={P(A|B)P(B)+P(A|\overline{B})P(\overline{B})} P(A)=P(A∣B)P(B)+P(A∣B)P(B)= 0.9 ∗ 0.75 + 0.3 ∗ 0.25 = 0.75 0.9*0.75+0.3*0.25=0.75 0.9∗0.75+0.3∗0.25=0.75
- 那么根据Bayes公式, P ( B ∣ A ) = P ( A ∣ B ) P ( B ) P ( A ) P(B|A)=\frac{P(A|B)P(B)}{P(A)} P(B∣A)=P(A)P(A∣B)P(B)= 0.9 ∗ 0.75 0.75 = 0.9 \frac{0.9*0.75}{0.75}=0.9 0.750.9∗0.75=0.9,
- 即第一件产品合格的条件下,机器正常的后验概率概率为0.9
-
后验概率是对先验概率的一种修正
-
后验概率和先验概率的解释分为两派
- 客观派:所有第一件产品是合格的日子里,100天内平均由90天机器是正常的
- 主观派:反映的是试验前后人们主观上对机器状态的不同信念
概率作为衡量人们对客观事件的信念度量
- 以伊索寓言狼来了为例
- A A A={孩子说谎}; B B B={孩子可信},假设一个可信的孩子**相对不容易说谎,**不妨设这个概率为0.1,即 P ( A ∣ B ) P(A|B) P(A∣B)=0.1;反之,一个不可信的孩子说谎话的概率为0.5,即 P ( A ∣ B ‾ ) = 0.5 P(A|\overline{B})=0.5 P(A∣B)=0.5
- 设村民遇到一个可信的孩子的概率为0.8,即 P ( B ) P(B) P(B)=0.8
- 那么孩子说慌话的概率为 P ( A ) P(A) P(A)= P ( A ∣ B ) P ( B ) + P ( A ∣ B ‾ ) P ( B ‾ ) P(A|B)P(B)+P(A|\overline{B})P(\overline{B}) P(A∣B)P(B)+P(A∣B)P(B)= 0.1 × 0.8 + 0.5 × 0.2 0.1\times{0.8}+0.5\times{0.2} 0.1×0.8+0.5×0.2=0.18
- 现在假设孩子说了谎话一次后,由bayes公式计算这个孩子是可信的概率
- P ( B ∣ A ) P(B|A) P(B∣A)= P ( B A ) P ( A ) \frac{P(BA)}{P(A)} P(A)P(BA)= P ( A ∣ B ) P ( B ) P ( A ∣ B ) P ( B ) + P ( A ∣ B ‾ ) P ( B ‾ ) \frac{P(A|B)P(B)}{P(A|B)P(B)+P(A|\overline{B})P(\overline{B})} P(A∣B)P(B)+P(A∣B)P(B)P(A∣B)P(B)= 0.1 × 0.8 0.18 \frac{0.1\times{0.8}}{0.18} 0.180.1×0.8= 4 9 ≈ 0.444 \frac{4}{9}\approx{0.444} 94≈0.444
- 即孩子可信的后验概率降低到了0.444
- 如果孩子再次撒谎,令
P
(
B
)
=
0.444
P(B)=0.444
P(B)=0.444按照上述方式再次计算后验概率:
- P ( B ∣ A ) P(B|A) P(B∣A)= 0.444 × 0.1 0.444 × 0.1 + 0.566 × 0.5 \frac{0.444\times{0.1}}{0.444\times{0.1}+0.566\times{0.5}} 0.444×0.1+0.566×0.50.444×0.1= 0.138 0.138 0.138
- 可见,孩子撒了两次慌后,其可信的后验概率已经降低到了0.138,给人的感觉几乎是一个不可信的人
补充
条件概率的链式法则
-
In probability theory, the chain rule (also called the general product rule[1][2]) permits the calculation of any member of the joint distribution of a set of random variables using only conditional probabilities.
-
The rule is useful in the study of Bayesian networks, which describe a probability distribution in terms of conditional probabilities.
-
更一般的,如果反复套用上述公式,我们可以得到:
- 下面得到公式看起来复杂,其实用起来是很自然
-
P ( ∏ i = 1 n A i ) = P ( ( ∏ i = 1 n − 1 A i ) A n ) = P ( A n ∣ ∏ i = 1 n − 1 A i ) P ( ∏ i = 1 n − 1 A i ) 设通项 P k = P ( ∏ i = 1 k A i ) = P ( ( ∏ i = 1 k − 1 A i ) A k ) = P ( A k ∣ ∏ i = 1 k − 1 A i ) P ( ∏ i = 1 k − 1 A i ) T ( k ) = ∏ i = 1 k A i k = n , n − 1 , n − 2 , ⋯ , 1 P k = P ( T ( k ) ) = P ( A k ∣ T ( k − 1 ) ) P ( T ( k − 1 ) ) ⋮ P(\prod_{i=1}^{n}A_i)=P((\prod_{i=1}^{n-1}A_i)A_n) =P(A_n|\prod_{i=1}^{n-1}A_i)P(\prod_{i=1}^{n-1}A_i) \\ 设通项P_k= P(\prod_{i=1}^{k}A_i)=P((\prod_{i=1}^{k-1}A_i)A_k) =P(A_k|\prod_{i=1}^{k-1}A_i)P(\prod_{i=1}^{k-1}A_i) \\T(k)=\prod_{i=1}^{k}A_i \\k=n,n-1,n-2,\cdots,1 \\P_k=P(T(k))=P(A_k|T(k-1))P(T(k-1)) \\\vdots P(i=1∏nAi)=P((i=1∏n−1Ai)An)=P(An∣i=1∏n−1Ai)P(i=1∏n−1Ai)设通项Pk=P(i=1∏kAi)=P((i=1∏k−1Ai)Ak)=P(Ak∣i=1∏k−1Ai)P(i=1∏k−1Ai)T(k)=i=1∏kAik=n,n−1,n−2,⋯,1Pk=P(T(k))=P(Ak∣T(k−1))P(T(k−1))⋮
-
特别的 : P 2 = P ( A 1 A 2 ) = P ( A 2 ∣ A 1 ) P ( A 1 ) P 3 = P ( A 1 A 2 A 3 ) = P ( A 3 ∣ A 1 A 2 ) P ( A 1 A 2 ) = P ( A 3 ∣ A 1 A 2 ) P ( A 2 ∣ A 1 ) P ( A 1 ) 类似的 : P 4 = P ( A 1 A 2 A 3 A 4 ) = P ( A 4 ∣ A 1 A 2 A 3 ) P ( A 1 A 2 A 3 ) = P ( A 4 ∣ A 1 A 2 A 3 ) P ( A 3 ∣ A 1 A 2 ) P ( A 2 ∣ A 1 ) P ( A 1 ) P n = ∏ i = 1 n P ( A n − i + 1 ∣ ∏ j = 0 n − i A j ) 严格的说 , ∏ j = 1 n A j 应该作 ⋂ j = 1 n A j , 表示积事件 定义 P ( A 0 ) = 1 特别的:P_2=P(A_1A_2)=P(A_2|A_1)P(A_1) \\P_3=P(A_1A_2A_3)=P(A_3|A_1A_2)P(A_1A_2) \\=P(A_3|A_1A_2)P(A_2|A_1)P(A_1) \\类似的: \\P_4=P(A_1A_2A_3A_4)=P(A_4|A_1A_2A_3)P(A_1A_2A_3) \\=P(A_4|A_1A_2A_3)P(A_3|A_1A_2)P(A_2|A_1)P(A_1) \\ P_n=\prod_{i=1}^{n}P(A_{n-i+1}|\prod_{j=0}^{n-i}A_j) \\严格的说,\prod_{j=1}^{n}A_j应该作\bigcap\limits_{j=1}^{n}A_j,表示积事件 \\定义P(A_0)=1 特别的:P2=P(A1A2)=P(A2∣A1)P(A1)P3=P(A1A2A3)=P(A3∣A1A2)P(A1A2)=P(A3∣A1A2)P(A2∣A1)P(A1)类似的:P4=P(A1A2A3A4)=P(A4∣A1A2A3)P(A1A2A3)=P(A4∣A1A2A3)P(A3∣A1A2)P(A2∣A1)P(A1)Pn=i=1∏nP(An−i+1∣j=0∏n−iAj)严格的说,j=1∏nAj应该作j=1⋂nAj,表示积事件定义P(A0)=1
-
其他写法
P n = ∏ i = 1 n P ( A n − i + 1 ∣ ⋂ j = 0 n − i A j ) 根据乘法交换律 ( 积事件调整书写顺序含义不变 ) P n = ∏ i = 1 n P ( A n − i + 1 ∣ ⋂ j = 0 n − i A j ) = ∏ i = 1 n P ( A i ∣ ⋂ j = 1 i − 1 A j ) 约定 ⋂ j = 1 0 A j 时省略该项 ( 作为必然事件 ) P_n=\prod_{i=1}^{n}P(A_{n-i+1}|\bigcap\limits_{j=0}^{n-i}A_j) \\根据乘法交换律(积事件调整书写顺序含义不变)\\ P_n=\prod_{i=1}^{n}P(A_{n-i+1}|\bigcap\limits_{j=0}^{n-i}A_j) =\prod_{i=1}^{n}P(A_{i}|\bigcap\limits_{j=1}^{i-1}A_j) \\约定\bigcap\limits_{j=1}^{0}A_j时省略该项(作为必然事件) Pn=i=1∏nP(An−i+1∣j=0⋂n−iAj)根据乘法交换律(积事件调整书写顺序含义不变)Pn=i=1∏nP(An−i+1∣j=0⋂n−iAj)=i=1∏nP(Ai∣j=1⋂i−1Aj)约定j=1⋂0Aj时省略该项(作为必然事件) -
通常,公式右边的条件概率都是比较容易计算的
- 通常利用条件概率的样本收缩来得出各个条件概率因子
- 否则可能要考虑其他的计算积事件的方法
More than two events
- For more than two events A 1 , … , A n A_{1},\ldots ,A_{n} A1,…,An the chain rule extends to the formula P ( A n ∩ … ∩ A 1 ) = P ( A n ∣ A n − 1 ∩ … ∩ A 1 ) ⋅ P ( A n − 1 ∩ … ∩ A 1 ) {\displaystyle \mathrm {P} \left(A_{n}\cap \ldots \cap A_{1}\right)=\mathrm {P} \left(A_{n}|A_{n-1}\cap \ldots \cap A_{1}\right)\cdot \mathrm {P} \left(A_{n-1}\cap \ldots \cap A_{1}\right)} P(An∩…∩A1)=P(An∣An−1∩…∩A1)⋅P(An−1∩…∩A1) which by induction may be turned into P ( A n ∩ … ∩ A 1 ) = ∏ k = 1 n P ( A k ∣ ⋂ j = 1 k − 1 A j ) . {\displaystyle \mathrm {P} \left(A_{n}\cap \ldots \cap A_{1}\right)=\prod _{k=1}^{n}\mathrm {P} \left(A_{k}\,{\Bigg |}\,\bigcap _{j=1}^{k-1}A_{j}\right).} P(An∩…∩A1)=k=1∏nP(Ak j=1⋂k−1Aj).
Example
- With four events ( n = 4 n=4 n=4), the chain rule is P ( A 1 ∩ A 2 ∩ A 3 ∩ A 4 ) = P ( A 4 ∣ A 3 ∩ A 2 ∩ A 1 ) ⋅ P ( A 3 ∩ A 2 ∩ A 1 ) = P ( A 4 ∣ A 3 ∩ A 2 ∩ A 1 ) ⋅ P ( A 3 ∣ A 2 ∩ A 1 ) ⋅ P ( A 2 ∩ A 1 ) = P ( A 4 ∣ A 3 ∩ A 2 ∩ A 1 ) ⋅ P ( A 3 ∣ A 2 ∩ A 1 ) ⋅ P ( A 2 ∣ A 1 ) ⋅ P ( A 1 ) {\displaystyle {\begin{aligned}\mathrm {P} (A_{1}\cap A_{2}\cap A_{3}\cap A_{4})&=\mathrm {P} (A_{4}\mid A_{3}\cap A_{2}\cap A_{1})\cdot \mathrm {P} (A_{3}\cap A_{2}\cap A_{1})\\&=\mathrm {P} (A_{4}\mid A_{3}\cap A_{2}\cap A_{1})\cdot \mathrm {P} (A_{3}\mid A_{2}\cap A_{1})\cdot \mathrm {P} (A_{2}\cap A_{1})\\&=\mathrm {P} (A_{4}\mid A_{3}\cap A_{2}\cap A_{1})\cdot \mathrm {P} (A_{3}\mid A_{2}\cap A_{1})\cdot \mathrm {P} (A_{2}\mid A_{1})\cdot \mathrm {P} (A_{1})\end{aligned}}} P(A1∩A2∩A3∩A4)=P(A4∣A3∩A2∩A1)⋅P(A3∩A2∩A1)=P(A4∣A3∩A2∩A1)⋅P(A3∣A2∩A1)⋅P(A2∩A1)=P(A4∣A3∩A2∩A1)⋅P(A3∣A2∩A1)⋅P(A2∣A1)⋅P(A1)
例
-
多次摸多颜色球问题
-
设有5红,3黑,2白
-
问,第三次才摸到白球的概率
-
即,前两次的摸球结果都不是白色的
-
为了方便讨论问题,记: A i = 第 i 次摸出白球 ; i = 1 , 2 , 3 A_i={第i次摸出白球};i=1,2,3 Ai=第i次摸出白球;i=1,2,3
- 如果不是白球,则记为 A i ‾ \overline{A_i} Ai
-
P = P ( A 1 ‾ A 2 ‾ A 3 ) = P ( A 3 ∣ A 1 ‾ A 2 ‾ ) P ( A 2 ‾ ∣ A 1 ‾ ) P ( A 1 ‾ ) = 2 10 − 2 8 − 1 10 − 1 8 10 = 2 8 7 9 8 10 = 7 45 P=P(\overline{A_1}\ \overline{A_2}A_3) =P(A_3|\overline{A_1}\ \overline{A_2}) P(\overline{A_2}|\overline{A_1})P(\overline{A_1}) \\=\frac{2}{10-2}\frac{8-1}{10-1}\frac{8}{10} =\frac{2}{8}\frac{7}{9}\frac{8}{10} =\frac{7}{45} P=P(A1 A2A3)=P(A3∣A1 A2)P(A2∣A1)P(A1)=10−2210−18−1108=8297108=457
-
其中 P ( B n ∣ B n − 1 ⋯ B 1 ) P(B_n|B_{n-1}\cdots{B_1}) P(Bn∣Bn−1⋯B1)表示已经有 n − 1 n-1 n−1个求被摸出,现在再摸出一个球,发生事件 B n B_n Bn的概率
-
例如 P ( A 2 ‾ ∣ A 1 ‾ ) P(\overline{A_2}|\overline{A_1}) P(A2∣A1)表示已经摸出一个球(而且不是白球)的情况下,再摸出一个求,而且仍然不是白球的概率
-
实时上,稍微熟练点的高中生,就可以直接写出 p = 2 8 7 9 8 10 p=\frac{2}{8}\frac{7}{9}\frac{8}{10} p=8297108
-
-
-
More than two random variables(多维随机变量下的链式乘法法则)
-
Consider an indexed collection of random variables X 1 , … , X n {\displaystyle X_{1},\ldots ,X_{n}} X1,…,Xn taking possible values x 1 , … , x n x_{1},\dots ,x_{n} x1,…,xn respectively.
-
Then, to find the value of this member of the joint distribution, we can apply the definition of conditional probability to obtain:
-
P ( X n = x n , ⋯ , X 1 = x 1 ) = P ( X n = x n ∣ X n − 1 = x n − 1 , … , X 1 = x 1 ) ⋅ P ( X n − 1 = x n − 1 , … , X 1 = x 1 ) {\displaystyle \mathrm {P} \left(X_{n}=x_{n},\cdots ,X_{1}=x_{1}\right)=\mathrm {P} \left(X_{n}=x_{n}|X_{n-1}=x_{n-1},\ldots ,X_{1}=x_{1}\right)\cdot \mathrm {P} \left(X_{n-1}=x_{n-1},\ldots ,X_{1}=x_{1}\right)} P(Xn=xn,⋯,X1=x1)=P(Xn=xn∣Xn−1=xn−1,…,X1=x1)⋅P(Xn−1=xn−1,…,X1=x1)
-
Repeating this process with each final term and letting A k A_{k} Ak denote the event X k = x k {\displaystyle X_{k}=x_{k}} Xk=xk creates the product:
-
P ( ⋂ k = 1 n A k ) = ∏ k = 1 n P ( A k ∣ ⋂ j = 1 k − 1 A j ) = ∏ k = 1 n P ( X k = x k ∣ X 1 = x 1 , … X k − 1 = x k − 1 ) . {\displaystyle \mathrm {P} \left(\bigcap _{k=1}^{n}A_{k}\right)=\prod _{k=1}^{n}\mathrm {P} \left(A_{k}\,{\Bigg |}\,\bigcap _{j=1}^{k-1}A_{j}\right)=\prod _{k=1}^{n}\mathrm {P} \left(X_{k}=x_{k}\,|\,X_{1}=x_{1},\dots X_{k-1}=x_{k-1}\right).} P(k=1⋂nAk)=k=1∏nP(Ak j=1⋂k−1Aj)=k=1∏nP(Xk=xk∣X1=x1,…Xk−1=xk−1).
Example
- With four variables ( n = 4 n=4 n=4), denote P ( x n ∣ x n − 1 … , x 1 ) : = P ( X n = x n ∣ X n − 1 = x n − 1 … , X 1 = x 1 ) {\displaystyle P(x_{n}\,|\,x_{n-1}\dots ,x_{1}):=P(X_{n}=x_{n}\,|\,X_{n-1}=x_{n-1}\dots ,X_{1}=x_{1})} P(xn∣xn−1…,x1):=P(Xn=xn∣Xn−1=xn−1…,X1=x1) for brevity.
- Then, the chain rule produces this product of conditional probabilities: P ( x 4 , x 3 , x 2 , x 1 ) = P ( x 4 ∣ x 3 , x 2 , x 1 ) ⋅ P ( x 3 , x 2 , x 1 ) = P ( x 4 ∣ x 3 , x 2 , x 1 ) ⋅ P ( x 3 ∣ x 2 , x 1 ) ⋅ P ( x 2 , x 1 ) = P ( x 4 ∣ x 3 , x 2 , x 1 ) ⋅ P ( x 3 ∣ x 2 , x 1 ) ⋅ P ( x 2 ∣ x 1 ) ⋅ P ( x 1 ) {\displaystyle {\begin{aligned}\mathrm {P} (x_{4},x_{3},x_{2},x_{1})&=\mathrm {P} (x_{4}\mid x_{3},x_{2},x_{1})\cdot \mathrm {P} (x_{3},x_{2},x_{1})\\&=\mathrm {P} (x_{4}\mid x_{3},x_{2},x_{1})\cdot \mathrm {P} (x_{3}\mid x_{2},x_{1})\cdot \mathrm {P} (x_{2},x_{1})\\&=\mathrm {P} (x_{4}\mid x_{3},x_{2},x_{1})\cdot \mathrm {P} (x_{3}\mid x_{2},x_{1})\cdot \mathrm {P} (x_{2}\mid x_{1})\cdot \mathrm {P} (x_{1})\end{aligned}}} P(x4,x3,x2,x1)=P(x4∣x3,x2,x1)⋅P(x3,x2,x1)=P(x4∣x3,x2,x1)⋅P(x3∣x2,x1)⋅P(x2,x1)=P(x4∣x3,x2,x1)⋅P(x3∣x2,x1)⋅P(x2∣x1)⋅P(x1)