7. 使用牛顿法求解经济模型 #

参见

GPU加速: 本讲座有一个使用jax在GPU上运行的版本，点击这里查看

7.1. 概述 #

许多经济问题涉及寻找不动点或零点（有时也称为”根”）。

例如，在简单的供需模型中，均衡价格是使超额需求为零的价格。

换句话说，均衡是超额需求函数的零点。

有各种算法可用于求解不动点和零点。

在本讲中，我们将学习一种重要的基于梯度的技术，称为牛顿法。

牛顿法并非总是有效，但在适用的情况下，其收敛速度通常比其他方法更快。

本讲将在一维和多维环境中应用牛顿法来解决不动点和零点计算问题。

牛顿法的基本思路是:

对于不动点问题，通过对函数 \(f\) 进行线性近似来寻找不动点。
对于零点问题，通过求解函数 \(f\) 的线性近似的零点，不断更新当前的估计值，直到收敛到真实的零点。

为了建立直观认识，我们首先考虑一个简单的一维不动点问题，其中我们已知解，并使用连续近似和牛顿法来求解。

然后我们将牛顿法应用到多维环境中，求解多种商品的市场均衡问题。

最后，我们将使用 autograd 包提供的自动微分功能来处理一个高维均衡问题。

!pip install autograd

我们在本讲中使用以下导入语句

import matplotlib.pyplot as plt
import matplotlib as mpl
FONTPATH = "fonts/SourceHanSerifSC-SemiBold.otf"
mpl.font_manager.fontManager.addfont(FONTPATH)
plt.rcParams['font.family'] = ['Source Han Serif SC']

from collections import namedtuple
from scipy.optimize import root
from autograd import jacobian
# 经过简单封装的numpy，以支持自动微分
import autograd.numpy as np

plt.rcParams["figure.figsize"] = (10, 5.7)

7.4. 多元牛顿法 #

在本节中，我们将介绍一个双商品问题，可视化问题，并使用SciPy中的零点查找器和牛顿法来求解这个双商品市场的均衡。

然后，我们将这个概念扩展到一个包含5,000种商品的更大市场，并再次比较这两种方法的性能。

我们将看到使用牛顿法时能获得显著的性能提升。

7.4.1. 双商品市场均衡#

让我们从计算双商品问题的市场均衡开始。

我们考虑一个包含两种相关产品的市场，商品0和商品1，价格向量为\(p = (p_0, p_1)\)

在价格\(p\)下，商品\(i\)的供给为，

\[ q^s_i (p) = b_i \sqrt{p_i} \]

在价格\(p\)下，商品\(i\)的需求为，

\[ q^d_i (p) = \exp(-(a_{i0} p_0 + a_{i1} p_1)) + c_i \]

这里的\(c_i\)、\(b_i\)和\(a_{ij}\)都是参数。

例如，这两种商品可能是通常一起使用的计算机组件，在这种情况下它们是互补品。因此需求取决于两种组件的价格。

超额需求函数为，

\[ e_i(p) = q^d_i(p) - q^s_i(p), \quad i = 0, 1 \]

均衡价格向量\(p^*\)满足\(e_i(p^*) = 0\)。

我们设定

\[\begin{split} A = \begin{pmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{pmatrix}, \qquad b = \begin{pmatrix} b_0 \\ b_1 \end{pmatrix} \qquad \text{和} \qquad c = \begin{pmatrix} c_0 \\ c_1 \end{pmatrix} \end{split}\]

用于这个特定问题。

7.4.1.1. 图形化探索#

由于我们的问题只是二维的，我们可以使用图形分析来可视化并帮助理解这个问题。

我们的第一步是定义超额需求函数

\[\begin{split} e(p) = \begin{pmatrix} e_0(p) \\ e_1(p) \end{pmatrix} \end{split}\]

下面的函数计算给定参数的超额需求

def e(p, A, b, c):
    return np.exp(- A @ p) + c - b * np.sqrt(p)

我们的默认参数值将是

\[\begin{split} A = \begin{pmatrix} 0.5 & 0.4 \\ 0.8 & 0.2 \end{pmatrix}, \qquad b = \begin{pmatrix} 1 \\ 1 \end{pmatrix} \qquad \text{和} \qquad c = \begin{pmatrix} 1 \\ 1 \end{pmatrix} \end{split}\]

A = np.array([
    [0.5, 0.4],
    [0.8, 0.2]
])
b = np.ones(2)
c = np.ones(2)

在价格水平 \(p = (1, 0.5)\) 时，超额需求为

ex_demand = e((1.0, 0.5), A, b, c)

print(f'商品0的超额需求为 {ex_demand[0]:.3f} \n'
      f'商品1的超额需求为 {ex_demand[1]:.3f}')

商品0的超额需求为 0.497 
商品1的超额需求为 0.699

接下来我们在\((p_0, p_1)\)值的网格上绘制两个函数\(e_0\)和\(e_1\)的等高线图和曲面图。

我们将使用以下函数来构建等高线图

def plot_excess_demand(ax, good=0, grid_size=100, grid_max=4, surface=True):

    # 创建一个100x100的网格
    p_grid = np.linspace(0, grid_max, grid_size)
    z = np.empty((100, 100))

    for i, p_1 in enumerate(p_grid):
        for j, p_2 in enumerate(p_grid):
            z[i, j] = e((p_1, p_2), A, b, c)[good]

    if surface:
        cs1 = ax.contourf(p_grid, p_grid, z.T, alpha=0.5)
        plt.colorbar(cs1, ax=ax, format="%.6f")

    ctr1 = ax.contour(p_grid, p_grid, z.T, levels=[0.0])
    ax.set_xlabel("$p_0$")
    ax.set_ylabel("$p_1$")
    ax.set_title(f'超额需求函数 {good}')
    plt.clabel(ctr1, inline=1, fontsize=13)

这是 \(e_0\) 的图

fig, ax = plt.subplots()
plot_excess_demand(ax, good=0)
plt.show()

_images/b02d223d632ae3b81b853437a0322ca39451d150d4e135808ce60c3134857b4f.png

这是 \(e_1\) 的图

fig, ax = plt.subplots()
plot_excess_demand(ax, good=1)
plt.show()

_images/b2d4f20920058cf4c4eaad59303f7321ade4a3a55a5974c037a33b14371dd0e9.png

我们看到黑色的零等高线，它告诉我们何时\(e_i(p)=0\)。

对于使得\(e_i(p)=0\)的价格向量\(p\)，我们知道商品\(i\)处于均衡状态（需求等于供给）。

如果这两条等高线在某个价格向量\(p^*\)处相交，那么\(p^*\)就是一个均衡价格向量。

fig, ax = plt.subplots(figsize=(10, 5.7))
for good in (0, 1):
    plot_excess_demand(ax, good=good, surface=False)
plt.show()

_images/c440688b63f830c446d6c09d02c62ca59967784615bb3d42b120e9f50d991295.png

看起来在 \(p = (1.6, 1.5)\) 附近存在一个均衡点。

7.4.1.2. 使用多维根查找器#

为了更精确地求解 \(p^*\)，我们使用 scipy.optimize 中的零点查找算法。

我们以 \(p = (1, 1)\) 作为初始猜测值。

init_p = np.ones(2)

这个算法使用改进的Powell方法来寻找零点

%%time
solution = root(lambda p: e(p, A, b, c), init_p, method='hybr')

CPU times: user 158 μs, sys: 16 μs, total: 174 μs
Wall time: 180 μs

这是得到的值

p = solution.x
p

array([1.57080182, 1.46928838])

这个结果看起来和我们从图中观察到的猜测很接近。我们可以把它代回到 \(e\) 中验证 \(e(p) \approx 0\)

np.max(np.abs(e(p, A, b, c)))

np.float64(2.0383694732117874e-13)

这确实是一个很小的误差。

7.4.1.3. 添加梯度信息#

在许多情况下，对于应用于光滑函数的零点查找算法，提供函数的雅可比矩阵可以带来更好的收敛性质。

这里我们手动计算雅可比矩阵的元素

\[\begin{split} J(p) = \begin{pmatrix} \frac{\partial e_0}{\partial p_0}(p) & \frac{\partial e_0}{\partial p_1}(p) \\ \frac{\partial e_1}{\partial p_0}(p) & \frac{\partial e_1}{\partial p_1}(p) \end{pmatrix} \end{split}\]

def jacobian_e(p, A, b, c):
    p_0, p_1 = p
    a_00, a_01 = A[0, :]
    a_10, a_11 = A[1, :]
    j_00 = -a_00 * np.exp(-a_00 * p_0) - (b[0]/2) * p_0**(-1/2)
    j_01 = -a_01 * np.exp(-a_01 * p_1)
    j_10 = -a_10 * np.exp(-a_10 * p_0)
    j_11 = -a_11 * np.exp(-a_11 * p_1) - (b[1]/2) * p_1**(-1/2)
    J = [[j_00, j_01],
         [j_10, j_11]]
    return np.array(J)

%%time
solution = root(lambda p: e(p, A, b, c),
                init_p, 
                jac=lambda p: jacobian_e(p, A, b, c), 
                method='hybr')

CPU times: user 211 μs, sys: 22 μs, total: 233 μs
Wall time: 236 μs

现在的解更加精确了（尽管在这个低维问题中，差异非常小）：

p = solution.x
np.max(np.abs(e(p, A, b, c)))

np.float64(1.3322676295501878e-15)

7.4.1.4. 使用牛顿法#

现在让我们使用牛顿法来计算均衡价格，采用多变量版本的牛顿法

(7.6)#\[p_{n+1} = p_n - J_e(p_n)^{-1} e(p_n)\]

这是(7.5)的多变量版本

（这里的\(J_e(p_n)\)是在\(p_n\)处计算的\(e\)的雅可比矩阵。）

迭代从价格向量\(p_0\)的某个初始猜测开始。

在这里，我们不手动编写雅可比矩阵，而是使用autograd库中的jacobian()函数来自动求导并计算雅可比矩阵。

只需稍作修改，我们就可以将我们之前的尝试推广到多维问题

def newton(f, x_0, tol=1e-5, max_iter=10):
    x = x_0
    q = lambda x: x - np.linalg.solve(jacobian(f)(x), f(x))
    error = tol + 1
    n = 0
    while error > tol:
        n+=1
        if(n > max_iter):
            raise Exception('Max iteration reached without convergence')
        y = q(x)
        if(any(np.isnan(y))):
            raise Exception('Solution not found with NaN generated')
        error = np.linalg.norm(x - y)
        x = y
        print(f'iteration {n}, error = {error:.5f}')
    print('\n' + f'Result = {x} \n')
    return x

def e(p, A, b, c):
    return np.exp(- np.dot(A, p)) + c - b * np.sqrt(p)

我们发现算法在4步内终止

%%time
p = newton(lambda p: e(p, A, b, c), init_p)

iteration 1, error = 0.62515
iteration 2, error = 0.11152
iteration 3, error = 0.00258
iteration 4, error = 0.00000

Result = [1.57080182 1.46928838] 

CPU times: user 2.56 ms, sys: 0 ns, total: 2.56 ms
Wall time: 2.18 ms

np.max(np.abs(e(p, A, b, c)))

np.float64(1.461053500406706e-13)

结果非常准确。

7.4.2. 高维问题#

我们的下一步是研究一个有3,000种商品的大型市场。

使用GPU加速线性代数和自动微分的JAX版本可在此处获取

超额需求函数基本相同，但现在矩阵 \(A\) 是 \(3000 \times 3000\) 的，参数向量 \(b\) 和 \(c\) 是 \(3000 \times 1\) 的。

dim = 3000
np.random.seed(123)

# 创建随机矩阵A并将行归一化使其和为1
A = np.random.rand(dim, dim)
A = np.asarray(A)
s = np.sum(A, axis=0)
A = A / s

# 设置b和c
b = np.ones(dim)
c = np.ones(dim)

这是我们的初始条件

init_p = np.ones(dim)

%%time
p = newton(lambda p: e(p, A, b, c), init_p)

iteration 1, error = 23.22267

iteration 2, error = 3.94538

iteration 3, error = 0.08500

iteration 4, error = 0.00004

iteration 5, error = 0.00000

Result = [1.50185286 1.49865815 1.50028285 ... 1.50875149 1.48724784 1.48577532] 

CPU times: user 32.4 s, sys: 192 ms, total: 32.6 s
Wall time: 30.7 s

np.max(np.abs(e(p, A, b, c)))

np.float64(1.5543122344752192e-15)

在相同的容差条件下，我们比较牛顿法与SciPy的root函数的运行时间和精确度

%%time
solution = root(lambda p: e(p, A, b, c),
                init_p, 
                jac=lambda p: jacobian(e)(p, A, b, c), 
                method='hybr',
                tol=1e-5)

CPU times: user 34.8 s, sys: 67.6 ms, total: 34.9 s
Wall time: 34.6 s

p = solution.x
np.max(np.abs(e(p, A, b, c)))

np.float64(8.295585953721485e-07)

7.5. 练习 #

练习 7.1

考虑索洛固定点问题的三维扩展，其中

\[\begin{split} A = \begin{pmatrix} 2 & 3 & 3 \\ 2 & 4 & 2 \\ 1 & 5 & 1 \\ \end{pmatrix}, \quad s = 0.2, \quad α = 0.5, \quad δ = 0.8 \end{split}\]

和之前一样，运动方程为

\[ k_{t+1} = g(k_t) \quad \text{where} \quad g(k) := sAk^\alpha + (1-\delta) k\]

但现在 \(k_t\) 是一个 \(3 \times 1\) 向量。

使用牛顿法求解固定点，初始值如下：

\[\begin{split} \begin{aligned} k1_{0} &= (1, 1, 1) \\ k2_{0} &= (3, 5, 5) \\ k3_{0} &= (50, 50, 50) \end{aligned} \end{split}\]

提示

固定点的计算等价于计算满足 \(f(k^*) - k^* = 0\) 的 \(k^*\)。
如果你对你的解决方案不确定，可以从已解决的示例开始：

\[ \begin{align}\begin{aligned}\begin{split}A = \begin{pmatrix} 2 & 0 & 0 \\ 0 & 2 & 0 \\\end{split}\\\begin{split}0 & 0 & 2 \\ \end{pmatrix}\end{split}\end{aligned}\end{align} \]

其中 \(s = 0.3\)、\(α = 0.3\) 和 \(δ = 0.4\)，初始值为：

\[k_0 = (1, 1, 1)\]

结果应该收敛到解析解。

解答练习 7.1

让我们首先定义这个问题的参数

A = np.array([[2.0, 3.0, 3.0],
              [2.0, 4.0, 2.0],
              [1.0, 5.0, 1.0]])

s = 0.2
α = 0.5
δ = 0.8

initLs = [np.ones(3),
          np.array([3.0, 5.0, 5.0]),
          np.repeat(50.0, 3)]

然后定义(7.1)的多元版本

def multivariate_solow(k, A=A, s=s, α=α, δ=δ):
    return (s * np.dot(A, k**α) + (1 - δ) * k)

让我们遍历每个初始值并查看输出结果

attempt = 1
for init in initLs:
    print(f'尝试 {attempt}: 初始值为 {init} \n')
    %time k = newton(lambda k: multivariate_solow(k) - k, \
                    init)
    print('-'*64)
    attempt += 1

尝试 1: 初始值为 [1. 1. 1.] 

iteration 1, error = 50.49630
iteration 2, error = 41.10937
iteration 3, error = 4.29413
iteration 4, error = 0.38543
iteration 5, error = 0.00544
iteration 6, error = 0.00000

Result = [3.84058108 3.87071771 3.41091933] 

CPU times: user 3.37 ms, sys: 3 μs, total: 3.37 ms
Wall time: 3.06 ms
----------------------------------------------------------------
尝试 2: 初始值为 [3. 5. 5.] 

iteration 1, error = 2.07011
iteration 2, error = 0.12642
iteration 3, error = 0.00060
iteration 4, error = 0.00000

Result = [3.84058108 3.87071771 3.41091933] 

CPU times: user 1.88 ms, sys: 0 ns, total: 1.88 ms
Wall time: 1.75 ms
----------------------------------------------------------------
尝试 3: 初始值为 [50. 50. 50.] 

iteration 1, error = 73.00943
iteration 2, error = 6.49379
iteration 3, error = 0.68070
iteration 4, error = 0.01620
iteration 5, error = 0.00001
iteration 6, error = 0.00000

Result = [3.84058108 3.87071771 3.41091933] 

CPU times: user 2.71 ms, sys: 0 ns, total: 2.71 ms
Wall time: 2.52 ms
----------------------------------------------------------------

我们发现，由于这个问题具有明确定义的性质，结果与初始值无关。

但是收敛所需的迭代次数取决于初始值。

让我们把输出结果代回公式中验证我们的最终结果

multivariate_solow(k) - k

array([-4.4408921e-16, -4.4408921e-16,  4.4408921e-16])

注意误差非常小。

我们也可以在已知解上测试我们的结果

A = np.array([[2.0, 0.0, 0.0],
               [0.0, 2.0, 0.0],
               [0.0, 0.0, 2.0]])

s = 0.3
α = 0.3
δ = 0.4

init = np.repeat(1.0, 3)


%time k = newton(lambda k: multivariate_solow(k, A=A, s=s, α=α, δ=δ) - k, \
                 init)

iteration 1, error = 1.57459
iteration 2, error = 0.21345
iteration 3, error = 0.00205
iteration 4, error = 0.00000

Result = [1.78467418 1.78467418 1.78467418] 

CPU times: user 1.99 ms, sys: 0 ns, total: 1.99 ms
Wall time: 1.86 ms

结果与真实值非常接近，但仍有细微差异。

%time k = newton(lambda k: multivariate_solow(k, A=A, s=s, α=α, δ=δ) - k, \
                 init,\
                 tol=1e-7)

iteration 1, error = 1.57459
iteration 2, error = 0.21345
iteration 3, error = 0.00205
iteration 4, error = 0.00000
iteration 5, error = 0.00000

Result = [1.78467418 1.78467418 1.78467418] 

CPU times: user 2.86 ms, sys: 0 ns, total: 2.86 ms
Wall time: 2.57 ms

我们可以看到它正在朝着更精确的解迈进。

练习 7.2

在这个练习中，让我们尝试不同的初始值，看看牛顿法对不同起始点的反应如何。

让我们定义一个具有以下默认值的三商品问题：

\[\begin{split} A = \begin{pmatrix} 0.2 & 0.1 & 0.7 \\ 0.3 & 0.2 & 0.5 \\ 0.1 & 0.8 & 0.1 \\ \end{pmatrix}, \qquad b = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} \qquad \text{和} \qquad c = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} \end{split}\]

对于这个练习，使用以下极端价格向量作为初始值：

\[\begin{split}\begin{aligned} p1_{0} &= (5, 5, 5) \\ p2_{0} &= (1, 1, 1) \\ p3_{0} &= (4.5, 0.1, 4) \end{aligned} \end{split}\]

将容差设置为\(0.0\)以获得更精确的输出。

解答练习 7.2

定义参数和初始值

A = np.array([
    [0.2, 0.1, 0.7],
    [0.3, 0.2, 0.5],
    [0.1, 0.8, 0.1]
])

b = np.array([1.0, 1.0, 1.0])
c = np.array([1.0, 1.0, 1.0])

initLs = [np.repeat(5.0, 3),
          np.ones(3),
          np.array([4.5, 0.1, 4.0])] 

让我们检查每个初始猜测值并查看输出结果

attempt = 1
for init in initLs:
    print(f'尝试 {attempt}: 初始值为 {init} \n')
    %time p = newton(lambda p: e(p, A, b, c), \
                init, \
                tol=1e-15, \
                max_iter=15)
    print('-'*64)
    attempt += 1

尝试 1: 初始值为 [5. 5. 5.] 

iteration 1, error = 9.24381

/home/runner/miniconda3/envs/quantecon/lib/python3.13/site-packages/autograd/tracer.py:54: RuntimeWarning: invalid value encountered in sqrt
  return f_raw(*args, **kwargs)
/home/runner/miniconda3/envs/quantecon/lib/python3.13/site-packages/autograd/numpy/numpy_vjps.py:184: RuntimeWarning: invalid value encountered in power
  defvjp(anp.sqrt, lambda ans, x: lambda g: g * 0.5 * x**-0.5)

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
File <timed exec>:1

Cell In[34], line 12, in newton(f, x_0, tol, max_iter)
     10 y = q(x)
     11 if(any(np.isnan(y))):
---> 12     raise Exception('Solution not found with NaN generated')
     13 error = np.linalg.norm(x - y)
     14 x = y

Exception: Solution not found with NaN generated

----------------------------------------------------------------
尝试 2: 初始值为 [1. 1. 1.] 

iteration 1, error = 0.73419
iteration 2, error = 0.12472
iteration 3, error = 0.00269
iteration 4, error = 0.00000
iteration 5, error = 0.00000
iteration 6, error = 0.00000

Result = [1.49744442 1.49744442 1.49744442] 

CPU times: user 3.37 ms, sys: 2 μs, total: 3.37 ms
Wall time: 2.99 ms
----------------------------------------------------------------
尝试 3: 初始值为 [4.5 0.1 4. ] 

iteration 1, error = 4.89202
iteration 2, error = 1.21206
iteration 3, error = 0.69421
iteration 4, error = 0.16895
iteration 5, error = 0.00521
iteration 6, error = 0.00000
iteration 7, error = 0.00000
iteration 8, error = 0.00000

Result = [1.49744442 1.49744442 1.49744442] 

CPU times: user 3.93 ms, sys: 3 μs, total: 3.93 ms
Wall time: 3.47 ms
----------------------------------------------------------------

我们可以发现牛顿法对某些初始值可能会失败。

有时可能需要尝试几个初始猜测值才能实现收敛。

将结果代回公式中检验我们的结果

e(p, A, b, c)

array([ 0.00000000e+00,  0.00000000e+00, -2.22044605e-16])

我们可以看到结果非常精确。

7. 使用牛顿法求解经济模型 #

7.1. 概述 #

7.2. 用牛顿法计算不动点 #

7.2.1. 索洛模型#

7.2.2. 实现#

7.2.2.1. 连续近似法#

7.2.2.2. 牛顿法#

7.3. 一维求根 #

7.3.1. 牛顿法求零点#

7.3.2. 在寻找不动点中的应用#

7.4. 多元牛顿法 #

7.4.1. 双商品市场均衡#

7.4.1.1. 图形化探索#

7.4.1.2. 使用多维根查找器#

7.4.1.3. 添加梯度信息#

7.4.1.4. 使用牛顿法#

7.4.2. 高维问题#

7.5. 练习 #

7. 使用牛顿法求解经济模型#

7.1. 概述#

7.2. 用牛顿法计算不动点#

7.2.1. 索洛模型#

7.2.2. 实现#

7.2.2.1. 连续近似法#

7.2.2.2. 牛顿法#

7.3. 一维求根#

7.3.1. 牛顿法求零点#

7.3.2. 在寻找不动点中的应用#

7.4. 多元牛顿法#

7.4.1. 双商品市场均衡#

7.4.1.1. 图形化探索#

7.4.1.2. 使用多维根查找器#

7.4.1.3. 添加梯度信息#

7.4.1.4. 使用牛顿法#

7.4.2. 高维问题#

7.5. 练习#

7. 使用牛顿法求解经济模型 #

7.1. 概述 #

7.2. 用牛顿法计算不动点 #

7.3. 一维求根 #

7.4. 多元牛顿法 #

7.5. 练习 #