【并行计算】经典的积分求解示例

1.前言

积分求解是讲并行计算是非常常用的一个经典例子，一方面因为其确实是一个并行运算能够发挥更好性能的例子，另一方面也比较简单，并不需要很深的数学基础，下面我们就一起来看看吧！
比如下面这条y = x^2的曲线，我们想计算其在0-10这一段的积分值，当然我们可以采用求原函数的方法，但是有很多时候，当函数比较复杂时，原函数往往是求不出来的，这时，我们就可以采用数值逼近的方法，比如将其积分面积划分为很多个小梯形，分别求小梯形的面积，然后来求和。
而当我们需要非常高的精度时，我们一般就会将梯形划分的很窄，梯形的数量就会非常多，可能甚至成千上万，这取决于我们希望的精度。而这，可能是一个很耗费时间的计算过程，因为计算是串行的，而如果我们采用并行计算的方式，将有可能以指数增加的效率来完成计算。

2. Talking is Cheap, Show me the Code!

将下面的代码保存为calculate_parallel.py：

import numpy as np
import sys
from mpi4py import MPI
from mpi4py.MPI import ANY_SOURCE

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

integral = np.zeros(1)
receive_buff = np.zeros(1)

# get parameters from command-line
start = float(sys.argv[1])
end = float(sys.argv[2])
piece = int(sys.argv[3])


# objective function
def f(x):
    return x**2


# integral calculate in a small range
def integrate_range(start, end, piece):
    integral = -(f(start) + f(end))/2.0
    for x in np.linspace(start, end, piece + 1):
        integral += f(x)
    integral = integral * (end - start)/piece
    return integral


# divide the integral into pieces for processes
width = (end - start) / piece
local_piece = piece/size        # pieces of a process
local_start = start + rank*local_piece*width
local_end = local_start + local_piece*width

integral[0] = integrate_range(local_start, local_end, local_piece)

# process 0 collects all results and sum
if rank == 0:
    integral_sum = integral[0]
    for i in range(1, size):
        comm.Recv(receive_buff, ANY_SOURCE)
        integral_sum += receive_buff[0]
else:
    comm.Send(integral, dest=0)

if rank == 0:
    print('With %d pieces, the estimation the integral of function y = x^2 from x = %f to x = %f is: %f.'
          % (piece, start, end, integral_sum))

3.运行效果：

在命令行输入：

mpiexec -n 4 python calculate_parallel.py 0 10 16

输出为：

(base) C:\Users\44375\Documents\python_proj>mpiexec -n 4 python calculate_parallel.py 0 10 16
With 16 pieces, the estimation the integral of function y = x^2 from x = 0.000000 to x = 10.000000 is: 333.984375.

4.该如何改进？

细心的朋友可能已经发现，上面的代码中，实际上我们在最后求和的过程中也是串行的，这对我们提升计算速度而言是不利的，那么有没有其他的方案呢？
下面给出一点思路供大家参考：

通过上图，大家可以看到在A中，也就是我们上面代码中采用的思路，A需要串行的去加2.3.4的结果，而在B中，则将A的计算过程转移了一部分到3中，使A仅需要进行两次计算，这看起来仅仅少了一次，但是当进程数成百上千时，这个时间利用效率是呈指数增加的。

参考

1.Point-to-Point Communication with Python and mpi4py (lecture 4/5)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【并行计算】经典的积分求解示例

【并行计算】经典的积分求解示例

1.前言

2. Talking is Cheap, Show me the Code!

3.运行效果：

4.该如何改进？

参考

【Docker】docker的介紹、安裝及使用教程

【學習筆記】PPO(Proximal Policy Optimization) - 李宏毅

Windows 10 安裝配置 Anaconda, Tensorflow等

ODROID-XU 使用EMMC卡刷Ubuntu 16.04

[AirSim Step 1] 在Windows 10 上配置AirSim

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結