Blame view

3rdparty/boost_1_81_0/libs/mpi/doc/mpi.qbk 6.91 KB
73ef4ff3   Hu Chunming   提交三方库
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
  [library Boost.MPI
      [quickbook 1.6]
      [authors [Gregor, Douglas], [Troyer, Matthias] ]
      [copyright 2005 2006 2007 Douglas Gregor, Matthias Troyer, Trustees of Indiana University]
      [id mpi]
      [license
          Distributed under the Boost Software License, Version 1.0.
          (See accompanying file LICENSE_1_0.txt or copy at
          <ulink url="http://www.boost.org/LICENSE_1_0.txt">
              http://www.boost.org/LICENSE_1_0.txt
          </ulink>)
      ]
  ]
  
  [/ Links ]
  [def _MPI_         [@http://www-unix.mcs.anl.gov/mpi/ MPI]]
  [def _MPI_implementations_ 
     [@http://www-unix.mcs.anl.gov/mpi/implementations.html
      MPI implementations]]
  [def _Serialization_ [@boost:/libs/serialization/doc
                        Boost.Serialization]]
  [def _BoostPython_ [@http://www.boost.org/libs/python/doc
                        Boost.Python]]
  [def _Python_      [@http://www.python.org Python]]
  [def _MPICH_        [@http://www-unix.mcs.anl.gov/mpi/mpich/ MPICH2]]
  [def _OpenMPI_      [@http://www.open-mpi.org OpenMPI]]
  [def _IntelMPI_     [@https://software.intel.com/en-us/intel-mpi-library Intel MPI]]
  [def _accumulate_   [@http://www.sgi.com/tech/stl/accumulate.html
                       `accumulate`]]
  
  [include introduction.qbk]
  [include getting_started.qbk]
  [include tutorial.qbk]
  [include c_mapping.qbk]
  
  [xinclude mpi_autodoc.xml]
  
  [include python.qbk]
  
  [section:design Design Philosophy]
  
  The design philosophy of the Parallel MPI library is very simple: be
  both convenient and efficient. MPI is a library built for
  high-performance applications, but it's FORTRAN-centric,
  performance-minded design makes it rather inflexible from the C++
  point of view: passing a string from one process to another is
  inconvenient, requiring several messages and explicit buffering;
  passing a container of strings from one process to another requires
  an extra level of manual bookkeeping; and passing a map from strings
  to containers of strings is positively infuriating. The Parallel MPI
  library allows all of these data types to be passed using the same
  simple `send()` and `recv()` primitives. Likewise, collective
  operations such as [funcref boost::mpi::reduce `reduce()`]
  allow arbitrary data types and function objects, much like the C++
  Standard Library would. 
  
  The higher-level abstractions provided for convenience must not have
  an impact on the performance of the application. For instance, sending
  an integer via `send` must be as efficient as a call to `MPI_Send`,
  which means that it must be implemented by a simple call to
  `MPI_Send`; likewise, an integer [funcref boost::mpi::reduce
  `reduce()`] using `std::plus<int>` must be implemented with a call to
  `MPI_Reduce` on integers using the `MPI_SUM` operation: anything less
  will impact performance. In essence, this is the "don't pay for what
  you don't use" principle: if the user is not transmitting strings,
  s/he should not pay the overhead associated with strings. 
  
  Sometimes, achieving maximal performance means foregoing convenient
  abstractions and implementing certain functionality using lower-level
  primitives. For this reason, it is always possible to extract enough
  information from the abstractions in Boost.MPI to minimize
  the amount of effort required to interface between Boost.MPI
  and the C MPI library.
  [endsect]
  
  [section:performance Performance Evaluation]
  
  Message-passing performance is crucial in high-performance distributed
  computing. To evaluate the performance of Boost.MPI, we modified the
  standard [@http://www.scl.ameslab.gov/netpipe/ NetPIPE] benchmark
  (version 3.6.2) to use Boost.MPI and compared its performance against
  raw MPI. We ran five different variants of the NetPIPE benchmark:
  
  # MPI: The unmodified NetPIPE benchmark.
  
  # Boost.MPI: NetPIPE modified to use Boost.MPI calls for
    communication.
  
  # MPI (Datatypes): NetPIPE modified to use a derived datatype (which
    itself contains a single `MPI_BYTE`) rather than a fundamental
    datatype.
  
  # Boost.MPI (Datatypes): NetPIPE modified to use a user-defined type
    `Char` in place of the fundamental `char` type. The `Char` type
    contains a single `char`, a `serialize()` method to make it
    serializable, and specializes [classref
    boost::mpi::is_mpi_datatype is_mpi_datatype] to force
    Boost.MPI to build a derived MPI data type for it.
  
  # Boost.MPI (Serialized): NetPIPE modified to use a user-defined type
    `Char` in place of the fundamental `char` type. This `Char` type
    contains a single `char` and is serializable. Unlike the Datatypes
    case, [classref boost::mpi::is_mpi_datatype
    is_mpi_datatype] is *not* specialized, forcing Boost.MPI to perform
    many, many serialization calls.
  
  The actual tests were performed on the Odin cluster in the
  [@http://www.cs.indiana.edu/ Department of Computer Science] at
  [@http://www.iub.edu Indiana University], which contains 128 nodes
  connected via Infiniband. Each node contains 4GB memory and two AMD
  Opteron processors. The NetPIPE benchmarks were compiled with Intel's
  C++ Compiler, version 9.0, Boost 1.35.0 (prerelease), and
  [@http://www.open-mpi.org/ Open MPI] version 1.1. The NetPIPE results
  follow:
  
  [$../../libs/mpi/doc/netpipe.png]
  
  There are a some observations we can make about these NetPIPE
  results. First of all, the top two plots show that Boost.MPI performs
  on par with MPI for fundamental types. The next two plots show that
  Boost.MPI performs on par with MPI for derived data types, even though
  Boost.MPI provides a much more abstract, completely transparent
  approach to building derived data types than raw MPI. Overall
  performance for derived data types is significantly worse than for
  fundamental data types, but the bottleneck is in the underlying MPI
  implementation itself. Finally, when forcing Boost.MPI to serialize
  characters individually, performance suffers greatly. This particular
  instance is the worst possible case for Boost.MPI, because we are
  serializing millions of individual characters.  Overall, the
  additional abstraction provided by Boost.MPI does not impair its
  performance.
  
  [endsect]
  
  [section:history Revision History]
  
  * *Boost 1.36.0*: 
    * Support for non-blocking operations in Python, from Andreas Klöckner
  
  * *Boost 1.35.0*: Initial release, containing the following post-review changes 
    * Support for arrays in all collective operations
    * Support default-construction of [classref boost::mpi::environment environment]
      
  * *2006-09-21*: Boost.MPI accepted into Boost.
  
  [endsect:history]
  
  [section:acknowledge Acknowledgments]
  Boost.MPI was developed with support from Zurcher Kantonalbank. Daniel
  Egloff and Michael Gauckler contributed many ideas to Boost.MPI's
  design, particularly in the design of its abstractions for
  MPI data types and the novel skeleton/context mechanism for large data
  structures. Prabhanjan (Anju) Kambadur developed the predecessor to
  Boost.MPI that proved the usefulness of the Serialization library in
  an MPI setting and the performance benefits of specialization in a C++
  abstraction layer for MPI. Jeremy Siek managed the formal review of Boost.MPI.
  
  [endsect:acknowledge]