Fast Multipole Method
 We implement a parallel Fast Multipole Methods (FMM) based on the tree data structures available from the domain decomposition. We use FMM to estimate efficiently the far-field conditions for the multigrid solver. Moreover, FMM may be used in the multigrid solver to avoid communication between sub-domains and, therefore, between processors. Commonly used Fast Multipole Methods were introduced by Greengard and Rokhlin (1988).   We consider pairwise particle interactions in a potential field of the form $\Phi(x_i)= \sum_{n=1, n \neq i}^{N} \frac{q_n}{\| x_i-x_n \|}.$   where $x_1,...,x_n$ are particle locations and $q_1,...,q_n$ are charges.   The multipole expansion of the potential $\Phi(x)$ at the particle position $x=(r,\theta,\phi)$ generated by the charges $q_1,..,q_N$ located at $x_1,...,x_N$  is given by Greengard and Rokhlin (1988):   $\Phi(x_i)= \sum_{n=0}^{\infty} \sum_{m=-n}^{n} \frac{M_n^m}{r^{n+1}} \cdot Y_n^m(\theta, \phi)$ , where $M_n^m$ are coefficients depending on the particle position  $x_1,...,x_N$ and charges  $q_1,..,q_N$. $Y_n^m$ are spherical harmonics of degree n and order m.   The evaluation of the potential involves traversing of the tree data structure and the summation of the local multipole expansions satisfying the opening angle criteria (Barnes and Hut, 1986). There exist several approaches to compute this summation on parallel architectures (Greengard and Grop, 1990 ; Kurzak and Pettitt, 2005). The PPM library creates one global tree data structure based on all particles that is  accessible from all processors. Each processor computes the multipole expansions for the leaf boxes in its sub domains. A pre-traversal determines the boxes that are needed for shifting the multipole expansion to the next higher tree level and creates an interaction list. In a communication step, the necessary multipole expansions are exchanged before they are shifted. The final traversal of the tree for a target point is based on the pre-traversal interaction to reduce the amount of communication. After sending these multipole expansions to the processor containing the target point, the final evaluation of the potential is performed without communication.