The Direct linear solver options involve an LU decomposition of the system matrix A, i.e., computation of a lower triangular matrix L and an upper triangular matrix U such that A=LU. Once the factorization is complete the solution to the matrix equation Ax=b is obtained in two steps. First, the expression Ly=b is solved for y (so-called forward substitution). Then y is solved (backward substitution).
The advantage of a direct solver over an iterative solver (discussed in the next section) is both robustness and the speed of generating solutions once the factorization has been computed. Although the factorization operation can take substantial time and memory, the subsequent forward-backward substitution (FBS), used to obtain a solution for a given right-hand side (excitation), is very fast. The disadvantage is that factorization is computationally expensive, with memory requirements that can exceed 10 times the storage requirements for the system matrix.
In the simulation properties dialog under the Solver tab, there are 3 options for Linear Solver/Direct/Method: MFLU, HMLU and Automatic. Although the direct solvers all create an LU factorization of the system matrix, they use somewhat different algorithms which have distinct characteristics:
MFLU: A multi-frontal method that makes use of multiple cores on a computer to speed up the factorization/solve via a task-based threading system. This solver produces an “exact” factorization which is used directly in the forward-backward substitution to obtain the linear system solution. MFLU is the preferred solver for small-to-modest sized problems running on a single node.
HMLU: Creates factors via a hierarchical decomposition of the system matrix using a graph-based subdivision of the finite-element mesh. The factor can be either exact or approximate, depending on the choice of the Factor Accuracy Level parameter. Inexact factors use compression techniques to reduce storage and improve run times, with lowest accuracy (value of Low for the parameter) requiring minimum resource requirements. This solver uses the factors as a preconditioner within an iterative solver which typically takes 1 to 3 iterations to converge, with more iterations required for low accuracy factors than for high accuracy factors.
As with MFLU, HMLU uses a task-based threading model to efficiently use multiple cores on a single machine. HMLU also supports use of multiple nodes on a Linux cluster to solve much larger problems than can be solved on a single computer. Distribution of the workload on a cluster is controlled by the Minimum Number Of Processes Per Linear Solve parameter. The default value of 0 will cause all available nodes to be used for each factor/solve. Choosing a value other than 0 will force HMLU to use that many nodes for each linear solve (limited by the total available nodes). HMLU is the best choice for larger problems.
The Automatic setting allows the system to choose the appropriate solver based upon the situation. Generally, smaller problems running on fewer cores will use MFLU, and larger problems and/or more cores will use HMLU. This setting also allows for fail-over to a higher accuracy factor or a different solver if necessary. The following table shows which direct linear solvers are supported for each simulator type, as well as which direct linear solver is chosen by the Automatic method for each simulator type.
Simulator | MFLU Supported | HMLU Supported | Automatic |
---|---|---|---|
“Driven Frequency (RF3p)” Ports Only Solve | ✓ | X | MFLU |
“Driven Frequency (RF3p)” Full Solve | ✓ | ✓ |
MFLU or HMFU based on problem size |
“2D/3D Eigenmode (OM2p/OM3p)” | ✓ | ✓ |
MFLU or HMFU based on problem size and system characteristics |
“3D Electrostatics (ES3p)” | ✓ | ✓ |
MFLU |
“2D/3D Magnetostatics (MS2p/MS3p)” | ✓ | X |
MFLU |
Please send email to awr.support@cadence.com if you would like to provide feedback on this article. Please make sure to include the article link in the email.