A New Approach to Load Balance for Parallel/Compositional Simulation Based on Reservoir-Model Overdecomposition
- Yuhe Wang (Texas A&M University) | John E. Killough (Texas A&M University)
- Document ID
- Society of Petroleum Engineers
- SPE Journal
- Publication Date
- April 2013
- Document Type
- Journal Paper
- 304 - 315
- 2013. Society of Petroleum Engineers
- 2 in the last 30 days
- 286 since 2007
- Show more detail
- View rights & permissions
|SPE Member Price:||USD 10.00|
|SPE Non-Member Price:||USD 30.00|
The quest for efficient and scalable parallel reservoir simulators has beenevolving with the advancement of high-performance computing architectures.Among the various challenges of efficiency and scalability, load imbalance is amajor obstacle that has not been fully addressed and solved. The causes of loadimbalance in parallel reservoir simulation are both static and dynamic. Robustgraph-partitioning algorithms are capable of handling static load imbalance bydecomposing the underlying reservoir geometry to distribute a roughly equalload to each processor. However, these loads that are determined by a staticload balancer seldom remain unchanged as the simulation proceeds in time. Thisso-called dynamic imbalance can be exacerbated further in parallelcompositional simulations. The flash calculations for equations of state (EOSs)in complex compositional simulations not only can consume more than half of thetotal execution time but also are difficult to balance merely by a static loadbalancer. The computational cost of flash calculations in each gridblockheavily depends on the dynamic data such as pressure, temperature, andhydrocarbon composition. Thus, any static assignment of gridblocks may lead todynamic load imbalance in unpredictable manners. A dynamic load balancer canoften provide solutions for this difficulty. However, traditional techniquesare inflexible and tedious to implement in legacy reservoir simulators. In thispaper, we present a new approach to address dynamic load imbalance in parallelcompositional simulation. It overdecomposes the reservoir model to assign eachprocessor a bundle of subdomains. Processors treat these bundles of subdomainsas virtual processes or user-level migratable threads which can be dynamicallymigrated across processors in the run-time system. This technique is shown tobe capable of achieving better overlap between computation and communicationfor cache efficiency. We use this approach in a legacy reservoir simulator anddemonstrate a reduction in the execution time of parallel compositionalsimulations while requiring minimal changes to the source code. Finally, it isshown that domain overdecomposition, together with a load balancer, can improvespeedup from 29.27 to 62.38 on 64 physical processors for a realisticsimulation problem.
|File Size||1 MB||Number of Pages||12|
Adaptive MPI Manual. V 1.0. Parallel Programming Laboratory, University ofIllinois at Urbana-Champaign.
Anguille, L., Killough, J.E., Li, T.M.C. et al. 1995. Static and DynamicLoad-Balancing Strategies for Parallel Reservoir Simulation. Paper SPE 29102presented at the SPE Symposium on Reservoir Simulation, San Antonio, Texas,12-15 February. http://dx.doi.org/10.2118/29102-MS.
Appleyard, J.R., Appleyard, J.D., Wakefield, M.A. et al. 2011. AcceleratingReservoir Simulators Using GPU Technology. Paper SPE 141402 presented at theSPE Reservoir Simulation Symposium, The Woodlands, Texas, 21-23 February. http://dx.doi.org/10.2118/141402-MS.
Bayat, M. and Killough, J. 2013. An Experimental Study of GPU Accelerationfor Reservoir Simulation. Paper SPE 163628 presented at the SPE ReservoirSimulation Symposium, The Woodlands, Texas, 18-20 February. http://dx.doi.org/10.2118/163628-MS.
Bohm, E., Bhatele, A., Kale, L.V. et al. 2008. Fine-Grained Parallelizationof the CaParrinello ab initio MD Method on Blue Gene/L. IBM J. of Researchand Development: Applications of Massively Parallel Systems 52([1/2]): 159-174.
Cao, H., Tchelepi, H.A., Wallis, J.R. et al. 2005. Parallel ScalableUnstructured CPR-Type Linear Solver for Reservoir Simulation. Paper SPE 96809presented at the SPE Annual Technical Conference and Exhibition, Dallas, Texas,9-12 October. http://dx.doi.org/10.2118/96809-MS.
Dean, R.H. and Lo, L.L. 1988. Simulation of Naturally Fractured Reservoirs.SPE Res Eng 3 (2): 633-648. http://dx.doi.org/10.2118/14110-PA.
Foltinek, D., Eaton, D., Mahovsky, J. et al. 2009. Industrial-Scale ReverseTime Migration on GPU Hardware. SEG Annual Meeting, Extended Abstract,2009-2789.
Fung, L.S.K. and Dogru, Ali H. 2007. Parallel Unstructured Solver Methodsfor Complex Giant Reservoir Simulation. Paper SPE 106237 presented at the SPEReservoir Simulation Symposium, Houston, Texas, 26-28 February. http://dx.doi.org/10.2118/106237-MS.
Huang, C., Lawlor, O., and Kale, L.V. 2003. Adaptive MPI. InProceedings of the 16th International Workshop on Languages andCompilers for Parallel Computing, College Station, Texas, 2-4October. http://dx.doi.org/10.1007/978-3-540-24644-2_20.
Huang, C., Zheng, G., Kumar, S. et al. 2006. Performance Evaluation ofAdaptive MPI. In Proceedings of the ACM SIGPLAN Symposium on Principles andPractice of Parallel Programming, 29-31 March. http://dx.doi.org/10.1145/1122971.1122976.
Jetley, P., Gioachin, F., Mendes, C. et al. 2008. Massively ParallelCosmological Simulations With ChaNGa. In Proceedings of the IEEEInternational Parallel and Distributed Processing Symposium, 14-18 April.http://dx.doi.org/10.1109/IPDPS.2008.4536319.
Jiao, X., Zheng, G., Lawlor, G. et al. 2005. An Integration Framework forSimulations of Solid Rocket Motors. Paper presented at the 41stAIAA/ASME/SAE/ASEE Joint Propulsion Conference, Tucson, Arizona, 10-13July.
Kale, L.V., Bohm, E., Mendes, C.L. et al. 2008. Programming PetascaleApplications With Charm++ and AMPI in Petascale Computing: Algorithms andApplications, ed. D. Bader, Chapman & Hall/CRC Press, pp. 421-441.
Karypis, G. and Kumar, V. 1999. A Fast and High-Quality Multilevel Schemefor Partitioning Irregular Graphs. SIAM J. Scientific Computing 20 (1): 359-392.
Killough, J.E. and Wheeler, M.F. 1987. Parallel Iterative Equation Solvers:An Investigation of Domain Decomposition Algorithms for Reservoir Simulation.Paper SPE 16021 presented at the SPE Symposium on Reservoir Simulation, SanAntonio, Texas, 1-4 February. http://dx.doi.org/10.2118/16021-MS.
Klie, H., Sudan, H., Li, R. et al. 2011. Exploiting Capabilities of ManyCore Platforms in Reservoir Simulation. Paper SPE 141265 presented at the SPEReservoir Simulation Symposium, The Woodlands, Texas, 21-23 February. http://dx.doi.org/10.2118/141265-MS.
Liu, H., Yu, S., Chen, Z. et al. 2012. Parallel Preconditioners forReservoir Simulation on GPU. Paper SPE 152811 presented at the SPE LatinAmerica and Caribbean Petroleum Engineering Conference, Mexico City, Mexico,16-18 April. http://dx.doi.org/10.2118/152811-MS.
Rodrigues, E.R. 2012. Thread Local Storage Enabled GFortran. PrivateCommunication.
Rodrigues, E.R., Navaux, P.O.A., Paneta, J. et al. 2010a. A ComparativeAnalysis of Load Balancing Algorithms Applied to a Weather Forecast Model.Proceeding SBAC-PAD '10 in Proceedings of the 2010 22nd InternationalSymposium on Computer Architecture and High-Performance Computing,Petrópolis, Rio de Janeiro, Brazil. 27-30 October. http://dx.doi.org/10.1109/SBAC-PAD.2010.18.
Rodrigues, E.R., Navaux, P.O.A., Panetta, J. et al. 2010b. A New Techniquefor Data Privatization in User-level Threads and Its Use in ParallelApplications. Paper presented at the ACM 25th Symposium on AppliedComputing (SAC), Sierre, Switzerland, 22-26 March. http://dx.doi.org/10.11451774088.1774540.
Rodrigues, E.R., Navaux, P.O.A., Paneta, J. et al. 2010c. Optimizing an MPIWeather Forecasting Model via Processor Virtualization. In Proceedings ofHiPC, Dona Paula, India, 19-22 December. http://dx.doi.org/10.1109/HIPC.2010.5713171.
Sherman, A.H. 1992. A Hybrid Approach to Parallel Compositions ReservoirSimulation. Paper OTC 6829 presented at the Offshore Technology Conference,Houston, Texas, 4-7 May. http://dx.doi.org/10.4043/6829-MS.
Shuttleworth, R., Maliassov, S., and Zhou, H. 2009. Partitioners forParallelizing Reservoir Simulations. Paper SPE 119130 presented at the SPEReservoir Simulation Symposium, The Woodlands, Texas, 2-4 February. http://dx.doi.org/10.2118/119130-MS.
Zhang, K., Wu, Y.S., Pruess, K. et al. 2001. Parallel Computing Techniquesfor Large-scale Reservoir Simulation of Multi-component and Multiphase FluidFlow. Paper SPE 66343 presented at the SPE Reservoir Simulation Symposium,Houston, Texas, 11-14 February. http://dx.doi.org/10.2118/66343-MS.
Zheng, G. 2005. Achieving High Performance on Extremely Large ParallelMachines: Performance Prediction and Load Balancing. PhD dissertation,University of Illinois at Urbana-Champaign, Illinois (December 2009).
Zheng, G., Lawlor, G.O.S., and Kale, L.V. 2006. Multiple Flows of Control inMigratable Parallel Programs. Paper presented at the 2006 InternationalConference on Parallel Processing Workshops (ICPPW'06), Columbus, Ohio, 14-18August. http://dx.doi.org/10.1109/ICPPW.2006.58.
Zheng, G., Negara, S., Mendes, C. et al. 2011. Automatic Handling of GlobalVariables for Multi-Threaded MPI programs. Proceeding ICPADS '11 inProceedings of the 2011 IEEE 17th International Conference on Parallel andDistributed Systems, Tainan, Taiwan, 7-9 December. http://dx.doi.org/10.1109/ICPADS.2011.33.