Towards Cloud Computing with Sun Grid Engine 6.2 is a podcast in which Sun Product Line Manager Miha Ahronovitz interviews Sun Grid Architect Daniel Templeton. The result is a clarification of the scope and meaning of cloud computing and a demonstration of how easy it is to get started with Sun Grid Engine 6.2.
Templeton explains the Sun Grid engine, calling it a distributed resource management (DRM) system able to match workload to available resources, resulting in utilization rates as high as 95% and more, better workload throughput, and higher user productivity all within a significantly lower timeframe for the completion of user tasks.
Sun Grid Engine can resolve resource conflicts consistent with business policy and has proven its worth in such fields as financial services, oil and gas, product development, biotechnology, visual effects and film production, automotive, manufacturing, research, and education. Users include entities ranging from the HPC installation at Texas Advanced Computer Center to Austrailia's Rising Sun Pictures, where Harry Potter and the Half Blood Prince and other major feature films are in production.
With user demands ranging ever closer to the petaFLOP range as typical, the ability of Sun Grid Engine 6.2 to enable scalability to meet those demands. The multi-clustering feature of Service Domain Manager enables horizontal scaling through additional machines and laterally through additional clusters, so vertical scalability alone is not all Sun Grid Engine can accommodate.
As the interview reveals, lateral scalability allows customers to solve a new class of problems that require individual clusters to work independently while continuing to maximize resource utilization by sharing resources among the clusters.
Templeton explains how Sun Grid Engine 6.2 managed to scale up to 63,000 cores, noting that this was accomplished by taking the scheduler component from being its own process to being a thread in the qmaster; streamlining the communications protocols; reducing the amount of data traffic
generated by resource monitoring; overhauling the utility computing framework and the
interactive and parallel job framework; parallelizing the way the qmaster handles incoming requests; reducing qmaster startup time; optimizing the scheduling algorithm and applying tighter memory
management. "The net result is unsurpassed scalability that lets our customers run a
production compute cluster with sixty-three thousand cores under a single qmaster that processes parallel jobs with tens of thousands of tasks," he says.
Other notable features of Sun Grid Engine 6.2 include advance reservation, which allows users to secure
needed resources for a fixed time period and to coordinate the availability of compute resources with external factors like people, equipment, and/or facility availability. Optimal resource utilization, in effect.
Also mentioned is array task dependencies that, combined with the job dependencies already offered by the Sun Grid Engine software, enables users to better parallelize their workloads to achieve quicker turn-around time and reduce time to market.array task dependencies is an important feature for visual effects companies using the Sun Grid Engine software to handle the rendering of animated videos and visual effects.
Another important feature is the support available for the Solaris Service Management Framework (SMF) and Sun Service Tags, both of which make for easier management of the Sun Grid Engine 6.2 software when running on machines installed with the Solaris Operating Environment. (Sun Service Tags are also available for Linux, the interview notes.)
More information is available at the Sun Grid Engine page, where you will find lots of great information about Sun Grid Engine, the 6.2 release, and the Grid Engine
community. Templeton and Ahronovitz also encourage listeners to check out the links to the Sun Grid Engine blogs, which include blogs from members of the Sun Grid Engine product team, including the product engineers.Here, one can find even more information about what you'll get in the new 6.2 release and why you'll want to upgrade.
[...read more...]