TACC Ranger Tackles Facial Recognition Scaling to 32,768 Cores...So Far
Aaron Dubrow reports on the work of Rob Farber and Harold Trease, both with the Pacific Northwest National Laboratory (PNNL), using Ranger at the Texas Advanced Computing Center (TACC) to demonstrate the ability to create searchable databases based on image recognition with massive amounts of data rather than text tagging.
“We want to take camera pictures, individual frames, or moving videos from webcams or YouTube, that don’t have any special annotations, and ask the question: ‘Have we seen this person’s face before?’” Farber explained. “That’s a huge data-volume, low information-content kind of video stream and it’s completely unstructured.”
Because Farber intends to utilize data sets thousands of times larger, he needs to get the maximum performance out of the massively parallel supercomputers he uses. He does this by utilizing compiler intrinsic operations that allow direct access to the processor’s SSE assembly instructions, enabling him to coax four flops per clock cycle per core — the theoretical peak performance — from each AMD Opteron Barcelona core on the floating point intensive part of his code, while scaling in a near-linear fashion up to 32,768 cores. His approach involves optimizing his code for Ranger's architecture and minimizing the communications among nodes.
“It takes people who are cognizant of both the algorithms plus the runtime and communications behavior of their algorithms, to scale successfully on massively parallel systems,” Farber said. “For Ranger to get to four flops per clock, I had to rewrite some of the code to use the compiler’s SSE intrinsic operations - basically using the assembly language instructions. That really lit Ranger on fire.”
Using a test set of pictures under varying conditions and with known identities, Farber and Trease correctly identified individuals with 99.8% accuracy, matching 1,998 out of 2,000 faces, Dubrow writes. In addition, Farber was able to maximize on-node performance on Ranger with near-linear scaling to achieve some of the best performance currently seen on high-performance computers, he adds.
As to how this level of performance might be applied, Dubrow notes that cancer scans, security surveillance, and satellite imaging may all be improved through these real-time image-detection methods and algorithms. Plus, Farber’s massively parallel mapping of this problem to Ranger has proved to work extremely well in adapting other computational problems to massively parallel machines like Ranger.
The resident HPC specialists at TACC, always keen to get the most out of Ranger, are very interested in Farber's performance successes. "What I find fascinating is how Farber’s code is able to get the maximum speed and performance out of Ranger and apply it to something interesting and socially relevant," Lars Koesterke, TACC research associate, said.
According to Dubrow, Farber's research has the potential to open up video to indexing capabilities similar to the way search engines like Google have opened up text for searching. With ever-expanding amounts of data available for analysis – from YouTube videos to deep-space scans to seismic sensor data.
[...read more...]
Customized news reports about Sun Microsystems. Just the news you need, none of what you don't. 50,000+ Members. 20,000+ Articles Published since 1998.