Failure Detection in HP Clusters and Computers Using Chaotic- Instruction Maps

The invention relates to high-performance computing systems and more specifically to software that can predict hardware faults. Our software system diagnoses a wide range of hardware faults in high-performance computing systems including logic-circuit failures in Arithmetic and Logic Units, element errors in memory units, and transmission and circuit errors in interconnects. It is suitable for diagnosing large computing clusters, server farms and supercomputers, which employ thousands or more processor cores, memory units and interconnect links.

Computer Science and Mathematics Div
Oak Ridge National Laboratory
Oak Ridge National Laboratory
