Decentralized On-line Task Reallocation on Parallel Computing Architectures with Safety-Critical Applications
This work presents a decentralized allocation algorithm of safety-critical application on parallel computing architectures, where individual Computational Units can be affected by faults. The described method consists in representing the architecture by an abstract graph where each node represents a Computational Unit. Applications are also represented by the graph of Computational Units they require for execution. The problem is then to decide how to allocate Computational Units to applications to guarantee execution of the safety-critical application. The problem is formulated as an optimization problem, with the form of an Integer Linear Program. A state-of-the-art solver is then used to solve the problem. Decentralizing the allocation process is achieved through redundancy of the allocator executed on the architecture. No centralized element decides on the allocation of the entire architecture, thus improving the reliability of the system. Experimental reproduction of a multi-core architecture is also presented. It is used to demonstrate the capabilities of the proposed allocation process to maintain the operation of a physical system in a decentralized way while individual component fails.
READ FULL TEXT