On a simplified approach to achieve parallel performance and portability across CPU and GPU architectures Article 7388 words 5 votes