OpenCL is a general-purpose programming framework forheterogeneous computing platforms,however,due to the differences in hardware architecture,how to achieve performance portability on different platforms based on the function portability is still to be studied.Currently most of the researches on algorithm optimization are aimed at a single hardware platform,and difficult to achieve the efficient running on different platforms.This paper analysed the differences between the underlying hardware architectures of GPU,and studied the effects of different GPU platforms using different optimization methods on performance from the access efficiency of global memory,full use of the GPU compute resource,the constraints with hardware resource and other aspects.Based on this,the Laplace image enhancement algorithm based on OpenCL was implemented.Experimental results show that optimized algorithm gets 3.7~136.1times and 56.7 times on average speedup(without calculate the data transfer time) on both AMD and NVIDIA GPU,and the performance of the optimized kernel increases 12.3%~346.7% and 143.1% on average than the CUDA version in NVIDIA NPP library,which verifies the effectiveness and cross-platform ability of optimization methods.