中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 软件所图书馆  > 期刊论文
Title:
parallelization and performance optimization on face detection algorithm with opencl: a case study
Author: Wang Weiyan ; Zhang Yunquan ; Yan Shengen ; Zhang Ying ; Jia Haipeng
Keyword: Algorithms ; Optimization
Source: Tsinghua Science and Technology
Issued Date: 2012
Volume: 17, Issue:3, Pages:287-295
Indexed Type: EI
Department: (1) Laboratory of Parallel Software and Computational Science Institute of Software Chinese Academy of Science Beijing 100190 China; (2) State Key Laboratory of Computer Science Institute of Software Chinese Academy of Science Beijing 100190 China; (3) Graduate University Chinese Academy of Sciences Beijing 100190 China; (4) Ocean University of China Qingdao 2661 China
Abstract: Face detect application has a real time need in nature. Although Viola-Jones algorithm can handle it elegantly, today's bigger and bigger high quality images and videos still bring in the new challenge of real time needs. It is a good idea to parallel the Viola-Jones algorithm with OpenCL to achieve high performance across both AMD and NVidia GPU platforms without bringing up new algorithms. This paper presents the bottleneck of this application and discusses how to optimize the face detection step by step from a very nave implementation. Some brilliant tricks and methods like CPU execution time hidden, stubbles usage of local memory as high speed scratchpad and manual cache, and variable granularity were used to improve the performance. Those technologies result in 4-13 times speedup varying with the image size. Furthermore, those ideas may throw on some light on the way to parallel applications efficiently with OpenCL. Taking face detection as an example, this paper also summarizes some universal advice on how to optimize OpenCL program, trying to help other applications do better on GPU. © 2012 Tsinghua University Press.
English Abstract: Face detect application has a real time need in nature. Although Viola-Jones algorithm can handle it elegantly, today's bigger and bigger high quality images and videos still bring in the new challenge of real time needs. It is a good idea to parallel the Viola-Jones algorithm with OpenCL to achieve high performance across both AMD and NVidia GPU platforms without bringing up new algorithms. This paper presents the bottleneck of this application and discusses how to optimize the face detection step by step from a very nave implementation. Some brilliant tricks and methods like CPU execution time hidden, stubbles usage of local memory as high speed scratchpad and manual cache, and variable granularity were used to improve the performance. Those technologies result in 4-13 times speedup varying with the image size. Furthermore, those ideas may throw on some light on the way to parallel applications efficiently with OpenCL. Taking face detection as an example, this paper also summarizes some universal advice on how to optimize OpenCL program, trying to help other applications do better on GPU. © 2012 Tsinghua University Press.
Language: 英语
Content Type: 期刊论文
URI: http://ir.iscas.ac.cn/handle/311060/15016
Appears in Collections:软件所图书馆_期刊论文

Files in This Item:

There are no files associated with this item.


Recommended Citation:
Wang Weiyan,Zhang Yunquan,Yan Shengen,et al. parallelization and performance optimization on face detection algorithm with opencl: a case study[J]. Tsinghua Science and Technology,2012-01-01,17(3):287-295.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[Wang Weiyan]'s Articles
[Zhang Yunquan]'s Articles
[Yan Shengen]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[Wang Weiyan]‘s Articles
[Zhang Yunquan]‘s Articles
[Yan Shengen]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2020  中国科学院软件研究所 - Feedback
Powered by CSpace