ISCAS OpenIR
a comparative study of absent features and unobserved values in software effort data
Zhang Wen; Yang Ye; Wang Qing
2012
发表期刊International Journal of Software Engineering and Knowledge Engineering
ISSN0218-1940
卷号22期号:2页码:185-202
摘要Software effort data contains a large amount of missing values of project attributes. The problem of absent features, which occurred recently in machine learning, is often neglected by researchers of software engineering when handling the missingness in software effort data. In essence, absent features (structural missingness) and unobserved values (unstructured missingness) are different cases of missingness although their appearance in the data set are the same. This paper attempts to clarify the root cause of missingness of software effort data. When regarding missingness as absent features, we develop Max-margin regression to predict real effort of software projects. When regarding missingness as unobserved values, we use existing imputation techniques to impute missing values. Then, Ε -SVR is used to predict real effort of software projects with the input data sets. Experiments on ISBSG (International Software Benchmarking Standard Group) and CSBSG (Chinese Software Benchmarking Standard Group) data sets demonstrate that, with the tasks of effort prediction, the treatment regarding missingness in software effort data set as unobserved values can produce more desirable performance than that of regarding missingness as absent features. This paper is the first to introduce the concept of absent features to deal with missingness of software effort data. © 2012 World Scientific Publishing Company.; Software effort data contains a large amount of missing values of project attributes. The problem of absent features, which occurred recently in machine learning, is often neglected by researchers of software engineering when handling the missingness in software effort data. In essence, absent features (structural missingness) and unobserved values (unstructured missingness) are different cases of missingness although their appearance in the data set are the same. This paper attempts to clarify the root cause of missingness of software effort data. When regarding missingness as absent features, we develop Max-margin regression to predict real effort of software projects. When regarding missingness as unobserved values, we use existing imputation techniques to impute missing values. Then, Ε -SVR is used to predict real effort of software projects with the input data sets. Experiments on ISBSG (International Software Benchmarking Standard Group) and CSBSG (Chinese Software Benchmarking Standard Group) data sets demonstrate that, with the tasks of effort prediction, the treatment regarding missingness in software effort data set as unobserved values can produce more desirable performance than that of regarding missingness as absent features. This paper is the first to introduce the concept of absent features to deal with missingness of software effort data. © 2012 World Scientific Publishing Company.
收录类别EI ; SCI
关键词Forecasting Software Engineering
部门归属(1) Laboratory for Internet Software Technologies Institute of Software Chinese Academy of Sciences Beijing 100190 China
学科领域Computer Science ; Engineering
资助者National Natural Science Foundation of China 60903050, 71101138; Beijing Natural Science Fund 4122087; Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry
语种英语
WOS记录号WOS:000304829200003
引用统计
内容类型期刊论文
URI标识http://ir.iscas.ac.cn/handle/311060/14914
专题中国科学院软件研究所
推荐引用方式
GB/T 7714
Zhang Wen,Yang Ye,Wang Qing. a comparative study of absent features and unobserved values in software effort data[J]. International Journal of Software Engineering and Knowledge Engineering,2012,22(2):185-202.
APA Zhang Wen,Yang Ye,&Wang Qing.(2012).a comparative study of absent features and unobserved values in software effort data.International Journal of Software Engineering and Knowledge Engineering,22(2),185-202.
MLA Zhang Wen,et al."a comparative study of absent features and unobserved values in software effort data".International Journal of Software Engineering and Knowledge Engineering 22.2(2012):185-202.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zhang Wen]的文章
[Yang Ye]的文章
[Wang Qing]的文章
百度学术
百度学术中相似的文章
[Zhang Wen]的文章
[Yang Ye]的文章
[Wang Qing]的文章
必应学术
必应学术中相似的文章
[Zhang Wen]的文章
[Yang Ye]的文章
[Wang Qing]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。