ISCAS OpenIR
a comparative study of absent features and unobserved values in software effort data
Zhang Wen; Yang Ye; Wang Qing
2012
SourceInternational Journal of Software Engineering and Knowledge Engineering
ISSN0218-1940
Volume22Issue:2Pages:185-202
English AbstractSoftware effort data contains a large amount of missing values of project attributes. The problem of absent features, which occurred recently in machine learning, is often neglected by researchers of software engineering when handling the missingness in software effort data. In essence, absent features (structural missingness) and unobserved values (unstructured missingness) are different cases of missingness although their appearance in the data set are the same. This paper attempts to clarify the root cause of missingness of software effort data. When regarding missingness as absent features, we develop Max-margin regression to predict real effort of software projects. When regarding missingness as unobserved values, we use existing imputation techniques to impute missing values. Then, Ε -SVR is used to predict real effort of software projects with the input data sets. Experiments on ISBSG (International Software Benchmarking Standard Group) and CSBSG (Chinese Software Benchmarking Standard Group) data sets demonstrate that, with the tasks of effort prediction, the treatment regarding missingness in software effort data set as unobserved values can produce more desirable performance than that of regarding missingness as absent features. This paper is the first to introduce the concept of absent features to deal with missingness of software effort data. © 2012 World Scientific Publishing Company.; Software effort data contains a large amount of missing values of project attributes. The problem of absent features, which occurred recently in machine learning, is often neglected by researchers of software engineering when handling the missingness in software effort data. In essence, absent features (structural missingness) and unobserved values (unstructured missingness) are different cases of missingness although their appearance in the data set are the same. This paper attempts to clarify the root cause of missingness of software effort data. When regarding missingness as absent features, we develop Max-margin regression to predict real effort of software projects. When regarding missingness as unobserved values, we use existing imputation techniques to impute missing values. Then, Ε -SVR is used to predict real effort of software projects with the input data sets. Experiments on ISBSG (International Software Benchmarking Standard Group) and CSBSG (Chinese Software Benchmarking Standard Group) data sets demonstrate that, with the tasks of effort prediction, the treatment regarding missingness in software effort data set as unobserved values can produce more desirable performance than that of regarding missingness as absent features. This paper is the first to introduce the concept of absent features to deal with missingness of software effort data. © 2012 World Scientific Publishing Company.
Indexed TypeEI ; SCI
KeywordForecasting Software Engineering
Department(1) Laboratory for Internet Software Technologies Institute of Software Chinese Academy of Sciences Beijing 100190 China
SubjectComputer Science ; Engineering
SponsorshipNational Natural Science Foundation of China 60903050, 71101138; Beijing Natural Science Fund 4122087; Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry
Language英语
WOS IDWOS:000304829200003
Citation statistics
Content Type期刊论文
URIhttp://ir.iscas.ac.cn/handle/311060/14914
Collection中国科学院软件研究所
Recommended Citation
GB/T 7714
Zhang Wen,Yang Ye,Wang Qing. a comparative study of absent features and unobserved values in software effort data[J]. International Journal of Software Engineering and Knowledge Engineering,2012,22(2):185-202.
APA Zhang Wen,Yang Ye,&Wang Qing.(2012).a comparative study of absent features and unobserved values in software effort data.International Journal of Software Engineering and Knowledge Engineering,22(2),185-202.
MLA Zhang Wen,et al."a comparative study of absent features and unobserved values in software effort data".International Journal of Software Engineering and Knowledge Engineering 22.2(2012):185-202.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Zhang Wen]'s Articles
[Yang Ye]'s Articles
[Wang Qing]'s Articles
Baidu academic
Similar articles in Baidu academic
[Zhang Wen]'s Articles
[Yang Ye]'s Articles
[Wang Qing]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zhang Wen]'s Articles
[Yang Ye]'s Articles
[Wang Qing]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.