Institutional Repository
| a comparative study of absent features and unobserved values in software effort data | |
| Zhang Wen; Yang Ye; Wang Qing | |
| 2012 | |
| Source | International Journal of Software Engineering and Knowledge Engineering
![]() |
| ISSN | 0218-1940 |
| Volume | 22Issue:2Pages:185-202 |
| English Abstract | Software effort data contains a large amount of missing values of project attributes. The problem of absent features, which occurred recently in machine learning, is often neglected by researchers of software engineering when handling the missingness in software effort data. In essence, absent features (structural missingness) and unobserved values (unstructured missingness) are different cases of missingness although their appearance in the data set are the same. This paper attempts to clarify the root cause of missingness of software effort data. When regarding missingness as absent features, we develop Max-margin regression to predict real effort of software projects. When regarding missingness as unobserved values, we use existing imputation techniques to impute missing values. Then, Ε -SVR is used to predict real effort of software projects with the input data sets. Experiments on ISBSG (International Software Benchmarking Standard Group) and CSBSG (Chinese Software Benchmarking Standard Group) data sets demonstrate that, with the tasks of effort prediction, the treatment regarding missingness in software effort data set as unobserved values can produce more desirable performance than that of regarding missingness as absent features. This paper is the first to introduce the concept of absent features to deal with missingness of software effort data. © 2012 World Scientific Publishing Company.; Software effort data contains a large amount of missing values of project attributes. The problem of absent features, which occurred recently in machine learning, is often neglected by researchers of software engineering when handling the missingness in software effort data. In essence, absent features (structural missingness) and unobserved values (unstructured missingness) are different cases of missingness although their appearance in the data set are the same. This paper attempts to clarify the root cause of missingness of software effort data. When regarding missingness as absent features, we develop Max-margin regression to predict real effort of software projects. When regarding missingness as unobserved values, we use existing imputation techniques to impute missing values. Then, Ε -SVR is used to predict real effort of software projects with the input data sets. Experiments on ISBSG (International Software Benchmarking Standard Group) and CSBSG (Chinese Software Benchmarking Standard Group) data sets demonstrate that, with the tasks of effort prediction, the treatment regarding missingness in software effort data set as unobserved values can produce more desirable performance than that of regarding missingness as absent features. This paper is the first to introduce the concept of absent features to deal with missingness of software effort data. © 2012 World Scientific Publishing Company. |
| Indexed Type | EI ; SCI |
| Keyword | Forecasting Software Engineering |
| Department | (1) Laboratory for Internet Software Technologies Institute of Software Chinese Academy of Sciences Beijing 100190 China |
| Subject | Computer Science ; Engineering |
| Sponsorship | National Natural Science Foundation of China 60903050, 71101138; Beijing Natural Science Fund 4122087; Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry |
| Language | 英语 |
| WOS ID | WOS:000304829200003 |
| Citation statistics | |
| Content Type | 期刊论文 |
| URI | http://ir.iscas.ac.cn/handle/311060/14914 |
| Collection | 中国科学院软件研究所 |
| Recommended Citation GB/T 7714 | Zhang Wen,Yang Ye,Wang Qing. a comparative study of absent features and unobserved values in software effort data[J]. International Journal of Software Engineering and Knowledge Engineering,2012,22(2):185-202. |
| APA | Zhang Wen,Yang Ye,&Wang Qing.(2012).a comparative study of absent features and unobserved values in software effort data.International Journal of Software Engineering and Knowledge Engineering,22(2),185-202. |
| MLA | Zhang Wen,et al."a comparative study of absent features and unobserved values in software effort data".International Journal of Software Engineering and Knowledge Engineering 22.2(2012):185-202. |
| Files in This Item: | There are no files associated with this item. | |||||
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment