中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 软件工程技术研究开发中心  > 学位论文
Title:
基于深度学习的病人相似性度量工具的设计与实现
Author: 倪嘉志
Issued Date: 2017-04
Supervisor: 叶丹
Major: 计算机软件与理论
Degree Grantor: 中国科学院大学
Place of Degree Grantor: 北京
Degree Level: 硕士
Keyword: 病人相似度 ; 深度学习 ; 度量学习 ; 迁移学习
Abstract:

随着医疗卫生服务信息化进程的推进和医疗数据的积累,医疗人工智能已经成为了医疗领域内的研究热点。其中,通过病人门诊、住院、用药及健康等相关数据给出具有临床意义的病人间相似性度量,是在临床决策支持和病人群体识别研究中的一项重要技术。传统方法利用关键词检索、SQL 查询等方式进行衡量,无法有效利用电子健康记录中潜在的大量医疗知识,而基于深度学习的病人相似性度量可以作为其补充。

本文研究病人相似性的度量问题。如何有效地从病人健康数据提取特征,如何对原始特征进行融合,如何获取病人相似性的监督信息,如何利用深度学习合理度量病人间的相似度,都是十分关键的技术问题。已有工作提出了有监督的距离度量学习及专家交互的反馈学习方法来解决上述问题,但在现实医疗场景中通常存在以下几点问题:(1)传统有监督的距离度量学习无法通过非线性变换从医疗语义层面对病人相似性进行刻画;(2)特定的疾病领域中病人样本的数量十分有限,无法有效利用传统度量学习方法得到病人的度量信息;(3)监督信息的获取十分困难,医疗领域的疾病种类繁多,在每个疾病领域中都利用专家来获取监督信息并不可行。

针对上述问题,本文对以下关键技术进行了研究:(1)设计了一种基于电子健康记录的病人特征画像模型,并给出静态特征、离散数值特征和连续数值特征的映射规则;(2)提出一种基于深度学习的病人相似度算法,算法将病人的诊断数据作为监督信息,利用深度学习强大的特征表示能力将病人映射到非线性的Embedding 空间,并在此空间中优化目标函数从而更加准确地刻画病人之间的相似性;(3)提出了一种基于迁移学习的疾病领域间知识迁移算法,算法利用源疾病领域知识在目标疾病领域中选取合适的度量进而刻画其中病人间的相似性。

在上述关键技术的研究基础上,本文将病人相似性度量工具应用于病案检索系统,并对其进行设计与实现。本文利用多标签分类对相似度算法进行评估,实验表明相比原有算法,本文提出的深度学习算法准确率提升 8%,迁移学习算法准确率提升 12.3%,并且两类算法在稳定性方面均有较好的表现。

English Abstract:

With the development of health informationization and the accumulation of health data, medical artificial intelligence has become a research hotspot in medical field. Patient similarity assessment is an important technique in the context of patient cohort identification studies and clinical decision support applications through the data of outpatient, hospital, medical and health. Traditional approaches simply provides keyword-based or SQL query, which cannot make efficient use of the implicit knowledge in EHR (Electronic Health Record) data, but the patient similarity measuring based on deep learning can provide good supplement.

This thesis studies patient similarity measuring. The key technical problems are: How to effectively fuse data and extract features from raw health data? How to compute the supervision information of patient similarity? How to sign a proper patient similarity measure by using deep learning? At present, these problems can be solved by supervised distance metric learning and expert interaction. However, several problems are exposed in current work of patient similarity: (1) The traditional supervised distance metric has a high computational cost and cannot describe the similarity of patients from the medical semantic level. (2) The metric cannot be learned by using the traditional metric learning method due to the limited quantity of cases in certain disease areas. (3) A wide variety of diseases in the medical field make it difficult to obtain supervision information, so it is obviously unrealistic to use physicians to get supervision information in each disease area.

Based on the above problems, this thesis mainly focuses on the research of following techniques: (1) We design a patient representation model based on structured and unstructured data and give mapping rules from original data to feature vector. (2) We propose a patient similarity algorithm based on deep learning, which can map the patient to a nonlinear Embedding space, and optimize the objective function in that space. (3) We propose an algorithm to transfer knowledge based on transfer learning in the domain of disease. The algorithm uses the known source domain knowledge to select the appropriate metric in the target disease domain and then describes the similarity in target domain.

On the basis of the research of the above key technologies, this thesis applies the patient similarity measuring tool to the medical record retrieval system, designs and implements it. In this thesis, the multi label classification is used to evaluate the similarity algorithm. The experimental results show that, the deep learning algorithm has 8% improvement and the transfer learning algorithm has 12.3% improvement in accuracy indicator compared to the original algorithm, and they both have good performance in terms of stability

Content Type: 学位论文
URI: http://ir.iscas.ac.cn/handle/311060/18902
Appears in Collections:软件工程技术研究开发中心 _学位论文

Files in This Item:
File Name/ File Size Content Type Version Access License
毕业论文-2014级硕士-倪嘉志-v40-毕业提交版本.pdf(3683KB)学位论文--限制开放 联系获取全文

description.institution: 中国科学院软件研究所

Recommended Citation:
倪嘉志. 基于深度学习的病人相似性度量工具的设计与实现[D]. 北京. 中国科学院大学. 2017-04-01.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[倪嘉志]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[倪嘉志]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2019  中国科学院软件研究所 - Feedback
Powered by CSpace