基于多种技术的混合式程序代码抄袭检测方法.pdf
文本预览下载声明
222 2016 ,52(18) Computer Engineering and Applications 计算机工程与应用
基于多种技术的混合式程序代码抄袭检测方法
杨 超
YANG Chao
合肥学院 基础教学与实验中心,合肥 230601
Basic Teaching and Experimental Department, Hefei University, Hefei 230601, China
YANG Chao. Hybrid plagiarism detection method in program code based on multiple techniques. Computer Engi-
neering and Applications, 2016, 52 (18):222-227.
Abstract :Based on analyzing characteristics and drawbacks of the existing plagiarism detection system in program code,
a hybrid plagiarism detection method combining text analysis, structure metrics and attribute counting is proposed. Firstly,
the document fingerprinting technology and Winnowing algorithm are used to compute text similarity. Secondly, the program
code is translated to a Dynamic Control Structure tree (DCS ), and Winnowing algorithm is applied to estimate the DCS
tree similarity which is structural similarity also. Then each variable information in code is collected and counted. The
variable similarity algorithm is applied to analyze variable information node and get variable similarity. Finally, the text
similarity, structural similarity and variable similarity are assigned a weight to compute the total code similarity. The experi-
mental results show that the proposed method can effectively detect all kinds of plagiarism. To the different threshold values,
the accuracy and the recall ratio of test results are higher than JPLAG system. Especially for the simple structure in program
code, the average accuracy of testing results of the method and JPLAG system are 82.5% and 69.5% respectively. Conse-
quently it shows that the proposed method is more effective.
Key words :plagiarism detection; similarity; Winnowing algorithm; structure metrics; attribute counting
摘 要:在分析现有程序代码抄袭检测系统的特点及局限性的基础上,提出一种综合文本分析、结构度量和属性计数
技术的混合式程序抄袭检测方法。应用文档指纹技术和Winnowing 算法计算程序的文本相似度;将程序代码表示成
动态控制结构树(Dynamic Control Structure tree ,DCS ),运用Winnowing 算法计算DCS 树相
显示全部