Show simple item record

dc.contributor.authorSorokina, Daria
dc.contributor.authorGehrke, Johannes
dc.contributor.authorWarner, Simeon
dc.contributor.authorGinsparg, Paul
dc.date.accessioned2019-10-24T04:37:59Z
dc.date.available2019-10-24T04:37:59Z
dc.date.created2017-01-05 01:05
dc.date.issued2007-04-04
dc.identifieroai:ecommons.cornell.edu:1813/5743
dc.identifierhttp://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cis/TR2006-2046
dc.identifierhttp://hdl.handle.net/1813/5743
dc.identifier.urihttp://hdl.handle.net/20.500.12424/849038
dc.description.abstractWe describe a large-scale application of methods for finding plagiarism and self-plagiarism in research document collections. The methods are applied to a collection of 284,834 documents collected by arXiv.org over a 14 year period, covering a few different research disciplines. The methodology efficiently detects a variety of problematic author behaviors, and heuristics are developed to reduce the number of false positives. The methods are also efficient enough to implement as a real-time submission screen for a collection many times larger.
dc.format.medium175570 bytes
dc.languageen_US
dc.language.isoeng
dc.publisherCornell University
dc.subjectcomputer science
dc.subjecttechnical report
dc.titlePlagiarism Detection in arXiv
dc.typeTechnical Report
ge.collectioncodeOAIDATA
ge.dataimportlabelOAI metadata object
ge.identifier.legacyglobethics:10446847
ge.identifier.permalinkhttps://www.globethics.net/gel/10446847
ge.lastmodificationdate2017-01-05 01:05
ge.lastmodificationuseradmin@pointsoftware.ch (import)
ge.submissions0
ge.oai.exportid148934
ge.oai.repositoryid600
ge.oai.setnameComputing and Information Science
ge.oai.setnameFaculty of Computing and Information Science
ge.oai.setnameComputing and Information Science Technical Reports
ge.oai.setspeccom_1813_358
ge.oai.setspeccom_1813_7730
ge.oai.setspeccol_1813_5602
ge.oai.streamid2
ge.setnameGlobeEthicsLib
ge.setspecglobeethicslib
ge.linkhttp://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cis/TR2006-2046
ge.linkhttp://hdl.handle.net/1813/5743


This item appears in the following Collection(s)

Show simple item record