Design an Approach for Finding the Similarity between the Documents

Ms. Shilpa Satone, Prof. Jayant Adhikari, Prof. Jayant Rohankar

PDF

Published: Jun 30, 2016

Ms. Shilpa Satone, Prof. Jayant Adhikari, Prof. Jayant Rohankar

Abstract

Now a days Data Management is very important issue. Data on cloud is very large in size. Web users need tools to manage information easily. If tried to do manually this is cumbersome and time consuming process because there are many near-duplicate results. The efficient detection of near-duplicate articles is very important in many applications that have a large amount of data available for a specific requirement depending upon the task in hand. We are introducing algorithm for extracting key-phrases and matching signatures for near-duplicate articles detection. Based on N-gram (i.e. bigram & trigram) algorithm for key phrase extraction & JACCARD similarity for finding similarity between documents. Algorithms are applied on article and text Documents and result shows that our proposed methods are more effective than other existing method.

How to Cite

, M. S. S. P. J. A. P. J. R. (2016). Design an Approach for Finding the Similarity between the Documents. International Journal on Future Revolution in Computer Science &Amp; Communication Engineering, 2(6), 01–03. Retrieved from http://www.ijfrcsce.org/index.php/ijfrcsce/article/view/33

Issue

Vol. 2 No. 6 (2016): June (2016) Issue

Section

Articles

Design an Approach for Finding the Similarity between the Documents

Abstract

Most read articles by the same author(s)

Contact Us:

Auricle Global Society of Education and Research
Y-18-A, Near Sanskar Play School,
Sudarshana Nagar,
Bikaner. Rajasthan (India).
Pin 334004

Article Sidebar

Main Article Content

Abstract

Article Details

Most read articles by the same author(s)