Paper Title
A Method of Similarity Measure with Different Domains' Documents for Semantic Classification using Word2vec and Jaccard Index
Abstract
Recently, people lifestyles have been changing due to ‘fourth industrial revolution’. For this reason, people’s jobs
are rapidly changing. Many developed countries have defined national competency standard, and have been training
professional technical personnel that required companies since they were students. The typical examples are the United
Kingdom’s National Occupational Standards (NOS), and Germany’s ‘Dual System’. In Korea, though nation developed
National Competency Standards, and job dictionary, they used the independent word, and curriculums of universities were
made regardless them. Due to this reason, though the students get a job, they do not know practical business, therefore the
company should train to students extra education. To reduce these costs, the connections needed with job dictionary, NCS,
and curriculum. For this, we suggest matching algorithm using Word2vec and Jaccard Index to connect between different
domains. We compare Doc2vec algorithm and existing documents similarity algorithm based on word2vec, and suggestion
method to prove accuracy. The result was our suggestion algorithm’s accuracy was higher than existing word2vec algorithm
about 17.52%, and doc2vec about 23.85%.
Keywords - Word2vec, Doc2vec, Document Similarity, National Competency Standard, Jaccard Index
Author - Sung-En Kim, Gui-Hyun Baek, Kee-Hong Ahn, Su-Kyoung Kim
Published : Volume-5,Issue-9 ( Sep, 2018 )
DOIONLINE Number - IJAECS-IRAJ-DOIONLINE-13680
View Here
|
|
| |
|
PDF |
| |
Viewed - 47 |
| |
Published on 2018-12-03 |
|