Inicio  /  Applied Sciences  /  Vol: 10 Par: 12 (2020)  /  Artículo
ARTÍCULO
TITULO

A Polarity Capturing Sphere for Word to Vector Representation

Sandra Rizkallah    
Amir F. Atiya and Samir Shaheen    

Resumen

Embedding words from a dictionary as vectors in a space has become an active research field, due to its many uses in several natural language processing applications. Distances between the vectors should reflect the relatedness between the corresponding words. The problem with existing word embedding methods is that they often fail to distinguish between synonymous, antonymous, and unrelated word pairs. Meanwhile, polarity detection is crucial for applications such as sentiment analysis. In this work we propose an embedding approach that is designed to capture the polarity issue. The approach is based on embedding the word vectors into a sphere, whereby the dot product between any vectors represents the similarity. Vectors corresponding to synonymous words would be close to each other on the sphere, while a word and its antonym would lie at opposite poles of the sphere. The approach used to design the vectors is a simple relaxation algorithm. The proposed word embedding is successful in distinguishing between synonyms, antonyms, and unrelated word pairs. It achieves results that are better than those of some of the state-of-the-art techniques and competes well with the others.

 Artículos similares

       
 
Xuyang Wang, Yajun Du, Danroujing Chen, Xianyong Li, Xiaoliang Chen, Yongquan Fan, Chunzhi Xie, Yanli Li and Jia Liu    
Domain-generalized few-shot text classification (DG-FSTC) is a new setting for few-shot text classification (FSTC). In DG-FSTC, the model is meta-trained on a multi-domain dataset, and meta-tested on unseen datasets with different domains. However, previ... ver más
Revista: Applied Sciences

 
Li He, Qian Zhang, Jianyong Duan and Hao Wang    
Open-domain event extraction is a fundamental task that aims to extract non-predefined types of events from news clusters. Some researchers have noticed that its performance can be enhanced by improving dependency relationships. Recently, graphical convo... ver más
Revista: Applied Sciences

 
Sardar Parhat, Mutallip Sattar, Askar Hamdulla and Abdurahman Kadir    
In this study, based on a morpheme segmentation framework, we researched a text keyword extraction method for Uyghur, Kazakh and Kirghiz languages, which have similar grammatical and lexical structures. In these languages, affixes and a stem are joined t... ver más
Revista: Information

 
Yao Qin, Yiping Shi, Xinze Hao and Jin Liu    
Microblog is an important platform for mining public opinion, and it is of great value to conduct emotional analysis of microblog texts during the current epidemic. Aiming at the problem that most current emotional classification methods cannot effective... ver más
Revista: Information

 
Musarat Karim, Malik Muhammad Saad Missen, Muhammad Umer, Saima Sadiq, Abdullah Mohamed and Imran Ashraf    
Citation creates a link between citing and the cited author, and the frequency of citation has been regarded as the basic element to measure the impact of research and knowledge-based achievements. Citation frequency has been widely used to calculate the... ver más
Revista: Applied Sciences