Inicio  /  Future Internet  /  Vol: 13 Par: 1 (2021)  /  Artículo
ARTÍCULO
TITULO

Using Machine Learning for Web Page Classification in Search Engine Optimization

Goran Mato?evic    
Jasminka Dob?a and Dunja Mladenic    

Resumen

This paper presents a novel approach of using machine learning algorithms based on experts? knowledge to classify web pages into three predefined classes according to the degree of content adjustment to the search engine optimization (SEO) recommendations. In this study, classifiers were built and trained to classify an unknown sample (web page) into one of the three predefined classes and to identify important factors that affect the degree of page adjustment. The data in the training set are manually labeled by domain experts. The experimental results show that machine learning can be used for predicting the degree of adjustment of web pages to the SEO recommendations?classifier accuracy ranges from 54.59% to 69.67%, which is higher than the baseline accuracy of classification of samples in the majority class (48.83%). Practical significance of the proposed approach is in providing the core for building software agents and expert systems to automatically detect web pages, or parts of web pages, that need improvement to comply with the SEO guidelines and, therefore, potentially gain higher rankings by search engines. Also, the results of this study contribute to the field of detecting optimal values of ranking factors that search engines use to rank web pages. Experiments in this paper suggest that important factors to be taken into consideration when preparing a web page are page title, meta description, H1 tag (heading), and body text?which is aligned with the findings of previous research. Another result of this research is a new data set of manually labeled web pages that can be used in further research.

 Artículos similares

       
 
Samiulhaq Wasiq and Amir Golroo    
Road networks play a significant role in each country?s economy, especially in countries such as Afghanistan, which is strategically located in the international transit path from Europe to East Asia. In such a country, pavement performance models are fu... ver más
Revista: Infrastructures

 
Tapan Chatterjee, Usha Rani Gogoi, Animesh Samanta, Ayan Chatterjee, Mritunjay Kumar Singh and Srinivas Pasupuleti    
Groundwater quality is one of the major concerns. Quality of the groundwater directly impacts human health, growth of plants and vegetables. Due to the severe impacts of inadequate water quality, it is imperative to find a swift and economical solution. ... ver más
Revista: Water

 
Kenneth David Strang    
A critical worldwide problem is that ransomware cyberattacks can be costly to organizations. Moreover, accidental employee cybercrime risk can be challenging to prevent, even by leveraging advanced computer science techniques. This exploratory project us... ver más

 
Qingyan Wang, Longzhi Sun and Xuan Yang    
Rice yield is essential to global food security under increasingly frequent and severe climate change events. Spatial analysis of rice yields becomes more critical for regional action to ensure yields and reduce climate impacts. However, the understandin... ver más

 
Sachin Gowda, Vaishakh Kunjar, Aakash Gupta, Govindaswamy Kavitha, Bishnu Kant Shukla and Parveen Sihag    
In the realm of urban geotechnical infrastructure development, accurate estimation of the California Bearing Ratio (CBR), a key indicator of the strength of unbound granular material and subgrade soil, is paramount for pavement design. Traditional labora... ver más
Revista: Urban Science