Full Text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

In natural language processing, word sense disambiguation (WSD) continues to be a major difficulty, especially for low-resource languages where linguistic variation and a lack of data make model training and evaluation more difficult. The goal of this comprehensive review and meta-analysis of the literature is to summarize the body of knowledge regarding WSD techniques for low-resource languages, emphasizing the advantages and disadvantages of different strategies. A thorough search of several databases for relevant literature produced articles assessing WSD methods in low-resource languages. Effect sizes and performance measures were extracted from a subset of trials through analysis. Heterogeneity was evaluated using pooled effect and estimates were computed by meta-analysis. The preferred reporting elements for systematic reviews and meta-analyses (PRISMA) were used to develop the process for choosing the relevant papers for extraction. The meta-analysis included 32 studies, encompassing a range of WSD methods and low-resourced languages. The overall pooled effect size indicated moderate effectiveness of WSD techniques. Heterogeneity among studies was high, with an I2 value of 82.29%, suggesting substantial variability in WSD performance across different studies. The (τ2) tau value of 5.819 further reflects the extent of between-study variance. This variability underscores the challenges in generalizing findings and highlights the influence of diverse factors such as language-specific characteristics, dataset quality, and methodological differences. The p-values from the meta-regression (0.454) and the meta-analysis (0.440) suggest that the variability in WSD performance is not statistically significantly associated with the investigated moderators, indicating that the performance differences may be influenced by factors not fully captured in the current analysis. The absence of significant p-values raises the possibility that the problems presented by low-resource situations are not yet well addressed by the models and techniques in use.

Details

Title
Word Sense Disambiguation for Morphologically Rich Low-Resourced Languages: A Systematic Literature Review and Meta-Analysis
Author
Hlaudi, Daniel Masethe 1   VIAFID ORCID Logo  ; Mosima, Anna Masethe 2   VIAFID ORCID Logo  ; Ojo, Sunday Olusegun 3   VIAFID ORCID Logo  ; Giunchiglia, Fausto 4   VIAFID ORCID Logo  ; Pius Adewale Owolawi 5   VIAFID ORCID Logo 

 Department of Data Science, Faculty of Information Communication Technology, Tshwane University of Technology, Pretoria 0001, South Africa; [email protected] 
 Department of Computer Science and Information Technology, School of Science and Technology, Sefako Makgatho Health Sciences University, Ga-Rankuwa 0208, South Africa 
 Department of Information Technology, Faculty of Accounting and Informatics, Durban University of Technology, Durban 4001, South Africa; [email protected] 
 Department of Information Engineering and Computer Science, Faculty of Information Communication Technology, University of Trento, 38100 Trento, Italy; [email protected] 
 Department of Computer Systems Engineering, Faculty of Information Communication Technology, Tshwane University of Technology, Pretoria 0001, South Africa; [email protected] 
First page
540
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
20782489
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3110511189
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.