Multiclass Classification of Unconstrained Handwritten Arabic Words Using Machine Learning Approaches



Jawad H. AlKhateeb, Jianmin Jiang, Jinchang Ren, Fouad Khelifi, Stan S. Ipson
School of Informatics (EIMC), University of Bradford, BD7 1DP, UK


© AlKhateeb et al.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Kasetsart Agricultural and Agro-Industrial Product Improvement Institute (KAPI), Kasetsart University, 50, Chatuchak, Bangkok, 10900, Thailand; Tel./Fax. 66-2942- 8599; E-mail: aappln@ku.ac.th, p_vaithanomsat@yahoo.com


Abstract

In this paper, we propose and describe efficient multiclass classification and recognition of unconstrained handwritten Arabic words using machine learning approaches which include the K-nearest neighbor (K-NN) clustering, and the neural network (NN). The technical details are presented in terms of three stages, namely preprocessing, feature extraction and classification. Firstly, words are segmented from input scripts and also normalized in size. Secondly, from each of the segmented words various feature extraction methods are introduced. Finally, these features are utilized to train the K-NN and the NN classifiers for classification. In order to validate the proposed techniques, extensive experiments are conducted using the K-NN and the NN. The proposed algorithms are tested on the IFN/ENIT database which contains 32492 Arabic words; the proposed algorithms give good accuracy when compared with other methods.

Keywords: Offline Arabic handwritten recognition, Full word features, Feature extraction, Multi class classification, Machine learning, KNN, NN.