A RULE-BASED APPROACH TO UNDERSTAND QUESTIONS IN ARABIC QUESTION ANSWERING

(Received: 2016-06-30, Revised: 2016-09-08 , Accepted: 2016-11-09)
Emad Al-Shawakfa,
Research on Arabic Natural Language Processing (NLP) is facing a lot of problems due to language complexity, lack of machine readable resources and lack of interest among Arab researchers. One of the fields that research has started to appear in is the field of Question Answering. Although some research has been done in this area, few have proved to be effective in producing exact relevant answers. One of the issues that affected the accuracy of producing correct answers is proper tagging of entities and proper analysis of a user’s question. In this research, a set of 60+ tagging rules, 15+ Question Analysis rules and 20+ Question Patterns were built to enhance the answer generation of Natural Language Questions posed over some corpora collected from different sources. A QA system was built and experiments showed good results with an accuracy of 78%, a recall of 97% and an F-Measure of 87%.
  1. A. Ezzeldin and M. Shaheen, "A Survey of Arabic Question Answering: Challenges, Tasks, Approaches, Tools and Future Trends," Proc.13th International Arab Conference on Information Technology (ACIT 2012), Paper ID 13106, Zarqa University, Jordan.
  2. C. L. Paris, "Towards More Graceful Interaction: A Survey of Question-Answering Programs," Technical Report, Columbia University, Report no. CUCS-209-85, 1985.
  3. M. R. Kangavari, S. Ghandchi and M. Golpour, "Information Retrieval: Improving Question Answering Systems by Query Reformulation and Answer Validation," Journal of World Academy of Science: Engineering & Technology, pp. 303-310, 2008.
  4. N. Kuchmann-Beauger and M. A. Aufaure, "A Natural Language Interface for Data Warehouse Question Answering," Natural Language Processing and Information Systems, pp. 201-208, 2011.
  5. D. Tufiş, "Natural Language Question Answering in Open Domains," Computer Science Journal of Moldova, vol. 19, no. 2, 2011.
  6. S. Mittal and A. Mittal, "Versatile Question Answering Systems: Seeing in Synthesis," International Journal of Intelligent Information and Database Systems, vol. 5, no. 2, pp. 119-142, 2011.
  7. S. Blair-Goldensohn, K. R. McKeown and A. H. Schlaikjer, "Defscriber: A Hybrid System for Definitional QA," Proc. 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 462-462, 2003.
  8. M. R. Kangavari, S. Ghandchi and M. Golpour, "A New Model for Question Answering Systems," Journal of World Academy of Science: Engineering & Technology, vol. 2, no. 6, pp. 506-513, 2008.
  9. A. Monroy, H. Calvo and A. Gelbukh, "NLP for Shallow Question Answering of Legal Documents Using Graphs," Computational Linguistics and Intelligent Text Processing, pp. 498-508, 2009.
  10. C. Unger and P. Cimiano, "Pythia: Compositional Meaning Construction for Ontology-based Question Answering on the Semantic Web," Natural Language Processing and Information Systems, pp. 153-160, 2011.
  11. F. A. Mohammed, K. Nasse and H. M. Harb, "A Knowledge Based Arabic Question Answering System (AQAS)," ACM SIGART Bulletin, vol. 4, no.4, pp. 21-30, 1993.
  12. B. Hammo, H. Abu-Salem and S. Lytinen, "QARAB: A Question Answering System to Support the Arabic Language," Proc. ACL-02 Workshop on Computational Approaches to Semitic Languages, pp. 1-11, 2002.
  13. Y. Benajiba, P. Rosso and A. Lyhyaoui, "Implementation of the ArabiQA Question Answering System's Components," Proc. Workshop on Arabic Natural Language Processing, 2nd Information Communication Technologies Int. Symposium, ICTIS-2007, Fez, Morroco, 3-5 April 2007.
  14. M. Akour, S. Abufardeh, K. Magel and Q.Al-Radaideh, "QArabPro: A Rule Based Question Answering System for Reading Comprehension Tests in Arabic," American Journal of Applied Sciences, vol. 8, no. 6, pp. 652-661, 2011.
  15. S. Bekhti, A. Rahman, M. Al-Harbi and T. Saba, "AQUASYS: An Arabic Question-Answering System Based on Extensive Question Analysis and Answer Relevance Scoring," International Journal of Academic Research, vol. 3, pp. 45-54, 2011.
  16. O. Trigui, L. H. Belguith, P. Rosso, H. B. Amor and B. Gafsaoui, "Arabic QA4MRE at CLEF 2012: Arabic Question Answering for Machine Reading Evaluation," Proc. CLEF (Online Working Notes/ Labs/ Workshop), 2012.
  17. N. Fareed, H. Mousa and A. Elsisi, "Enhanced Semantic Arabic Question Answering System Based on Khoja Stemmer and AWN," Proc. 9th International Conference on Computer Engineering, (ICENCO-2013), pp. 85-91, 2013.
  18. H. Kurdi, S. Alkhaider and N. Alfaifi, "Development and Evaluation of a Web Based Question Answering System for Arabic Language," International Journal on Natural Language Computing (IJNLC), vol. 3, no. 2, 2014.
  19. V. Guda, S. Sanampudi and L. Manikyamba, "Approaches for Question Answering," International Journal of Engineering Science and Technology (IJEST), vol. 3, no. 2, 2011.
  20. W. Bdour and N. Gharaibeh, "Development of Yes/No Arabic Question Answering System," International Journal of Artificial Intelligence & Applications (IJAIA), vol. 4, no. 1, 2013.
  21. O. Al-Harbi, S. Jusoh and N. Norwaw, "Handling Ambiguity Problems of Natural Language Interfaces for Question Answering," International Journal of Computer Science Issues, vol. 9, no. 3, pp. 17-25, 2012.
  22. A. Azmi and N. Al-Shenaifi, "Handling ‘Why’ Questions in Arabic," Proc. 5th International Conference on Arabic Language Processing, (CITALA 2014), pp. 206-209, 2014.
  23. P. Rosso, Y. Benajiba and A. Lyhyaoui, "Towards an Arabic Question Answering System," Proc. 4th Conference on Scientific Research Outlook & Technology Development in the Arab World (SROIV), Damascus, Syria, 2006.
  24. Y. Benajiba, P. Rosso and J. M. Benedíruiz, "ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy," Computational Linguistics and Intelligent Text Processing, pp. 143-153, 2007.
  25. W. Brini, M. Ellouze, S. Mesfar and L. H. Belguith, "An Arabic Question-Answering System for Factoid Questions," Proc. International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE), pp. 1-7, 2009.
  26. G. Kanaan, A. Hammouri, R. Al-Shalabi and M. Swalha, "A New Question Answering System for the Arabic Language," American Journal of Applied Sciences, vol. 6, no. 4, pp. 797-805, 2009.
  27. O. Trigui, L. H. Belguith and P. Rosso, "DefArabicQA: Arabic Definition Question Answering System," Proc. 7th Workshop on Language Resources and Human Language Technologies for Semitic Languages (LREC), Valletta, Malta, pp. 40-45, 2010.
  28. L. Abouenour, K. Bouzoubaa and P. Rosso, "IDRAAQ: New Arabic Question Answering System Based on Query Expansion and Passage Retrieval," Proc. CLEF 2012 Workshop on Question Answering for Machine Reading Evaluation (QA4MRE), 2012.
  29. A. Ezzeldin, M. Kholief and Y. El-Sonbaty, "ALQASIM: Arabic Language Question Answer Selection in Machines, Information Access Evaluation, Multilinguality, Multimodality and Visualization," Lecture Notes in Computer Science, vol. 8138, pp. 100-103, 2013.
  30. F. Al-Khawaldeh, "Answer Extraction for Why Arabic Question Answering Systems: EWAQ," Proc. World of Computer Science and Information Technology Journal (WCSIT), vol. 5, no. 5, pp. 82-86, 2015.
  31. K. Shaalan and H. Raza, "NERA: Named Entity Recognition for Arabic," Journal of the American Society for Information Science and Technology, vol. 60, no. 8, pp. 1652-1663, 2009.
  32. J. Maloney and M. Niv, "TAGARAB: A Fast, Accurate Arabic Name Recognizer Using High-precision Morphological Analysis," Proc. Workshop on Computational Approaches to Semitic Languages, pp. 8-15, 1998.
  33. K. Shaalan and H. Raza, "Person Name Entity Recognition for Arabic," Proc. 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, pp. 17-24, 2007.
  34. C. Shihadeh and G. Neumann, "ARNE: A Tool for Named Entity Recognition from Arabic Text," Proc. 4th Workshop on Computational Approaches to Arabic Script-based Languages (CAASL4), Located at the 10th Biennial Conference of the Association for Machine Translation in the Americas (AMTA), pp. 24-31, 2012.
  35. Y. Benajiba, M. Diab and P. Rosso, "Using Language Independent and Language Specific Features to Enhance Arabic Named Entity Recognition," Proc. International Arab Journal of Information Technology (IAJIT), vol. 6, no. 5, 2009.
  36. Y. Benajiba, I. Zitouni, M. Diab and P. Rosso, "Arabic Named Entity Recognition: Using Features Extracted from Noisy Data," Proc. ACL 2010 Conference Short Papers, ACLShort, Stroudsburg, PA., pp. 281–285, 2010.
  37. Y. Benajiba and P. Rosso, Arabic Question Answering, Diploma of Advanced Studies, Technical University of Valencia, Spain, 2007.
  38. K. Shaalan, "A Survey of Arabic Named Entity Recognition and Classification," Proc. Computational Linguistics, vol. 40, no. 2, 2014.
  39. Microsoft Arabic Word-Breaker, white paper, http://www.microsoft.com/en-ph/download/confirmation.aspx?id=32828, (accessed June 23rd, 2015).
  40. M. Attia, A. Toral, L. Tounsi, M. Monachini and J. Van Genabith, "An Automatically Built Named Entity Lexicon for Arabic," Proc. 7th Conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta, European Language Resources Association (ELRA), May 2010.
  41. I. Gharaibeh and N. Gharaibeh, "Towards Arabic Noun Phrase Extractor (ANPE) Using Information Retrieval Techniques," Software Engineering, vol. 2, no. 2, pp. 36-42, 2012.
  42. Tenzil.net web site, Quran Corpus from http://tanzil.net/download/ (accessed on June 16th, 2015).
  43. Arabic Stop words https://sourceforge.net/projects/arabicStop words/ (accessed on July 26th, 2015).
  44. M. Al-Nabhan, An Investigation of the Impact of Stop Words Removal and Word Normalization on the Performance of Stem-based Arabic Information Retrieval, Unpublished MSc Thesis, Computer Information Systems Department, Faculty of IT, Yarmouk University, December 2015.