Analysis of the Alignment of the A1-Level Arabic Language Assessment on the AlifBee App with the CEFR Levels
Keywords:
assessment, arabic language, CEFR, digital learning, Modern Standard Arabic (MSA)Abstract
Introduction: This study aims to analyze the alignment of A1-level assessment in the AlifBee Arabic learning application with the Common European Framework of Reference for Languages (CEFR), highlighting the importance of valid level classification in digital language learning. Methods: The research employed a descriptive qualitative approach using document analysis. The data sources consisted of assessment items in AlifBee level A1 and CEFR descriptors, which were compared to identify the degree of alignment between task demands and proficiency descriptors. Results: The findings show that out of 48 assessment items, 33 items, or 69%, are aligned with the A1 level, while 15 items, or 31%, are more appropriately classified at the Pre-A1 level. Misalignment mainly occurs in vocabulary translation tasks, letter arrangement tasks, and oral repetition tasks, which only measure isolated linguistic knowledge rather than communicative competence. Discussion: These findings indicate that the A1 label in AlifBee has not been fully supported by consistent task demands aligned with CEFR descriptors. The study suggests the need for more communicative-based tasks to strengthen the validity of level classification.
Downloads
References
Abu-Zhaya, R., & Arnon, I. (2024). Does Early Unit Size Impact the Formation of Linguistic Predictions? Grammatical Gender as a Case Study. Language Learning, 74(4), 814-852. https://doi.org/10.1111/lang.12638
Aizawa, I. (2025). The role of language on assessment outcomes: An analysis of calculation and explanation questions in science classrooms. Assessment & Evaluation in Higher Education, 50(1), 67–82. https://doi.org/10.1080/02602938.2024.2353865
Alanazi, M. S. (2024). The use of Modern Standard Arabic and colloquial Arabic in translation tasks: A new perspective. Cogent Arts & Humanities, 11(1), 2366572. https://doi.org/10.1080/23311983.2024.2366572
Alexiou, T., & Stathopoulou, M. (2021). The pre-A1 level in the Companion Volume of the Common European Framework of Reference for Languages. Research Papers in Language Teaching and Learning, 11(1), 11-29.
Alrababa'h, I. H., Habashneh, Q. Y., & Rababa, I. A. (2024). Assessing reading texts for non-native Arabic speaking students at the University of Jordan in light of the Common European Framework of Reference for Languages from the students' perspective. Theory and Practice in Language Studies, 14(6), 1818–1827. https://doi.org/10.17507/tpls.1406.23
Aslan, M. (2025). The evolution of language education in the digital age. Language in the Digital Age, 118.
Asli, N. F., Mohd Matore, M. E. E., & Md Yunus, M. (2024). Construct validity of primary trait writing rubrics based on assessment use argument (AUA) validation framework. Heliyon, 10(22), e40053. https://doi.org/10.1016/j.heliyon.2024.e40053
Aslan, M. (2025). The evolution of language education in the digital age. Language in the Digital Age, 118.
Asli, N. F., Mohd Matore, M. E. E., & Md Yunus, M. (2024). Construct validity of primary trait writing rubrics based on assessment use argument (AUA) validation framework. Heliyon, 10(22), e40053. https://doi.org/10.1016/j.heliyon.2024.e40053
Aslan, M. (2025). The evolution of language education in the digital age. Language in the Digital Age, 118.
Asli, N. F., Mohd Matore, M. E. E., & Md Yunus, M. (2024). Construct validity of primary trait writing rubrics based on assessment use argument (AUA) validation framework. Heliyon, 10(22), e40053. https://doi.org/10.1016/j.heliyon.2024.e40053
Bachman, L. F. (2004). Statistical analyses for language assessment. Cambridge University Press.
Bar-On, A., Shalhoub-Awwad, Y., & Tuma-Basila, R. I. (2018). Contribution of phonological and morphological information in reading Arabic: A developmental perspective. Applied Psycholinguistics, 39(6), 1253–1277. https://doi.org/10.1017/S0142716418000310
Benedetto, L., Gaudeau, G., Caines, A., & Buttery, P. (2025). Assessing how accurately large language models encode and apply the Common European Framework of Reference for Languages. Computers and Education: Artificial Intelligence, 8, 100353. https://doi.org/10.1016/j.caeai.2024.100353
Bowen, G. (2009). Document analysis as a qualitative research method. Qualitative Research Journal, 9, 27–40. https://doi.org/10.3316/QRJ0902027
British Council, EALTA, UKALTA, & ALTE. (2022). Aligning language education with the CEFR: A handbook. British Council. https://rm.coe.int/1680459f97
Cummings, D., & Anderson, P. (2025). The impact of CEFR level adjustments on student engagement in Emirates School Establishments. Gulf Education and Social Policy Review (GESPR), 6(2), 130–145. https://doi.org/10.18502/gespr.v6i2.17685
Dewi, D. P., Ramadhani, G. P., & Sopian, A. (2025). Analyzing “Learn Arabic for Beginners” TikTok content based on the CEFR. LISANIA: Journal of Arabic Education and Literature, 9(1), 272–300. https://doi.org/10.18326/lisania.v9i1.272-300
Elsayed, Y., Nabil, E., Torki, M., Faizullah, S., & Khalafallah, A. (2025). ZaQQ: A new Arabic dataset for automatic essay scoring via a novel human–AI collaborative framework. Data, 10(9), 148. https://doi.org/10.3390/data10090148
Council of Europe. (2020). Common European Framework of Reference for Languages: Companion volume. Council of Europe.
Gaillat, T., Simpkin, A., Ballier, N., Stearns, B., Sousa, A., Bouyé, M., & Zarrouk, M. (2022). Predicting CEFR levels in learners of English: The use of microsystem criterial features in a machine learning approach. ReCALL, 34(2), 130–146. https://doi.org/10.1017/S095834402100029X
Giraldo, F. (2018). Language assessment literacy: Implications for language teachers. Profile: Issues in Teachers' Professional Development, 20(1), 179–195. https://doi.org/10.15446/profile.v20n1.62089
Gou, P. (2023). Teaching English using mobile applications to improve academic performance and language proficiency of college students. Education and Information Technologies, 28(12), 16935–16949. https://doi.org/10.1007/s10639-023-11864-9
Gunawan, I., & Palupi, A. R. (2016). Taksonomi Bloom–revisi ranah kognitif: Kerangka landasan untuk pembelajaran, pengajaran, dan penilaian. Premiere Educandum: Jurnal Pendidikan Dasar dan Pembelajaran, 2(02). https://doi.org/10.25273/pe.v2i02.50
Jeon, J. (2025). The impact of CEFR basic user level text complexity on elementary school learners' English comprehension. Primary English Education, 31(1). https://doi.org/10.25231/pee.2025.31.1.143
Jurane-Bremane, A. (2023). Digital assessment in technology-enriched education: Thematic review. Education Sciences, 13(5), 522.
Kamal, M., Sarip, M., Ilham, A., Jubaedah, S., & Khambali, K. (2025). Compiling e-learning Kitabah Muqoyyadah teaching materials through the CEFR. ALSUNIYAT: Jurnal Penelitian Bahasa, Sastra, Dan Budaya Arab, 8(1), 21–35. https://doi.org/10.17509/alsuniyat.v8i1.73520
Khallaf, N., & Sharoff, S. (2021). Automatic difficulty classification of Arabic sentences. In Proceedings of the Sixth Arabic Natural Language Processing Workshop (pp. 105–114). Association for Computational Linguistics. https://aclanthology.org/2021.wanlp-1.11/
Lowie, W. M., Haines, K. B. J., & Jansma, P. N. (2010). Embedding the CEFR in the academic domain: Assessment of language tasks. Procedia - Social and Behavioral Sciences, 3, 152–161. https://doi.org/10.1016/j.sbspro.2010.07.027
Mahmudi, I., Muhyiddin, L., Ismail, M., Saifulloh, A., & Kusnawan, W. (2025). Analysis of the quality of Arabic language online test items based on ACTFL standards. Langkawi: Journal of The Association for Arabic and English, 11(2), 243–256. https://doi.org/10.31332/lkw.v11i2.12047
Maulani, H., Muthmainah, N., Khalid, S. M., Saleh, N., & Taufik, I. H. (2024). Investigation of the reference level description for Arabic proficiency tests in Indonesia. Jurnal Al Bayan: Jurnal Jurusan Pendidikan Bahasa Arab, 16(1), 1. https://doi.org/10.24042/albayan.v16i1.21566
Miles, M. B., Huberman, A. M., & Saldaña, J. (2014). Qualitative data analysis: A methods sourcebook (3rd ed.). SAGE Publications.
Muawanah, U., Marini, A., & Sarifah, I. (2024). The interconnection between digital literacy, artificial intelligence, and the use of e-learning applications in enhancing the sustainability of regional languages: Evidence from Indonesia. Social Sciences & Humanities Open, 10, 101169. https://doi.org/10.1016/j.ssaho.2024.101169
Musthofa, T. (2022). CEFR-based policy in Arabic language teaching and cultural dimension in Indonesian Islamic higher education. Eurasian Journal of Applied Linguistics, 8(2), 96–107.
Nagai, N., Birch, G. C., Bower, J. V., & Schmidt, M. G. (2020). CEFR-informed learning, teaching and assessment: A practical guide. Springer. https://doi.org/10.1007/978-981-15-5894-8
Najiyah, S. A., Mahmudi, I., Ismail, M., & Sa'diyah, L. F. (2026). Development of Arabic reading skills test items based on Common European Framework of Reference for Languages theory. An Nabighoh, 28(1), 47–70. https://doi.org/10.32332/an-nabighoh.v28i1.47-70
Newton, S., Alemdar, M., Rutstein, D., Edwards, D., Helms, M., Hernandez, D., & Usselman, M. (2021). Utilizing evidence-centered design to develop assessments: A high school introductory computer science course. Frontiers in Education, 6. https://doi.org/10.3389/feduc.2021.695376
Norrbom, B., & Zuboy, J. (2021). Some practical consequences of quality issues in CEFR translations: The case of Arabic. In B. Lanteigne, C. Coombe, & J. D. Brown (Eds.), Challenges in language testing around the world: Insights for language test users (pp. 421–432). Springer. https://doi.org/10.1007/978-981-33-4232-3_30
Norris, J. M. (2018). Task-based language assessment: Aligning designs with intended uses and consequences. JLTA Journal, 21, 3–20. https://doi.org/10.20622/jltajournal.21.0_3
Nurdianto, T., Hidayat, Y., & Fitrianto, I. A. A. (2025). Design of CEFR-based Arabic language proficiency test indicators in the Arabic language education department. Ijaz Arabi Journal of Arabic Learning, 8(3).
Oyebode, B. I., & Nicholls, N. (2021). Does the timing of assessment matter? Circadian mismatch and reflective processing in university students. International Review of Economics Education, 38, 100226. https://doi.org/10.1016/j.iree.2021.100226
Pellegrino, J. W., DiBello, L. V., & Goldman, S. R. (2016). A framework for conceptualizing and evaluating the validity of instructionally relevant assessments. Educational Psychologist, 51(1), 59–81. https://doi.org/10.1080/00461520.2016.1145550
Pransiska, T., Sugiyono, S., Widodo, S. A., & Sayyid, W. A. M. (2024). Al-kutub al-madrasiyah al-'arabiyah fi Indunisiya min mazuri wataniyyin wa 'alamiyyin. Jurnal Ilmiah Islam Futura, 24(2), 488–512. https://doi.org/10.22373/jiif.v24i2.14964
Romero-Yesa, S., Fonseca, D., Aláez, M., & Amo-Filva, D. (2023). Qualitative assessment of a challenge-based learning and teamwork applied in electronics program. Heliyon, 9(12), e22739. https://doi.org/10.1016/j.heliyon.2023.e22739
Ryding, K. C. (2005). A reference grammar of Modern Standard Arabic. Cambridge University Press.
Salam, A. F. A., Firdaus, M. R., Amaliah, T., Azmy, M. U., & Hasanah, U. (2025). Analisis pengembangan kurikulum CEFR bahasa Arab di Eropa dan internasional. Al-Hikmah: Jurnal Agama Dan Ilmu Pengetahuan, 22(2), 502–513. https://doi.org/10.25299/ajaip.2025.vol22(2).23113
Shen, W., Xu, X., & Wang, X. (2022). Reconceptualising international academic mobility in the global knowledge system: Towards a new research agenda. Higher Education, 84(6), 1317–1342. https://doi.org/10.1007/s10734-022-00931-8
Soliman, R., & Familiar, L. (2024). Creating a CEFR Arabic vocabulary profile: A frequency-based multi-dialectal approach. Critical Multilingualism Studies, 11(1), 266-286.
Taye, T., & Lodebo, T. (2026). The relationship between vocabulary learning strategies, reading comprehension, and motivation among secondary school EFL learners. Social Sciences & Humanities Open, 13, 102516. https://doi.org/10.1016/j.ssaho.2026.102516
Utomo, B. (2022). Analisis validitas isi butir soal sebagai salah satu upaya peningkatan kualitas pembelajaran di madrasah berbasis nilai-nilai islam. Jurnal Pendidikan Matematika (Kudus), 1(2). https://doi.org/10.21043/jpm.v1i2.4883
Winke, P. M., & Isbell, D. R. (2017). Computer-assisted language assessment. In Language, Education and Technology (pp. 1–13). Springer. https://doi.org/10.1007/978-3-319-02328-1_25-1
Wisniewski, K. (2018). The empirical validity of the Common European Framework of Reference scales: An exemplary study for the vocabulary and fluency scales in a language testing context. Applied Linguistics, 39(6), 933–959. https://doi.org/10.1093/applin/amw057
Zeng, J., & Huang, L. (2021). Understanding formative assessment practice in the EFL exam-oriented context: An application of the theory of planned behavior. Frontiers in Psychology, 12, 774159. https://doi.org/10.3389/fpsyg.2021.774159
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Alyaa Adhinna Putri, Asep Sopian, Hikmah Maulani

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.









