Annie Lee
Professor Annie En-Shiun Lee is an assistant professor at Ontario Tech University and the University of Toronto (status-only). With a commitment to making language technology inclusive and accessible, she leads the Lee Language Lab (L³), focusing on research in language diversity and multilingualism. Her work aligns with the instituitional vision of “Tech with a Conscience,” aiming to bridge technology and societal needs.
Professor Lee’s research is widely recognized, with publications in prestigious venues such as Nature Digital Medicine, ACM Computing Surveys, ACL, SIGCSE, IEEE TKDE, and Bioinformatics. Her contributions extend beyond academia, with more than a decade-long track record of successful technology transfers to industry and government, involving over a dozen partnerships. Recognizing this significance, she was selected as Demo Co-Chair for NAACL 2024 and brings extensive expertise in applying research to real-world applications.
Previously, Professor Lee served as an assistant professor (teaching stream) at the University of Toronto, where she won the ARIA spotlight award for making transformative contributions to AI education. During her tenure, she introduced three new concentrations across multiple faculties and nearly doubled enrollment while maintaining diversity.
Professor Lee holds a PhD from the University of Waterloo, where she specialized in unsupervised pattern analysis, domain adaptation, and model interpretation under the supervision of Andrew K. C. Wong and Daniel Stashuk. She has also held visiting researcher positions at the Fields Institute and the Chinese University of Hong Kong and has worked as a research scientist and team lead at VerticalScope and Stradigi AI.
Education
Doctor of Philosophy, 2009-2014 – University of Waterloo
Masters of Mathematics, 2006-2008- University of Waterloo
Joint Honours Bachelor of Mathematics (computer science and combinatorics and optimization) with Co-operative Education, 1999-2004- University of Waterloo
Publications
Surangika Ranathunga, En-Shiun Annie Lee, Marjana Prifti Skenduli, Ravi Shekhar, Mehreen Alam, Rishemjit Kaur. 2023. Neural Machine Translation for Low-resource Languages: A survey. ACM Computing Surveys, 55(11), 1–37.
Beate Franke, Jean‐François Plante, Ribana Roscher, En‐shiun Annie Lee, Cathal Smyth, Armin Hatefi, Fuqi Chen, Einat Gil, Alexander Schwing, Alessandro Selvitella, Michael M Hoffman, Roger Grosse, Dieter Hendricks, Nancy Reid. 2016. Statistical inference, learning and models in big data. International Statistical Review, 84(3), 371–389.
David Ifeoluwa Adelani, Hannah Liu, Xiaoyu Shen, Nikita Vassilyev, Jesujoba O Alabi, Yanke Mao, Haonan Gao, Annie En-Shiun Lee. 2024. SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 226–245, St. Julian’s, Malta. Association for Computational Linguistics.
Genta Indra Winata, Frederikus Hudi, Patrick Amadeus Irawan, David Anugraha, Rifki Afina Putri, Yutong Wang, Adam Nohejl, Ubaidillah Ariq Prathama, Nedjma Ousidhoum, Afifa Amriani, Anar Rzayev, Anirban Das, Ashmari Pramodya, Aulia Adila, Bryan Wilie, Candy Olivia Mawalim, Ching Lam Cheng, Daud Abolade, Emmanuele Chersoni, Enrico Santus, Fariz Ikhwantri, Garry Kuwanto, Hanyang Zhao, Haryo Akbarianto Wibowo, Holy Lovenia, Jan Christian Blaise Cruz, Jan Wira Gotama Putra, Junho Myung, Lucky Susanto, Maria Angelica Riera Machin, Marina Zhukova, Michael Anugraha, Muhammad Farid Adilazuarda, Natasha Santosa, Peerat Limkonchotiwat, Raj Dabre, Rio Alexander Audino, Samuel Cahyawijaya, Shi-Xiong Zhang, Stephanie Yulia Salim, Yi Zhou, Yinxuan Gui, David Ifeoluwa Adelani, En-Shiun Annie Lee, et al. (2024). WorldCuisines: A massive-scale benchmark for multilingual and multicultural visual question answering on global cuisines. In Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2025).
Kosei Uemura, Mahe Chen, Alex Pejovic, Chika Maduabuchi, Yifei Sun, En-Shiun Lee. 2024. AfriInstruct: Instruction Tuning of African Languages for Diverse Tasks. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 13571–13585, Miami, Florida, USA. Association for Computational Linguistics.
Aditya Khan, Mason Shipton, David Anugraha, Kaiyao Duan, Phuong H Hoang, Eric Khiu, A Seza Doğruöz, En-Shiun Annie Lee. 2025. URIEL+: Enhancing linguistic inclusion and usability in a typological and multilingual knowledge base. In Proceedings of the 31st International Conference on Computational Linguistics (COLING 2025), pages 6937–6952.
David Ifeoluwa Adelani, John Ojo, Ibrahim A. Azime, Jia-Yi Zhuang, Jesujoba O. Alabi, Xiaoyu He, Michael Ochieng, Sara Hooker, Ayanda Bukula, En-Shiun Annie Lee, En-Shiun Annie Lee, Chidera Chukwuneke, Henry Buzaaba, Bonginkosi Sibanda, Gift Kalipe, James Mukiibi, Samuel Kabongo, Faith Yuehgoh, Mpho Setaka, Lucky Ndolela, and Pontus Stenetorp. 2024. IrokoBench: A new benchmark for African languages in the age of large language models. In Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2025). Cornell University.
En-Shiun Annie Lee, Sarubi Thillainathan, Shravan Nayak, Surangika Ranathunga, David Ifeoluwa Adelani, Ruisi Su, Arya D McCarthy. 2022. Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?. In Findings of the Association for Computational Linguistics: ACL 2022, pages 58–67, Dublin, Ireland. Association for Computational Linguistics.
Eric Khiu, Hasti Toossi, David Anugraha, Jinyu Liu, Jiaxu Li, Juan Armando Parra Flores, Leandro Acros Roman, A Seza Doğruöz and En-Shiun Annie Lee. 2024. Predicting Machine Translation Performance on Low-Resource Languages: The Role of Domain Similarity. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1474–1486, St. Julian’s, Malta. Association for Computational Linguistics.
Andrew KC Wong, Pei-Yuan Zhou, En-Shiun Annie Lee. 2023. Theory and rationale of interpretable all-in-one pattern discovery and disentanglement system. In NPJ Digital Medicine, 6(1):92.
