9:00-9:10 Opening Remarks
9:10-10:30 Morning Session: Downstream Tasks
Chair: Mariana Romanyshyn
9:10-9:25 | Improving Named Entity Recognition for Low-Resource Languages Using Large Language Models: A Ukrainian Case Study
Vladyslav Radchenko and Nazarii Drushchak |
9:25-9:45 | A Framework for Large-Scale Parallel Corpus Evaluation: Ensemble Quality Estimation Models Versus Human Assessment
Dmytro Chaplynskyi and Kyrylo Zakharov |
9:45-10:05 | UNLP 2025 Best Paper
Introducing OmniGEC: A Silver Multilingual Dataset for Grammatical Error Correction Roman Kovalchuk, Mariana Romanyshyn and Petro Ivaniuk |
10:05-10:25 | Improving Sentiment Analysis for Ukrainian Social Media Code-Switching Data
Yurii Shynkarov, Veronika Solopova and Vera Schmitt |
10:30-11:00 Morning Coffee Break
11:00-12:00 Morning Session: Towards a Ukrainian LLM
Chair: Oleksii Ignatenko
11:00-11:20 | From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages
Artur Kiulian, Anton Polishko, Mykola Khandoga, Yevhen Kostiuk, Guillermo Gabrielli, Łukasz Gągała, Fadi Zaraket, Qusai Abu Obaida, Hrishikesh Garud, Wendy Wing Yee Mak, Dmytro Chaplynskyi, Selma Belhadj Amor and Grigol Peradze |
10:20-10:40 | Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains
Yurii Paniv, Artur Kiulian, Dmytro Chaplynskyi, Mykola Khandoga, Anton Polishko, Tetiana Bas and Guillermo Gabrielli |
11:40-12:00 | On the Path to Make Ukrainian a High-Resource Language
Mykola Haltiuk and Aleksander Smywiński-Pohl |
12:00-13:00 Keynote: Sebastian Ruder
13:00-14:15 Lunch
14:15-15:30 Afternoon Session: Linguistics and NLP
Chair: Mariana Romanyshyn
14:15-14:30 | Developing a Universal Dependencies Treebank for Ukrainian Parliamentary Speech
Maria Shvedova, Arsenii Lukashevskyi and Andriy Rysin |
14:30-14:50 | Vuyko Mistral: Adapting LLMs for Low-Resource Dialectal Translation
Roman Kyslyi, Yuliia Maksymiuk and Ihor Pysmennyi |
14:50-15:10 | Context-Aware Lexical Stress Prediction and Phonemization for Ukrainian TTS Systems
Anastasiia Senyk, Mykhailo Lukianchuk, Valentyna Robeiko and Yurii Paniv |
15:10-15:30 | Precision vs. Perturbation: Robustness Analysis of Synonym Attacks in Ukrainian NLP
Volodymyr Mudryi and Oleksii Ignatenko |
15:30-16:00 Afternoon Coffee Break
16:00-17:00 Keynote: Illia Strelnykov
17:00-17:50 Afternoon Session: Responsible AI
Chair: Oleksii Ignatenko
17:00-17:15 | UAlign: LLM Alignment Benchmark for the Ukrainian Language
Andrian Kravchenko, Yurii Paniv and Nazarii Drushchak |
17:15-17:30 | GBEM-UA: Gender Bias Evaluation and Mitigation for Ukrainian Large Language Models
Mykhailo Buleshnyi, Maksym Buleshnyi, Marta Sumyk and Nazarii Drushchak |
17:30-17:50 | Gender Swapping as a Data Augmentation Technique: Developing Gender-Balanced Datasets for Ukrainian Language Processing
Olha Nahurna and Mariana Romanyshyn |
17:50-18:00 Closing Words
9:00-10:30 Morning Session: Shared Task
Chair: Roman Kyslyi
10:30-11:00 Morning Coffee Break
11:00-13:00 Panel Discussion: Disinformation Detection from a Business Perspective
Panelists: Kateryna Burovova, Nataliia Romanyshyn, Yaroslav Peliushenko, Yuliia Dukach
Chair: Roman Kyslyi
Illia Strelnykov, Data Scientist at YouScan, Ukraine
Topic: Leveraging User Feedback to Improve Your Models
While academic research provides a strong foundation for model development, the ultimate goal is to deploy these models in real-world applications, where they interact with actual users. This talk addresses the critical challenge of effectively leveraging user feedback to enhance model performance in practical scenarios. We’ll explore ways to incorporate the highly valuable — yet inherently noisy — user-provided data into model training and fine-tuning pipelines. First, we’ll cover methods for collecting user feedback and the challenges involved in processing it, including issues like bias and conflicting information. Then we will examine various solutions for tackling these challenges and how to use refined feedback for model improvement.
Sebastian Ruder, Research Scientist at Meta, Germany
Topic: Multilinguality in Llama 4 and Beyond
Abstract: Multilingual LLMs have become so powerful that they can be used in real-world conversations in a variety of applications. While this presents many opportunities, it also poses challenges associated with the complexity of natural language. In this talk, I will seek to connect academic research to real-world challenges of multilingual conversational AI. I will first provide an overview of multilinguality in Llama 4, highlighting the importance of evaluation. I will then discuss what it takes to bridge the gap between academic and real-world evaluations. Finally, I will discuss how we can develop models that are useful to speakers in their local context, across the globe and for the Ukrainian language.
Kateryna Burovova, ML Engineer at LetsData
Kateryna specializes in AI-powered solutions for detecting and combating harmful information operations, leveraging NLP and computational social science to create threat detection pipelines that analyze content semantics, user behavior patterns, network dynamics, and other contextual signals.
Nataliia Romanyshyn, AI Specialist at Texty.org.ua
Nataliia focuses on the detection and analysis of Russian disinformation. Her expertise includes natural language processing, specifically topic modeling, named entity recognition, large language models, and multilingual NLP. She plays a key role in developing analytical frameworks that transform complex textual data into actionable insights aimed at uncovering disinformation mechanisms.
Yaroslav Peliushenko, Head of Analytics at Osavul
Yaroslav is the Head of Analysis at Osavul, a technology company developing AI-powered solutions for deep intelligence and countering information threats. At UNLP, he will share insights into how their team analyses and structures information, the frameworks they use, and the thinking behind their approach.
Yuliia Dukach, PhD, Data Journalist and Head of Disinformation Investigations at OpenMinds
Yuliia is an expert in disinformation research with over five years of experience in investigative data journalism. Yuliia applies advanced skills in Python and machine learning to analyze computational propaganda and online misinformation.
Vasyl Starko, Ukrainian Catholic University, Ukraine
Andriy Rysin, Independent researcher, USA
Topic: BRUK Team’s Resources for Ukrainian Corpus Creation
The talk will focus on the key resources and tools developed by the BRUK team for the automatic processing of Ukrainian texts, especially for building Ukrainian corpora. The resources include:
* BRUK (Ukrainian Brown Corpus, a projected one-million-word POS gold standard)
* VESUM (A Large Electronic Dictionary of Ukrainian, over 420,000 lemmas and counting, for POS tagging)
* USL (Ukrainian Semantic Lexicon for semantic tagging).
The tools come in the form of the NLP_UK suite for Ukrainian text tokenization, lemmatization, POS tagging, and cleaning. The application of NLP_UK to build multiple iterations of the GRAC corpus will be discussed.