An Expanded Massive Multilingual Dataset for High-Performance Language Technologies (HPLT)
Burchell, Laurie,
De Gibert Bonet, Ona,
Arefyev, Nikolay,
Aulamo, Mikko,
Bañón, Marta,
Chen, Pinzhen,
Fedorova, Mariia,
Guillou, Liane,
Haddow, Barry,
Hajič, Jan,
Helcl, Jindřich,
Henriksson, Erik,
Klimaszewski, Mateusz,
Komulainen, Ville,
Kutuzov, Andrey,
Kytöniemi, Joona,
Laippala, Veronika,
Mæhlum, Petter,
Malik, Bhavitvya,
Mehryary, Farrokh,
Mikhailov, Vladislav,
Moghe, Nikita,
Myntti, Amanda,
O’Brien, Dayyán,
Oepen, Stephan,
Pal, Proyag,
Piha, Jousia,
Pyysalo, Sampo,
Ramírez-Sánchez, Gema,
Samuel, David,
Stepachev, Pavel,
Tiedemann, Jörg,
Variš, Dušan,
Vojtěchová, Tereza,
and Zaragoza-Bernabeu, Jaume
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jul
2025