We see 3 basic approaches to making an NLP system - a chat bot, a search engine, NER, a classifier... - that works well for one language, like English, work well for more languages:
1. machine-translating at inference (or query) time
2. machine-translating labelled training data (or search indices), and training a multilingual model
3. zero-shot approaches with a multilingual LM like BERT or LASER
When to use which approach?
We at ModelFront have shared some an open guide on multilingual search: https://modelfront.com/search.
In March, Nerses Nersesyan from Polixis and I will give an online workshop on this at Applied Machine Learning Days.
https://appliedmldays.org/events/amld-epfl-2021/workshops/how-to-make-your-nlp-system-multilingual
We'd be happy to have your input and feedback and hope it can help you with your own work, though unfortunately the workshop won't be in time for the ACL/EACL submission deadline.
Thanks and regards,
Adam
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 422 |
Nodes: | 16 (2 / 14) |
Uptime: | 183:30:09 |
Calls: | 8,946 |
Calls today: | 13 |
Files: | 13,352 |
Messages: | 5,991,292 |