Cross-lingual learning-to-rank with shared representations

Shota Sasaki, Shuo Sun, Shigehiko Schamoni, Kevin Duh, Kentaro Inui

研究成果: Conference contribution

23 被引用数 (Scopus)

抄録

Cross-lingual information retrieval (CLIR) is a document retrieval task where the documents are written in a language different from that of the user's query. This is a challenging problem for data-driven approaches due to the general lack of labeled training data. We introduce a large-scale dataset derived from Wikipedia to support CLIR research in 25 languages. Further, we present a simple yet effective neural learning-to-rank model that shares representations across languages and reduces the data requirement. This model can exploit training data in, for example, Japanese-English CLIR to improve the results of Swahili-English CLIR.

本文言語English
ホスト出版物のタイトルShort Papers
出版社Association for Computational Linguistics (ACL)
ページ458-463
ページ数6
ISBN(電子版)9781948087292
出版ステータスPublished - 2018
イベント2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018 - New Orleans, United States
継続期間: 2018 6月 12018 6月 6

出版物シリーズ

名前NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
2

Conference

Conference2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018
国/地域United States
CityNew Orleans
Period18/6/118/6/6

ASJC Scopus subject areas

  • 言語学および言語
  • 言語および言語学
  • コンピュータ サイエンスの応用

フィンガープリント

「Cross-lingual learning-to-rank with shared representations」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル