TY - JOUR
T1 - Assisting authors to convert raw products into polished prose
AU - Ito, Takumi
AU - Kuribayashi, Tatsuki
AU - Kobayashi, Hayato
AU - Brassard, Ana
AU - Hagiwara, Masato
AU - Suzuki, Jun
AU - Inui, Kentaro
N1 - Funding Information:
We would like to thank the Tohoku NLP laboratory members as well as Benjamin Heinzerling, Michael Zock and Marie Josée Brassard for their feedback. Many thanks also to Masato Mita for his advice for carrying out the experiments. Last but not least, thanks to JSPS who supported partly J. Suzuki's work (JSPS KAKENHI Grant Number 19H04162).
Publisher Copyright:
© 2020 Institute for Cognitive Science.
PY - 2020
Y1 - 2020
N2 - Being a notoriously complex problem, writing is generally decomposed into a series of subtasks: idea generation, expression, revision, etc. Given some goal, the author generates a set of ideas (brainstorming), which he integrates into some skeleton (outline, text plan, outline). This leads to a first draft which is submitted then for revision possibly yielding changes at various levels (content, structure, form). Having made a draft, authors usually revise, edit, and proofread their documents. We confine ourselves here only to academic writing, focusing on sentence production. While there has been quite some work on this topic, most writing assistance has mainly dealt with grammatical errors, editing and proofreading, the goal being the correction of surface-level problems such as typography, spelling, or grammatical errors. We broaden the scope by also including cases where the entire sentence needs to be rewritten in order to express properly all of the information planned. Hence, Sentence-level Revision (SentRev) becomes part of our writing assistance task. Obviously, systems performing well in this task can be of considerable help for inexperienced authors by producing fluent, well-formed sentences based on the user's drafts. In order to evaluate our SentRev model, we have built a new, freely available crowdsourced evaluation dataset which consists of a set of incomplete sentences produced by nonnative writers paired with final version sentences extracted from published academic papers. We also used this dataset to establish baseline performance on SentRev.
AB - Being a notoriously complex problem, writing is generally decomposed into a series of subtasks: idea generation, expression, revision, etc. Given some goal, the author generates a set of ideas (brainstorming), which he integrates into some skeleton (outline, text plan, outline). This leads to a first draft which is submitted then for revision possibly yielding changes at various levels (content, structure, form). Having made a draft, authors usually revise, edit, and proofread their documents. We confine ourselves here only to academic writing, focusing on sentence production. While there has been quite some work on this topic, most writing assistance has mainly dealt with grammatical errors, editing and proofreading, the goal being the correction of surface-level problems such as typography, spelling, or grammatical errors. We broaden the scope by also including cases where the entire sentence needs to be rewritten in order to express properly all of the information planned. Hence, Sentence-level Revision (SentRev) becomes part of our writing assistance task. Obviously, systems performing well in this task can be of considerable help for inexperienced authors by producing fluent, well-formed sentences based on the user's drafts. In order to evaluate our SentRev model, we have built a new, freely available crowdsourced evaluation dataset which consists of a set of incomplete sentences produced by nonnative writers paired with final version sentences extracted from published academic papers. We also used this dataset to establish baseline performance on SentRev.
KW - Academic writing assistance
KW - Dataset creation
KW - Deep learning
KW - Natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85087832725&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85087832725&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85087832725
VL - 21
SP - 103
EP - 140
JO - Journal of Cognitive Science
JF - Journal of Cognitive Science
SN - 1598-2327
IS - 1
ER -