We studied the use of case particles in modern Japanese, focusing on the difference between written and spoken languages as well as the difference among sub corpus (registers). As the data source, we used the core part (newspapers, magazines, books, white papers,
Chiebukuro (Wisdom Q&A), blog) of Balanced Corpus of Contemporary Written Japanese (BCCWJ) and four monologues and four dialogues in the Corpus of Spoken Japanese (CSJ). In both BCCWJ and CSJ, 30% of all the words are particles, among which the case particles are the majority. In BCCWJ, white papers and
Chiebukuro exhibit completely opposite results. White papers and newspapers have more written-language-style, formal expressions such as "ni-oite" instead of more colloquial "de."
Chiebukuro and blogs are more towards spoken language. Furthermore,
Chiebukuro has its own unique style. In CSJ corpus, monologues and dialogues showed different characteristics. Finally, we found that some similarity between monologues in CSJ and formal documents (such as white papers) in BCCWJ.
View full abstract