ÃÊ·Ï |
Lee, Yong-hun & Joh, Gihyun. (2019). Identifying suicide notes using forensic linguistics and machine learning. The Linguistic Association of Korean Journal, 27(2), 171-191. This paper presents how to identify the characteristic properties of suicide notes using the analysis methods in forensic linguistics and how to apply the knowledge to the machine learning research. For this purpose, a corpus was compiled with Virginia Woolfs literary works and suicide notes, which contained six texts. Then, each text was analyzed with the LIWC (Linguistic Inquiry and Word Count) software. Since the analysis results were complicated, a dimensionality reduction was conducted using a Principal Component Analysis (PCA). In the PCA analysis, it was found that, even though all the texts were written by the same author, the suicide notes were clearly identified from the literary works. The analysis results of LIWC analyses were applied to a machine learning technique (especially a Support Vector Machine; SVM), and the classification accuracy was measured using six real texts and three hypothetical texts. Through the analysis, it was found that the SVM machine identified the suicide notes from the literary works with 100% of accuracy. The current study demonstrates that the linguistic properties of texts can be used to identify the suicides notes from the other types of writings and that they can be used in machine learning research. |