Methods of author identification
As a result of the research, 2 methods of identification of an unknown author of a work belonging to the library of known authors were implemented, the method of text clustering was also implemented and testing of methods with and without clustering was performed. A criterion method was also proposed to select the 𝑛-grams that would best serve as a marker to identify the author. 800 texts by 16 authors were used for testing. As a result, it was found that the method that uses the density of the distribution function is suitable for identifying the authors of works of both large texts (50,000+ characters) and small (10,000+ characters). And the method that uses p-statistics is only suitable for use on large works. With clustering of texts, much better results were obtained in a test sample for both methods.
Галузь знань та спеціальність
11 Математика та статистика , 113 Прикладна математика
Mykhailiuk V. Methods of author identification : graduation thesis … master's : 113 "Applied Mathematics / Vladyslav Mykhailiuk. - Kyiv, 2020. - 25 p.