Shakespeare Statistics
Authorship of 2 Henry IV
auf deutsch
All data were generated by R Stylo (see: Computational Stylistics Group Homepage).
Rolling Classify makes use of classifiers like nsc (nearest shrunken centroid), svm (support vector machine) and delta.
They were applied to word frequencies (mf1w), character bigrams (mf2c) and character trigrams (mf3c). An improved methodology
had recourse to a large number of reference texts, all of which are sole-authored and well attributed. Core plays of large corpora
made sure that no bias came into being. The window size is 8000 words and a slice overlap of 7750 words provides comparability
with Rolling Delta results. The mathematical kernels of the classifiers are unique and explain differences in the results. Nsc has a rather
low decision level, whereas svm has a high one, and is more reliable for that reason.
At the end of the table the evaluation of the matrix is given. The number of cells and their percentages are recorded.



Compare these evaluations with the results of Rolling Delta.