Shakespeare Statistics
Authorship of The Life and Death of Jack Straw
auf deutsch
All data were generated by R Stylo. See: Computational Stylistics Group Homepage
Rolling Classify makes use of classifiers like nsc (nearest shrunken centroid), svm (support vector machine) and delta.
They were applied to word frequencies (mf1w), character bigrams (mf2c) and character trigrams (mf3c). An improved methodology
had recourse to a large number of reference texts, all of which are single-authored and well attributed. Core plays of large corpora
made sure that no bias came into being. The window size is between 1000 and 7000 words at a distance of 1000 words each, and a slice overlap of 250 words provides comparability
with Rolling Delta results. The mathematical kernels of the classifiers are unique and explain differences in the results. Nsc has a rather
low decision level, whereas svm has a high one, and is more reliable also for that reason. Vocabulary, however, is less reliable than
character bi- and trigrams.
A majority of attributions favours William Shakespeare as author of The Life and Death of Jack Straw followed by Samuel Rowley,
particularly in nsc classifications.




In contrast to Rolling Delta, the differently scoring classifiers throw up a large
number of eligible authors depending on the selection of variables (mf1w, mf2c and mf3c)
and on window size. The chart above results from a pre-selected set of plays:
chettle_hoffman.txt; daniels_cleop.txt; greene_friarbb.txt; kyd_soliman.txt; kyd_spanpure.txt;
lodge_mariusscilla.txt; lyly_motherbombie.txt; mar_tamburlain1.txt; mar_tamburlain2.txt;
mars_antmellid.txt; mars_malcontent.txt; nashe_summerslast.txt; peele_oldwives.txt;
row_whenysee.txt; shak_hamlet.txt; shak_thnight.txt; sidney_marcantonie.txt;
wilson_3ladieslondon.text;
But it is the windows with a larger size which have a clear penchant for Shakespeare. Only two of his core plays
were used to avoid a bias towards authors with a larger corpus.
Compare these evaluations with the results of Rolling Delta and the General Imposters Method.