Shakespeare Statistics


Authorship of The Life and Death of Jack Straw

BRD-Flagge auf deutsch     All data were generated by R Stylo. See:


Rolling delta is a feature of the R Stylo suite, a program to measure the delta distances between reference texts and a target text. Delta was developed by John Burrows in 2002, and Rolling delta overcame the problem that one delta figure covered a whole text by introducing windows of a particular size that ‘rolled’ through the text with an overlap, so that many measurements were taken. In this way collaborations of authors and affinities between texts could be detected. To reduce subjectivity in the choice of reference texts a totality of over 150 drama reference texts from the Renaissance period that had been collected over the years were used in the analysis. This took my PC to the end of its tether, but the results gained from 5000-word windows, a step size of 250 words and character trigrams as variables give clear indications of authorships and textual affinities. A spreadsheet noted the titles of plays in column A, column B noted the delta values of each play at 2500 words (5000-word window). Column C returned the delta measurements taken at 2750 words (step size 250 words), and this continued to the right depending on the length of the target text. In each column (except A) the three lowest deltas were marked and all play titles without any marking were erased. (This was also done with plays that could be excluded timewise, except for Shakespeare core plays. Here early Shakespeare plays were impaired due to possible collaborations).
From column B to the right the texts (and authors) with the lowest deltas are noted. The lowest delta in each column is marked in green, the second-lowest in yellow, and the third lowest in red.


Please find below the chart provided by Rolling Delta. Here the step size was 125 words and the two lowest delta curves represent the files with the smallest stylistic distance from Jack Straw.

This looks very much as if Samuel Rowley and/or William Shakespeare wrote the play. Some measurements still have to be taken. Frequencies of words, character bigrams etc. are helpful, particularly with different window sizes. Classifications with nsc, svm, and delta are also indispensible, the General Imposters method should also be tried in a compound of suitable reference texts.
Compare these evaluations with the results of Rolling Classify and the General Imposters Methode.