Shakespeare Statistics

Authorship of Mucedorus, ca. 1590

BRD-Flagge auf deutsch              All data were generated from R Stylo (siehe: Computational Stylistics Group Homepage).

MF1W, MF2C, and MF3C

The charts display attribution results with windows of different sizes, growing by 500 words each time. Within a window, the step size is 250 words, generating a close network of measuring points, the lowest of which represent the smallest stylistic difference between reference texts and the text in question. The number of variables depends on the window size and is displayed in brackets on each chart. N gives the culling value.
Rolling delta is a feature of the R Stylo suite, a program to measure the delta distances between reference texts
and a target text. Delta was developed by John Burrows in 2002, and Rolling delta overcame the problem that one
delta figure covered a whole text by introducing windows of a particular size that ‘rolled’ through the text with
an overlap, so that many measurements were taken. In this way collaborations of authors and affinities between texts
could be detected. To reduce subjectivity in the choice of reference texts a totality of 100 drama reference texts from the Renaissance
period that had been collected over the years were used in the analysis. This took my PC to the end of its tether, but
the results gained from 5000-word windows, a step size of 250 words and character trigrams as variables give clear
indications of authorships and textual affinities, provided that no reference text is missing. A spreadsheet noted
the titles of plays in column A, column B noted the delta values of each play at 2500 words (5000-word window). Column C
returned the delta measurements taken at 2750 word (step size 250 words), and this continued to the right depending on
the length of the target text. In each column (except A) the three lowest deltas were marked and all play titles without
any marking were erased. The remaining texts went into a new table (after a 90° turn) which can be found below. Column A
gives the measuring at a distance of 250 words. From column B to the right the texts (and authors) with the lowest deltas
are noted. The last two columns on the right return the scenes of the play and their accumulated word counts, adapted to
the 250-word distances of column A. The lowest delta in each line is marked in green, the second-lowest in yellow, and
the third lowest in red.