Top Ten Findings

1. Did Fanny Write “The Destroying Angel” and “The Fair Cuban”? Probably.

Months of trials led us to one overarching result: the two stories allegedly authored by Fanny, “The Destroying Angel” and “The Fair Cuban,” do not fall significantly outside the boundaries of the rest of The Dynamiter on a stylometric level. In other words, we cannot support with solid evidence that she was the author.

More curiously, other stories within the volume wound up more distinct from the comparative RLS corpus. The most likely scenarios we can conclude from this are that:

Fanny did write the two stories as she says, but Mr. Stevenson edited them so heavily that his footprint became more defined than hers, or —

Fanny thought up the stories, and created them for her husband, and he was responsible for physically writing them down. Either way, what we discovered is that our question was one of much more nuance than a simple distribution of writing responsibilities between Fanny and Robert Louis Stevenson.

2. The South Sea Tales are unique enough to be the work of a different writer 

The radical authorial differences in these stories compared to the rest of Mr. Stevenson’s work is astonishing, and our team spent a long time researching the possible explanations. In the end, while it may seem that Stevenson’s travelling and health influenced his writing heavily, more to the story can be found here.

3. “The Half-White” is an authorial anomaly

Fanny’s story of natives and mysticism, though it resembles her other work in theme and characterisation, is also very much like her husband’s work in a stylometric sense. We struggled to find an answer for this short story’s inconsistent placement in our results, but after you’ll see what we ultimately discovered here.

4. The Corpora: Size Really Does Matter 

Because of Fanny’s limited body of work, and her tendency to writeshort stories rather than extended pieces, we had a very difficult time using all the material we collected. Read on through here to find out why, though all the Stevensons’ work is important to our research, we couldn’t use some of what we had. Much of our reading noted that a standardized sample size would be beneficial to tests like the kind we attempted, but since we couldn’t ask Fanny to write longer stories, we did our best to triangulate the texts into our corpora. The size of our samples was our single most challenging variable.

5. “Chy Lung, the Chinese Fisherman” is an outlier, and it’s because of a few specific words 

This story, much like “The Half-White”, was an outlier in a majority of our tests. The type of language used in “Chy Lung, the Chinese Fisherman” was demonstrably different from many of the other texts in the corpus.  Function words like “for” and “near” made a significant impact in the comparative location of this story, and we learned quickly that distinguishing between parts of speech could make a big difference in the results. See what we learned about Fanny’s story here.

6. Genres are not Homogenous

Compiling a corpus would have been much easier had we been able to consider cross-genre writing like nonfiction and letters. However, we worked with what we were given.  Trying to compare standalone short stories, samples from novels, and discrete, yet related, short stories from a collection was not an ideal methodology. This hindered our already formidable research question considerably. Given the work the Stevensons actually did, we had to do our best to remember that unexpected results could simply be an issue of genre.

 7. Pronouns: Gender Blindness Doesn’t Work

Given that Henry James has fairly accurately described Robert Louis Stevenson’s work as having a notable “absence of care for things feminine,” removing the pronouns from our corpus aided us significantly. In simple terms, Fanny tends to write about women, and Robert Louis tends to write about men. To look at the composition of their work outside the bounds of character gender, by removing words like ‘s/he’ and ‘her/him’, we got a more accurate image of authorial fingerprint and saw what Fanny and Robert Louis had in common or did differently, apart from writing gender-distinct characters. James continued in his explanation of Robert Louis that “everything he has written is a direct apology for boyhood,” and in many ways we discovered that to be the case. The tests that included pronouns were quite clearly affected by the inclusion or exclusion of female characters.

 8. Function Words: Unconscious Stylistic Decisions Matter

One of the central tensions within authorship attribution is between the value of function words and the value of outstanding content words. Authorial footprints can be related to noticeable language like superlatives, onomatopoeia, and the like, or even who prefer certain nouns over similar ones. In reality, all nouns, pronouns, and adjectives are content words. These words are decisions made by the author. Function words are conjunctions, articles, and other helper parts of speech are more deeply ingrained within a personal subjectivity, and it’s harder to consider them as authorial choices. Rather, they make up what we refer to as an ‘authorial fingerprint,’ a mark left by the writer that might easily be forgotten or disregarded (except by digital humanists). These fingerprints were what we sought through our research, as we not only read that function words were a better signifier of authorship, but we proved it to ourselves through our trials: In The Dynamiter, the words that matter are the seemingly insignificant ones.

 9. The Biographical Context: Do Your Research

Publication years turned out to be just as telling of a text’s stylistic nature.  We found a fairly regular pattern of change in writing over time for both Fanny and Robert Louis. Once we added the year of publication to our chapter and story titles, it was easier for us to track which periods of the Stevensons’ lives led to literary similarities. While date of publication may seem like an obvious factor in determining textual likenesses, it was very important for us to make sure we had a full picture of the writing context so as to identify and explicate outliers. Also, keeping in mind where in the world the Stevensons were as they wrote (Samoa, Skerryvore, and California, to name a few) was key to our work: the writing from the South Seas is very different from that of their time in Hyères. Reading the opinions of friends like Sidney Colvin, who knew the Stevensons at different times in their lives, also clued us in on personal developments that may have influenced their writing, like health or familial factors.

 10. The Rest of The Dynamiter: Visualisations are Everything

There are three other chapters in the volume that produced results worth discussing. We considered the idea that Fanny had written them, particularly “The Tale of the Spirited Old Lady,” but this seemed altogether unlikely given the inconsistency of our results. Eventually, the “Spirited Old Lady,” along with “The Brown Box” and “The Squire of Dames” led us to realize that the differences were mostly based on the kind of visual representation at which we were looking.  Between Principal Component Analysis (PCA), Cluster Analysis (CA), Multidimensional Scaling (MDS), and Latent Dirichlet Allocation (LDA), (details of which can be found on our methodology page), we had widely varied pictures of our data. Different visualisations lead to startlingly different results, and these three stories bore the brunt of the variance. In the end, our conclusion was that within a more controlled and uniform set of texts, the three tales would have produced more normalised results. As it happened, we were led to question pronouns and gender variance through our closer examination of these chapters, and as such our research grew and improved anyway.