Statistical Writing With Others 

by Andrew Mosbo, Statistics

There will be many obstacles that one will encounter when writing a statistical report or article with others. Of course, some of these issues apply to any type of writing, while others will also apply when writing solo. In STAT5020W, several groups of three-to-four students must write reports by the end of the spring semester. As anyone who participated in group projects knows, the “divide and conquer” method is almost always used. That is, Student A will write the introduction and summary, Student B will write the methods, Student C will write the results, and Student D will write the discussion and conclusion. This is not always negative, but there are some important things to review. 

The first is inconsistent terminology. While a simple fix, this has the most impact on the comprehensibility of a report. When statisticians are writing for non-statisticians, comprehensibility is paramount. Other groupmates may use different terminology, which is fine when talking to other statisticians, but should be avoided when reporting to non-statisticians. For example, there is something in statistics called a “precision matrix,” but it can also be called a “concentration matrix.” Swapping between these two names in a paper is unnecessary and will negatively impact the report. It should never be expected that a reader will know all variations of the name of something. Even if the equivalence is specified, inconsistent terminology should be avoided. 

The second is the voice. I and many others tend to use the active voice with the pronouns “we” or “our,” but many also use a more passive voice. Using pronouns tends to be better as authors can specify whether it was them or cited authors that produced certain results. While switching between the two is less of an issue than inconsistent terminology, it should still be edited to create a better flow between sections. For example, in a methods section, there may be a sentence like “A linear model was fitted…” but the conclusion may summarize the results as “Our linear model…” While not the worst mistake to make, it can cause confusion. Another example is this very blog post. The previous paragraph used no pronouns regarding the author. However, the second sentence of this paragraph used pronouns referencing the author, but not again for the rest of the paragraph, including this sentence and the previous one. Confusing, right? 

There are some more general pieces of advice that apply to statistical writing, such as simplicity. For example, I saw the first sentence in a colloquium presentation once, and the second sentence is an equivalent way of saying the first: 

  • “The vector space spanned by the basis of the column space” 
  • “The column space” 

These sentences mean the exact same thing. Another thing to avoid is redundancy. You should never say the same thing twice. Also, see what I meant about pronouns? 

There are many issues that can arise from writing with others, but reviewing and editing can help significantly. Another way to avoid these issues is to write together. Asking the other authors for advice or opinions while you are writing can reduce the number of these errors. It also makes the writing process slightly easier, as the massive edits do not need to be done at the very end, when everything is already written. Instead, substantial changes are avoided since the snowballs were removed before they could cause an avalanche.