How ChatGPT can change the conversation about summarizing and revising in scholarly writing

To begin, a caveat: I know almost nothing about Large Language Models (LLMs) in general or ChatGPT in particular. I will also admit that my immediate reaction is to want to hide my head in the sand, to fall back on the old humanist values of originality and that ineffable concept that often goes by the name of “soul.” Finally, I will not hide my pleasure at the fact that that GPT (Generative Pre-trained Transformer) pronounced in French sounds like “J’ai pété” (“I farted”). It won’t stop the spread of this kind of technology, but at least it gives me a measure of puerile amusement.

As someone whose primary job is to work with graduate students on their scholarly writing, the accessibility and power of ChatGPT (and other such bots) inevitably raises concern, but I agree that it is an opportunity to rethink assignments and how we assess knowledge, learning and critical thinking.

My discussion here is specifically about one aspect of scholarly writing that is both massively important and chronically undervalued: revision. I don’t mean editing, which is a relatively mechanical task, involving correcting small errors in spelling, word choice, punctuation and syntax. I actually think a bot could be very helpful as a copy-editor. I mean revision, the larger, structural and stylistic transformations that more or less gradually turn a messy draft into a coherent, cohesive piece of writing.

Writers often dread revision. I suspect that this is in part because many writers, and particularly graduate students, still think of writing in terms of product without recognizing the importance of process. The need to revise can feel like a problem or even a failure if what we have in mind is the perfect final product. In my workshops I try to change this view by framing revision as a necessary good, rather than a necessary evil (let alone a mark of failure to get it right the first time–an impossible goal with any writing that achieves something new).

Why do drafts of a dissertation chapter or journal article almost always need at least one or two rounds of significant revision? Because drafts are a form of pre-writing, of outlining. Don’t think of them as an attempt at the final product. Think of drafts as an outline in prose, a sketch of the chapter/article to be. This is why revision is necessary: revision takes the inevitably messy/shitty draft and re-shapes it into something closer to the final product, though it may need revisions of its own.

Here’s where ChatGPT comes back in. Until yesterday, I had thought of this technology as something that could write for you. What I had seen of ChatGPT’s outputs made me think the panic about it was far overblown. From what I could tell, the bot could write convincingly like a human, but not like a human who knows how to write. The best outputs I had seen were C+/B- level at a first-year university level, to be generous. For now, then, I didn’t see ChatGPT as a problem for graduate writers, who as a whole aim for something rather better than a B- at a Biology 101 level.

But yesterday I attended a discussion on ChatGPT and its implications for graduate education, and I was made aware of some of the other functions that actually complicate my response. I mean specifically its ability to summarize texts and to revise drafts. This made me more nervous, not because I felt that it threatens my job but because I can definitely see the appeal of these functions for many graduate researchers. I can imagine that a graduate student would want to use GPT to summarize an article they’re reading, or to summarize their own article into an abstract. I can also imagine a harried grad student wanting automated help in transforming a messy first draft into something a bit more shapely. As you can probably guess from what I said about revision’s role above, this is worrying.

First, summarizing. Summarizing is a basic skill for researchers who engage with huge amounts of readings, and it’s a difficult one. Summarizing is often also an unpleasant task–I know this from having to write abstracts summarizing my own articles, and really finding the work unexciting and frustrating. How tempting to get a bot to do it for you! But here’s the thing: summarizing is not a neutral task. Like translation, summary is an interpretation. When you summarize your own article–or someone else’s–you’re not just extracting its basic points: you’re choosing what counts as its basic points, for your purposes. As a purely linguistic tool, ChatGPT can’t make those decisions for you. In fact, YOU might not really know what those basic points are UNTIL YOU DO THE SUMMARIZING. In other words, revision is not just interpretation: it’s an act of meaning creation. ChatGPT can mimic creativity, but I don’t believe it can do it for real.

(For more on this line of thinking, you might be interested in this pre-print: Mahowald, Kyle, et al. “Dissociating language and thought in large language models: a cognitive perspective.” arXiv preprint arXiv:2301.06627 (2023).)

Now, revision. Just as I can imagine a grad student wanting automated help in summarizing something, I can easily picture the temptation to use ChatGPT to revise a draft. The reasons are similar: revision is a lot of work, and without a method it can feel not only slow but sometimes counterproductive. (There are plenty of very good methods of revision, including reverse outlining, abstracting, and sentence outlining–on which more soon.) But the tempting automation of revision concerns me even more than the automation of summary, though for similar reasons. Revision is an act of refining and organizing thought. It is the act of recognizing which half-formed ideas are worth foregrounding and which are worth eliminating, as well as the act of turning a mess into an argumentative progression that is often very inchoate in early drafts.

Think of revision as thinking. Or, think of revision as a kind of whetstone for your ideas, a sparring match in which you, the Reviser, parry with another self, the Drafter, pushing that former self to explain their ideas, to unpack implications, to order information in accordance with the significance of your contribution to the field, as a real reader would need that information to be ordered. ChatGPT can’t do that for you because it cannot know what that significance is; what’s more, YOU often don’t know what that significance is–or at least you don’t know it as well and clearly as you could–until you put yourself and your draft through the act of revision.

There’s a scene in The Matrix when Neo is plugged into a program and emerges with an amazed expression, “I know kung fu.” Wouldn’t that be nice? But would it work? Can you know something without rehearsal, without putting your body and/or mind through the moves, through trial and error?

Think of revision not as a form of correction, but rather as a step in an education. A messy first draft (a redundant expression because first drafts are always messy)… A messy first draft is not a failed performance: it is a rehearsal early in a sequence of rehearsals, in which revision plays a crucial role. ChatGPT can give you notes, I guess, but it can’t do the revision for you. It can perform a mimicry of revision, and that might be helpful. But it doesn’t do the work. I mean, it doesn’t think, or have ideas.

All the above is a specific version of a larger (and older) argument about the role of writing in scholarship. In some fields more than others, writing is often seen as a bothersome extra task, something separate from the research. The argument that your research doesn’t matter until it’s written/published doesn’t go far enough: that still implies that writing is an add-on. But writing is more than an add-on: writing is not just “writing up.” Even in experimental or quantitative fields, it includes activities we might not think of as writing, such as talking through the study with collaborators, supervisors, conference attendees; producing outlines; generating hypotheses and alternative hypotheses; outlining; experimental design; note-taking; and of course drafting, revising and revising again.

I understanding that in some fields, writing is more instrumental than it is in mine, where the research happens largely through the writing process. But even instrumental writing isn’t just an output. There is a feedback process, a recursivity that the acts of drafting and revising perform that are crucial not only to clarifying your ideas for readers, but also for generating those ideas for yourself. ChatGPT might seem like a convenient shortcut, but skipping the feedback process is not like cutting diagonally across and intersection to avoid crossing twice: it’s like going from A to C when C cannot truly exist without B.

“We write and revise our earliest drafts to discover and express what we mean,” as Joseph Williams puts it in Style: Ten Lessons in Clarity and Grace. If a bot is revising for us (let alone writing for us), we are not merely skipping a burdensome task on the way from draft to manuscript: we are forfeiting the creation of knowledge and clarity that emerge from the messy task of revising.

No doubt I’ll have more to say about this in the coming months and years. In fact, I have no trouble admitting that this is an unrevised first draft (except for the belated addition of the Williams quotation above), undoubtedly still largely unformed as an argument. It’s fine for a post on a blog no one reads, but I would certainly revise it (and show it to other readers for comment) before submitting it to another platform.

For now, I simply wanted to write about this. ChatGPT is an opportunity here: it makes it even more apparent that we need to rethink how we present writing and revising to our students, especially our graduate students. Many graduate students are never explicitly told anything about writing and especially not about revising (except that it needs to be done), and I think the massive task of supporting our graduate students has always included the need to clarify the role of revision in the process. ChatGPT’s ability to perform an imitation of revision makes that old problem new again.

I wonder if there’d be interest in a workshop “What ChatGPT teaches us about the importance of revision”?

Work cited

Joseph M. Williams. Style: Ten Lessons in Clarity and Grace, 6th ed. Longman, 2000.

Essay contest for PhDs or Postdocs

Here is a fantastic publication opportunity for doctoral students and postdocs for whom English is not the first language. In addition to the book prize and the publication of the essay, this opportunity enables an emerging scholar to expand their impact beyond their primary field and to gain some significant editorial experience with a journal. In today’s dismal job market, these experiences could give an extra boost to a strong research and teaching record.

See the prize website for details. Basics are reproduced, slightly abridged, below:

CHRISTOPHER BRUMFIT ESSAY PRIZE 2021

Language Teaching announces the award of an essay prize which honours one of the founding editors of this journal. PRIZE The winner will receive a £500 credit to be used to purchase books available in the current Cambridge University Press catalogue. The winning essay – revised where appropriate in line with referees’ comments – will be prioritised for publication in the first available issue of the journal. The winner will be nominated for a one-year period as a member of the Language Teaching Editorial Board and designated in all outlets of the journal as the “Christopher Brumfit Award Winner”. 

Write an essay which presents an argument of relevance to second/foreign language learning or acquisition. This essay essay should not be a research report, with its standard format of literature review, methodology, results, and conclusion. What is required is something different.  Like all academic work, it should be rational and scholarly. The chosen topic should ideally be handled with aplomb, seeking to engage and entertain readers as well as inform them, and stimulate ideas. The essay might also reflect on, challenge, or question any number of ideas and assumptions. It may therefore be speculative, provocative, personal, or controversial.  Above all, it should be interesting.

See the prize website for details.

Communicating scholarship from left field

The results of the 2020 “Dance Your PhD” competition are in (for info, see here; for some of the videos, see here). The overall winners this year have won extra approval from 100% of preteen boys in my house. Not bad for atmospheric physicists who do simulations.

There’s so much to love about this competition, starting with the fact that it exists at all. But of course this kind of eccentric approach to communicating research is not just whimsical. Like the 3MT competition (3 minute thesis), Dance Your PhD can be hugely productive for researchers not despite but because of its constraints and creativity. I once asked a group of PhDs in chemistry to draw their research in a single cartoon panel, and the results were surprisingly information–not just to me but also, apparently, to them. Part of this may be due to the freedom to relax and just try something new, but I think more is gained from the constraints that these activities impose. By disallowing researchers from using the same old explanations, you can help them find new and often better ways to reach their various audiences (whether general or specialized).

The more I try these activities, the more I believe they actually work. I’ve often had grad students write their dissertation in six words (a version of the classic 6-word novel, not actually invented by Hemingway, by the way). It’s a quick but surprisingly challenging experiment that often yields clarifying results. One version for my own current project is this: “Science uses (and creates!) narrative forms.” I’ve also seen some fantastic results from having students write up their project as the blurb for a limited series that people would actually want to watch. Another trick I tried–perhaps my most interesting yet–was to get students to invent the perfect epigraph by whomever they want for their own dissertation; it was incredible to see how well these invented epigraphs got at and even uncovered the key concerns of the project. (I’m going ask my students for permission to reproduce some of these experiments here; stay tuned!).

I don’t imagine I’ll be asking my students to dance their dissertation, but I will certainly encourage them to watch the winning entries. Now, I’m going back into that rabbit hole myself.

Academic Writing Pattern # 2: “with-linked phrases”

Like the “and is” pattern covered in AWP #1, this common pattern is not incorrect. Nor is it even necessarily awkward or otherwise undesirable. But it is used by some writers, especially in sciences, uncritically and too often. As a case in point, the examples I quote below were all taken from articles I already happened to have in my hard-drive; of all the papers I opened searching for the “with-link” pattern, only two (2) didn’t use it at least once.

From my observations I’d say this pattern, like “and is,” tends to appear early in papers, when authors are trying to get a load of background covered before getting to the real matter of interest. But it can appear anywhere. Here is an example from an article on CBC.ca:

Variants recently identified in the United Kingdom, South Africa and Brazil are transmitting much more easily than the original strain, with the first estimated to be at least 56 per cent more transmissible.

Adam Miller, “How the spread of coronavirus variants could completely change the pandemic in Canada.” CBC.ca (16 Jan. 2021).

In this pattern, the preposition “with” serves as the connection between the main sentence and a new piece of information contained in a phrase.

Here are a few more examples, these from scholarly articles:

Development of improved systems for CO2 conversion has been an area of intense activity, with significant emphasis placed on EC design and on discovery of improved CO2 reduction (4, 5) and H2O oxidation (6) catalysts based on nonprecious earth-abundant elements.

Kunene et al. “Solar-powered synthesis of hydrocarbons from carbon dioxide and water.” PNAS 116.20 (2019): 9694.

Coot eggs in a nest hatch asynchronously, with an average of 9 to 10 eggs hatching over a period between 2 and 11 d, depending on the nest (median 6 d).

Lyon and Shizuka. “Extreme offspring ornamentation in American coots is favored by selection within families, not benefits
to conspecific brood parasites.” PNAS 117.4 (2020): 2057.

The PPA is thought to represent the basic geometry and content of a scene, with these representations being intolerant to low-level shifts in information (Epstein, 2008)….

Douglas et al. “Perception of Impossible Scenes Reveals Differential Hippocampal and Parahippocampal Place Area Contributions to Spatial Coherency.” Hippocampus 27 (2017): 61.

As you’ll probably agree, these sentences are fine. Fine: that is, not bad, not ungrammatical. But could they be better? My first observation is that these constructions sound very unlike the way people actually speak, even when they speak to fellow specialists about their research. So: is the “with” actually helpful; it is actually necessary? More to the point, what alternative arrangements does it conceal?

In many cases, the simplest alternative is just to delete “with,” leaving the sentence otherwise unchanged. This isn’t always possible, but when it is it’s worth a try. I suspect that the “with” can be removed this way when the phrase it introduces contains a verb ending in -ing. The second and third examples above are exemplary:

Coot eggs in a nest hatch asynchronously, an average of 9 to 10 eggs hatching over a period between 2 and 11 d, depending on the nest (median 6 d).

The PPA is thought to represent the basic geometry and content of a scene, these representations being intolerant to low-level shifts in information (Epstein, 2008)….

In other cases, you can replace the “with” with a verb (ending with -ing or not, as the case requires. For example, here is a revision of the first part of the first quotation:

Development of improved systems for CO2 conversion has been an area of intense activity, which has placed significant emphasis on EC design…

Why is this better? Well, whether it is better is a matter of taste, I guess. I prefer it because the new verb (“using”) adds action to and therefore clarifies the relationship between various parts of this short sentence. “With” represents the same relationship but less evidently, less actively.

Look at a sentence I wrote earlier: “In many cases, the simplest alternative is just to delete ‘with,’ leaving the sentence otherwise unchanged.” I could have used the “with-link” arrangement instead: “In many cases, the simplest alternative is just to delete ‘with,’ with the sentence otherwise unchanged.” I hope you’ll agree that the original just sounds more natural (and not just because the revision ends up repeating “with” twice in a row). Instead of “with,” my original sentence uses a verb ending with -ing: leaving. Action!

The benefit of such a strategy becomes clearer when you’re dealing with a longer, more complex sentences, as in the second quotation above:

Development of improved systems for CO2 conversion has been an area of intense activity, significant emphasis being placed on EC design and on discovery of improved CO2 reduction (4, 5) and H2O oxidation (6) catalysts based on nonprecious earth-abundant elements.

I would actually recommend dividing this sentence into two sentences–not because it’s that long, but because as I read it it really does address two ideas (1 = this is an area of intense activity; 2 = within this area of activity, special emphasis is placed on X). But if it’s one sentence I think this revision is clearer, more active and (not unrelatedly) more like spoken English.

Academic Writing Pattern # 1

This is the first of my posts in my Notes on Academic Writing series. It will focus on small but (to my eye) troublesome patterns in academic writing, as well as on neat tricks for improving academic prose. The issues addressed in this series are not the usual ones. Nor are they all that important–nothing like the issues of concision, clarity cohesion and coherence that are treated so well elsewhere (for instance in my colleague Rachael Cayley’s fantastic blog Explorations of Style). No, this series is about small issues. But even small issues can have a big effect on the clarity, rhythm, professionalism and beauty of academic writing.

This first post is about one of those troublesome patterns, what I call the “and is” construction. It’s particularly common in scientific writing. Here are two examples, with the pattern highlighted:

In light of scientific developments in the field of medical research, the document aims to address a range of issues which involves ethical controversies and is criticised by pro-life ethicists.

Patrick Foong. “Human Embryonic Stem Cell (HESC) Research in Malaysia: Multi-Faith Perspectives.” Asian Bioethics Review 3.3 (2011): 191

The Gini coefficient has gained popularity in the social sciences as an accepted way to measure income inequality (Allison 1978) and is used in many studies of income inequality.

Laura Duncan. “Money and Membership: Effects of Neighbourhood Poverty, Income Inequality and Individual Income on Voluntary Association Membership in Canada.” Canadian Journal of Sociology 35.4 (2010): 580.

These sentences, though correct, both sound a bit “off” to me. What accounts for this feeling? Without doing a more in-depth analysis, I suspect two causes. One is the fact, demonstrated above, that the sentence’s subject is linked to two verbs, the first of which is active and the second passive. Thus, in the first example, “the document” is the subject associated with two verbs: “aims” (active voice) and “is criticised” (passive voice). In the second example, the subject “Gini coefficient” is associated with two verbs: “has gained” (active voice) and “is used” (passive voice).

The second factor, which is more subjective and difficult to assess, is the fact that while both parts of each sentence are “about” the same topic (the document, the Gini coefficient), each part of those two sentences are about fairly distinct aspects of that topic. In the first sentence, the first half of the sentence is about what the document does (what it “aims to address”), while the sentence half of the sentence is about how people have reacted to it. In short, though both parts of the sentence are about the document, they are still about two very different aspects of that document–a difference that is somewhat obscured by the way both halves of the sentences are connected (with a simple “and,” as if both sides were equivalent or symmetrical). Another way to put it is that the two parts appear to be different kinds of statement about the same topic.

Here is one more example, which also reflects the issues I mentioned above.

Intriguingly, these organisms have only a single noncentromeric histone H3 that resembles H3.3 and is
deposited during both replicative and nonreplicative phases of the cell cycle.

Harmit S. Malik and Steven Henikoff. “Major Evolutionary Transitions in Centromere Complexity.” Cell 138 (2009): 1071.

Again, we have a subject (“single noncentromeric histone H3”) working with two verbs in active (“resembles”) and passive (“is deposited”) voice. Like the other examples, this two-verb pattern also mixes description (“resembles”) with action (“is deposited”).

To get a better sense of why “and is” sentences often seem awkward, it may be helpful to look at an exception, where the “and is” structure works rather well. For example:

This software is institutionally available to U.S. Fish and Wildlife Service employees and is thus
cost effective.

Victoria M. Hunt et al. “A Decision Support Tool for Adaptive Management of Native Prairie Ecosystems.” Interfaces 46.4 (2016): 339.

Here, the two halves of the sentence seem to be “about” the same or at least closely related aspects of the software in question. In fact, the second fact about the software (i.e. that is it cost effective) is directly related to the first fact about it (i.e. that it is institutionally available to employees). Compare that pattern to the sentences above. Note, too, that both verbs here are in the same voice (in this case active voice).

But I’m not really convinced that I’ve put my finger on why most “and is” sentences seem awkward and why some don’t. Consider this hypothetical case:

Malaria is a leading cause of mortality and is one of the most challenging infectious diseases to control.

This sentence does not fit the patterns I identified in the first three quoted sentences, yet it still sounds awkward in the same way. The voice is consistent here (there is no switch from active to passive voice, or vice versa); furthermore, I think it’s fair to say that both parts of the sentence are about more or less the same aspect of the same topic (both are descriptive and both deal with the seriousness of malaria). Yet I find that the “and is” structure is problematic in the same way as the three quoted examples above. I can’t quite figure this out.

I will keep thinking about this issue. But I suspect that the pattern I identified above is generally responsible for the oddity of “and is” sentences. In the meantime, don’t fret: it’s not a major problem! Still, I think it’s good to identify patterns that make our writing less effective, as well as patterns we end up relying on.

To conclude, I’ll go back to my hypothetical example to offer some easy fixes. In many cases, the easiest solution to the “and is” issue is to divide the sentence into two. But there are other options that might be preferable, such as

Malaria, a leading cause of mortality, is one of the most challenging infectious diseases to control.

A leading cause of mortality, malaria is one of the most challenging infectious diseases to control.

Malaria is a leading cause of mortality and among the most challenging infectious diseases to control.

These are just three of many more possible revisions….