The Rhetoric of Content Types

For think tanks, the structure of a piece of content isn’t always the same thing as its function. It’s time for our CMSs to stop treating them as if they were.

jjosephmiller
10 min readDec 14, 2020

It’s been nearly seven years—an entire generation in Internet Time—since Jeff Eaton published “The Battle for the Body Field.” The post beautifully sums up the central problem in producing online narrative content:

This fields-and-templates approach works great for content that follows predictable patterns, like product information sheets, photo galleries, and podcasts. It’s at the heart of NPR’s successful “Create Once, Publish Everywhere” system, and it’s hard to find a CMS or web publishing tool that doesn’t offer some way to model different types of content.

But Team Chunk has a deadly weakness. When narrative text is mixed with embedded media, complex call-outs, or other rich supporting material, structured templates have trouble keeping up.

Content strategists have solved a lot of problems since Jeff’s article. Quickly and easily producing clean, reusable narrative content isn’t one of them.

The state of think tank content management systems

Thankfully, the days of think tanks relying on default WordPress content types are mostly past. Open up the hood of a think tank CMS and you’ll probably see content types like publication, article, event, news, project and person.

On newer sites, you might also see a content type called longform or featured, built with some sort of page builder-y components built with things like Gutenberg (WordPress) or Paragraphs (Drupal).

More about that below.

On the ambiguity of “content type”

Conversations often run aground on problems of definitions.¹ Indeed, What is a content type? has generated approximately 1743839754 Slack posts at Soapbox and prompted a long Twitter conversations. (Have I mentioned how much I love my job?)

If you’re an author, content type refers to the types of things you create. For think tanks, that means stuff like reports and briefs and working papers. If you’re someone who builds think tank websites, then content type is a term of art, one that revolves around particular collections of fields and permissions.

In the interest of preserving our company Slack as something other than a single-issue message board, I’m going to avoid using the term content type through the rest of this piece.

Also note that the remainder of this section is adapted from Deane Barker’s excellent Real World Content Modeling. If you’re already familiar with Deane’s book, you can skip to the next section. (If you’re not, now is a great time to pick up a copy.)

Attributes

An attribute is the most basic level of stuff that a CMS database holds. It consists most fundamentally of two things:

  1. A piece of data.
  2. A label for that data.

In practice, attributes hold a bunch of other information (e.g., an internal name, a user-facing label, a datatype, and a UI interfaces). All those things are important for how your website is built, but they aren’t as important here, so we’re going to ignore them.

What matters for our purposes is that an attribute is a data point together with the full set of things stored with that data point (datatype, internal name, user-facing label, etc.)

Entities

An entity is a set of attributes that forms a logically independent piece of content.²

On this understanding, an entity does not store data. (More precisely, it’s a sort of super-attribute that contains a machine name, a human name, and some validation and permission rules. Again, this is important for developing a CMS, but not so important here.)

The function of an entity is to act as a wrapper for a particular set of attributes. The information is contained in the attributes themselves.

What makes an entity type unique?

Leibnitz’s principle of the identity of indiscernibles applies to entity types. That means entities P and Q are the same entity type if every attribute possessed by P is also possessed by Q.³

Think tanks and the body field problem

For a long time think tanks could get away with only a handful of entity types, because every think tank adopted the same approach to the body field problem. Stuff all the body content into a PDF.

Thank tanks have long adopted a single approach to the body field problem. Stuff all the body content into a PDF.

When all the messy body content is inside a PDF, the remainder of your entity can be fully structured and easily templated. The thing itself is the PDF. The CMS simply stores all the metadata about the thing. It’s the CMS as document management system.

The (pre-redesign) Chatham House reader, which turns InDesign files into HTML.

The body-as-PDF solution works—so long as your definition of work includes I don’t want anyone to read this thing.

Thankfully, more and more think tanks are shifting from PDF-only to PDF-also models of publishing. That process runs in one of two directions.

  1. Turn InDesign files into clean HTML. This process requires discipline in creating your InDesign files themselves—no inline styling!—a sophisticated set of CSS styles for the website, and some custom code to massage the export into something your website can read. We’ve built such a solution for Chatham House and the World Resources Institute.
  2. Generate a PDF from HTML. Yes, there is print CSS. But many think tanks need more than that. A lot of policymaking still happens face-to-face. Putting a Printed Thing on a Minister’s desk at the start of a conversation carries real weight, which means that any sort of PDF you generate needs to look like something that has been professionally typeset. The International Budget Partnership does this quite well. Here’s the HTML. And the PDF, autogenerated from the HTML.

No question that both options are a huge improvement over the PDF-only model. But they also introduce a whole new wrinkle into the body field problem.

The same kinds of content—let’s say, a policy brief—can now have different sets of CMS attributes.

The essence of briefs

No, not that kind. The kind that think tanks write. Here’s a nice summary from the University of North Carolina:

A policy brief presents a concise summary of information that can help readers understand, and likely make decisions about, government policies. Policy briefs may give objective summaries of relevant research, suggest possible policy options, or go even further and argue for particular courses of action.

Infographic from the Royal Academy of Engineering.

The important thing about policy briefs is that they are intended for a very specific audience and for a very specific purpose. They are aimed primarily at people who make decisions about policies and secondarily at people who influence the people who make decisions about policies.

So what form would a concise summary plus policy options take? There’s really no one answer.

Structurally, these are very different things.

  • An infographic may need to execute extra code or allow for a full-width embed.
  • A chartbook needs repeating paired fields for an interactive chart plus corresponding text.
  • A traditional narrative needs mostly text interspersed with embed fields.
  • A collection needs to hold some introductory text and some fields that allow editors to curate a number of distinct pieces of content.

And, of course, any (or all) of those things could exist as HTML or they could be contained inside a PDF.

So is a brief one entity type? Or have we just described four different entity types?

Authors and readers vs the CMS

If we take seriously the idea that an entity is a logically independent collection of attributes, and that two things with different attributes are different entities, then the answer is four.

From the perspective of a CMS, an entity is all about the structure of the content.

But if your concern is about providing concise summaries that help decision-makers—and those who influence them—understand and make decisions about public policy, then the answer is one.

From the perspective of an author, a thing’s type is all about the rhetoric of the content.

What is rhetoric?

I’m quite fond of the term rhetoric, for all that it’s a bit old-fashioned.

My very first professional job: Lecturer in Rhetoric at my alma mater.

Rhetoric refers to the way that content needs to work in order to achieve its goals.

It’s about knowing the best way to say things such that they inform, persuade, entertain, or make it possible for a person to do a thing.

It’s about knowing how to string words and ideas together in such a way as to have the desired effect on your readers.

It’s why we’re moved when Barack Obama delivers a speech and befuddled when Donald Trump spits out word salad.

Good rhetoric delivers the right information, in the right way, at the right time.

Rhetoric can take many forms, and many of those forms have a specific structure. But rhetoric isn’t simply about having a structure. Rhetorical skill means knowing which structure is most effective for your audience and then stringing together compelling sentences and visuals to enable that structure to work.

Entities and rhetoric

Entities define the attributes that make up a piece of content. Rhetoric is about the actual data contained inside those attributes.

An example might help.

Consider a bicycle and an e-scooter. They are made up of very different components—one has a seat and pedals and the other has an electric motor and a charging port. But in most cities, the two have the same function: helping people commute (relatively) short distances more quickly than they could do so on foot.

An e-scooter’s technical schematics are an entity.

Users rent an e-scooter for its rhetorical function.

If you’re a company in the business of renting out transportation modes, it certainly matters a lot whether you specialize in bicycles or e-scooters. Having the right set of technical schematics definitely matters a lot to the people who build the devices you’ll be renting.

Commuters don’t care which thing you make, so long as they can find one of them right outside their door when they’re running late for work.

What does this mean for our CMS?

In some industries, there is a one-to-one relationship between a CMS entity and a rhetorical function. (There’s a good reason that every introduction to structured content uses recipes as their very first example.)

Think tank content isn’t like a recipe.

Items with different rhetorical functions can have identical sets of attributes. This is most evident for legacy content that is posted as PDF files. A brief and a working paper have very different rhetorical functions. But the rhetoric is entirely contained in the PDF. The structural bits stored in the CMS are usually identical.

And, as we discussed earlier, items with different sets of attributes can have identical rhetorical functions.

A missing distinction

The underlying problem is that there’s a distinction that content management systems fail to capture. The CMS is famous for enabling designers and authors to separate content from presentation. (Well, every CMS other than the one that rhymes with Bird Dress.)

Separating collections of attributes from rhetorical function is every bit as important. But it’s poorly supported in major open source CMSs. I think there’s a case to be made that content strategy is currently stuck with the equivalent of tables and spacer gifs for building content models.

Content strategy is stuck with the equivalent of tables and spacer gifs for building content models.

Failure to separate out these concepts is at the heart of our labeling issues: Does content type refer to a rhetorical function or to a logically distinct collection of attributes?

It’s also at the heart of a bunch of technical challenges—for example, should a CMS build workflow and permission systems around collections of attributes or around rhetorical function? The fact that most CMSs privilege the former creates a host of implementation challenges.

But there are workarounds.

Our team at Soapbox very recently helped the Ada Lovelace Institute launch a new website that uses what we call our Flexible Publishing Model. It addresses some of these problems by allowing authors to first choose the rhetorical function and then choose the structure. We’ve an ongoing R&D project to streamline the authoring experience.

In an upcoming post, I’ll talk more about what we’ve done thus far and where the team is headed.

Notes

¹ Fun fact: an entire school of early 20th Century philosophy—logical empiricism—held that all disagreements are rooted in imprecise language, and that if we limit our arguments to well-formed, meaningful sentences with shared definitions of terms, disagreement would evaporate. Logical empiricism—like much else in philosophy—is both wrong and grounded in important and instructive insights. Not all disagreements are linguistic. But a lot are. (return)

² This is where things get a bit risky. I’ve chosen to use entity here to avoid the controversy around content type. But of course entity has a technical meaning in Drupal—one that is broader than the Drupal meaning of content type. For example, an image plus its associated metadata is a Drupal entity, but it’s not a content type. Where necessary, I’ll refer to entities (logically connected independent sets of attributes) and nested entities (or logically connected but not independent).

I’m not sure if this will ultimately work. But that’s a problem for Future Joe. (return)

³ Remember that business about entities being super attributes? This is where it comes back into play. If I have an entity called “news” and another called “article” both of which contain the same set of attributes, then they they aren’t really two entities—they’re subtypes of a single entity type.

But now suppose that news and article have different workflows. Perhaps news items need to be approved by the director of communications while article items need approval from the director of research. Since permissions are an attribute of content type, then these may need to be separate.

In my view, this is less an objection to my overall argument and more an additional point of evidence for the claim that content type and editorial function should be treated separately at the CMS level. (return)

⁴ IBP’s Open Budget Survey is cool for all sorts of reasons. The page-to-PDF bit is one of them. The fact that they’ve also effectively turned Drupal into a component content management system is another. You can read more about that at On Think Tanks. (return)

⁵ The basic rules of the English language would imply that content type should refer to the kinds of things an organization produces. In practice, a great many content management systems use content type as a technical term referring to a particular collection of attributes. This is why your product teams need content strategists. (return)

--

--

jjosephmiller

Employing hypertext to explore ambiguous idea spaces. Principal, Fountain Digital Consulting. Author SCREENS, RESEARCH AND HYPERTEXT. Recovering philosopher.