Author: Reading the Pictures

I write a small newsletter about environmental policy, which means I spend a lot of time reading things I only partially understand. Last fall I was trying to make sense of a paper in Nature Climate Change about sea-level projections. Not because anyone assigned it to me, but because I’d read a newspaper column that cited it, and the column was doing that thing newspaper columns do when they summarize research: collapsing months of careful work into a sentence designed to make you either angry or afraid. I wanted to see for myself what the researchers actually found.

The paper was open-access, which was the first good sign. I downloaded the PDF, scrolled past the abstract (too compressed to be useful to anyone outside the field) and did what I suspect most non-specialists do when they open a journal article: I went straight to the figures.

This is, I think, common behavior. When you don’t know a discipline’s vocabulary well enough to parse its prose, the figures become your interpreter. They’re supposed to condense the finding into something your eyes can grasp faster than your reading can. The figure is where you go to decide whether the rest is worth your time.

So the figures in this particular paper should have told me something. They did, technically. They told me the authors were using MATLAB, probably with default settings. The color scheme was a rainbow gradient, red through yellow and green to blue, applied to data where the only meaningful distinction was “higher” and “lower.” The axis labels were in 8-point font. One figure had a legend that required me to distinguish between six shades of blue-green, which on my laptop screen compressed into three.

I stared at Figure 3 for what felt like a long time. A time series of some variable I thought I understood. The lines went up. That part was clear enough. But the confidence intervals overlapped so thoroughly that the visual message amounted to: anything could happen. Which may have been the scientifically honest message. It was not a communicatively useful one.

I closed the PDF and reread the newspaper summary. At least it had the decency to be wrong in a way I could follow.

Edward Tufte, in his 1983 book The Visual Display of Quantitative Information, made a case that has become a kind of commandment in data visualization circles: “Above all else show the data.” The principle is simple, almost tautological. A figure exists to make data visible. If it obscures, decorates, or distracts, it has failed at its only job.

Tufte is the canonical reference here, but a more interesting one might be W.E.B. Du Bois, who in 1900 created sixty-three hand-drawn data visualizations for the Paris Exposition. His charts depicted the economic and social conditions of Black Americans, and they were gorgeous: spiraling color wheels, stacked area charts with hand-lettered labels, bar graphs composed with the compositional care of a painting. Du Bois understood that a figure is an argument. Its visual form is part of what it means. His charts didn’t just present data about inequality; they demanded that viewers look at it, carefully, by making the looking itself a considered experience. They won a gold medal.

By that standard, most scientific figures today are failing. Not through malice or incompetence, but through inattention. The people making them are researchers, trained to design experiments, not graphics. They reach for whatever plotting software their lab uses, accept its defaults, and move on. The figure is the last thing finished before submission, assembled sometime around midnight on the day of the deadline. Nobody stops to ask whether a reader will be able to parse it.

I understand this. I know about deadlines and the triage of priorities. When you’re trying to get a manuscript through peer review, the prose matters, the statistics matter, the figures are whatever the code produced. Reviewers rarely comment on visual quality unless something is egregiously misleading. The incentive structure rewards publishable data, not communicable data. These are different things, and the distance between them is where public understanding of science goes to quietly erode.

My friend Sarah teaches introductory biology at a university in the Midwest. A few years ago she started an experiment of her own: she assigned Tufte’s book as required reading before students could begin their research posters.

Most students had never thought about a figure as something you design. A figure was something you generated. You ran the analysis, the software produced a chart, you put the chart on the poster. Design implied intention, and there had been no intention at any point in the process.

After the Tufte unit, she told me, three things happened. Students stopped using 3D effects. They stopped using rainbow color maps. And they started arguing with each other about what the figure was supposed to show, not what data it should include, but what message it should deliver. She said that last part surprised her. She’d expected better aesthetics. She hadn’t expected a shift in how students thought about communication itself.

“Nobody taught them to make figures,” she said. “They learned by copying their advisors, who learned by copying their advisors. The chain goes back to whenever Excel became the default and nobody questioned it.”

I asked whether the posters actually improved.

“The bad ones got less bad. The good ones got much better. A few of them looked like they were designed by someone who’d thought about the person standing three feet away, trying to learn something.”

I asked if it lasted. Whether students carried the habit into their graduate work or reverted to lab defaults the moment a thesis advisor handed them a different plotting library.

She paused. “Some revert,” she said. “The pressure to match what everyone else in the lab does is strong. But a few of them keep fighting the defaults. Those are the ones whose papers I can actually read.”

That phrase stuck with me — “thought about the person standing three feet away.” It gets at the thing most scientific figures are missing. Not skill, not software, not time (though all three help). What’s missing is a reader.

I started paying closer attention after that conversation. I’d open papers the way I always had, figures first, and try to articulate what was going wrong when a figure failed to communicate. Usually it wasn’t one catastrophic choice. It was an accumulation of small neglects. A color palette chosen because it was the software default, not because it mapped onto the data’s structure. Axis labels abbreviated into acronyms that meant something to the authors and nothing to anyone else. Legends that required cross-referencing with a table buried three pages later. Each choice individually forgivable. Together, they built a wall.

The worst part was that the underlying findings were often interesting. I could tell, from the surrounding text, that the authors had discovered something worth knowing. But the figures, the part of the paper that should have carried the finding most efficiently to the widest audience, had been treated as an afterthought. The ideas were good. The transmission was broken.

A few months after that conversation with Sarah, I was trying to recreate one of those broken figures for my newsletter. I wanted to show readers what the sea-level projections actually looked like when you stripped away the clutter. I’m not a designer. I can use a spreadsheet. The results were predictably mediocre.

While searching for alternatives, I stumbled onto tools for generating scientific figures from text descriptions. The concept was simple: instead of building a chart element by element, you describe what you want it to communicate, and the tool handles the visual choices. I tried one called FigureGPT on the sea-level data.

The output wasn’t art. But it was clear. Two distinct trajectories, labeled in readable type, with uncertainty shown as a shaded region instead of spaghetti lines. I hadn’t specified any of those design decisions. I’d just described the point I wanted the figure to make. The tool had produced something that a non-specialist could parse in ten seconds. My hand-built spreadsheet version took thirty and still left ambiguity.

What struck me about this wasn’t the technology itself. It was what the technology revealed about the problem. The original paper’s authors had better data than I did, better software, better training. The difference was that they’d built their figure for the analysis pipeline. I’d described mine for a reader. And the tool, by asking me to articulate the point before it could produce anything, had forced exactly the moment of intention that Sarah’s students learned in her Tufte unit: what is this figure supposed to tell someone?

I don’t think automation replaces the kind of deep visual thinking Du Bois brought to his data portraits. That was artistry married to argument. But for the vast majority of scientific figures, the ones assembled at midnight by someone who needs something in the manuscript by morning, a higher default floor matters. It’s the difference between a figure that was designed and one that was merely produced.

There is a strange irony at the center of all this. Our visualization tools have gotten extraordinarily powerful. Our figures have gotten worse. It’s a version of what Michael Lind once called technological regress: a system that advances in capability while declining in the quality of its outputs, because nobody is paying attention to the part that faces the user. The tools can do almost anything. The humans operating them are asking for almost nothing.

Science is under political and cultural pressure it hasn’t faced in decades. I don’t think better figures will fix that. The roots of distrust are structural, economic and political and tribal, and no amount of good typography will paper over them. But I do think that scientists control more of the communication pipeline than they exercise. Every time a curious non-specialist, a teacher, a journalist, someone like me with a downloaded PDF, encounters a figure that was made for no one, something small is lost.

I don’t mean this romantically. I mean it practically. My friend Dana works in public health communications for a state agency. Her job, as she describes it, is “translating papers into things a legislator can read at breakfast.” Last year she received a report on childhood nutrition outcomes in rural counties. Seventeen figures. Every one technically correct. She sat with them for an afternoon and could not construct a coherent briefing.

The problem wasn’t the science, she told me. The problem was that the figures had been designed for peer review, not for a committee staffer with eleven minutes before a hearing. She spent three days redrawing them, simplifying color schemes, relabeling axes in plain language, cutting six figures entirely because they duplicated information already in the other eleven.

“They’re not making figures for us,” she said. “They’re making them for the journal. We just happen to need the same information.”

When a figure is clear, a reader who isn’t an expert can at least locate the claim. They can see the trend, the comparison, the uncertainty. They may not follow every detail, but they’ve been given an honest entry point. When a figure is opaque, not because the science is complex, but because nobody thought about the person looking at it, the entry point doesn’t exist.

I went back to the sea-level paper a few weeks later. Someone had linked me to a blog post by a science communicator who’d redrawn Figure 3 using the same data. Two colors instead of six. Confidence intervals as a shaded band rather than overlapping spaghetti lines. Axes labeled in words a person might use in conversation.

The finding came through immediately: under high-emission scenarios, sea-level rise by 2100 was projected at roughly double the moderate-action trajectory. Uncertainty was real but the direction was not in doubt.

I had the same data the whole time. The numbers were there in the original paper. But I couldn’t reach them, because the original figure hadn’t been made for anyone. It was output, not communication. Somewhere in the chain from analysis to publication, nobody had paused to consider who might be on the other end, trying to understand.

Maybe that sounds like a small thing. One person failing to read one figure in one paper about a topic she could have learned about from a dozen other sources. But I think about Dana redrawing seventeen charts by hand. I think about Sarah telling her students that a figure is not a puzzle. I think about Du Bois, painting data by hand because he understood that how you show something is part of what it means.

Mostly I think about the next paper I’ll download, and whether the figures will let me in. I’m not optimistic. But I keep opening the PDFs.

04/05/2013

Throw Caution to the Wind

My free túk-túk ride turned out to be a scam... but it was worth it for the great story.

by Reading the Pictures

11/12/2010

Author: Reading the Pictures

Throw Caution to the Wind

My free túk-túk ride turned out to be a scam... but it was worth it for the great story.

by Reading the Pictures

How Do You Say…

Learning a language is one thing; living it is another.

by Reading the Pictures

or try…