Big Data Systems for a Small, Qualitative World: Introduction

Elijah Meshnick
3 min readJan 21, 2022
Abstract blue and purple design that looks a little like a circuit and a little like a ball of electricity.
Design from Software Engineering Institute at CMU

There is so much to be said about why data is unlike us. It reduces us to a number. That cliche has weight. It’s a gesture to why data often feels like a bad word. It’s sterile and intimidating, unapproachable unless you have a relevant degree. But I’ve been wondering how these concepts can be transformed, and if there’s anything about data systems that I can actually relate to on a wholistic, human level.

Maybe it’s best to start out with a common definition. When I teach coding to high school students, I always start out with vocabulary. Learning is a lot easier in a shared language.

Data refers to quantities, measures, and numbers that describe. It is a snapshot that can be compared and categorized, used as evidence. Based on units that help build credible assumptions. A rhetorical device for those with the tools to collect it.

Data deserves the bad reputation that it has in some circles. It’s often taken without our consent, and it is frequently used against us. It raises issues of privacy and control. Data almost always comes from an authority, as a census, a system of surveillance.

But we can imagine data as a process rather than an object, because data is meant to be processed. Databases contain numbers but they also contain images. They store words, and in doing so, they store messages, and in doing so, they store love letters. Data is nothing unless it is kept, revisited, and interpreted, and in that way, data is a lot like memory.

Data is just one-dimensional information, “points.” But you can draw a line between any two points, and a picture is made up of lines, just like stories are made up of information. Data is defined by rules, but so are games. And what if the rules of data, of storing it, sharing it, and shaping it, can be used to better understand our more human systems of knowledge.

Reflecting on knowledge is a study of its own, named “epistemology.” That’s a word I’ve used a handful of times, so that people think I’m smart, and each time (including this one) I’ve looked up the definition immediately to make sure I used it correctly. It’s a word that’s too long and irrelevant to perfectly remember. It rolls off the tongue, flowy and pretentious. It’s nothing like “data,” curt and businesslike.

I don’t have enough patience or interest to study epistemology, but I’m constantly studying systems of data as a Software Engineer. Maybe there’s an overlap between data engineering and the types of intimate knowledge we use but rarely describe, and if there isn’t I’ll make one. After all, you can draw a line between any two points.

This is the first of a series of reflections as I read Designing Data-Intensive Applications by Martin Kleppmann. These reflections are meant to be accessible to people who know nothing about data OR applications.

--

--