A Funny History of Data Collection
A couple of weeks ago, I was lucky enough to present to the folks at Legal and General about the importance of data collection, and making sure you have strong data capture governance and processes. Now, as a data-collection geek, I find the topic very interesting and nuanced, but I was concerned it may come across as slightly dry to the audience. To try and make it more captivating to the un-initiated in the ways of client- and server-side data collection, composable CDP architecture or identity resolution, I thought it would be important to look at the history of data collection, what it taught us, and how this has led us to where we are today. Which led to this funny history of data collection.
The Palaeolithic Era: The Writing’s on the Wall
Data Collection Method: Cave paintings, flint tools, charcoal.
Purpose: Keeping track of prey animals, invention of written communication, worship of ancestors.
The first age of data collection was happening all the way back in caveman times. Hunters put diagrams on the walls to recount their exploits to others. Now this was great at the time; people were capturing data where before there was nothing. But when you look back there were a number of challenges. The data wasn’t governed (Is that a man or a woman? Or is it a hyena, dog or lion??). It was also stuck in one place, and couldn’t be replicated easily!
The Metal Ages (Bronze, Iron, etc): The Birth of Data
Data Collection Method: Papyrus, clay tablets, carved hieroglyphs, occasional henge (stone or wood options).
Purpose: Taxation, worshipping of gods, military campaign management, early time keeping instruments.
The second age was in ancient Egyptian times. Things had made a huge leap forwards. Scribes recorded taxes, crop yields, military records and health data on tablets (stone ones, not iPad or Surface) and papyrus. The fact this data is still exists today and gives real insights into the time is testimony to the governance of the data collected. But the risk of data loss due to the fragility of how this was captured highlights the impacts of technological advancement on data storage.
The Middle Ages: The Dark (Data) Age
Data Collection Method: Parchments, scrolls, and monks.
Purpose: Tax collection, keeping track of kings, and occasionally a plague or two.
In medieval times, monks did most of the data collection, copying everything by hand. Some poor guy in a monastery had to count how many pigs were in the village and write it down. And he had to hope nobody spilled ink all over the parchment. “Big Data” was probably the size of a slightly larger-than-usual scroll. It wasn’t very scalable – only the monks and a few others could read. But it did keep track of historical events and taxes in an organised way (like the Domesday Book or the Anglo-Saxon Chronicle), so it did mean data had a clear value.
And in Arabia, the age wasn’t so dark, and they did something quite important there for the future digital age. They invented zero!
The Renaissance: Data Gets Fancy
Data Collection Method: Early census data, star charts, Renaissance art catalogues, printing presses.
Purpose: Counting people, studying the stars, and making sure Michelangelo’s expenses were well-documented.
Ah, the Renaissance! A time when people started getting serious about counting things, like the number of people in Florence who owned shoes. Data collection was still slow, though. Just imagine Leonardo da Vinci tallying the results of a survey, only for someone to ask him to paint the Mona Lisa again. However, the printing press did change the reach of data visualisation and literacy; suddenly, your data dissemination could go through the roof. Although hopefully not the one on the Sistine Chapel – Michelangelo would go nuts!
The Industrial Revolution: Steam-Powered Data
Data Collection Method: Punch cards, early census systems, distributed data collection collated through telegraph machines.
Purpose: Keeping track of industrial production and census records without causing factory explosions; weather reports, changing perceptions and decisions based on data.
With the rise of machines, humans finally realized they could automate data collection! Enter punch cards— little pieces of paper that helped keep track of everything from census results to factory outputs. Of course, if one punch card got lost or eaten by the machine, it was back to square one. But hey, at least nobody had to chisel data into stone anymore; they needed the chisels for digging out coal! We did see innovation in data in other areas. Fitzroy’s use of repeatable and precise weather stations, with dedicated people then cabling the results into his office allowed the first weather forecasts. And his friend Florence Nightingale invented a chart to show how cleaning hospitals saved lives. Which showed how data could influence public health policy.
The 20th Century: Data Goes Digital (Kinda)
Data Collection Method: Mainframes, spreadsheets, and punch cards (still).
Purpose: Tallying population stats, stock prices, and how many people wore bell-bottoms.
As computers started to gain traction, the idea of collecting data digitally sounded like a dream! But early systems were as friendly as a grumpy librarian. Want to run a report? That’ll take two hours, minimum. Want to sort the data? Well, buckle up, because your spreadsheet is about to crash. But still, progress was made, and companies could finally gather more data than ever before… though understanding it was a whole other problem.
The Internet Era: Data Overload
Data Collection Method: Websites, social media, and cookies (not the delicious kind).
Purpose: Tracking what you’re clicking, buying, eating, thinking, and feeling.
Once the internet took off, so did data collection. Companies began tracking everything you did online—whether you were shopping for socks or binge-watching cat videos. Companies collected data everywhere, and suddenly, your browsing history became the most valuable commodity on Earth. The bad news? Half the world got addicted to cute dog videos. The good news? Someone figured out how to recommend the perfect dog food ad, right after you Googled “puppy care.”
The Big Data Era: Too Much of a Good Thing?
Data Collection Method: Sensors, social media, machine learning, and apps that know what you had for breakfast.
Purpose: Predicting what you’ll buy next, optimizing business decisions, and figuring out how to monetize your steps.
With the rise of Big Data, companies went from not having enough information to drowning in it. Businesses could now predict what colour socks you’d buy in five years. And how many steps you’ll take in the meantime. But with all that data, came the challenge: “How do we use this without getting lost in it?”. Answer: hire a data scientist, and give them a shiny title, like “Chief Data Wrangler.”
The AI and IoT Age: The Data Explosion
Data Collection Method: Everything. Literally everything.
Purpose: Knowing you better than you know yourself.
Today, data collection has reached a level where it feels like your fridge might soon ask you for a customer satisfaction survey. Your phone tracks your steps, your car tracks your driving habits, and your smartwatch knows your heart rate better than your doctor. Companies collect so much data, you’d think they were preparing for a “Data Olympics.” The question now isn’t “How do we get data?” but “How do we use all this data without causing the Matrix to happen?”
The Future: The Data Singularity
Data Collection Method: Probably brain waves, or something involving quantum computers.
Purpose: Not letting the robots take over (hopefully).
In the future, data collection will likely get so advanced that your toaster will know you better than your family. Companies will have so much data on you that predicting your every move will seem less like marketing and more like mind-reading. But don’t worry—there’ll always be room for cute cat videos and endless debates on how to best organize your music playlists.
And that’s the funny history of data collection! From cave paintings to quantum computing, humans have always found a way to collect, lose, and analyze data (sometimes all at once). Here’s to the next chapter of chasing those elusive data points, hopefully with fewer punch cards and more humour!