A suggestion about data: generate it now, and figure out curation later.

Example: Labeling Tickets

Here’s an example from my work. When I write software engineering tickets, I label them; if I’m writing a ticket that improves patching, I’d label it “security.”

In Jira, those labels are links. Click “security” and you’ll see dozens of tickets that we’ve written to improve our security posture. And so, when a stakeholder asks: “What are we doing about security?”– we have an answer. All because we continually applied one little label.

What About Curation?

What about curation? Selecting the right labels, curating them, labeling legacy work, etc.? How do I manage the system? Am I just generating data? Yes, mostly.

I don’t think long about the labels that I pick. The label “security” can cover everything from protecting against attacks to encrypting data.

I don’t curate the labels. Both “infosec” and “security” can exist, and a ticket can have both.

I don’t label legacy tickets, as a rule. They’re in the past.

Don’t overthink this! Curation is a tomorrow problem.

Example: Knowledge Bases

Another example: team knowledge bases. You’ll often see teams nitpicking the organization of their internal knowledge bases. Every document must be in the right place, have an owner, and be refined and maintained.

In my experience, great knowledge bases are overloaded with information. Employees contribute now and sort it out later. Until you have data, organization is premature.

Example: This Blog

Last example: this blog. I’ve tagged this blog post “data”. In the platform I’m using, Hugo, this generates a taxonomy with its page.

Is there more than one “data” post there? Yes. Also, I don’t care! I’m just generating that data. It’s my default. I may use it later.

Caveat & Conclusion

I’d like to praise to Jomkit Jujaroen, who helped me surface an important caveat to this advice: it might depend on your personality. I tend to want a perfect system before starting to use it. Such that I often don’t use it at all. This approach helps me put down that perfectionism. It might not be the right one for every person or team.

How do you approach data generation and curation? I’d love to compare notes.