Data that plays nice with others

Photo by  justgrimes .

Photo by justgrimes.

The Ontario Trillium Foundation is a fairly unique organization: it’s both a provincial government agency and one of Canada’s largest grantmakers. It’s also unique in terms of its approach to open data.

The OTF started publishing grant data in August 2015, in an open and machine-readable format. The foundation has taken on a leadership role in this area — including becoming the first foundation in Canada to introduce a completely paperless granting system — and we’re proud to be helping them develop an open data plan.

Last week the OTF released another batch of granting data. It includes grants made from the Youth Opportunities Fund, along with the latest additions to the 15 years of granting data previously released. There are more datasets to come, but we wanted to use this example to highlight some best practices the OTF is using, which are relevant for anyone publishing granting data: shared unique identifiers.

Why be unique?

Grant recipients in the OTF data are identified with either an incorporation or charitable registration number. These are “unique” identifiers because there’s no way for them to be overlap: two organizations can’t share the same number.

At the most basic level, this is a best practice because it eliminates ambiguity that comes from only providing an organization’s name. A grantmaker may know which CPSA they’ve supported, but to an outsider looking at the data, the CPSA could be the Canadian Professional Sales Association, the Canadian Political Science Association, the College of Physicians and Surgeons of Alberta, or perhaps even the Clay Pigeon Shooting Association. We probably want to distinguish between clay pigeons and salespeople — at least when it comes to data analysis.

Along with resolving these issues, using shared unique identifiers can unlock new value. An organization’s charitable or business number is a shared identifier that is already being used in a variety of contexts. Including it in a dataset makes it easier to augment it with other sources, such as the Canada Revenue Agency’s records, or data from other funders that decide to follow the OTF’s lead in the future.

Much like using data standards, providing shared unique identifiers is forward-thinking: when done correctly, it opens the door to all sorts of collaboration. And when done incorrectly — well, that’s a story for another blog post.