Denise McKenzie is the community and ethics partner at PLACE, a mapping data institution that acts as a trusted intermediary between public and private entities. She has over 20 years of experience with the global geospatial community, in domains including smart cities, agriculture, defense, and insurance. Previously, she co-directed the Benchmark Initiative with Ordnance Survey.
This is an extended cut of the interview from AI from Above that has been edited for ease of reading.
Why is it important to apply an ethical lens to location and map data?
Maps are how we make sense of the world, they are an incredible tool in understanding why we as humans do what we do. Increasingly, we’ve got amazing algorithms that scan over satellite imagery and aerial photography and pick out features. If you haven’t trained them properly, you may pick out some features but miss others. A good example of this is the automated creation of a data type called “land cover”. I was just talking to someone in South Africa about this the other day. We know there are people living in an area of grassland but that doesn’t appear in the land cover layer. The danger is that those humans ‘don’t exist’ when someone looks at the data to make a decision. So there’s a real danger in mapping, and when you create derived products. If you haven’t trained your machine learning well enough, you’re basically going to rub humans out of existence, or rub types of trees out of existence, or take animals out of existence, because you haven’t detected it with your artificial intelligence.
What should AI researchers consider before they use location datasets?
There’s this really fantastic phrase that gets used in the geospatial world. It’s whether data is “fit for purpose”. Data is hard. In this field, you almost never get the perfect dataset for exactly what you need. So we often grab data and try to mould it. You always hear people go, ‘Oh I had to clean the data, I had to re-structure it. I had to do all this work to make it do what I needed it to.’ What I would say to AI professionals, is that you have to think carefully about how much manipulation you do before you run your algorithms over a dataset. Have you thought about what the inherent bias might be in the data?
I literally just had a conversation like this. I was talking to a solution provider that pulls a lot of data together and passes it on to the finance industry. And I asked about their confidence in the data and its completeness in different geographies. And I was shocked that their answer was, ‘Oh, that’s not my problem’. I think we all have a responsibility to understand the data that we’re working with, and how it’s going to impact the outputs of AI and machine learning.
You can have either too little or too much data, can’t you?
What we often need, is for decision makers who are looking after a place, to have data that shows that a population exists and explains the needs of that community. In the geospatial community, “collect once and reuse many times” is often how we think about data.
We know that roughly half of the globe’s population doesn’t have internet access, yet we collect things like mobile phone data and declare that it’s representative of a population when we know that perhaps a third of people don’t carry phones. So I think when it comes to some particularly vulnerable populations that are not digitally connected, we need to find the right ways to make sure they’re in the data and to ensure that we’re designing data collection that is ethical, appropriate, and accessible.
How does PLACE strike the right balance?
What we do is collect really high-resolution, high-quality aerial imagery of urban environments that can be used by governments for decision-making. They get ownership of the data that is about their nation and their cities. Then we build a responsible data user community around that data that allows this one single dataset to be reused in multiple different ways.
The first pilot we completed is in Côte d’Ivoire. There are incredible images within the datasets that can show things like the air conditioners on top of buildings and how big they are. And if you can teach your AI to identify them, you can actually calculate for a particular area of the city, the energy draw, size, and capacity of all the air conditioning units and then calculate the energy consumption.
If you’re a city or an energy provider within that space, it allows you to start predictively thinking, “Where do I actually need to improve infrastructure?”
How do you build trust with communities at PLACE?
With trust, it’s very much about transparency and the communication that you give. The PLACE model isn’t so much co-owning with communities as much as it’s trying to put agency and sovereignty of data into the space of the government — of trying to put data back into a nation’s set of assets to enhance their ability to make the right decisions. We will build a relationship with a national mapping agency within a country and get recommendations for at least two local organizations that can actually do the flying with drones or planes.
One of the questions I get is, “But you’re not open data, you know, you’re not open by default?” in this almost pious way. And I challenge that to say, actually we are open. We’re as open as we can be, and as closed as we need to be. The type of data that we’re collecting is incredibly detailed, even if it’s not personally identifiable. It’s looking at homes, at schools, at shopping centers, at roads where people go. If you were to just make that freely available to anybody, then it’s just as open to people who might do bad with it. When we give the data to the government, it is anonymized. They then “gift” us with a license and a copy of the data that we put in a legal trust, which PLACE Community members pay for access to. And that’s how we create a sustainable funding model, so we can continue to fly new imagery again the following year and start the process all over. We create a sustainable cycle of supply and demand this way.
In the PLACE Community we expect members to adhere to a code of conduct. We look to organizations we can trust that actually have the best interests of vulnerable populations at heart. This is incredibly detailed, powerful information that the world needs — that governments need — to understand their populations. But equally a terrorist could be looking for the same type of data to work out where to locate a bomb. So we want to know the people who use our data, because we want to understand that they’re using it for the right reasons. That’s stewardship, for us. We take it quite seriously and we foster a community approach around the data.
You helped write the Locus Charter, a document that calls for ethical practices for location data use. Who is it for?
I would say it’s for a lot of different location data practitioners to look at their practices and say, ‘How do we measure up against these principles? Can we do better at how we build trust with the public and with other organizations through what we do?’ The Locus Charter came out of two pieces of work. The EthicalGEO program, working with the American Geographical Society in the United States and the Benchmark Initiative with Ordnance Survey in the United Kingdom.
During the pandemic, we saw location data challenges in how global dashboards were showing the spread of the virus across the world. There were questions about the quality of data, where data was coming from, and whether it was equitable data. We talked to a huge variety of technical professionals, and found that they wanted to be doing the right thing, but didn’t really have a yardstick to measure against. So we said, ‘Well, why don’t we have a go at writing those principles?’
Portrait photo of Denise McKenzie is by Hannah Yoon (CC-BY) 2022
Mozilla has taken reasonable steps to ensure the accuracy of the statements made during the interview, but the words and opinions presented here are ascribed entirely to the interviewee.