View Today's Schedule
We ask Lauren Etter from Boston University about how efforts to build trust and ensure equitable access to healthcare led to the development of an artificial intelligence system that uses images of ears to identify patients in remote communities.This Pulsar podcast is brought to you by #MOSatHome. We ask questions submitted by listeners, so if you have a question you'd like us to ask an expert, send it to us at email@example.com.
For more information on Project SEARCH, visit their research page here.
ERIC: Over the past few months we've received quite a number of questions at the Museum of Science, Boston about technologies like artificial intelligence and pattern recognition, with people especially interested in what real-life applications these technologies have.
Today on Pulsar we're highlighting a new project that aims to create a more versatile way to identify a patient by taking a quick snapshot of their ear.
My guest today is Lauren Etter, the coordinator for Project SEARCH, Scanning Ears for Child Health. Lauren, thanks so much for joining me today.
LAUREN: Thanks for having me. I'm excited to be here.
ERIC: So your project hopes to create a brand new way for doctors to identify their patients. So let's start at the beginning with why having a system like that is so important.
LAUREN: It's a great question, and one that I've spent a lot of time digging into. Unique identification-- I like to think of it as, if you're a patient walking into a clinic, and you've been there a number of times before, and you are not matched with a record that is currently in the system and they have to create a new record for you, you can imagine that that new record is going to be missing a lot of information about you. Things like notes from previous visits, allergies that you might have that you forget to tell them for creating this new record-- a number of different things can be missing from that new record.
So that's one of the most obvious reasons why having a nice, accurate patient identification system is important, is to link you to a record that is already in the system that already has your information on file. And there are a lot of implications of these what we call "duplicate records," these kind of shell of a record that has a lot of incomplete information there. And a lot of it comes down to quality of care and then can also aggregate into reporting effects downstream and cost effects downstream. It's an important field of study, I think.
ERIC: So important no matter what you're doing, but when you're talking about medicine and people's health, it's really important. Things like allergies, if you miss that, that can have really drastic consequences.
LAUREN: Exactly, exactly. And to put this into perspective, there was a recent audit of a medical center in Texas that was done a couple of years ago. And they found that 22% of all records in their system were duplicate records. And it was this massive cleaning effort to reconcile records where they could. And they broke down the costs, which was very interesting. So it was at $96 on average per duplicate record found in terms of administration costs to clean up the records, and then $1,100 per patient in terms of duplicate costs, which is just shocking.
ERIC: So there are a lot of unique things about each human being, like fingerprints, the first thing that kind of comes to mind. So why did your team choose an ID system based on ears?
LAUREN: Yeah. So when you say biometric, the first things that come to mind are fingerprints, iris scanning, facial recognition. And if it is on your list, going to be very down to the bottom is going to be ears. We took a look at ears mainly because our project is in the context of tracking vaccination records for infants. And we found that across these more traditional biometrics, they have a very hard time identifying accurately infants at a very young age, because infants are just growing too fast.
So in the literature, the fingerprinting and iris scanning and the facial recognition has very poor recognition rates when you're talking about anyone from zero to five years of age. What is not documented well is how a picture of the ear does perform on those younger populations. For this project we wanted to dig into how the ear grows in the first several months of life, which is not documented currently, and then also if it can be a reliable identifier in this younger population.
The other thing to mention too is that the ear is on the side of your head. You're not really paying attention to it. It's very impersonal. Whereas if you're talking about facial recognition or an iris scan, it's very personal. And fingerprinting is often used in a criminal justice context, so there's a lot of historical context between these other biometrics that have been used previously.
ERIC: Now, I would say that it would be nearly impossible for a person to recognize another person's ear out of a database of thousands of ears. What kind of technology is involved in trying to make this identification?
LAUREN: Where our project is right now-- for identification we're using a more simplistic algorithm, which is a pattern recognition algorithm. Ours is called SIFT, which stands for Scale-Invariant Feature Transform. It's a fancy way of saying that it takes a 2D photograph and picks up high contrast points on that photograph and stores those points and the relationship of those points in a vector. And that vector then becomes the unique identifier for that person.
ERIC: So how would this work in practice?
LAUREN: You can imagine a patient would show up to the clinic. They've already been registered in the system. They have a vector in the database. They're coming back for a return visit so they get a picture of their ear taken again, which has a new vector. And then that vector is compared to all of the other vectors in the database. And the vectors with the smallest distance are going to be your strongest match. So the way that we're currently running it is we display a top 10 list of matches based on the smallest distances that were found in the database. And if it's an accurate system you're going to have the patient show up in the top 10 list more often than not.
ERIC: And so what it does sound like, I mean, is fingerprints, where you're looking for specific points and the distance between them, sometimes, to uniquely identify someone. And when I first thought about it I was impressed that you could do that with an ear. But ears are complicated. Ears have that same structure and patterns, and it makes sense that a computer would be able to tell them apart, right?
LAUREN: Yeah, yeah, exactly. And we like to say, too, if you have family members around, even just to look at their ears. And even between family members, between people who are related genetically, your ear is totally different. It's quite fascinating. The ear is very unique in that way.
ERIC: Yeah, I can see that being true for sure. Now, how good does a picture have to be for this pattern recognition to work? What level of input can the algorithm be successful with?
LAUREN: Yeah. So this is a great question. So before SIFT does its magic we have steps that fall into the category of preprocessing. And this is getting at image quality. So we actually found that resizing the image, reducing the size of the image, actually reduces the amount of information that SIFT has to use to make its match and to make its vector.
The image resizing step is a very important step to increase our accuracy. You don't need a high resolution photograph going into the algorithm itself. And this is something that we paid attention to especially because our project is based in a clinic in Zambia where access is not the same as it is here in the United States. And the internet connectivity is also something that we've dealt with there as well. So we've tried to create the most robust system that we can and SIFT-- it doesn't take much to get a match from SIFT.
ERIC: And so looking at the bigger picture, this project is not just about improving one system, but using technology to provide more equitable access to health care.
LAUREN: Exactly. We're dealing specifically from zero to five years of age, what's called the under five care system in Zambia. And currently they're using a paper card to identify the patient. They're currently trying to bring all of that information and that data centralized into an electronic registry at the ministry level. But what they're running into is the accuracy of identifying based on name or date of birth, they're having a lot of issues. So they're currently still using that paper card to identify the patient, which often, you can imagine, gets lost, damaged, is unreadable at times. Our system is meant to be plugged into their electronic registry and be a way to identify a patient that way.
ERIC: Now, earlier you mentioned facial recognition, which is a similar technology that we get a lot of questions about relating to things like ethics and fairness and bias. So what do these concerns look like in practice for a pattern recognition technique like this?
LAUREN: Yeah, and that's a great point. I'm so glad that you brought that up. And I want to call attention to, in terms of facial recognition-- I know it's been a hot topic in recent times. But there is this great paper that came out of MIT Media Lab called Gender Shades. And they take a look at these different gender classifiers that use facial recognition. And they test that on different groups of people. And they've grouped people based on skin tone and then sex.
They found that darker-skinned females had higher error rates across the board. They took a look at Microsoft, IBM, and Face++ as these facial recognition platforms. And darker-skinned females had higher error rates that ranged between 20.8% and 34.7% across those three platforms. And in contrast to the lighter-skinned males, the error rates there were between 0% and 0.8%. So there's this stark contrast, right, in terms of how these different platforms are trained and what data they are using to train these systems on.
ERIC: So even something like not having a range of skin colors in your initial set of data can end up causing a bias in the technology itself.
LAUREN: Exactly. And I think as individuals, as scientists, as society as a whole, this is something that we really need to be aware of. And it's going to be there. I mean, there's bias in any algorithm that you train. And the goal is to mitigate the bias in your training data sets. And that's something that we've run into as well as a project. Our main effects paper just came out in Gates Open Research. And we go through our process-- it's been very iterative-- of taking three different benchmarking data sets and testing our algorithm on all three of those.
And one of them is actually the data set that was collected at the Museum of Science over a period of about a year that had many different variables. And we found, compared to-- we had an initial Boston University data set. We found our accuracy dropped by about 30% just based on the differences in the data set characteristics there. So it's definitely something that we need to be paying attention to and being aware of as we create these data sets to train these algorithms on.
ERIC: So the earlier you can identify the bias in something like pattern recognition, whatever the cause, the more you can do about it, and the less it's going to end up in the final technology.
LAUREN: Exactly. Exactly. It's always going to be there. It's not going away. And I think the more attention that is paid to it is definitely worthwhile.
ERIC: So to wrap up, what's something that's been really cool about working on this project?
LAUREN: I would say our team is so fun. We're under Dr. Christopher Gill at the School of Public Health. And he has a project portfolio that has been about 17 years in Zambia. And there's a well established team on the Zambian end. So we've been able to have some computer scientists from the University of Zambia on our team, who are brilliant, and then computer scientists from the Boston University team, who are equally as brilliant, and different public health perspectives on the team as well. It's very interdisciplinary, and people from different backgrounds and contributing all the time, so it's fun. It's a lot of information that I'm learning constantly, so it makes it fun.
ERIC: Well, Lauren, good luck with the future of Project SEARCH, and thanks so much for telling us all about it.
LAUREN: Thank you so much for having me.
ERIC: You can find out more about this ongoing effort by searching for Project SEARCH on the Gates Open Research website or by visiting this episode's web page on mos.org/pulsar. Until next time, keep asking questions.
Theme song by Destin Heilman