Connect with Us
602 Park Point Drive, Suite 225, Golden, CO 80401 – +1 303.495.2073
© 2024 Medical Affairs Professional Society (MAPS). All Rights Reserved Worldwide.
How Machine Learning Can Be Used to Understand Audiences
Many of us think about machine learning and AI as a way for an audience to get information: We ask ChatGPT a question and it provides the answer. Here we explore machine learning as a way to get information about an audience — both audience segments/personas and individuals within these groups. We also discuss the limitations and data privacy concerns that distinguish what we “could” know from what we “should” discover. And it turns out machine learning is not just a tool for mining he external environment, but also a way to make sense of a company’s internal data to discover hidden experts and resources within the company.
Garth Sundem 00:00
Welcome to this episode of the Medical Affairs Professional Society podcast series: “Elevate”. I’m your host, Garth Sundem. And today we’re exploring machine learning to understand audiences with Julia Mutygullina, Product Solutions Manager, and Data Scientist, Carlos Areia, both from Digital Science. This episode is sponsored by Digital Science. Okay, so Carlos, we’ve chatted before about machine learning. And I think our audience and I’ll include myself in this, we hear a lot about machine learning and AI, as a way for an audience to get information. That’s how I use in my very basic use of machine learning. I asked Chatterjee GPT things, but we are talking about machine learning as a way to get information about an audience. So So first, for today, are we talking about sort of identifying and understanding customer personas? Or are we talking about machine learning to identify individuals in this digital space?
Carlos Areia 01:08
Well, hello, Garth. Nice to be here. Again, thanks for inviting me. So yeah, we can talk a bit of both I know, machine machine learning. Well, now it’s becoming an old thing now with all artificial intelligence and everything exponentially growing at a really, really fast pace. But yeah, so yeah, we let’s see what the conversation takes us. But for example, we’re doing a bit of work on both inside Digital Science. I think we’ve previously discussed about yet another podcast on like finding unicorns and key key opinion leaders and digital opinion leaders and how we match them, et cetera, et cetera. And then we’ll partially answer your question on the personas. And the other part is like work, such as segmenting audiences like saying, for example, a group of cardiologists are all tweeting about a particular product or a particular study. And there’s both bits of work that we are doing in developing at the moment, or in on both we are applying a bit of machine learning and steam steering and our way and developing AI capabilities to and so yeah, I think it’s a very, very interesting topic. And it’s very, for me, it’s it’s, it’s mind blowing out, things are changing, and new tools and new capabilities and more and more data. It’s showing up and now it’s been used.
Garth Sundem 02:32
Okay, I apologize to ask this, because it’s going to show my lack of basic understanding. But I imagined that someone else out there, could you use this same thing? Can you please, just as we get started, tell me the difference between AI and MI. And I know, I know, this is a basic question. I apologize. But just give me the context here. AI is replacing Mi What’s, what’s the difference?
Carlos Areia 02:58
So, it’s a bit, it’s a bit of a blend. Now with a new AI. It’s like this huge pool of data and self and learning capabilities that well, everyone is competing about inside chat, GPT and Google and all the others in all the other competitors, that’s using that, whereas machine learning can be like algorithms and model in a more simple way perhaps more specific. Whereas like intelligent artificial intelligence is something that you can almost near real time, iterate and change and having this for example, shot GPT you can have this conversational iteration iterations with it, and it will update instantly. Whereas with with machine learning, at least the traditional one traditional way, this will involve more and more steps, and sometimes even coding, etc.
Garth Sundem 03:51
Okay, thanks for that. And you can decide later whether we edit that out, but I just needed that little bit for myself. And I think maybe some other folks do, too. So Julia, is especially in it’s funny to start with limitations, but this this brings up you know, we’re identifying individuals. So what what what can we do? What can we look out into the landscape and see, you know, what, what can’t we do? And who decides, can we? Can we find people’s LinkedIn profiles and message him? Can we just know that someone’s going to be at a congress and they go on our list of people we’d like to meet? Or what what can we do when we’re identifying audiences out in this space?
Julia Mutygullina 04:46
Hello, Garth, and thanks for having me here and do this is indeed very interesting question. Being the voice of the customer here because I as a product solution manager as a consultant work a lot with our clients for Ask exactly the same question, how far can we go? How much data we can get about the group of people or individuals, like specific individuals. And as Carlos mentioned, it’s a little bit of both. So we usually start with identifying more of like, bigger, larger groups of people, larger groups of users of power tools or audience online. And then some of our clients, they are absolutely fine with staying on this level, saying, Okay, this is already good enough for us to know, who we are connecting with, we don’t need to go deeper into individual profiles, like personal, more personal details and personal information. On the other side, and a lot of clients would like to know who have these big group of key opinion leaders, these healthcare practitioners would be probably the most interesting or the most apologetic for their products, will they want to connect to talk personally, and maybe like, learn more and to collaborate with? And in this case, this comes the question of how much information we can bring in on a personal level. Right? So and, again, I probably won’t have the correct answer on that. But like, we have all these laws around how much of the company can collect and store personal information, how far we can go, what we can actually collect online, and even with these new AI regulations that have been recently issued in European Union, right? I think that it’s just important to keep in mind that whatever our clients want, we as a digital science, we need to be vigilant about, like, complying with our clients requirements, but also complying with like legal requirements and all regulations that we have on this site. So saying that, yes, we can bring to some extent personal data now to the to the conversation. But we need to also understand what is the limit to like having this data? Should it be the name? I don’t know. Yeah, LinkedIn profile, X profile, phone number, email address, how much we can collect and store.
Garth Sundem 07:25
Gosh, I think I hear your drones circling my house right now…Exactly. I think I have one here somewhere as well. But I think that sounds really useful for a company who just wants the information? And is it part of, you know, and I’ll speak maybe more broadly than just digital signs. But do you see that as part of the role of a solution provider to be aware of these guardrails? And then also to make clients aware of these guardrails and limitations as well?
Julia Mutygullina 08:05
Yes, this is definitely a very important part of our role, both myself and Carlos as customer facing people at our company. And I think this is the most frequent conversations that we have, well first, like, what is what is available, what we can provide? And what are the limitations of what we can provide? Okay, so that’s yeah, that’s something that we discussed, discussed quite often. And I think that there is some clarity on this side. But there are also like, questions that has to be answered still, probably.
Carlos Areia 08:39
Yeah. And there’s a lot of the conversation with clients, for example, like, just because we can doesn’t mean that we should.
Garth Sundem 08:50
Yeah, that’s a really, really interesting one. Oh, next episode, the ethics of the ethics of Customer Identification. Interesting. All right. Well, let’s get into how we do this. So Carlos, you know, I’m thinking about MAPS. And, you know, I know our CRM, and we have people tagged in a couple of different ways. And, you know, we know folks have maybe attended certain kinds of webinars or that kind of thing. But is that what a company needs in order to unleash machine learning into its people to create personas and identify individuals? Or is it just like, you know, a company can ask you for information on its audiences and you could just go out into X, you know, Twitter and find it, how does a company best prepare either data resources or or CRM or whatever, to use machine learning to identify audiences or understand audiences?
Carlos Areia 09:55
So when we talk about machine learning, at least like we use it For example getting like getting into like the Kol, or the key opinion leaders and, for example, what we do is use like, dimensions data. And by dimensions, I mean the research landscape information. And then we tag it along without metric, which is usually the, which is the platform for for the impact of all this research. And that gives us access to tools like x and etc. So for example, one machine learning that example that we use was to be able to match their dimensions profiles of a researcher profile where you have listed or like, for example, let’s say, Mr. Garre published 10 papers in a specific specific therapeutic area. So we would find, Mr. Garth, on dimensions. And then using machine learning without Altmetric capabilities, me and Mike Taylor, which you already have, a couple of times in your podcast, we are this hypothesis that being me and him, both researchers as well, that we tweet about, well shouldn’t say tweet now, post about our research, research, we had this hypothesis that, like a good percentage of people share their own research. So if Mr. Garth Sundem published the paper, and Mr. Garth Sundem, a couple of hours or days later tweeted about the paper, they’re very likely the same person. So and using that we managed to do much more than between half a million to a million people, resources. And that’s very useful inside the Kol field, because sometimes these key opinion leaders are also digital opinion leaders, sometimes on their own research, but sometimes on others people research, and then this creates like a rabbit hole of who this Kol tweets about, and then other Kol as well, and who they scare, well, it’s about and then we managed to find like very great, like this pool of key opinion leaders that share a lot about research their own and others. And then we, we also are able to segment this by specific products or specific therapeutic areas. And that’s, that is where a lot of our clients find value in this data, because well do they is there it just like using machine learning and other tools, we’re developing AI capabilities to further develop this as well. But to make sense of this data, and basically turn all these like huge tables of information into insights that will say something in like, weekly. So for example, they can use in their CRM systems and they know, get more information about this pool of people, insights, very specific and narrow fields are brought whatever they want.
Garth Sundem 13:01
So that’s cool. It sounds like you have, you know, the potential to offer a couple metrics, I can imagine you saying, you know, here’s who’s driving the research. And then here’s who’s driving the opinion about the research. And it sounds like a lot of the Miche or I should ask is, is a lot of the machine learning application? Learning? Who among the KOLs is a dol?
Carlos Areia 13:32
Yes, like the probability for example, the, we calculate like this probability of Mr. Garth Sundem, the researcher matching Mr. Garth Sundem, the well, the X profile, because sometimes you won’t, they won’t put their own name they put like an acronym or attack. And then for example, we use to understand if like, Mr. Garth Sundem, the Twitter profile is an oncologist or is a cardiologist and, or is a patient or a cancer survivor, for example, and then we create, like these groups of people that we can then like gather all the information and, and then use it in our metrics. So by for example, doing this matching of profiles we’ll be able to say, like, Mr. Golf activity as a researcher, like how many publications how many grants? How many times as their own research been cited, is on research been cited, but also like his social media profile? How many times has he cited pages about his own work or other people’s work inside the therapeutic area? And what types of research does does he share? Etc, etc. And that’s why we use like a bit of machine learning and what types of what, what, what like his tweets and description tells about him, like Is he an oncologist? What types of products does he work with? What kind of things does he post etc, etc. It’s basically creating this persona, and getting a bit making sense of all the numbers bringing a bit of like, qualitative data in then a bit more about.
Garth Sundem 15:18
The qualitative data to So Julia on the customer side, what are you hearing from folks about how they’re using this? I can imagine. So Carlos gives me a list of all I don’t know, 1000, oncologists, in a certain therapeutic area who might be using a certain product. And then Julia, what are what are companies doing with this? Or what? Or? And also, are you getting questions from companies? Asking for ways they would like to use this stuff?
Julia Mutygullina 15:57
Oh, that’s also great, in very diverse question. So first of all, and yeah, let’s let’s broaden this, not only to okay, we found these 1002 researchers and apologists, like other other doctors, and we want to see how they interact with our research. So I think first of all, the great point that Carlos is made, we have all these data that we retrieve ourselves, so we have access to it. And one of the key benefits that our clients find in, in our data is that they don’t need to search for it, again, of themselves, right, they have it there. And they can and they can use it. And they can actually see right without like doing their own additional research. Okay, who is reading my, for example, I published an article who is reading this article, who is talking about this article, both in academic space or in the social media. But another very important outlook. And again, this is one of their outlook in the outer world, let’s look at to their researchers for interacting with my research. And I can probably even connect with them at some point. Another sort of Outlook would be I have my own data. So a lot of our clients, they have a lot of their own in house data information that they have been collected over the years. But in a lot of cases, they are not using this data is one sort of big system, right? Different teams will have different bits and pieces, information about like their individual likes, their specific XR articles, or this specific group of healthcare practitioners that they are talking to. But on a company level, a lot of people would want to sort of like globalize, this group create the holistic picture. And that’s where our data and the work of garlis Mike and gather our data scientists common place that we can provide our data and merge it with a company’s data. That’s how companies can not only have the outlook in the outside world, what’s happened. And I would say, but in a lot of cases, and this is the most interesting feedback that I probably heard from our clients is that how much more they learn about their own researchers, it is really fascinating sometimes to see, oh, I didn’t even know that one of my colleagues actually published, like another three articles in collaboration with another, like three groups, so of our researchers, just because I’m focused on my like, one single thing, and that’s perfectly fine. People are usually busy with what they have. So again, sometimes it is, like really interesting to get this feedback tool that first of all, the company have a chance to learn more about like their own researchers, and only after that connect these researchers with someone else outside.
Garth Sundem 19:05
Oh, so not only identifying the external audience that might be useful or interesting, but maybe identifying the internal the appropriate internal resources to have the conversations with those external DLLs. Carlos, does it drive you crazy when you have data from four different departments and 40 different teams, and each team has coded the data differently, or now you can somehow use the magic of machine learning to say, Let’s heal all this together and heal all this together with the external resources. I mean, is that how it works or do you still look at that and just shake shake your head and cry a single tear?
Carlos Areia 19:49
I cry several tears, no, I’m kidding. No, no, it’s usually quite straightforward because we don’t need a lot of data to start the process. Like it can be a list of publications. Got a list of free searches that we can match. But a lot of the times what we are we can, a lot of times can be a single therapeutic area or product, for example, and then giving a whole because like, for example, I mentioned, we have all besides publications and all the rest of the data, there’s like, links to affiliations, etc. And sometimes you can segment it using that. And that’s where Julia mentioned, like, sometimes discover things about their own company, or their own team that they, they didn’t know or it was a surprise to them. And it can be as simple as like, oh, as been, as been my research, cited by patterns this many times, etc, has been like about the research users or something, or even like inside social media, like how much attention like was he getting? Or is he getting etc, etc. But it’s really interesting that Julie mentioned, because, for example, inside companies, we can also use all this information to classify and to not protect great trajectories of leadership and emerging emerging research. So for example, it’s very good to find early career people, either academic or inside the company, inside corporate, who are on the ascending trend. Like in terms and we combine again, the data that we have not only the research the data, but like the, the attention and the impact that that research is creating to build a trajectory for that researcher, so we will be able to say, okay, inside this pool of 10,000, researchers, with a publication age of like less than 10 or 15 years, these, these are the most likely to become leaders in specific fields, etc, etc.
Garth Sundem 22:01
That that’s neat. And this is, well, first of all, it strikes me that there’s a lot of really neat creativity and how you apply this machine learning, I never would have thought to go to their affiliations. You know, so maybe you see you hypothesize through your machine learning that someone is an oncologist, and then you see all these profiles, and you can see that they, you know, belong to Asher, or you know, a gastroenterology society or something like that their affiliations, and then you have more information about them. You know, it seems like machine learning is a only as good as the data but be also only as good as the, you know, the human understanding to ask the right questions in the right way.
Carlos Areia 22:39
Oh, yeah. Especially now with the Getting back to the Intel artificial intelligence. This is whole new job category being created, which not sure if you ever heard of it, but it’s like the prompt creators, which is prompt engineers, which is a very fancy way to say people that know how to ask the machine questions. Yeah.
Garth Sundem 22:59
Absolutely. Oh, don’t take us down that rabbit hole. Carlos. We’ve got a lot to talk about. But I also think you don’t even have to answer this. But I can imagine. Wouldn’t it be cool if you could use this same learning internally, Julia, as you brought up to identify the rising stars within a company? And I’m not sure exactly what sort of, I don’t know, creative, cool metrics you would use to do that probably wouldn’t just be research citations in the Impact Factor and in the metric, but boy, if you could find a way to turn that same approach internally and identify the rising stars, I imagine that would be neat. But alright, so let’s leave it there for today. Carlos and Julia. Thank you for joining us. To learn more about how your company can partner with digital science. Visit Digital-science.com. MAPS members don’t forget to subscribe. And we hope you enjoyed this episode of the Medical Affairs Professional Society podcast series: “Elevate”.
602 Park Point Drive, Suite 225, Golden, CO 80401 – +1 303.495.2073
© 2024 Medical Affairs Professional Society (MAPS). All Rights Reserved Worldwide.