Aired:

October 9, 2025

Category:

Podcast

Real-World Evidence and the Future of Clinical Research

In This Episode

In this edition of Life Sciences DNA, hosts Amar Drawid and Daniel Levine sit down with Dr. Manfred Stapff, Founder and Managing Director of Candid Advisory, a consultancy specializing in AI-driven health analytics and regulatory strategy. A physician and former pharmaceutical executive with leadership roles at Merck, Forest Labs, and TriNetX, Manfred shares his perspective on how real-world data (RWD) and real-world evidence (RWE) are transforming the way drugs are developed, evaluated, and regulated. From addressing misinformation in healthcare to redefining how patient data informs drug safety and efficacy, this conversation explores the critical role of RWE in bridging the gap between clinical trials and real-world patient outcomes.

Episode highlights

The Data Gap in Clinical Trials: Why current eligibility criteria often represent only 10% of real-world patients, and how RWD can help close this gap.
From Experiments to Evidence: How integrating RWE with clinical trials creates a more holistic understanding of safety, efficacy, and patient diversity.
Bias, Quality, and Trust: The challenges of bias, data consistency, and privacy, and why building trust and transparency is key to RWE adoption.
AI and Real-World Data: How AI enables faster analysis of massive, heterogeneous data sources like EMRs, claims, and digital health tools, and how to manage inherent biases.
The Regulatory Frontier: Why agencies like the FDA are increasingly embracing RWE in decision-making and Manfred’s vision for a future stepwise drug approval model.

Transcript

Daniel Levine (00:00)

The Life Sciences DNA podcast is sponsored by Agilisium Labs, a collaborative space where Agilisium works with its clients to co-develop and incubate POCs, products, and solutions. To learn how Agilisium Labs can use the power of its generative AI for life sciences analytics, visit them at labs.agilisium.com. Amar, we've got Manfred Stapff on the show today. Who is Manfred?

Amar Drawid

Manfred is a physician and pharmaceutical executive specializing in internal medicine, clinical pharmacology, and data-driven healthcare innovation. He has held leadership positions at major pharmaceutical companies such as Merck, Forest Labs, and LG Chem, where he managed global clinical development, regulatory affairs, and medical operations. He served as chief medical officer at TriNetX, helping build the largest real-world data research network in healthcare.

Daniel Levine

What is Candid Advisory?

Amar Drawid

Candid Advisory is a consultancy firm founded by Dr. Stapff specializing in AI-driven health analytics and regulatory strategy. The firm offers guidance in health analytics, drug development, regulatory affairs, and the strategic use of real-world data for clinical research and decision-making.

Daniel Levine

And what are you hoping to hear from him today?

Amar Drawid (01:25)

Manfred has written and talked about the need to combat misinformation and advance evidence-based medicine. This is a very real problem in the realm of healthcare. So I'd like to hear about how we can do a better job in getting the real-world evidence and how AI can really help drug developers, regulators, policymakers, providers, and patients to make better decisions based on real world data and real world evidence.

Daniel Levine

Before we begin, I want to remind our audience that they can stay up on the latest episodes of the Life Sciences DNA by hitting the subscribe button. If you enjoy the content, be sure to hit the like button and let us know your thoughts in the comments section. And don't forget to listen to us on the go by downloading our audio only version of the show from your preferred podcast platform. With that, let's welcome Manfred to the show.

Amar Drawid

Thanks for joining us today. We're going to talk today about life sciences and how real-world evidence and AI can help various stakeholders in our healthcare system to make better decisions. We're in the age of social media where information and misinformation spread quickly. There are times today when it seems like facts don't matter. We can think about the COVID pandemic and the controversy around vaccines. How do you view the information landscape today? What has changed and has misinformation become a public health threat?

Manfred Stapff (02:54)

First, thank you very much for having me. I'm really glad to be here and also to exchange opinion and to provide my messages to the audience or to the world, I have to say, because this is really interesting time. The information landscape, you are absolutely right, is changing dramatically. What I found out is that it became dangerous if we fail to correctly deal with it, if we cannot handle it. The reason being the change in speed in technology in how information is distributed, obviously as we know all, has accelerated dramatically in the past 20, 10, 5 years. But the speed of our brain to digest this information and also the speed of the regulatory system to regulate and the legislative system to kind of regulate this information flow has not accelerated that much. And out of this discrepancy between how we can deal with the information and how quickly the information spreads and how slow the information regulation is changing, this can become a threat. But it also, if we deal with it correctly, it can be a huge package of opportunity. So I'm sure we will go a little bit more into the details, but this is what I see as the biggest impact, the difference in speed of spreading information versus digesting information.

Amar Drawid (04:33)

Okay. And how, so before we dig down into Candid Advisory, what role do you see AI analytics and their application to the real world evidence as helping address this threat of misinformation?

Manfred Stapff (04:50)

I mean, AI is a wonderful tool if you use it correctly and if you don't abuse it, absolutely. Because again, it's the speed and the volume which AI can digest compared to our brain. So as a research tool, it's a wonderful help. But AI can only digest or analyze the information and the data which it gets. And usually it screens the internet for information. So if the information on the internet is somehow biased in a certain political direction, a certain racial direction, in a certain scientific direction, then the output of AI will also be biased. So this is something definitely which we have to keep in mind. Also pharmaceutical companies, if they apply AI to their clinical research data, then the output will also have the same non-representativity and the same experimental character as clinical trial data have as compared to real-world data. So again, it's a wonderful tool, but we have to ⁓ use it correctly.

Amar Drawid (06:05)

Then in the world of, and before we go to the clinical and the real world data, in the world of information, how do you regulate that? Because as you described, AI is using information that it's being trained on, but how do we make sure that it is being trained on the right information or right data? How do we keep an eye on that or how do we manage that?

Manfred Stapff (06:33)

If we can influence, obviously we should be aware of that and the example of the pharmaceutical industry analyzing their own clinical trial data is a typical example where we can influence, meaning we can limit or define the data volume, the information volume, which the system, the artificial intelligence gets. In a case like Chat GPT, we do not have an influence. We can certainly ask questions and by asking the questions also tell the system, I want to see this type of population, I want to see this type of science, I want to see this type of publications coming from that area. So by asking the correct questions, we have a certain influence on which information, for example, Chat GPT will be digesting. But overall, we have no influence on what Chat GPT sees in the World Wide Web. But again, asking the correct questions may help us to push it into a direction which is perhaps less biased, or also push it into a direction with perhaps more bias. We are not completely powerless, but let's be honest, Chat GPT, of course, does what they can do and is doing what they have been trained to do.

Amar Drawid (08:06)

Yeah, absolutely. So in the pharmaceutical world, as you mentioned about the clinical trials, can you talk about what are some of the limitations of clinical trial data or like, why is that not good enough?

Manfred Stapff (08:20)

Clinical trials are an experimental system. We have a study protocol. We have defined visit intervals. We have defined inclusion and exclusion criteria. We have basically an optimized population, optimized for efficacy and optimized for safety. It has nothing to do with cheating. It is simply for scientific quality in order to evaluate efficacy in the best way and the best scientific correctness. We want to define the patient population in the best way possible. And for safety, also for the safety of the study patients, we want to exclude ⁓ patients who have a certain risk. Elderly people, pregnant people, lactating people, we all know these exclusion criteria. The problem is then when we move forward after phase 2, we in phase 2, we want to have this scientifically super correct population. Very often we have a copy and paste, copy and paste situation from phase 2 to phase 3 and this idealistic patient population continues to be enrolled in phase 3 and has not so much to do with reality. So, the exclusion and inclusion criteria need to become much, much more wide, much, much more liberal, much, much closer to the characteristics of the population who will later take the drug once it's on the market. Our own analysis showed that ⁓ in most studies, the study eligibility criteria restrict the population to approximately 10 % of the population who will later have the indication to take the drug. Vice versa, this means that for 90 % of the patients, because of the various combination of in- and exclusion criteria, for 90 % of the population, there are not enough efficacy and safety data. So if I was sitting at the FDA, I would give a pharmaceutical company all the exclusion criteria of their phase 3 studies as a contraindication. I would say you don't have data. It happens partially, you don't have data on kids, you don't have data on this, but it's much more a matter of representativity of the population. And we know, partially also from more or less political and ideological discussions, minorities are underrepresented. Study patients are more often white, are more often male, are more often young, have more often a higher socioeconomic status than the real population who has the disease and supposed to take the drug when it's on the market. So in this case, ideologically, kind of tainted discussion comes very well together with the scientific discussion. Yes, minorities are underestimated in clinical trials. And I was now only talking about the population. Now let's talk about the procedures. We have a study protocol with visit intervals, visit intervals every week, every two weeks, every month, controls, checks, double checks. Special treatment - if you are study patients, perhaps you do not have to wait in the waiting room for hours like other people in the emergency room. So no patient gets treated in a real world situation so well and with so much care as study patients are treated in a clinical trial situation. The consequence is non-representative population, placebo effect, and Hawthorne effect. Hawthorne effect is the effect that if people know that they are observed, they behave differently. In many studies, there are questionnaires to be filled out. So if I know because I have to provide my informed consent, of course, I may behave in a different way than I would behave in reality. So this all makes randomized clinical trials a perfect scientific tool, but it needs to be completed, complemented by real-world data. And both results should, they don't have to be identical. The results don't have be a total match, but they should point into the same direction because clinical trials are an experimental situation, artificial situation, and very often have not too much to do with reality.

Amar Drawid (13:23)

So one of the arguments that the pharmaceutical companies make, especially about like when you have the heterogeneous population, the statistics becomes harder. You need to have larger sample size, right? So that increases the cost a lot, that increases the study time a lot, right? So what would you say to that?

Manfred Stapff (13:43)

A sample size alone will not solve it, just looking at the representativity of the population. If the population is too young and too male, you can increase the sample size and have a false sense of certainty because by increasing the sample size you only increase the statistical significance, but not necessarily the clinical relevance. So it has to be complemented by real world data.

Amar Drawid (14:13)

Okay, so let's talk about real world data. So the term real world data and real world evidence, right? These are related terms. Can you, like for the audience members who are not familiar with these, can you talk a bit more about what these mean?

Manfred Stapff (14:27)

Yeah, the definition of real world data, I like the most is the FDA definition. It's funny they define what it is not rather than what it is, but this is easier. So real world data is everything that does not come from an experimental situation, does not come from a clinical trial, does not come from any kind of scientifically impacted or controlled situation. This can be whatever data comes from a day-to-day interaction between doctors and patients in the hospital. It can also come from smart watches, smart phones. A a lot of health related digital health applications can measure heart rate, can read the ECG, some even can pretend to measure glucose values, oxygen measurement. So it can come from all situations which are not influenced by a scientific study protocol or experimental protocol.

Amar Drawid (15:33)

And what about real evidence, which is related, like, so when you, talk about that.

Manfred Stapff (15:37)

So thank you for triggering that because this first thing on the real world data, real world data can only become real world evidence if they are collected well in high quality and analyzed with proper statistic methods, for example, without bias and with cross-checking for outliers, for example, but also if the interpretation of the statistical analysis is done in a correct way, similar to clinical trials, and then also when the interpretation and the conclusions are communicated in a proper way, adjusted to the respective audience. So real world data alone are not the solution yet. It's what we make out of the data by proper analysis and proper communication.

Amar Drawid (16:30)

So real world data by definition is not going to be as clean as the clinical data in the well-defined experiment, the clinical trials that the pharmaceutical companies do, right? So real world data is going to have a lot of, as you were talking about, population is going to be heterogeneous, the way people take drugs is going to be different, not all the tests that are neatly done in clinical trials, that's not going to be here. How different do you usually find analysis of real-world data compared to the analysis of clinical trials? And what are some of the challenges that you find with the analysis of real-world data?

Manfred Stapff (17:11)

So one thing is obviously bias because it's not randomized in the real world, clinical trial world is a bias. So things or changes which we see in the overall course of a disease may be explainable also by the reason why a patient got this drug. A very simplistic, simplest explanation. If a specific drug, for example, is preferably given to elderly or to more sensitive people because it has a reputation that is very safe, then of course we have a bias in the result. I'm mentioning that because this is what basically broke the neck in the Vioxx case because Vioxx was promoted as so safe that a higher percentage of risk population got Vioxx and then of course experienced also myocardial infarctions or other complications. And this could have been attributed to the age of the patient, but it could also have been attributed to the drug. But the conclusion is real world data. If there is no randomization, they can be biased. Therefore, in the analysis, if one compares clinical trials, which are most cases randomized, and one compares them with real-world data, one has to take this into consideration. The demography may be different, the ⁓ concomitant therapy may be different, the risk profile may be different, and this has to be considered, for example with propensity scoring during the analysis.

Amar Drawid (18:53)

Well, I was wondering about - so real world data is then removing the biases that are inherent in the clinical trial populations, right?

Manfred Stapff (19:04)

I would phrase it in that way, by randomization you have less bias than in the real world data. Therefore in the real world evidence generation, meaning the proper analysis, you have to consider that you may have this bias and you have to work on the bias. For example, identifying potential influence factors, other than the medication. If you compare it with some product, and do propensity scoring to a balance by age, by risk factors, by smoking habit, or by whatever is important for your specific therapy or for your specific disease.

Amar Drawid (19:45)

Okay. And any other things like there that we talked about bias, but what are some of the other areas in which real world data is different from clinical trials, like collection or like even the cleanliness? Can you talk about some of those?

Manfred Stapff (20:03)

Yeah, since there is no scientific experiment, no study protocol which requires which data to collect and when to collect them. There is no clinical monitor who does the source data control. There is no data management in a ⁓ pharmaceutical company who makes sure that the data set collected is complete and perhaps goes back to the investigator if the data set is incomplete, missing data, or if they are inconsistencies, or if they are just typing errors, ⁓ decimal point errors. So all these quality management steps are available in clinical studies, but not in real world data. And therefore, similar steps need to be introduced in the collection of the data. When basically the data come in, usually this is a vendor, a data broker, an organization who collects this data. These companies need to have a quality management which applies reasonable data quality tests. The most simple one is look for outliers. You cannot survive if you have a potassium of 50. You can have a potassium of 5.0. So these kind of tests, of course, need to be done for completeness. Also, if you have in the data set a treatment for thyroid disease, then there should also be a diagnosis for thyroid disease. So this is a test for consistency, completeness and reasonable. So such quality management steps should be introduced. Similar like in a factory when you build a car, then also the parts which come before they go into the manufacturing belt, the parts which come from a deliverable, they need to be tested for quality before they are mounted into the car. And then at the end, when the car is produced at the end, there's again a quality check for the end product. Something similar needs to happen in organizations or companies who collect real world data, combine them and sell them to the pharmaceutical industry or to other organizations who are interested in real-world data.

Amar Drawid (22:38)

So then what you're saying is that the data preparation for real-world data is much more difficult than when you're analyzing clinical trials, right? Because you have to go through all these things that you need to consider that in clinical trial, because it's an experiment, those are already controlled.

Manfred Stapff (22:55)

It can be automated. Of course, all these checks can be some automatic checks in various algorithms like data management in the clinical trial also is doing, consistency checks, plausibility checks, and many of them are automatic checks. Obviously, it doesn't have to be a manual work. But certainly, in some instances it makes sense if a medically trained person looks at the data set and has a little bit of an idea about consistency checks, like I mentioned with the thyroid. There are other consistency checks where one can check diagnosis versus history, whether it does make medical sense or there are perhaps some errors in there. I would not necessarily say that these quality checks are much more effort. I'm just saying this is necessary. And then also not forgetting when you do a real world data study, you do not have to search for patients and enrollment speed. And you do not have to wait for the therapy duration if you are doing a long-term study for clinical outcomes for five years, you cannot accelerate it. If you do a real-world data study about five years' outcomes, then you need data which spans over five years. You can do it retrospectively. So overall, of course, the costs and the effort for real-world data, real-world evidence is much, much less than for randomized clinical trials.

Amar Drawid (24:39)

Okay. As we talked about, the incompleteness or the cleanliness of the real world data is not that much. Does that result in not being able to use all the data or all the patient's data usually? Or how do you find that when you're doing the analysis?

Manfred Stapff (25:00)

That's perhaps where sample size indeed then comes into play because if there is a high variation or missing data, this can be compensated by much higher sample size. While in a clinical trial, the statistician gives you a sample size which is necessary to work on the hypothesis with the planned P level, usually 5%. And this is usually then the number which you enroll in this study for financial reasons. You do not want to over enroll, but also for ethical reasons, you should only expose this number of patients in a clinical trial with an unknown drug, a drug where the safety is not yet very well evaluated. You should only expose this number to the study, the drug, which is absolutely necessary for statistical reasons. So therefore, you keep the numbers low for ethical, statistical, and financial reasons. While in the real world world, in studies which you can do with real world data, you often you take whatever you get because a higher sample size gives you more certainty not only from a statistical point of view, but also to compensate for potentially missing data or potentially erroneous data or potentially higher standard deviation.

Amar Drawid (26:32)

Can you talk about what are some of the examples of real-world data? There's claims data, there is registry data. What are the different types of EMR? What are some of the different types of real-world data? And if you can talk a bit about those characteristics.

Manfred Stapff (26:48)

Yeah, if you talk about real world data, you automatically think about electronic medical records. Thanks to 21st Century Cures Act, almost all hospitals, almost all physicians and practices are using electronic medical records. The challenge obviously is that all these data sources do not necessarily ideally talk to each other. They have different definitions, different structures. So patient linking is an important task, but it can be done by standards. And then you can connect data which come from different sources, but not only different electronic medical record sources, because a patient can be in a hospital, a patient can be in a general practitioner. The same patient can also have a visit with a dermatologist or with an ophthalmologist. So these are already situations where data need to be combined. Patient linking is the miracle word here. Then it continues. Claims data is also an interesting source because usually you have one insurance and then all these different data from different sources come together into this one insurance. There are claims obviously before they have been solved and after they have been solved. So there are some specifics to be considered. Then when we think about medication in the electronic medical records, you find what medication has been prescribed. But in pharma data or in claims data, you see whether the patient actually filled this prescription, whether the patient actually went to CVS or Walgreens or whatever pharmacy and filled the prescription and at least took it home. You still don't necessarily know whether the patient then actually has taken the medication. What you do in a clinical trial, you do the pill count. It's also no proof that the patient has taken it. But if the patient brings back an empty bottle, likelihood is higher that the patient actually took the drug. This is certainly often an uncertainty, the drug compliance. What else do we have? Yeah, then we have the patient's own digital health tools, digital health tools, smartphones, smartwatches, so all these sources. When we talk about registries, then these are already scientifically planned situations. I'm kind of hesitant to define a registry as a real world data because registries are given to the patient by a questionnaire so the patient knows that they will be asked certain questions mostly about symptoms and feeling and so on. So this does not really go into the real world.

Amar Drawid (29:50)

Okay, I bought. And you were about to give an example about real world data analysis and what Candid Advisory does. So can we start with that? Maybe an example that brings together a lot of the things that you talked about.

Manfred Stapff (30:08)

The one example which I wanted to bring up earlier when it comes to the difference between clinical trials and real world data is very striking because one has to show, and this is one thing which I'm doing in my Candid Advisory life, also looking at data and evidence and the practical examples for theoretical things which we discuss or theoretical advice I give. And in this case, the practical example for the difference between science and the real world. So a Canadian study, which was published recently, looked at treatments for multiple myeloma, MM, and looked at seven schedules of treatment schemes, which had been tested before in randomized clinical trials. They tested these seven treatment schemes with real-world data and found out that in six of these seven treatment schemes, the clinical trial data or the clinical trial results showed an up to 1.4 longer progression-free survival than in real world. So in the clinical trials, the treatment schemes worked better than in reality in six out of seven cases. And that's a very interesting example because it shows that especially in oncology where we get a lot of these studies from very specific randomized clinical trials, that these data do not necessarily realize in real world. Not surprising, not surprising, the first result of this analysis was that the multiple myeloma patients in the real world data set were significantly older than in the clinical trial data set, which may also be a reason for the worse performance of the scheme in the real world.

Amar Drawid (32:19)

That's very interesting. Yeah, so tell us about like maybe another example, maybe let's say on the clinical development side, how you've helped some of the pharmaceutical companies in terms of clinical trial designs. So is there like an example that comes to your mind?

Manfred Stapff (32:37)

Yeah, classical examples for real world data is obviously learning about the patient population, learning about the disease before you even start a clinical development. What's the demography? What are the usual risk factors? What are the usual therapies? What is the patient journey through a diagnosis or through a disease? What are the lines of treatments? Very often the treatments are changed during the course. So this is a very first step which should be done. The next step is then support for protocol generation, protocol writing in the sense of feasibility. We all know and it goes through the pharmaceutical industry as a message since 25 years, but it's never changed that the combination of in- and exclusion criteria is often so complicated, it's very hard to find these patients in reality. And therefore, feasibility, running this protocol in silico before the protocol is finalized to finding out what will be the screen failure rate, what may be the enrollment speed. And can you find these patients anyway? And then it continues with the potential to having an external control group. If you want to compare drug A and drug B and drug A is perhaps new, drug B is on the market. Perhaps you do not have to have the arm for drug B in the randomized clinical trial. So it would cut your trial population by half because you can take the comparator data from the real world. It's a little bit tricky. You have to make sure that the populations are comparable because you have, then you don't have randomization. You still have a Hawthorne effect because the drug B patients--they are in real world. The drug A patients--they know that they are in the study. They behave perhaps differently. So it works for some very objective endpoints where we are not very concerned about the placebo effect or other more emotional effects which impact perhaps the treatment. But it's an option which can be discussed. Then are the after post-approval requirements. There should always be a post-approval pharmacovigilance observation of the patients, of the safety profile, of upcoming side effects, of new coming diagnosis. This should become standard and instead of an advertising observation, which pharma relatively often uses to push a little bit the prescribing instead of these market observations, it should just be done by collection of real world data if the numbers are high enough.

Amar Drawid (35:38)

Okay. And to what extent do you see the pharmaceutical and biotech companies taking advantage of the real world data? Like before, say clinical trials planning as you gave multiple examples of where this can be used or even post-market. Do you see that very common or do you think there is still a long way to go in terms of adoption?

Manfred Stapff (36:00)

I think thanks to the advertising effort and promotional effort of organizations who provide real world data, the classical use cases, protocol feasibility and site selection are the most often applied use cases. Site selection also because you then know if you check all the inclusion and exclusion criteria and you can differentiate by sight, then you know where you can find these patients, at least in a geographic sense, sometimes even hospital by hospital. And these are the most often used cases. I do not see too often the general learning about the disease aspect. Let's say the pre...clinical aspect, learning about the disease in real world. As I mentioned before, what's the demography, what's the treatment, what's the standard treatment, what are the patients going through? I don't see that too often. I also do not yet see too often the pharmacovigilance part where you actively collect real world data after launch in order to learn more unless the FDA mandates that. There's one could claim that the pharmaceutical industry is hesitant to actively collect adverse event data, which may perhaps then taint their label. And therefore it would work the best if the FDA just mandates it or if the FDA does it because they have their Sentinel system. So why bothering to force the pharmaceutical industry if the FDA can do it themselves.

Amar Drawid (37:50)

And what about any specific examples? Let's say about regulatory strategy. How can that how can real world data be used or how you have used that? Yeah.

Manfred Stapff (37:59)

I mean, there are always confidentiality agreements, so therefore I have to tell it a little bit general without naming specific products or companies. But you just imagine that a smaller company needs external advice when it comes to develop in the United States. And the companies outside the United States, they want to develop in the United States a nasal spray with a biological compound in an orphan disease. There are so many unpredictable situations where such companies need an external ⁓ advice. And in this specific case, it turned out to be so complicated that perhaps there was no return on investment. A decision to say no is also a decision because it protects the company from a later failure. So these strategic discussions are very often coming from a desire when smaller companies, startup companies, do not necessarily have this wide expertise, perhaps do not even have a chief medical officer or if they have a chief medical officer, they prefer a specialist who was working in this specific disease for years or even decades. And then often come questions, strategic questions, regulatory approach questions, where someone with a wider cross-functional expertise, and I worked in medical services, in clinical development, in clinical operations in Europe and the United States. So my experience is very wide and startup companies have very often the desire to have employees who have a very narrow but very deep experience. So I can complement that if specific questions come up.

Amar Drawid (40:05)

Yes, sounds good. And so where do you see the biggest challenges when integrating real world data analytics and AI into the life sciences research?

Manfred Stapff (40:16)

One is still trust and this is related to privacy. It's also related to quality. So everybody working in this field has to do a lot to explain what is really behind real world data. How are the companies working? Where do the data come from? What kind of quality management systems do they have? What are the analytical methods being used? And also of course, more for the physicians and for the patients also, what data protection measures are done. Not only HIPAA compliant or also in compliance to the European data laws. Creating transparency, education in order to create trust is a very important step. Fortunately, the FDA has already accepted real world data with the proper real world evidence, obviously, and has already acted in regulatory decision making when it, for example, when it comes to new indications for drugs which are already on the market. Tacrolimus is one example, but also in the medical device sector, different indications for certain medical devices, simply solely based on eal world data. This is certainly a step where we can expect more to come in the future. So to the topic trust, it's really important. And this is one reason why I try to distribute trust by podcasts like yours. Thank you for inviting me. Also by writing the book, Real World Evidence Unveiled, navigating through the maze of modern misinformation. So this is also not only important for drug developers and pharmaceutical industry, it also applies to our daily life. So the same way like FDA is bombarded with statements that a drug is safe and has efficacy, are, and they want to see the evidence, so are we also bombarded with information and with statements, with headlines in media, social media, newspaper every day, and we also want to see the evidence. So the overall thinking should be the same. And since it's a young, let's call it a young discipline, it is very dynamic and changes year by year. But now it's everything a little bit still a little bit shaky. But there's more to come and it will become more and more acceptable.

Amar Drawid (42:59)

And is your book out or is that something you're planning to publish?

Manfred Stapff (43:03)

Now the book is out, it's available on Amazon as ebook, hardcover or paperback. And again, it says real world evidence. You find it faster if you search for real world evidence unveiled, then navigating through the maze of modern misinformation. And then of course you find it.

Amar Drawid (43:25)

That sounds great. And so now we talked about a lot of these applications of real-world data in the life cycle of the drug, right? Where do you see it having the greatest impact near-term and in the long-term?

Manfred Stapff (43:40)

So near term, it's already accepted by safety related, when there are safety related questions, partially by pharma, but mainly also by regulatory agencies. Mention again, the Sentinel initiative. Long term, I could imagine, and this is very long term, it's almost a little bit revolutionary. I could imagine a step-wise approval process, for example, which we had with the COVID vaccine when there was an urgent need for vaccine. There was a preliminary approval without long-term studies with a very incomplete CMC package, chemistry manufacturing control package, because it was necessary. Now I would argue for a step-wise process, not only because it's necessary, but more because it's possible. We could have a step-wise approval process where the step two, the phase 2 studies are the classical scientific studies with a strict protocol, with a lot of in- and exclusion criteria. And these scientific studies, they will lead to a preliminary approval and allow the pharmaceutical company to bring it on the market. And then the phase 3 will be replaced by basically what we have today, a phase 4 with a post-market observation. In this case, with a mandatory collection of real-world data about this drug in the real world, how it behaves. Obviously the patients need to know that this has only preliminary approval. There needs to be a lot of education and transparency, but this would speed up the drug development process, would make drugs sooner available for patients who need it, would reduce the cost. Just imagine what two confirmatory phase 2 studies are costing. And I don't think would jeopardize patient safety if the real world safety observation is very dynamic and reacting very quickly in case signals come up.

Amar Drawid (46:03)

That sounds great and I hope we go in that direction and we march in that direction and we can use real world data to really accelerate clinical trials but also make sure that we have the data for different types of populations. So really appreciate that. Dr. Manfred Stapff, founder of Candid Advisory. Manfred, thank you very much for your time today.

Manfred Stapff (46:24)

Thank you very much for having me and thank you for helping me to distribute the message and the facts behind real world data, real world evidence. Thank you very much.

Daniel Levine (46:36)

Well, we really covered a big range there in real world data. It was interesting to hear Manfred talk about these different aspects. What did you think?

Amar Drawid

I mean, I find real world data to be a pretty fascinating area. It's something that's of course, in the last few years, we're talking more about it. We've started using a lot of real world data, but still there is a long way to go. And both from the from the regulatory point of view. So it was really good to see a lot of the applications that you talk about and how they're coming to age more and more. But also that I don't believe we're using real world data to its fullest potential at this point.

Daniel Levine

One of the things that really struck me was when he talked about the fact that most, the eligibility criteria for most clinical trial studies restrict the population that can enroll to about 10 % of the actual population that will later use a drug there. So we're not adequately testing drugs for 90 % of the populations that will use them. What did you make of that?

Amar Drawid

I think pharmaceutical companies do have a realization that the clinical trials that they design don't necessarily cover everything. And as I mentioned there, right, there are limitations for these companies who need to show the statistical significance in a population. And it's also, there's a limit to how many patients they can recruit. And I know most of the pharma companies are pretty active in terms of diversifying their patient population to include multiple races, to have the...the genders included to a large extent. So I know a lot of them are trying that. It's something that I believe started a few years ago. And we to see to what extent they have now been successful in diversifying their ⁓ population. So there is genuine need from the pharmaceutical companies and genuine desire to really get a lot of those populations. a matter of how can they do that with the constraints that are there. And so that's why if the real world data can complement in some of that aspects, that would be really beneficial.

Daniel Levine

Another thing, Manfred talked about is because of ethical, financial, statistical reasons, you want to keep study sizes small. Real world data is quite different, particularly now that we can manage and analyze massive data sets. What are the implications of that?

Amar Drawid

So with real-world data, yes, you can get a lot of the retrospective stuff, right? But for lot of regulatory approvals, you do need, like, not the retrospective, but the more like you define the specific population and you do the trials. So it's much more prospective than retrospective. Also, one thing about retrospective is that you can understand what is happening in a disease area in terms of the treatment landscape of the drugs that are already on the market, whereas when you're doing the clinical trials, there's a new drug that will be coming across. So the real world data becomes much more helpful for a specific drug once it's on the market, because that's when the data becomes available and you can get the retrospective aspect. So yeah, there are some limitations about it, but again, to really understand the market or some of the applications you talked about, the site selection or really understanding what are the inclusion exclusion criteria that can benefit and that can only harm the clinical trial because you really just cannot recruit patients. And that's a real problem. Sometimes those criteria can be so stringent that it's hard to recruit, right? So using real world data for some of those makes a lot of sense to me. But I'm not sure if a lot of the companies do that consistently. And as he talked about, because I mean, of course, doing the real world data, the analytics, that also is an expense. And so I would say the bigger pharmas with more money can tap into that much more than the smaller biotechs who, as he talked about, they have people with maybe more the narrow skill set in a specific indication that they're going after rather than...the larger broader view or even the means to do those kind of analysis to it, to buy the data, to have people who can clean up the data and and do the real world analytics there. There's definitely I see the gap between what is really possible that what is the ideal versus where we are?

Daniel Levine

You talked about electronic health records is one of the primary and obvious sources of real world data and talked about some well-known challenges that have to do with the structure and the variability of electronic health records. One of the things I'm hearing though is the problem with these records from a research point of view is that they're really designed to satisfy the needs of insurers and may not be well designed for research purposes. Do we need to think about changing that to make them more useful?

Amar Drawid

And the question is, how, right? Because these are the EMRs that are usually managed by specific health systems, and they're capturing a lot of the data so that their doctors can see the history of the patients. Of course, what they want to know is, what were the diagnosis? What was prescribed? But as you were talking about, right? Yeah, the prescription information is there, but whether the compliance is not necessarily there because that's something that maybe then just doctor having conversation with the patient rather than something that gets recorded. And that's something that when we're looking at for the pharmaceutical purpose, we really need to know what is the compliance? How much is that really being used? Because then you really understand the effect of the drug if the patient is taking them. So it's just different purposes. on the other hand, I understand that it's if the pharmaceutical companies start getting into the EMRs, people are going to get worried about, okay, what are you trying to get from those, right? So it's different people who are making sure that the EMRs are there and then the purpose of the difference. So yes, there's a need to gather a lot of this information, but I don't see how that is going to happen in reality.

Daniel Levine

You asked about to what extent biopharma companies are adopting real world data. Manfred mentioned site selection as an example where they're probably most aggressive about this, not as much so on the preclinical research to understand a disease or on the other side, pharmacovigilance once they're on the market. But what do you think that is? Is it a matter of access to adequate data? Is it cost or is it something else?

Amar Drawid

When you want to do the clinical trials, you want to do them very fast. You want to accelerate those. And maybe in that, this might be falling a bit sideways in terms of taking the time to get the data, buy the data, to do the analysis, to really understand. So I think one of the reasons is time. Another one is, of course, resources. Another one could be the trust, right? They don't necessarily know what is it that could come out of the real world data. So maybe people and these companies need to spend some more time in terms of understanding what is the value they can get from real world data. That's why having these kinds of conversations is helpful. So the companies do understand more about these and then...can make use of those. But again, it's more about as we talked about, right? This is a new field. This is a burgeoning field. This is like 10, 15 years ago, the use of real world data was tiny, whereas right now it's much more. And the companies have started understanding it, but it's a journey. We're on a journey, and I believe we will continue to be on that, and companies will use real world data more and more as time goes. I'm glad we were able to have Manfred on the show to talk about this scenario. We haven't really explored in depth before, so that was a great conversation and thanks as always. Good to be here, Danny.

Daniel Levine

Thanks again to our sponsor, Agilisium Labs. Life Sciences DNA is a bi-monthly podcast produced by the Levine Media Group with production support from Fullview Media. Be sure to follow us on your preferred podcast platform. Music for this podcast is provided courtesy of the Jonah Levine Collective. We'd love to hear from you. Pop us a note at danny at levinemediagroup.com. Life Sciences DNA, I'm Daniel Levine.

Thanks for joining us.

Our Host

Dr. Amar Drawid, an industry veteran who has worked in data science leadership with top biopharmaceutical companies. He explores the evolving use of AI and data science with innovators working to reshape all aspects of the biopharmaceutical industry from the way new therapeutics are discovered to how they are marketed.

Our Speaker

Entrepreneurial biotech/pharma leader and author with end-to-end expertise in drug development, from early assessment to launch. His hands-on experience spans clinical development, medical affairs, real-world evidence (RWE), and pharmacovigilance. He have a proven record of successfully managing highly complex, cross-functional projects in both startups and large global organizations across diverse therapeutic areas, including NASH, CV, CNS, and oncology. An educator and published author dedicated to advancing public health.

Recent Episodes

Thank you! your form has been submitted.

Oops! Something went wrong while submitting the form.

Real-World Evidence and the Future of Clinical Research

In This Episode

Transcript

Our Host

Dr. Amar Drawid

Our Speaker

Dr. Manfred Stapff,

Recent Episodes

Redefining Biomarkers for Immune-Mediated Disease with AI

Driving Insights from Real World Oncology Data with AI

Building the AI Playbook for Clinical Trials

Transforming Cancer Diagnostics and Care with AI

Moving Beyond the Hype of AI in Biopharma

Why AI’s Most Promising Near-Term Value Is in Clinical Operations

Teaching the Biopharma Workforce to Coexist with AI

Unlock Life Sciences

Innovation with
Agentic AI