Aired:

December 12, 2024

Category:

Podcast

Generating Proteins that Nature Never Imagined

In This Episode

This episode of the Life Sciences DNA Podcast takes us to the edge of innovation, where AI and synthetic biology are helping create proteins that nature never designed. It’s a bold look at how scientists are rewriting the rules of biology to fight disease in entirely new ways.

Episode highlights

Explains how researchers are no longer constrained by nature—they’re now designing novel proteins from scratch to take on tough diseases.

Shows how AI models are helping predict structure, function, and stability—turning trial and error into targeted design.

Highlights real examples of synthetic proteins being used to target diseases once considered out of reach.

Reveals how AI shortens the time from idea to candidate—and helps ensure those candidates are highly precise and scalable.

Explores how this technology could transform the development of vaccines, precision medicines, and treatments for resistant conditions.

Transcript

Daniel Levine (00:00)

The Life Sciences DNA podcast is sponsored by Agilisium Labs, a collaborative space where Agilisium works with its clients to co-develop and incubate POCs, products, and solutions. To learn how Agilisium Labs can use the power of its generative AI for life sciences analytics, visit them at labs.agilisium.com.

Amar, we've got Mike Nally on the show today. For audience members not familiar with Mike, who is he? Mike is the CEO of Generate Biomedicines and is a CEO partner at Flagship Pioneering. He's held several leadership positions at Merck, including Chief Marketing Officer. He was responsible for the global marketing and brand strategy of a diverse portfolio encompassing innovative medicines and vaccines, generating over $40 billion in revenue.

Before that, he served as the president of Global Vaccines for Merck, managing the company's operations in Sweden and in the UK. He holds an MBA from Harvard Business School, a degree in accounting and finance from the London School of Economics, and a BA in economics from Middlebury College. And what is Generate Biomedicines? Generate Biomedicines is a company that emerged from Flagship Pioneering Innovation Foundry.

It officially launched in 2020. It's using machine learning to generate new antibodies, peptides, enzymes, cytokines, and other therapeutic proteins. It's working to dramatically increase the speed and reduce the cost of drug discovery. And what are you hoping to hear from Mike today? I'd like to understand how its platform technology works, what's unique about its approach, and how the company's work speaks to the broader changes

that are happening in drug discovery. Well, before we begin, I'd like to remind our audience that if they'd like to keep up on the latest episodes of Life Sciences DNA, they should hit the subscribe button. If you enjoy the show, hit the like button and let us know your thoughts in the comments. With that, let's welcome Mike to the show. Mike, thanks for joining us today.

We're going to talk about generative AI, protein therapies, and how your platform technology is expanding the potential for protein therapeutics. Let's start with the name of the company. What's the meaning of Generate Biomedicines? Amar, thanks so much for having me and excited to share a little bit about the great work our scientists are doing here at Generate.

Generate was named six years ago because there was a fundamental belief that using generative modeling, we could ultimately come up with protein constructs that nature hasn't discovered. And that was largely based on some research that was suggesting that while we've made huge progress in protein engineering as a field over the last 40, 50 years, unfortunately, what I also saw was that

there was a data-driven approach that could ultimately yield superior results. So, using kind of this wave of generative modeling that had started to emerge in the tech sector, we believe that those sort of approaches could find patterns in data to ultimately come up with better answers for protein-based therapeutics. So you started this way before this generative AI hype that's there right now, right? So you actually started this way before

using generative modeling six years ago. So that's fantastic. Yeah, I think that, you know, I will say I think our founders, Gavorg Gregoryan, Molly Gibson deserve a ton of credit in seeing how this confluence and this emergence of these sort of technologies could be applied to biology. And that's what's really so exciting. I mean, I think if you think about our field that as a whole, you know, where we've struggled is we don't understand biology all that well

as humanity. And I think it's without a doubt that, you know, with the help of some of these computational based techniques, this, you know, increasing computing power that we've seen across all of society, we are going to start to understand biology in a way that was previously unimaginable. And through the help of machines, we're starting to understand these patterns that underlie biology in pretty unique and distinctive ways.

So how does Generate Biomedicine's platform technology integrate machine learning with biological engineering? Can you go a bit deeper into that? It's a great question. And I think this is actually at the heart of the company. So at our core, we integrate cutting edge computational techniques. So the latest in generative modeling, diffusion models with a cutting edge experimental laboratory. And so

we don't rely purely on computation to understand biology. We actually have a strong underlying prior model that suggests DNA sequences. Each of those DNA sequences is then built into a full length protein format of choice. All of those proteins then are measured within our labs for both biophysical and functional characterization. And then every data point with an informatics layer is captured

and fed back to refine the computational approach. And so what we're seeing is very much a learning system where we are learning from directly suggesting, building, and then testing DNA sequences to try and capture the inner relationships between sequence and ultimately function. Because if you're in the therapeutics business, what you really care about are how

proteins interact and how they function and drive biological processes. So try to understand it better. Are you generating these sequences to understand more the targets and the disease mechanisms or are you also generating them to come up with the new drugs which are proteins? Yeah, I think in the early days of the company where we saw the most immediate application was in molecular design.

We believe that we could come up with both better versions of existing starting points, but also, you know, and equally profound, is can we come up with entirely new compositions from scratch? You're talking about the proteins, the new proteins. So yeah, if you have a, let's just say you have a target of interest, right? So yeah. And then you have a desired landing spot on the target known as an epitope

where the drug needs to bind to that. Exactly right. Can you come up with a protein construct or molecule that will bind with specificity to that location and that location only? many ways before these techniques existed, it was really hard to come up with a de novo antibody construct. As far as we're aware, that had never been done roughly three years ago. We saw our first de novo hits around three years ago. And what that gives you now

is a control over biology that we've never had before. Because if you think about how we've historically found protein-based therapeutics, immunize a human amounts of llama, we comb through plasma, we see what sticks to the target, and then we tinker with it to have drug-like properties. This is allowing you a programmability dimension that we think is pretty unique and distinctive. Sorry, go ahead.

No, you're basically able to go from sequence to functional molecule. And that is what I think is so exciting for us is we believe we are on the brink of an era of programmable biology. We're still very much in the early days. There's a long road to go. But the ability to, in some ways, make biology engineerable truly, we believe will lead to an industrial revolution in biology.

And when you talk about, for our audience members, right, who are not familiar with the terms de novo or the, you know, how you are, how, what the traditional methodology is? Can you explain that a bit more about, you know, how it was done, like how the protein therapeutics were created before and what is really the revolutionary about what you're doing? Yeah, I mean, I think a lot of, you know, the latest antibodies had historically been found, you know, either in human

plasma, right? So you have an infection, your body creates antibodies, your immune system creates antibodies, you see those antibodies that bind to the target of interest and then we try and manipulate it to have better binding affinity to the target or better functional performance over time. The way we've historically manipulated proteins is we've either used random mutagenesis so error PCR to try and mutate

single amino acids or a series of amino acids to try and find superior performing proteins. This was a completely random pursuit. On the other hand, there were biophysical based techniques that people are familiar with a software called Rosetta that allowed us to try and build up, bottoms up, atom by atom and understand protein dynamics. The problem with

both of those techniques is that they've been revolutionary in terms of our ability to explore biology, but they also were limited in our ability to actually explore beyond, you know, the narrow area beyond a starting point. Yeah. What de novo gives us is an ability to explore this vast universe of potential proteins. So the average protein, just for context, is about 200 amino acids long.

There are 20 natural amino acids. So the combinatorial possibilities in protein space are 20 to 200th power. That's atoms in the universe cubed. One of our board members is the Nobel laureate, Frances Arnold. She said to me when I was contemplating joining Generate, "Mike, you have to recognize despite the majesty of nature, nature has only surveyed one drop of water in all the earth oceans of potential proteins."

And what these generative tools give you are techniques to start to survey the breadth of the ocean.

Absolutely. And so the proteins that you're generating, how much are they different from the traditional antibody proteins that are out there? Like, can you give us a sense of that? Are they like completely different? Yeah, it's a great question. And again, part of what you're trying to delicately balance is the recognition that if you change them so completely,

they become constructs that are foreign to our immune systems and therefore our immune systems are well-honed defense, they have well-honed defense mechanisms to protect ourselves against foreign entries. What we try and do is we try and manipulate the most meaningful parameters on a protein in order to get the desired therapeutic effect. And so in some cases that will be redesigning

the binding region of a protein. Those are the six CDR loops, which are kind of responsible for the interaction between the protein and the target. We'll do pretty substantial manipulation of those domains. We've shown the capability to change 20, 30, 40, 50, 60 % of the CDRs, where historical techniques would only allow us to change about 10 % before we couldn't find any functional variants.

It's just a great example of how these tools are giving us an ability to explore a much broader swath of biology and do so in a way that introduces favorable pharmaceutical-like properties, which is ultimately the goal. Because one of the things that we believe and we've seen is creating a drug is actually a multi-parameter optimization equation. And if you think about where computers excel

over pure human ingenuity, it's in these complex formulas. Right? So if I'm making a drug, I want it to bind a certain way. I want it to be specific to the target. I want it to have certain manufacturability parameters, right? I don't want it to be, I look after the viscosity of the drug. I want it have certain developability parameters. That in and of itself is a multi-parameter optimization equation. So...

how we tune each of those dials becomes really, really important. And historically what we've done is we've done that in a sequential process. So first we try and improve affinity. And then we find something that binds the way we want it, but it may not have those manufacturability properties. So then we start tinkering with it to try and iron those out. What the computer allows you to do is do that as a simultaneous optimization equation,

which ultimately gives you a tremendous advantage in terms of speed, but also potentially quality. Yes. That's also the reason it takes so long to even get a drug into the clinic, right? And that's five to seven years because you are doing this multi parametric optimization, right? As you said, you're first checking for the right affinity, then you're trying to improve the affinity, but then there's a...

safety issues, so you're trying to make sure that toxicity is not there. I mean, all these things that you talked about. It's interesting that instead of doing that sequentially now, you're trying to do that in one shot with this. That's a long-term goal. I think we're still a bit aways from there. That's why the wet lab integration is so important for us. Okay. But what we're seeing is you're able to do that. You mentioned a five to seven year process. What we've seen with our first molecules that have entered the clinic,

the first one was 17 months from concept to clinic. The second one was about two years. And so you're seeing a dramatic reduction in the time it takes to come up with an optimal therapeutic. Great, what about when people are using these generative models for the sequencer, and it's amazing that you're able to change

20 % of these, you know, complementary determining regions, right? The CDRs of the antibodies, which is fantastic. How does your generative reaction work? Like, does it come with like, you know, is it like a large language model that's coming up with like, you know, the next amino acid in the sequence or how does that work? So the core foundational way we started building these models was, and there's been a pretty big evolution

over the last six years, just given, you know, number one, increasing computing power, we've seen a dramatic increase in, you know, the ability to train bigger and bigger models. Secondly, machine learning techniques have evolved quite considerably. And then, you know, thirdly, we've got better and better data sets to train.

So the starting data sets that we trained on were actually very common data sets. We studied every amino acid sequence across all species. So 180 million amino acid sequences. We also studied the protein data bank, which has about 200,000 high quality, high resolution structures of proteins.

What those two data sets allowed us to start to understand was the interrelationship between protein sequence and protein structure. The aim of Generate was always then to add a third leg to that stool and add function. So could we then through our internal model suggest DNA sequences that we could then build and test, measure, and capture data on function?

That has now, we've probably generated close to 10 million proteins that nature hasn't discovered. We've built all those, we've tested all those, we've learned from all of those. To your point on what kind of model architectures, we basically employ all different model architectures within Generate. We published one of our models, a model named Chroma, which was published in Nature about two years ago.

Chroma was a diffusion model. But we are not, we are very open to think about what is the right model architecture to solve these therapeutic challenges. And we learn a lot from the advances that are happening in academia, across industry.

We see how these techniques are evolving. We see the performance of different attributes. We innovate and push these frontiers ourselves, but we also certainly learn a lot from others. Given the complexity of the task that we're trying to undertake, it's going to take a collective effort to get to the best possible answers. And so we take inspiration from a number of domains. We hope to provide a bit of inspiration in our own work

because the task is really important and really complex. Yes. Yes. And as you mentioned, the diffusion model, people know that these are about something that's producing the images like in Dall E right? Exactly right. I mean, one of the observations was if you can see these sort of generative models create great likenesses in imagery, could you do the same thing with biology? Could we actually understand

not necessarily the statistical properties of what it means to be a face, but to actually sit there and dissect that from a protein perspective. And I think what we've shown is that we can, our models can perform very well in making suggestions depending upon how much starting information we have. One of the things that's been interesting is that if you have an existing structure of a protein binding a target, and we ask our model to suggest alternative structures that would bind

that same target, we're seeing success rates of like 50, 60%. Right. So it'll come up with pretty novel sequences that will bind in that same region. When you do the de novo task that we talked about earlier, those success rates are in like the low single digits right now, but they're on a logarithmic scale of improvement. So three years ago, the hit rate for de novo antibody across the field, we believe was close to zero.

Last year, it was in the 0.1 % range, 0.2 % range, so one out of a thousand. Now we're seeing it in the low single digit range. So you're on this beautiful curve, this logarithmic curve that ultimately if we can see another logarithmic leap, we think we'll be in a place where you're gonna have enough starting material that this becomes a much more standard way of,

you know, starting design. The reason I say that is what's happening on the back end is we're changing the underlying experimental infrastructure. So when we started Generate six years ago, we could only build 100 variants to an individual target. So our model was not limited in the number of variants it could suggest. We were limited experimentally from a capacity standpoint and how many we could actually build.

Yes. What we've - what we've done as a company and I think what you're seeing more broadly is through investments in automation, through miniaturization of experiments, we're now able to do experiments on the scale of a million defined variants to an individual target. And so are you doing like these high throughput experiments to validate? Absolutely. And so if you think about it from that vantage point, if I have a point one percent hit rate,

and I do a million variants, I'm still getting thousands of starting points. Yes.

And then, I mean, it reminds me of, like, you know, again, bioinformatics where, you are, you have these variants, you're trying to figure out, where exactly, which one is the target, right? And, but then even just because of the size of the experiment, you have so many variables that you get a lot of these false positives and stuff. How are you dealing with that? Because yes, I mean, you can generate a lot of these new variants. Many of them may, like, when you do the validation, they may give you a positive signal.

How are you trying to find, how are you finding the right one, right? Like that one, just because it's not just a statistical error. Well, that's, I mean, this is the beautiful thing about experimental validation, right? We don't just sit there and computationally trust that the model is a hundred percent accurate. Yeah. When you build and test all of these, you're being, you're defining every sequence that you want to build. And then you're actually,

synthesizing it in the lab. And so the beautiful thing is we're learning both from the successful sequences as well as the failure. So, one of the things that has plagued this generative revolution that we're seeing across society are things like hallucinations. Yes, we view hallucinations in our world as a feature, not a bug, because we learn as much from the hallucinated sequences

as we do from the functional sequences. Because every data point, when I build that hallucinated sequence, what I find is I have a non-functional protein. And I'm learning that, you know, that sequence that was suggested leads to a non-functional protein, and that data gets then fed back to refine the computational model. So we're starting to say, well, if I put this code in, it doesn't work. So let's learn that there's something in this code

that doesn't translate biologically. So that's interesting. So you're learning that, but now are you learning that at the level of - you are learning that at the level of sequences. Are you able to translate that to the level of structures? Yeah, I mean, so one of the big investments we made as a company a few years ago was a cryo-EM core. So we built a facility in Andover, Massachusetts that has four cryo-EM microscopes. As you know well,

cryo-EM is truly an artisanal craft where it can take months to solve an individual structure. The goal of this core was to actually make cryo-EM a high throughput instrument. And part of the reason we did this was for data purposes. While there's this beautiful, we talked about the protein data bank, we talked about the 200,000 structures of proteins. One of the gaps we saw in the protein data bank was there's a

a gap in antibody antigen interactions. And the reason that data is so important to us and to our models is if you think about what you care about when you're designing a therapeutic, it's how it interacts with a target, not necessarily the freestanding protein in and of itself. And so we have, through a lot of hard work, through some brilliant scientists, through

advancements in sample preparation, some machine learning based approaches. We're making these instruments high throughput and we believe we'll have more data than the public domain on protein-protein interactions in not too distant future. And part of what we're doing, as you rightly note, is we're actually solving structures. We call it structure in the loop learning, where as we're doing these therapeutic exercises to try and find an optimal molecule, we'll start solving

our experimental structures and understand more clearly what's happening at an anatomistic level between the protein and the desired target. So we have heard of DeepMind trying to predict structures, right? But okay, so it's trained on the structures that are in the nature. Now you are generating some really novel sequences. Is something like that able to predict the structures of these novel

proteins accurately or does that struggle there? Yeah, I mean, listen, what the team at DeepMind and the AlphaFold project have done is nothing short of remarkable, right? I think it's been a huge advancement for the entire field. We've embraced much of the structure prediction technologies in our own work. At the same time, if you think about where AlphaFold or tools like AlphaFold have struggled to really excel, it's at interfaces.

And so while you may be able to predict again the freestanding structure, as we know, proteins bind in a much more dynamic fashion. It's not a static interaction. Understanding the dynamism of that interaction is something that these current generations of structural prediction technologies struggle with. And it's not because the

underlying computational approach is flawed, it's because there isn't enough data to appropriately characterize it. And so that's part of why we've made the investment in cryo-EM, because we're trying to fill that data gap that will allow us to more accurately characterize that interaction. And in terms of data, so you're using, of course, the public data, but then the proprietary data, some of this that you mentioned.

Can you tell us a bit more about that, like in terms of data quality, how you're managing that and how does that affect the model accuracy? Yeah, so we were really, I mean, really, really fortunate to have an extraordinary informatics team from day one. I think was like our fourth hire was our head of informatics. And so the architecture at Generate has always been purpose built for remarkable precision and data quality and data capture, I

should note. So this is part of why we believe in a fully automated lab over time. Removing measurement error to our best of our ability, high quality assay development, high throughput assay development. And these are features that are super, super critical, I mean, at the core of what you're getting at, data quality.

You can have the best machine learning techniques, but if you have poor data quality, you're going to get poor answers. In this world, we believe through a much more systematic data generation philosophy and approach, we can generate the highest of high quality data. Where we're rate limited in certain instances, and we see this with some of our early therapeutic programs, is

the tools to effectively measure biology with the level of precision we want. We're getting antibodies today that butt up against basically our biophysical measurement limit with current instrumentation. I know historically a good protein-based therapeutic would have a binding affinity maybe in the the low picomolar levels, which would be a really, really good antibody. You know, there are many

therapeutics that have shown nanomolar binding. Our second program in the clinic is at 106 femtomolar binding. Wow. Right? Which, again, it's hard to measure. I don't think we're hitting a biophysical ceiling. I think we're hitting a biophysical measurement ceiling in our ability to analyze. Very interesting. So let's talk about GB-0669.

It's a phase one monoclonal antibody candidate to treat COVID variants. It targets the S2 domain, which has been thought of as undruggable. What does this say as a proof of concept for Generative Biomedicines? Yeah, so it's the COVID program that we started working on. I think we were all, during the course of the pandemic, asking,

what could we do to uniquely contribute to stop both the global health crisis that we were experiencing, but the broader social crisis that I think, you know, plagued so many around the world. When Omicron emerged, it was about Thanksgiving of, gosh, I'm trying to keep the dates straight, but like, I think it was maybe 2021. We asked ourselves,

Could we use our technology to do something distinctive from what the field has done to protect humanity from this virus? We talked to a bunch of evolutionary biologists and they suggested there are some highly conserved domains of the spike protein that if you could target them, they may lead to a more durable solution from a protein-based therapeutic perspective. Unfortunately, these domains are immune cryptic. So if you think about how -

you know, part of the challenge we've faced with COVID is what became clear early on was the receptor binding domain of the spike protein was a very good target because if you could block that binding domain, you could ultimately inhibit the virus. Unfortunately, when we are infected or we get a vaccine, our bodies generate antibodies to that receptor binding domain.

And those antibodies put pressure on the virus, which actually leads to viral evolution. The virus escapes those answers. What was suggested to us is there are other domains where our bodies don't produce antibodies in anywhere near the quantum that are much more conserved and therefore much more stable. And if we could find something from a therapeutic or prophylactic standpoint that would bind to those domains,

you may not see viral escape at the pace that you've seen it with other technologies. So all of the antibodies that the field and you know, our colleagues across industry had invented to date had targeted the receptor binding domain. All of those had ultimately been rendered obsolete by the evolution of the virus. And so we sat there and said, well, could we use these computational techniques to target

these immune cryptic domains because we're not bound by our immune systems to find good therapeutics. And we were fortunate to find an extraordinary molecule, GB-O669 that you mentioned, that binds with specificity to this S2 domain. The S2 domain houses the fusion machinery for the virus. There are some...

human-generated antibodies to this domain. Unfortunately, they don't bind the target very well and they don't neutralize the virus. We found a molecule in this case that both is potent and broadly neutralizing to this domain. And that's why, you you mentioned the word undruggable. I know many had seen that it was really hard to find anything that could actually be a valid therapeutic here. GB-O669, we've just read out the interim data at IDWeek

two weeks ago, the data suggested four really important things to us. Number one, GB-0669 was safe. There were no serious adverse events in the trial. Number two, it was highly effective, so it neutralized all historical variants of concern and potential future variants of concern. It also neutralized

pan- coronavirus so it worked against SARS-CoV-1 and a number of other bat coronaviruses. Number three, the molecule was well behaved and when I say well behaved the theoretical concern of computational protein generation has always been would these molecules be immunogenic? We were immune response right like exactly. Yeah The ADA rate, here the anti-drug antibody rate,

was very, very low in the study. So at the therapeutically relevant dose, was about 3 % across the entire cohort, it was about 8%, which is very manageable. And then fourth, we were able to demonstrate that these technologies gave us tools to prosecute biology that were almost impossible with traditional techniques. The molecule has done everything we've asked of it in terms of its ability to block the virus. And I think it'll

it'll serve as a good potential solution if the virus were to continue to evolve or we experience another coronavirus in the future. Okay, that's great. And so when you say the immunocryptic, that's where the immune system is not gonna go after that. So there's no problem of evolving there, right? That's a virus. Exactly, I I think our immune systems are amazing, right?

But there are parts of biology that our immune system doesn't generate answers toward. And so we estimate that a substantial portion of the target landscape is immune cryptic. And these technologies give us probably better tools to go after some of those cryptic domains. Very interesting. So you pause the development of GB-0669. Why is that?

Well, I think what we've seen is the market dynamics for COVID have radically changed. And we also have seen a very different stance from a regulatory perspective. Our belief that the best use of GB-0669 would be in a prophylactic setting. But we also believe that a prophylactic, you would, you know, the most desirable approach would be a combination based approach. If you think about rapidly mutating viruses,

what we've learned across the broader infectious disease landscape is that combinations provide more secure treatment or prophylaxis than a single agent. Right now, there aren't a lot of great combination agents to pair with 0669. And then when you couple that with the fact that the phase 3 trial that would be the next step for an agent like it,

would likely be a 6,500 patient study, an investment of multiple hundreds of millions of dollars. That's a lot for a small biotech to take on. And so for us, we have a great molecule. We will continue to characterize it throughout the phase 1 trial. And it'll be a great point, port of call, if we need a good answer in the future. It's either the current coronavirus or

future infectious disease attacks. Absolutely. I mean, this is another interesting way of attacking the virus, right? And then that's something that if something like this happens again, the pandemic, you will be able to use this knowledge, right, to find those

epitopes where, you know, these immunocryptic epitopes then go after them pretty quickly. That's right. And I think the other thing is that the other thing that's important to note is a number of the early therapeutics that were utilized in the course of the pandemic were actually repurposed drugs from prior, some from SARS-CoV-1, for instance. And so, you know, this is something that, you know, we've had, you know, three probably major coronavirus outbreaks in the last, you know, 15 years. Obviously, one was, had a

much bigger global human impact than the other two. But we now have additional tools in our arsenal and we have new approaches that should protect us better in the future. Yes. So given the broad applicability of the platform, how does the company prioritize the indications you're going to pursue? Yeah, it's one of the things we wrestle with the most.

Where we started from a company perspective was we believe we're working on a transformational technology that could change the way every large molecule is made in the future. And in that, what we've tried to do with some of our early programs is tune down biology risk to isolate, does the technology work? Sometimes what can be confounding

in drug discovery is it's hard to understand: was the technology flawed or was the underlying biological thesis flawed? So what we have started working on our series of immunology agents where we know if we bind a target in a certain way, we drive a desired clinical therapeutic effect. What we try to do in those cases is find:

Are there molecules that we can show huge distinctive advantages with in pursuing those targets? So our second molecule in the clinic is an anti-TSLP antibody. The current standard of care is dosed every month. And biologic penetration in asthma today is 10 to 20%. With GB-0895, which is our anti-TSLP antibody, the thesis was if we could improve the binding affinity,

and extend the half-life of these sort of approaches, could we get a molecule that could actually be dosed every six months for severe asthma? That program has now enrolled five of the six single ascending dose cohorts, and the early data seems to suggest we are going to be able to achieve the six-month profile. So if you think about global patient access, this would

be a huge benefit for society, for patients, to migrate from a monthly injectable to every six month injectable. You just think about the change in cost of goods. You think about the broader health system costs of those visits. It could be a huge benefit for society. And so we've, some ways, tuned down biology risk by going after precedent in biology with a better molecule.

As we look forward, we're doing a lot more work in both infectious disease and cancer. These are largely as a small biotech, we had to select some areas to build capability. We liked infectious disease because if you neutralize a virus pre-clinically, it translates very well into the clinic. So you get very early reads before you invest a lot. And we've saw that the technology could

do things distinctive as we showed in the COVID example. In cancer, we saw an ability to go after tumor specific epitopes in a very selective way. And if you think about, you know, some of the challenges we have is we have now a series of tools, whether it be like antibody drug conjugates, radio ligand therapies, T cell engagers, CAR-Ts

that now can kill cancer cells very, very well. But the question becomes, how do we selectively target the cancer cell versus the healthy cell? And similar to what we were talking about in the COVID example around targeting cryptic domains, are there cryptic domains or highly selective domains on cancer cells that we can use these sort of technologies?

And so we've got a series of different efforts on that front as well. And the company has a number of collaborations that's announced. So how do you view the role of partnerships and how will you balance the development of your own pipeline with the demands from the collaborators? Yeah, I mean, it's core. We are few and they are many. There are a lot of brilliant minds, a lot of brilliant scientists in a lot of different organizations.

We have some areas of focus, some areas of expertise, but we can learn a lot from the field. And we have a technology. I think one of the major things that I spend a lot of time thinking about is one of the beautiful things about these computational approaches is they're introducing scalability into drug discovery in a way that has been almost previously unachievable.

If you think about how drugs have been historically discovered, it's down to an individual genius or a team of geniuses who come up with a therapeutic hypothesis and then have the perseverance to see it through. That's not a very scalable model. If I put the computer more at the center of the creative process, all of a sudden, what we know about computation is it is much more scalable. It collects

our collective wisdom and applies it in a way that can be applied across a number of different therapeutic challenges. So partnerships for us are essential if we're gonna get the most out of this platform. As a small organization, we're working on about 20 different programs right now. Many of those are with partners. So we have six of those with our good partner Amgen who placed their trust in us two and a half years ago.

And that's been a collaboration that has been really, I think, fruitful for both organizations. We work with certain academic medical centers. So MD Anderson Cancer Center down in Texas. One of the things that struck us was they collect two gigabytes of data on every patient that walks through their doors. Wow. And so, you know, they were joking with us one day that the secrets of cancer lie in their freezers. And so, you know,

were there some observations that their scientists have made that allowed us to identify some of these tumor specific epitopes that our technology could potentially prosecute in a very fast time. We have a deal with Roswell Park Cancer Center, which they basically were saying, there's some of the experts in cell therapy at Roswell Park. And they were like, one of the things that we don't have a capability to do is

to really optimize binding in the frameworks of CAR-Ts. Could we use your technology to help us in that endeavor? And so we're matching their expertise in manufacturing CARs, their clinical expertise, again, with our capabilities. And then most recently, we signed a deal with Novartis that should again, more similar to the Amgen deal, bring the breadth of Novartis' therapeutic area expertise -

tremendous protein engineering capability. It brings manufacturing capabilities, clinical capabilities to complement our expertise in computational protein design. And so I think in these cases, bringing the best of some of these organizations together, we can do more than either organization could do on their own. So when you're looking for like...

an ideal partner, these are some of the things that you're looking for, right? Like that they not only are like partnering with you in a deal making sense, but also providing a lot of their clinical expertise. And then you're having a lot of even like the thought expertise as well. Yeah, and I think, listen, there's so many brilliant scientists in our industry, right? And if we can benefit from complementary capabilities

and partners, that's awesome. And, you know, for us, we have a humility about what we don't know. There's a lot we don't know as an organization. And there's a lot of extraordinary expertise in other organizations. And we compare the best of what we do with the best of what they do, we're going to tackle some really important challenges. And as you know really well, I mean, biology,

it's hard. It's humbling. Right. We need as humanity to put our best minds on it. And not all of those best minds will be in any one organization. Yes. It is great then that you are then working with so many different partners, really smart partners and scientists, and you are integrating a lot of the knowledge you're getting with them, right, like into this. So really accelerate

how you can develop the better and better therapies. Yeah, and at the core, what we're all here to do is get better medicines to patients faster. Yes. And if we tried to tackle all those things ourselves, it would take us much longer if we were ever to even get to those answers. And so bringing those experts closer to us gives us a better probability, gives us a much faster decision-making framework, gives us

a lot of benefits from the expertise and the learnings that they've had on their journey that we can then complement with some of the things that we think we do pretty distinctively. Absolutely. So given the speed with which you can generate clinical candidates, what becomes the rate limiting step in advancing drugs to market? Two things. One is clinical.

Right? And tied to the clinical is the cost of clinical development. What I think, if I were to rank order, I always deconstruct kind of like the drug discovery process into three different pieces. One is your underlying target hypothesis. Do I have a better hypothesis than someone else? And if I do, that can help me create value and create a valuable medicine.

The second thing is, can I come up with the molecular intervention that ideally tests that therapeutic hypothesis? And then the third thing is, can I develop and demonstrate that and document that for regulators, for physicians, and for patients? And those are kind of the three core ways that we create scientific

value. Now there's obviously some nuances on the manufacturing side, the commercial side. But you know, if you think about the scientific enterprise, those are the three big tasks. What we're really good at is the molecular generation piece. Right, we can come up with extraordinary molecular answers

we think it because it's applicable across disease areas and across protein modalities. That gives us a lot of different hypotheses to potentially pursue. At the same time on the clinical side that we're start to enter into a much more highly regulated space. So while I believe computational techniques will play a huge role in redefining the clinical pathway in the future,

I think it's going to be a bit of a longer journey. Now there are elements of clinical development that I think we can address sooner - patient recruitment, right? These are things that can be very long duration exercises that we should be able to use digital technologies to identify patients much, much quicker. On the back end, study closeout

exercises, right? So you think about, you know, from the time we finish a trial to the time we file, sometimes that can be a six month or longer exercise. Yeah. These sort of large language models should be able to write a lot of their study reports almost instantaneously. Yes. There's the piece in the middle though, that I think we can use computation and data to do better patient selection. So maybe enrich through biomarkers or other

markers. But ultimately, I think for these early generation technologies, we're going to have to adhere to the current regulatory process as we should, because the worst thing we could do as an industry is undermine the perceived safety of these sort of approaches that would diminish trust and potentially

undermine what I think will be one of the biggest productivity boons to the industry in the last 50 years. So I think we're in this world where there are going to be pieces of the clinical process that are going to continue to be long. But can we collapse all these other things to enhance productivity? And I think that's what we're trying to be very selective about. On the capital side, the cost of a clinical trial just keeps going up.

And we've got to figure out ways to stop the rampant inflation of what it costs to study a patient. At the same time, we also have to be better, and this goes back to the biomarker point, at selecting which patients disproportionately benefit from a medicine. If we have an enriched population, you need a smaller sample size typically, and you get faster results because you usually see the therapeutic benefit

much, much quicker. And so, there's a whole host of things, but clearly, we believe over the next 18 months, we'll add another three to six programs into the clinic. We believe that is something that could be a steady state if we can solve some of these clinical and capital challenges. Yes. And how do you see the application of generative AI changing drug discovery?

I think in the not too distant future, every molecule is going to be made this way.

Now we can debate timelines. And when I say not too distant future, I know, unfortunately I've been in the drug industry for long enough that, you know, not too distant future means a five to 10 year horizon. You know, I think, I think we're on the brink of a revolution, and, you know, at its core, if you think about biology, biology is the original information technology.

So when we are conceived, what's passed from parent to child is a code called DNA. Within that code, all biological function is derived. Humans have never understood that relationship. But there is a relationship. I think it's inevitable with the help of computers, we are going to start

to understand that relationship better and better. For that reason, I think this has a transformative effect because whenever, if you go back over the arc of history, so over the last 150 years, what we've learned as humanity is when complex domains become engineerable or programmable, industrial revolutions occur.

I believe we are at the start of the biological revolution. We've watched chemical systems, we've watched electrical systems. We've thought about Bernoulli's principle for flight underscoring the aviation revolution. There's a whole host of Ohm's law for electricity. These sort of things have been magical because we've distilled these complexity down to engineering.

And as with the help of computers, I believe we're going to start to be able to not just see the code of life in DNA, read the code of life, but ultimately write in the code of life in a way that will create extraordinary solutions for some of the biggest illnesses and challenges that face the planet. I'm really looking forward to all of that. And it's a really, really exciting times.

Mike Nally, CEO of Generate Biomedicines. Mike, thank you very much for your time today. Thank you very much. Appreciate it.

Well, Amar, what did you think? It was amazing to understand how the platform technology that they have built and of course, I mean, they have been doing generative AI for last six years, as you mentioned, right? Unlike the last couple of years when this has all been out in the public, people have been talking about it, but of course, generative AI was there before and these guys have been visionaries in...

starting with that such a long time ago. What I really liked was also that they're evolving. Their models are evolving as time is going, so they're very open to that. They're also doing a lot of lab validation, and it was very interesting to hear about how now they have started to get really to the right kind of like adjusting the parameters, right? So that now the number of drugs that they can get, potential drugs, is increasing

exponentially, that is amazing news. Mike mentioned that we don't understand biology all that well. He suggests that we'll understand biology in a way that was not previously imaginable with these AI systems. But now I'm wondering, will these AI systems operate as kind of black boxes making connections that we're not aware of, or will we actually get to a new understanding? Yeah.

Of course, we don't understand biology that much. You look at in physics and chemistry, like to what extent the knowledge that we have versus in biology, it's, I would say, still pretty basic. We have made tremendous advances over the last 50 years. The amount of understanding we have about immunology especially has gone up tremendously. Oncology tremendously.

Maybe not so much in neurology, we still don't understand a lot of things about how we think and how we process things, right? So there are a lot of things that we still need to learn there. But also, see, the way a good measurement there is, we able to, if we understand a system, are we able to then cure all the diseases in that system? Or even the best is if we can even prevent

any disease that might be happening there, right? So the way I would say is that immunology, right? Like are we able to like yes, of course we have now great drugs against arthritis, against asthma, etc. But then are we will be able to ever prevent those, right? So yeah, we have a great understanding, but we still don't have the understanding about how we can completely, you know, we eradicate those diseases.

But the same thing also in terms of, see, we have so many thousands of proteins that are in our body. We have millions or billions of cells. How do they all each interact with each other? We know some, but maybe I would say less than 10%. We need to understand a lot more. So yeah, absolutely, need to, there's so much that we need to learn.

I mean, I remember that until this AI came up, people used to say that 21st century is going to be the century of biology, which I believe is still true, right? And now, yes, there's AI to help with understanding biology. Now, to your question about is that going to be a black box or not? I mean, I would say it's going to be, see, it will be getting more and more

knowledge about it. Of course, AI, maybe the way it can, it's coming up with some of this knowledge that may or may not stay as a black box. But in biology, you can just do experiments to validate what you're finding. So it's not like, you find something and you never know whether it's true or not. You can validate and see if it's right or not. I think that as he said, right, it can do what one genius scientist

can do at a much larger scale. So understanding that. So I think it's a very much bright future. However, AI or machine learning, however they discover a new insights of biology, it's fine. I think we will know much and much more about it anyway. We have the wet lab to really go for there anyway. So of course, AI by itself is not gonna be doing all these. It's gonna need the wet lab and the validation to then really understand about us.

You know, one of the things he talked about was hallucinations and we've talked about hallucinations from time to time, but he talked about hallucinations as a feature rather than a bug because he said they can learn as much from those than they can learn from functional proteins. What did you think of that? The hallucinations that we are right now having, the context is we're asking some of these, you know, large language models, some questions that we get some weird answers, right? Those are the hallucinations.

The context here is different where they are asking to generate, they are generating new sequences and some of the sequences are weird because they came up because of these hallucinations, but what they're doing is they're validating using their experiments and then they're saying, okay, well, instead of the right type of sequence, if you get this kind of, this wrong type of sequence, what are the implications of that? So for them, it's just, you know,

okay, well, here's some good samples you learned from that, here's some bad samples you learned from those. So the hallucination just becomes these like bad samples rather than how like when we are consuming that in our mindset, like it's different. But I see that hallucination is just like, you know, it's just a bad data that was generated, that's it. You talked about the control over biology, was that possible before bringing about programmability to create

functional variants to proteins to optimize them as therapies. Mike said we're on the brink of a revolution that every molecule will be created this way. What did you think of the potential to change drug discovery with this technology? I think there is a lot of potential that I see because until now, as you said, right, I mean, if you talk about these CDRs, which are like, you have these antibodies and in the antibodies,

like the basic structure is the same for all these antibodies, but there's the CDR, which is called Complementarity Determined Regions. These are the regions in the antibody where the sequences can be very different. And those are the sequences that determine what is the target that the antibody is going to bind. So that's where the variability of the sequences is. And what he talked about is that, until now, what the scientists did was they looked at what's already available.

And then, okay, well, let's make a changes here, change there and see if the binding improves based on that. With this de novo that he talked about, it's like, okay, well, you start from scratch, right? And then you can come up with a lot of very different sequences and then because of that very different structures. So that is the power that generative AI is now giving us. And of course, like now that

I mean, these are the practitioners of those. And now as he's talking about it, even three years ago, not much yield was coming out of that, but now it's basically increasing exponentially. So that means that there's a lot of improvement that's happening in that area. And to me, then, that is going to be fantastic because see, theoretically to me, it makes sense that when you have the ability to generate a lot of different sequences, you're going to get much better drugs. But now that I'm hearing from him that

in practicality, they're getting much and much better success. So yes, of course, I mean, theoretically it makes sense to me and now it's great to hear that that's happening practically as well. Well, another fascinating discussion. Amar, thanks and until next time. Yes. Thank you, Danny.

Thanks again to our sponsor, Agilisium Labs.

For Life Sciences DNA and Dr. Amar Drawid, I'm Daniel Levine. Thanks for joining us.

‍

Our Host

Dr. Amar Drawid, an industry veteran who has worked in data science leadership with top biopharmaceutical companies. He explores the evolving use of AI and data science with innovators working to reshape all aspects of the biopharmaceutical industry from the way new therapeutics are discovered to how they are marketed.

Our Speaker

Michael “Mike” Nally is a veteran pharmaceutical and biotech executive currently serving as CEO‑Partner at Generate:Biomedicines, a Flagship Pioneering company dedicated to pioneering generative biology. Based in Boston, he combines deep commercial leadership experience with a visionary approach to merging machine learning, biological engineering, and experimental innovation to create next‑generation biologics

Recent Episodes

Generating Proteins that Nature Never Imagined

In This Episode

Transcript

Our Host

Dr. Amar Drawid

Our Speaker

Mike Nally

Recent Episodes

Why AI’s Most Promising Near-Term Value Is in Clinical Operations

Teaching the Biopharma Workforce to Coexist with AI

Making Clinical Trials More Predictable with AI

Finding the Right Healthcare Providers from Clinical Trials to War Zones

Driving Efficiency in Drug Development with AI

Improving Clinical Trial Designs with AI

Navigating the Challenges of Integrating AI in Clinical Development

Unlock Life Sciences

Innovation with
Agentic AI