xander.ai
Software Engineers Meet Bio
Someone set out to found the company I founded in 2018, so this is the advice I gave them.
Xander Dunn, 5 August 2023
I Would've Appreciated This
A software engineer and tech founder reached out to me with plans to found a company that uses machine learning to develop gene therapies for longevity. I was in a very similar situation in 2018, so I gave him the advice I wish someone had taken the time to give me in 2018. I can't say for sure that it would've put me on a more productive path at the time, but I can say for sure that I would've appreciated it in retrospect.
Advice
I was Co-Founder & CEO of a biotech company in 2018 focused on using deep learning to improve the design of gene therapies, with a focus on longevity. I want to live forever just as much as anyone, and I have the pills, exercise, 0 alcohol, minimized microplastics, and calorie restriction to prove it. But I'm skeptical of the application of software, or deep learning in particular, in biotech.
Biology has great challenges that software does not have:
- Biology is fundamentally slow, because the physical world is fundamentally slow. The difference between chatGPT / Midjourney vs. industrial robotics / biology is that in the former case we have billions of monkeys around the world generating data (images and text) for those models every hour of every day, whereas for the latter we have very, very little data, and it's nearly all proprietary. Machine learning hasn’t gotten much more data efficient in the past 5 years. Producing data in biology is extremely difficult. Cells won’t grow any faster than they already grow (and synthetically speeding up the process would potentially invalidate the results). A single mouse study takes years. It takes physical humans to run all of the experiments, which introduces a lot of noise. As a result, biologists don’t really even look at molecules below a very large effect size because it could just be noise. Cloud bio labs are attempting to solve this problem by making biology experiments automated, scalable, and less noisy. Emerald Cloud Labs is an example. But, their progress has been slow. ECL hasn’t met any breakout success, and their competitors have died out. When I last checked a few years ago, they still couldn’t do mammalian cell assays, they were focused on servicing high throughput small molecule experiments for big pharma. The timescale and noisiness of biology and the sample inefficiency of deep learning are orders of magnitude off from one another. AlphaFold made progress on the protein folding problem due to the 150+ years of open access sequence data + many decades of physics modeling on protein interactions. So the question is, where do we find or create similarly sized datasets? Even AlphaFold faces challenges to market impact. Anecdotes from labs using it are that it is no panacea. With every deep learning endeavor, I ask how the Bitter Lesson fits in. The massive successes we've seen have been scaling deep learning across enormous datasets: the text of the entire Internet, trillions of years of Go/Atari/Starcraft simulations, every image on the Internet, etc. Biology presents one of the most challenging data situations for deep learning: low resolution (often single bit yes/no data points), low public data availability, expensive data collection (multi year mouse experiments), and very noisy data (performed by hand with low reproducibility). Whatever deep learning play ends up working in biology will deeply understand the Bitter Lesson.
- The average cost of bringing a human drug therapy to market is ~$1B and ~10 years, and ~90% of drugs fail to get to market even after reaching Phase I clinical trials. It is an incredibly capital intensive endeavor. Imagine if the average cost of putting a software MVP into the hands of its first users were $1B and 10 years? The entire tech & VC ecosystem would look radically different, and everyone would be much more risk averse.
- As a result of the capital intensity, drug startups are either purchased by a large pharma company, or at the very least partner with them to bring drugs to market. Ultimately this means you move at the speed of big pharma, not at the speed of a startup.
- In software, it's trivial to verify someone's claims. I can simply go to your website and try your product, or I can clone your code and run it on my laptop. Biotech is the opposite. It takes too much time to evaluate anyone’s claims, so everyone looks for credentials. Biotech is rife with credentialism. It was very difficult to get biotech VCs to give me a chance given that I have no PhD in genetics. My cofounder did have a PhD in genetics, but not in longevity research, and that mattered. Being smart and energetic sufficiently derisks a $6M software investment, but it does not sufficiently derisk a $1B biotech investment.
<redacted VC 1> encouraged you to go for it now, but how much money did he give you to pursue it? Words from VCs are empty unless backed by money. In this case, it needs to be a lot of money. <redacted VC 1> invested >$150M in <redacted biotech company doing longevity>, so he's not serious unless he's investing those amounts. The "I already have a longevity investment and they might be unhappy if I invest in your longevity company" is nonsense. He already has multiple longevity investments.
<redacted VC 2> is being polite, but <redacted VC firm> invested in <one of the most successful biotech investments of all time>. Their core focus area is whatever makes money.
Perhaps it would be a better use of time to work at Retro Biosciences or NewLimit for a few years to better understand biotech and see first hand where there might be opportunities for problem solving and company building. It may also be worth meeting Raphael Townshend, who was on the team at DeepMind that developed AlphaFold and left to start his own company in this space a few years ago. I can introduce you. Or, have you reached out to David Sinclair and asked how you can use your software skills to empower his epigenetic research? Sinclair starts a new company almost yearly, I'm sure there is some commercial endeavor to bring his epigenetic research to market.
My current hope is that longevity will be solved indirectly within our lifetimes. For example, solve AGI / ASI, which solves longevity. Some people believe they will be able to upload their brains to silicon before we have the ability to seriously increase human biological longevity (Neuralink + e11 Bio + ASI). I think finding a really clever way to solve longevity indirectly is our best bet as software / math / physics people. But, if this is something you have high conviction in, don’t let me stop you. I hope you succeed where I did not. I am very optimistic about the future of both biology and software. All of these problems are solvable, but the path is not yet clear to me.