The ASML of DNA Sequencing

The ASML of DNA Sequencing

Show Video

a single company's machines produces 90 of all the DNA Gene data sequenced Illumina holds an effective Monopoly in the DNA sequencing Market their next Generation sequencing machines blow up the sequencing processes speed and volume reaching in the hundreds of billions of bases per run accordingly the cost of DNA sequencing has rapidly declined after the turn of the century from over a billion dollars to just a thousand until recently I have never heard about this company in this video I want to talk a bit about Illumina the asml of the DNA sequencing industry but first let me talk a bit about our sponsor blinkist way before I started the YouTube channel I had been an avid non-fiction reader I would take these long walks near the SF Bay and listen to audiobooks about how our brain works during depression or the history of language I still love doing that so for 2023 I hope to get back to more of that and become a person more knowledgeable and attuned to the biological aspects of the world around us but with life being so busy now with the channel I really appreciate being able to get straight to the point and that's where blinkist comes in blinkist offers over 5 500 titles and 27 different categories in just 15 minutes you can get powerful insights into different topics while going about your everyday life learn about the world around us or how to be a better Communicator or a happier person I personally think that there's a lot of overlap between what asianometry does and what blinkist provides densely packed information in 15 to 20 minute chunks long enough to read on the high-speed rail or while washing dishes after dinner get everything you want to know about a particular topic or read to complement your everyday habits and accelerate your learning if you are interested in the concepts in today's video I recommend checking out the Gene and intimate history by Siddhartha Mukherjee previously he wrote the book I wish I could have written about cancer listening to the gene on blinkist is a wonderful follow-up to this video I also want to call out a new feature blink is connect which lets you share your subscription to a friend for as long as you wish you can get more out of your current subscription which is good get 25 off blinkist premium and enjoy two memberships for the price of one with connect start your seven day trial today by clicking on the link in the description below become who you want to be thanks to blinkist for sponsoring the channel in 1997 an associate working at a New York Venture Capital fund John stutenagel received a letter from an acquaintance at a technology licensing office they want Institute Nagel's VC firm Channing Weinberg group or CW group to come check out a new technology from Tufts University studentagle's boss at CW group Larry Bach meets with the Technology's inventor David Walt Walt demonstrates what he called Optical noses bead clusters like load a certain color in the presence of a chemical Stuart Nagel felt that this technology had potential after three months of due diligence CW group and Tufts negotiated an exclusive worldwide license for a new startup after cycling through a few names like Sensa Technologies they settled on Illumina illumina's first products focused on a type of genetic variation measurement called single nucleotide polymorphism or SMP these are systems that help detect minor substitutions of a single nucleotide at a specific place in the genome these can indicate vulnerabilities to certain diseases for instance cystic fibrosis is a rare genetic disease caused by a mutation in a specific Gene throughout the 1990s the SNP research Community focused on cystic fibrosis as the rallying point to advance their research in the late 1990s a company called Alpha metrics took leadership in SMP their product the gene chip was made up of a rays of hundreds of DNA probes produced with semiconductor manufacturing methods what David Walt inventor of illumina's founding Technologies demonstrated for studentagle and the investors at CW group was not a product it was simply a technology Walt demonstrated that you can etch Wells into the ends of a bundle of optical fiber then you place a bead about three to five micrometers large into each of the etched Wells these small beads can turn a different visible fluorescent color when in the presence of an odorant that light goes through the optical fiber for interpretation by CCD camera system that was basically it Walt envisioned that this optical fiber technology can create DNA sensors illumina's co-founders faced the challenge of turning this science experiment into a real product in 2002 Illumina released its first product the bead array which was then expanded into a full platform first there was the Centrix array Matrix and Centric speed chip one is suited for higher throughput while the other features more etched Wells and is thus more flexible other parts of the platform include the bead array reader and the alligator DNA synthesis you start with a patient's purified DNA you then introduce it to a bead array on either the array Matrix or the bead chip DNA molecules in the sample will bind to their matching beads depending on the system either the sample or the beads are tagged with a fluorescent dye the bead array reader then shines a laser on the die and interprets the reflected results together these Technologies allow for someone to efficiently and effectively measure genetic variation SNP brought Illumina its first bucket of gold however the company knew that it faced a long-term threat from an adjacent Market DNA sequencing a gene sequencer or DNA sequencer produces or reads DNA sequences consisting of strings of letters a T C and G with a fifth letter N for ambiguity to borrow a phrase SNP genotyping is like reading a few words or phrases in a book DNA sequencing is about reading larger chunks of that book or perhaps the entire book itself why is this helpful here's another metaphor if you are in a dark room and you want to find the remote control it is easier to just turn on the ceiling lights and see the whole room all at once than it is to use a flashlight to scour the room little by little so DNA sequencing can work better than genotyping giving access to more available data the downside is that DNA sequencing is hard takes a long time and is expensive let us pause for a moment to walk through the DNA sequencing industry from the very beginning because I want to famously Watson and Crick aided by the work of Franklin and Wilkins figured out the double helix structure in 1953. however it would take another 15 years to actually develop the technology and capability to read the DNA sequences within that structure a long and Winding Road the same year of the double helix Discovery the British scientist Frederick Sanger sequenced two chains of a biological molecule bovine insulin sanger's work Illustrated the importance of sequence in such molecules and won him his first of two Nobel prizes in 1958. DNA presented a bigger sequencing challenge than simple proteins DNA molecules are much longer furthermore there are fewer component units the four DNA bases unlike the amino acids and proteins it is harder to distinguish one base from another initial efforts first focused on sequencing RNA some types of RNA molecules are more similar to proteins and scientists recognized that they can adapt protein sequencing methods accordingly in 1965 Robert Hawley first wholly sequence a type of transfer RNA in a species of yeast others soon followed these methods were adapted to tackle DNA sequencing that in 1970 the American scientist Hamilton Smith and his team discovered what are called type 2 restriction enzymes these enzymes provided a general method to cut up the big DNA molecule into smaller more manageable pieces we can then sort those out by size using a method called gel electrophoresis Smith a UC Berkeley Alum shared the 1978 Nobel Prize for medicine for this these discoveries pave the way for what is to come in 1975 Sanger and his colleague Alan R Coulson introduced the plus and minus method of DNA sequencing the principles demonstrated here would dominate the gene sequencing industry for the next three or four decades this is exceptionally complicated stuff bear with me as we briefly walk through this the original Sanger method involves preparing four mixtures each of the mixtures contains one the template DNA you want to sequence 2. DNA polymerase which is an enzyme that catalyzes the synthesis of DNA molecules from its molecular precursors three a primer which is necessary to Prime the process of DNA synthesis and four those DNA molecular precursors the deoxynucleotides acgt one of those deoxynucleotides is tagged with a radioactive substance the presence of a DNA polymerase and a primer together in these four mixtures will trigger the synthesis of new DNA these new DNA strands will be of varying lengths we then split these four mixtures into eight containers then we perform a second round of DNA polymerase reactions on each half of the eight variants however there is a Twist The Twist is that we terminate the reaction at a specific step in the sequence by withholding a critical ingredient one or sum of the four nucleotide bases the full reaction needs all four present these are the plus and minus reactions which give the whole thing its name for the minus reaction we add three of the four bases thus withholding one this generates DNA extensions that all terminate just before the missing base for the plus reaction we add just one of the four bases thus with holding three this makes all the generated DNA extensions to end with that one base with this we can load the contents of the eight containers onto Lanes of polyacrylamide gels and use gel electrophoresis which I mentioned earlier to separate and sort the DNA Extensions by their size we then place our eight Lanes onto a roll of X-ray film and leave it overnight like a no-bake cheesecake after you develop the film we can infer the position of the nucleotides and DNA fragments up to 50 bases large Sanger and Coulson use this method to run the first DNA Gene sequencing the subject being a virus bacteriophage called Phi x174 two other scientists Alan Maxim and Walter Gilbert introduced their own method it also uses polyacrylamide gels to sort radio tagged DNA fragments by their length but used a different method to generate those fragments Maxim Gilbert sequencing was initially popular since it didn't require as many Preparatory steps as the plus minus method however it did require the use of certain toxic chemicals then in 1977 Sanger came out with the ddoxey chain termination method it changed some of the ingredients including the radioactive tagged nucleotide used so to allow longer sequences to be read the didioxy chain termination method later renamed to just a Sanger sequencing method became the Cornerstone of what we call the first generation of automated Gene sequencing this 1977 mic drop earned Sanger his second chemistry Nobel shared with Walter Gilbert and Paul Berg in 1986 applied bio systems introduced an automated sequencer the ABI 370a DNA sequencer at the start it was more like a system that helped read and interpret the data but automation rapidly progressed throughout the 1990s applied biosystems was the Undisputed market leader in DNA sequencing technology the National Institutes of Health use an ABI 370a DNA sequencer for the Human Genome Project DNA sequencers compete in the market based on the accuracy of their reads how fast they can do those reads and how many reads they can do in a single run oh yeah and cost two never forget about cost generation DNA sequencers relying on the Sanger sequencing method most suffered shortcomings and sequencing throughput and cost in the 2000s new technologies emerged to greatly raise throughput with parallelization we Amplified the DNA sample split it up into many smaller pieces and finally read those smaller pieces all at once a computer program known as a sequence alignment software can then assemble the sequence pieces together there are a number of Open Source and Commercial software to do this with varying levels of sophistication and accuracy the game changing ability to do a million or billion reads in a single run using parallelization thus earn the name Next Generation sequencing or high throughput sequencing for this video I'm going to use the phrase second generation the very significant company to commercialize this technology was a company called 454 Life Sciences using a newly discovered method known as pyro sequencing pyro sequencing takes advantage of a two enzyme reaction that generates fluorescent light as a side effect we can trigger this during DNA synthesis to measure which base pairs are being added throughout the synthesis 454 was later acquired by Roche their first sequencer the gs20 entered the market in 2003 and was capable of reading a total of 25 million base pairs in a single 4-Hour run at a cost just one-sixth of the Legacy Sanger automated technology other parallelized products entered the market after 454s but the wine with the most potential came out of the United Kingdom Celexa so Lexa was founded by two British chemists from the University of Cambridge the india-born Shankar Bala subramanian and David kleinerman in late 1997 the two approached an investment firm called abingworth with the proposal of a technology capable of speeding up Gene sequencing by a factor of 104 to 105. the following summer abbingworth invested 3 million dollars to get the company started so Lexus method of Gene sequencing referred to as sequencing by synthesis is very complicated what follows is an incredible simplification first we break the DNA we want to sequence into random small pieces about 200 base pairs long this is part of the parallelization I discussed earlier we add adapter chemicals to the ends of these smaller pieces for marking purposes then we stick the whole thing to a solid plate then we use a method called PCR Bridge amplification to create a million copies of each small piece referred to as clusters finally we do the sequencing like with pyro sequencing we trigger a DNA synthesis reaction using DNA polymerase and use fluorescence to track it during the synthesis reaction the DNA clusters on the plate will pick up the complementary nucleotide to match up to its existing sequence C for g a for T and vice versa like with Sanger sequencing we Halt and limit the full DNA synthesis reaction so that it only picks up one nucleotide at a time we tag each of these nucleotides with a fluorescent that glows when exposed to a laser a CCD camera reads the light signal and gradually learns the cluster's nucleotide sequence so Lexus Innovation was using DNA amplification on a solid surface to strengthen the fluorescent light signal this not only allowed for exceptional accuracy but also meant that Alexa did not need the most expensive and sensitive Optics in 2005 Celexa went public on the NASDAQ threw a merger with a Bay Area biotech company called links the combined company was worth over 200 million dollars a year later in 2006 Alexa released what they called the 1G genome analyzer the 1G stands for one gigabase the target total output of a billion bases per run later devices surpass this number running 20 30 and then even 50 gigabases per run over the span of several days in 2012 so Lexus technology will power the high SEC 2500 capable of sequencing 600 gigabases in a single run and with 99.9

accuracy the entire Human Genome has little over 3 billion base pairs you might be wondering why would you need 600 gigabases the reason seems to be redundancy for additional coverage to avoid false positive and false negative errors during the sequencing process anyway at about four hundred thousand dollars per device the genetic analyzer was more expensive than its contemporaries but when you take into account the system's immense throughput the cost per sequence gigabase made a lot more sense in November 2006 shortly after the genome analyzer entered the market Illumina offered 650 million dollars to purchase Celexa Illumina felt that Celexa had a revolutionary technology and with it they could challenge the first generation giants like Alpha metrics and applied biosystems furthermore Illumina wanted to integrate into our existing platforms the same parallelization technologies that allowed Celexa to sequence an entire genome at scale the acquisition paid off more than anyone could have ever imagined the industry rapidly adopted celexas now illumina's approach to DNA sequencing new collaborations with industry and Academia were also generating opportunities the more academics used illuminous products to produce highly cited peer-reviewed articles the more it cemented the company has the Undisputed leader of the second generation of DNA sequencing today Illumina has about 80 market share in the DNA sequencing Market generating 90 of all DNA sequence data today its newest machines can sequence the entire Human Genome in 48 hours at the cost of one thousand bucks the biotechnology industry depends very heavily on IP rights receiving patents and protecting those patents Illumina is no different this company has seen and filed more than its fair share of lawsuits in 2004 Alpha metrics filed a lawsuit alleging that illuminous b chip and array Matrix infringe on six of its patents Illumina filed countersuits after four years in 2008 the two companies settled their lawsuit with alumina paying 90 million dollars without admitting liability after releasing its highly paralyzed second generation sequencer Illumina got sued by ADI the former first generation leader and Columbia University another biotech startup helicos sued Illumina as well after several years the Holocaust patents were dismissed by a judge in 2010 and 2012 Illumina themselves filed lawsuits against a company called complete genomics complete genomics was an upstart Challenger to illumina's Monopoly with 90 customers and revenue backlog of about 30 million dollars the suit was eventually settled but it left complete genomics edling the CEO resigned and the company accepted an acquisition by the Chinese biotech giant bgi group in 2016 Illumina also sued to block sales of the British company Oxford nanopore which has a very small DNA sequencer that you can take with you on the go this was quickly settled by August 2016. Illumina has since had especially vicious clashes with bgi they have since sued bgi in the U.S UK Germany Denmark and so on presumably to lock bgi's competing Gene sequencing products from those developed markets they have not won all their battles recently in 2022 there was news of a court awarding bgi 300 million dollars in Damages for patents from its complete genomics acquisition Illumina is also a sharp-eyed acquirer after all its brilliant acquisition of Celexa cemented its dominant position in the second generation genome sequencing Market two of the company's attempted Acquisitions have thus run a foul of antitrust regulators the first happened in November 2018 with its 1.2 billion dollar proposed acquisition of Pacific biosciences if you recall Illumina sequencing technology uses millions of short reads on DNA segments Pac bio has a technology capable of longer individual sequence breeds about 15 000 bases compared to illumina's 50 to 300 bases these long reads slow down throughput but also offer Diagnostic and Industry benefits a combined packed bio and Illumina would have few domestic competitors the aforementioned nanopore and its adorably named minion product would probably be one so a year later in December 2019 the Federal Trade Commission sued to block the acquisition calling Illumina a monopolist in January 2020 Illumina abandoned the Takeover and paid packed bio a breakup fee then came the announce acquisition of Grail a former alumina spin-off that tests for multiple early stage cancers in the blood using DNA sequencing in September 2020 Illumina sought to bring Grail back into the fold for 7.1 billion

dollars this alarmed Regulators who saw it as an illegal vertical expansion Illumina is the Monopoly provider of DNA sequencing equipment and acquiring Grail would give it incentive to cut off grail's competitors from an essential piece of equipment both the U.S Federal Trade Commission and the European commission sought to block the deal a judge ruled against ftc's lawsuit but the EU decided to prohibit the acquisition Illumina closed the acquisition anyway a ballsy move and as of this riding is waiting for word from the EU on what to do some of illumina's core patents on second generation sequencing are starting to expire so generic versions of their early Hardware as well as the chemicals used to run its sequencing referred to as consumables might be possible new competitors have emerged like element bio and singular genomics seeking to push gigabase output speed flexibility or cost not to mention the giant over in China bgi but at the same time the company has iterated on its IP pretty well and especially on the software and business side Illumina software and data standards have become an industry Foundation like adobe's have there does Loom the possibility of another sweeping technology change uh quote unquote third generation of sequencing that can overturn illuminous Monopoly what that technology might look like however remains uncertain as Illumina continues to battle its competitors pushing prices down and throughput up perhaps the biggest benefit for us would be the insights we can glean from cheaper data sets of a complete genome the possibilities of modern machine learning techniques applied to this abundant genomic data are pretty exciting alright that's it for tonight thanks for watching and thanks again to blinkist for sponsoring the channel check the link Out Below to subscribe I'll see you guys next time

2023-01-29 02:52

Show Video

Other news