[PARIS AND WASHINGTON] Concerns that Celera Genomics Corporation may be poised to obtain a monopoly over human genome data are mounting in the academic research community, as the implications of the company's licensing policy on its soon-to-be-released sequence database begin to sink in.
In a statement earlier this month, Celera said: "When sequencing and scientific analysis of the human genome is completed, the consensus sequence data will be submitted for publication in a scientific journal. These published data will be made freely available to researchers around the world under a non-redistribution agreement."
Philip Green, a biocomputing expert at the University of Washington, argues that this "non-redistribution clause" represents a "significant departure from previous promises made by Celera that the sequence would be in the public domain; this presumably means, for example, that the data will not be submitted to GenBank."
Celera's president Craig Venter promised Congress under oath, in 1988 [http://www.house.gov/science/venter_06-17.htm], that data from the human genome would be made available to the public domain and that he would work with the National Library of Medicine, which runs GenBank. "Those statements implied to all observers that there would be truly no limitations on access or use," says one scientist working for the publicly funded international genome project.
Paul Gilman, director of policy and planning at Celera, confirmed to Nature that no data will be submitted to GenBank. Instead, sequence data will either be released on the company's own web site, or provided to researchers on DVD discs.
Gilman compares the terms of the proposed free public release with that of software licensing agreements. By opening the DVD package, the recipient would agree not to redistribute the data. "We do not want researchers to redistribute the data in competition with us," he says fearing that this could undercut Celera's business plan, which relies on selling subscriptions to its database to large pharmaceutical companies.
But many researchers fear that the license will preclude free use of the data by scientists and that Celera may eventually close out all competitors. "I think it would be a disaster for biology if [a monopoly] happened to bioinformatics," says John Sulston.
Another genome scientist involved in the publicly funded project says that the licensing model adopted by Celera puts stringent limitations on use. "No other commercial database can use the Celera sequence to build a better annotated version," he says. "Their explicit intention is to obtain a monopoly on this field, driving out Incyte, Pangea, and a host of other creative companies whose ability to provide added value to the genome sequence is something we all want to flourish."
Other researchers are concerned that the agreement would prevent data contained on the DVD being used to design DNA chips for the whole human genome, or other genomic or proteomic experiments that would require use of the entire genome sequence. Gilman argues that Celera is not interested in selling raw data as such, but rather in licensing sophisticated software tools for its analysis. He defends the redistribution agreement. "People will have the entire completed genome from us before they have it from the public and they'll have it for free."
It is true that the Celera database, likely to be available this summer, will be the best available and very attractive to scientists in academia and industry. Although by late spring Celera will only have sequenced the genome to a depth of 4X (meaning that four bases of sequence have been generated for every base of genome), its database will be close to the 10X gold standard -- ironically, obtained only by merging its own 4X data with the public-domain 5X data. In contrast, the publicly funded project will only have its own 5X sequence to offer. "They can take our data, but we can't take theirs," points out one genome scientist. This lead will give Celera a serious competitive advantage in the period before the public project single-handedly reaches 10X coverage in around 2003. Gilman disputes that the Celera database will become less attractive with time: "It's not just about the data," he says.
Eric Lander, director of the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts, says he "applauds efforts to add value to the genome" but adds "All data should be broadly and freely available. There is no proprietary genome." Lander predicts that dozens of companies will jostle to provide bioinformatic tools to analyse the human genome, and is concerned that the terms of Celera's licensing agreements may stifle this.
Celera is thinking of charging $20,000 dollars annually per laboratory for basic access to its database, says Gilman, adding that universities with multiple laboratories would probably pay around $5,000 per laboratory. Higher levels of access and a greater number of tools would probably cost more, he says. Academics could have "exactly the same access as the pharmaceutical companies, if they want it". Craig Venter adds: "If no one subscribes to it, we're out of business."
Gilman envisages that laboratories would subscribe to different suites of tools and access, with "deluxe", "economy" or "middle" packages. The company has not yet signed any contracts with academic laboratories, but is in discussion with some 15 universities, according to Gilman. "Our goal has always been to have academic subscribers."
Mark Magnuson, director of Transgenics/ ES cell research at Vanderbilt University, says that the university is "interested" in subscribing to Celera's database, in particular to annotated data. But he says, "they're going to have to have information that's not in GenBank."
Last Updated on 2/17/00
By Rachel Benbrook